Title: Speed up open().read() pattern by reducing the number of system calls · Issue #120754 · python/cpython · GitHub
Open Graph Title: Speed up open().read() pattern by reducing the number of system calls · Issue #120754 · python/cpython
X Title: Speed up open().read() pattern by reducing the number of system calls · Issue #120754 · python/cpython
Description: Feature or enhancement Proposal: I came across some seemingly redundant fstat() and lseek() calls when working on a tool that scanned a directory of lots of small YAML files and loaded their contents as config. In tracing I found most ex...
Open Graph Description: Feature or enhancement Proposal: I came across some seemingly redundant fstat() and lseek() calls when working on a tool that scanned a directory of lots of small YAML files and loaded their conten...
X Description: Feature or enhancement Proposal: I came across some seemingly redundant fstat() and lseek() calls when working on a tool that scanned a directory of lots of small YAML files and loaded their conten...
Opengraph URL: https://github.com/python/cpython/issues/120754
X: @github
Domain: github.com
{"@context":"https://schema.org","@type":"DiscussionForumPosting","headline":"Speed up open().read() pattern by reducing the number of system calls","articleBody":"# Feature or enhancement\r\n\r\n### Proposal:\r\n\r\nI came across some seemingly redundant `fstat()` and `lseek()` calls when working on a tool that scanned a directory of lots of small YAML files and loaded their contents as config. In tracing I found most execution time wasn't in the python interpreter but system calls (on top of NFS in that case, which made some I/O calls particularly slow).\r\n\r\nI've been experimenting with a program that reads all `.rst` files in the python `Docs` directory to try and remove some of those redundant system calls..\r\n\r\n### Test Program\r\n```python\r\nfrom pathlib import Path\r\n\r\nnlines = []\r\nfor filename in Path(\"cpython/Doc\").glob(\"**/*.rst\"):\r\n nlines.append(len(filename.read_text()))\r\n```\r\n\r\nIn my experimentation, with some tweaks to fileio can remove over 10% of the system calls the test program makes when scanning the whole `Doc` folders for `.rst` files on both macOS and Linux (don't have a Windows machine to measure on).\r\n\r\n### Current State (9 system calls)\r\nCurrently on my Linux machine to read a whole `.rst` file with the above code there is this series of system calls:\r\n```python\r\nopenat(AT_FDCWD, \"cpython/Doc/howto/clinic.rst\", O_RDONLY|O_CLOEXEC) = 3\r\nfstat(3, {st_mode=S_IFREG|0644, st_size=343, ...}) = 0\r\nioctl(3, TCGETS, 0x7ffe52525930) = -1 ENOTTY (Inappropriate ioctl for device)\r\nlseek(3, 0, SEEK_CUR) = 0\r\nlseek(3, 0, SEEK_CUR) = 0\r\nfstat(3, {st_mode=S_IFREG|0644, st_size=343, ...}) = 0\r\nread(3, \":orphan:\\n\\n.. This page is retain\"..., 344) = 343\r\nread(3, \"\", 1) = 0\r\nclose(3) = 0\r\n```\r\n\r\n### Target State (~~7~~ 5 system calls)\r\nIt would be nice to get it down to (for small files, large file caveat in PR / get an additional seek):\r\n```python\r\n# Open the file\r\nopenat(AT_FDCWD, \"cpython/Doc/howto/clinic.rst\", O_RDONLY|O_CLOEXEC) = 3\r\n# Check if the open fd is a file or directory and early-exit on directories with a specialized error.\r\n# With my changes we also stash the size information from this for later use as an estimate.\r\nfstat(3, {st_mode=S_IFREG|0644, st_size=343, ...}) = 0\r\n# Read the data directly into a PyBytes\r\nread(3, \":orphan:\\n\\n.. This page is retain\"..., 344) = 343\r\n# Read the EOF marker\r\nread(3, \"\", 1) = 0\r\n# Close the file\r\nclose(3) = 0\r\n```\r\n\r\nIn a number of cases (ex. importing modules) there is often a `fstat` followed immediately by an open / read the file (which does another `fstat` typically), but that is an extension point and I want to keep that out of scope for now.\r\n\r\n### Questions rattling around in my head around this\r\nSome of these are likely better for Discourse / longer form discussion, happy to start threads there as appropriate.\r\n\r\n1. Is there a way to add a test for certain system calls happening with certain arguments and/or a certain amount of time? (I don't currently see a great way to write a test to make sure the number of system calls doesn't change unintentionally)\r\n2. Running a simple python script (`python simple.py` that contains `print(\"Hello, World!\")`) currently reads `simple.py` in full at least 4 times and does over 5 seeks. I have been pulling on that thread but it interacts with importlib as well as how the python compiler currently works, still trying to get my head around. Would removing more of those overheads be something of interest / should I keep working to get my head around it? \r\n3. We could potentially save more\r\n 1. with readv (one readv call, two iovecs). I avoided this for now because _Py_read does quite a bit.\r\n 2. dispatching multiple calls in parallel using asynchronous I/O APIs to meet the python API guarantees; I am experimenting with this (backed by relatively new Linux I/O APIs but possibly for kqueue and epoll), but it's _very_ experimental and feeling a lot like \"has to be a interpreter primitive\" to me to work effectively which is complex to plumb through. Very early days though, many thoughts, not much prototype code.\r\n4. The `_blksize` member of fileio was added in bpo-21679. It is not used much as far as I can tell as its reflection `_blksize` in python or in the code. The only usage I can find is https://github.com/python/cpython/blob/main/Modules/_io/_iomodule.c#L365-L374, where we could just query for it when needed in that case to save some storage on all `fileio` objects. The behavior of using the stat returned st_blksize is part of the docs, so doesn't feel like we can fully remove it.\r\n\r\n### Has this already been discussed elsewhere?\r\n\r\nThis is a minor feature, which does not need previous discussion elsewhere\r\n\r\n### Links to previous discussion of this feature:\r\n\r\n_No response_\r\n\r\n\u003c!-- gh-linked-prs --\u003e\r\n### Linked PRs\r\n* gh-120755\r\n* gh-121143\n* gh-121357\n* gh-121593\n* gh-121633\n* gh-122101\n* gh-122103\n* gh-122111\n* gh-122215\n* gh-122216\n* gh-123303\n* gh-123412\n* gh-123413\n* gh-124225\n* gh-125166\n* gh-126466\n\u003c!-- /gh-linked-prs --\u003e\r\n","author":{"url":"https://github.com/cmaloney","@type":"Person","name":"cmaloney"},"datePublished":"2024-06-19T19:36:57.000Z","interactionStatistic":{"@type":"InteractionCounter","interactionType":"https://schema.org/CommentAction","userInteractionCount":20},"url":"https://github.com/120754/cpython/issues/120754"}
| route-pattern | /_view_fragments/issues/show/:user_id/:repository/:id/issue_layout(.:format) |
| route-controller | voltron_issues_fragments |
| route-action | issue_layout |
| fetch-nonce | v2:c8e18d55-9660-2af3-87f4-ad95e550efaa |
| current-catalog-service-hash | 81bb79d38c15960b92d99bca9288a9108c7a47b18f2423d0f6438c5b7bcd2114 |
| request-id | 8946:231D79:2A39F7F:377E71A:696B2EF5 |
| html-safe-nonce | 050db2947e419d5453887ea6279a5205a7410c7455f729f9331470c857108052 |
| visitor-payload | eyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiI4OTQ2OjIzMUQ3OToyQTM5RjdGOjM3N0U3MUE6Njk2QjJFRjUiLCJ2aXNpdG9yX2lkIjoiMTAxOTc5NDU3MTIxOTY0NDE1MCIsInJlZ2lvbl9lZGdlIjoiaWFkIiwicmVnaW9uX3JlbmRlciI6ImlhZCJ9 |
| visitor-hmac | b48bd07149c25e7b6c3bbbfa36116a481c42ef21bd6b989627aba46e1591a690 |
| hovercard-subject-tag | issue:2363021533 |
| github-keyboard-shortcuts | repository,issues,copilot |
| google-site-verification | Apib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I |
| octolytics-url | https://collector.github.com/github/collect |
| analytics-location | / |
| fb:app_id | 1401488693436528 |
| apple-itunes-app | app-id=1477376905, app-argument=https://github.com/_view_fragments/issues/show/python/cpython/120754/issue_layout |
| twitter:image | https://opengraph.githubassets.com/4e4bec4fa9fb2ebd1639bcd01a249f46a4526d6c15fe1ef5f3a091a47db14257/python/cpython/issues/120754 |
| twitter:card | summary_large_image |
| og:image | https://opengraph.githubassets.com/4e4bec4fa9fb2ebd1639bcd01a249f46a4526d6c15fe1ef5f3a091a47db14257/python/cpython/issues/120754 |
| og:image:alt | Feature or enhancement Proposal: I came across some seemingly redundant fstat() and lseek() calls when working on a tool that scanned a directory of lots of small YAML files and loaded their conten... |
| og:image:width | 1200 |
| og:image:height | 600 |
| og:site_name | GitHub |
| og:type | object |
| og:author:username | cmaloney |
| hostname | github.com |
| expected-hostname | github.com |
| None | 5f99f7c1d70f01da5b93e5ca90303359738944d8ab470e396496262c66e60b8d |
| turbo-cache-control | no-preview |
| go-import | github.com/python/cpython git https://github.com/python/cpython.git |
| octolytics-dimension-user_id | 1525981 |
| octolytics-dimension-user_login | python |
| octolytics-dimension-repository_id | 81598961 |
| octolytics-dimension-repository_nwo | python/cpython |
| octolytics-dimension-repository_public | true |
| octolytics-dimension-repository_is_fork | false |
| octolytics-dimension-repository_network_root_id | 81598961 |
| octolytics-dimension-repository_network_root_nwo | python/cpython |
| turbo-body-classes | logged-out env-production page-responsive |
| disable-turbo | false |
| browser-stats-url | https://api.github.com/_private/browser/stats |
| browser-errors-url | https://api.github.com/_private/browser/errors |
| release | 82560a55c6b2054555076f46e683151ee28a19bc |
| ui-target | full |
| theme-color | #1e2327 |
| color-scheme | light dark |
Links:
Viewport: width=device-width