Title: Add optimized versions of isdir / isfile on Windows · Issue #101196 · python/cpython · GitHub
Open Graph Title: Add optimized versions of isdir / isfile on Windows · Issue #101196 · python/cpython
X Title: Add optimized versions of isdir / isfile on Windows · Issue #101196 · python/cpython
Description: I went down this rabbit hole when someone mentioned that isfile/isdir/exists all make a rather expensive os.stat call on Windows (which is actually a long wrapper around a number of system calls on Windows), rather than the simpler and m...
Open Graph Description: I went down this rabbit hole when someone mentioned that isfile/isdir/exists all make a rather expensive os.stat call on Windows (which is actually a long wrapper around a number of system calls on...
X Description: I went down this rabbit hole when someone mentioned that isfile/isdir/exists all make a rather expensive os.stat call on Windows (which is actually a long wrapper around a number of system calls on...
Opengraph URL: https://github.com/python/cpython/issues/101196
X: @github
Domain: github.com
{"@context":"https://schema.org","@type":"DiscussionForumPosting","headline":"Add optimized versions of isdir / isfile on Windows","articleBody":"I went down this rabbit hole when someone mentioned that `isfile`/`isdir`/`exists` all make a rather expensive `os.stat` call on Windows (which is actually a long wrapper around a number of system calls on Windows), rather than the simpler and more direct call to `GetFileAttributeW`.\r\n\r\nI noticed that at one point there was a [version of `isdir`](https://github.com/python/cpython/commits/9c669ccc77c85eac245d460bab510a38b20d9a08) that does exactly this. At the time, this claimed a 2x speedup. \r\n\r\nHowever, this C implementation of `isdir` was removed as part of a large set of changes in df2d4a6f3, and as a result, `isdir` got faster.\r\n\r\nWith the following benchmark:\r\n\r\n\u003cdetails\u003e\r\n\u003csummary\u003eisdir benchmark\u003c/summary\u003e\r\n\r\n```\r\nimport os.path\r\nimport timeit\r\n\r\n\r\nfor i in range(100):\r\n os.makedirs(f\"exists{i}\", exist_ok=True)\r\n\r\n\r\ndef test_exists():\r\n for i in range(100):\r\n os.path.isdir(f\"exists{i}\")\r\n\r\n\r\ndef test_extinct():\r\n for i in range(100):\r\n os.path.isdir(f\"extinct{i}\")\r\n\r\n\r\nprint(timeit.timeit(test_exists, number=100))\r\nprint(timeit.timeit(test_extinct, number=100))\r\n\r\n\r\nfor i in range(100):\r\n os.rmdir(f\"exists{i}\")\r\n```\r\n\r\n\u003c/details\u003e\r\n\r\nI get the following with df2d4a6f3:\r\n\r\n```\r\nexists: 0.18694799999957468\r\ndoesn't exist: 0.08418370000072173\r\n```\r\n\r\nand with the prior commit:\r\n\r\n```\r\nexists: 0.25393609999991895\r\ndoesn't exist: 0.08511730000009265\r\n```\r\n\r\nSo, from this, I'd conclude that the idea of replacing calls to `os.stat` with calls to `GetFileAttributeW` would not bear fruit, but @zooba should probably confirm I'm benchmarking the right thing and making sense.\r\n\r\nIn any event, we should probably remove the [little vestige](https://github.com/python/cpython/blob/main/Lib/ntpath.py#LL854-L862C9) that imports this fast path that was removed:\r\n\r\n```python\r\ntry:\r\n # The genericpath.isdir implementation uses os.stat and checks the mode\r\n # attribute to tell whether or not the path is a directory.\r\n # This is overkill on Windows - just pass the path to GetFileAttributes\r\n # and check the attribute from there.\r\n from nt import _isdir as isdir\r\nexcept ImportError:\r\n # Use genericpath.isdir as imported above.\r\n pass\r\n```\n\n\u003c!-- gh-linked-prs --\u003e\n### Linked PRs\n* gh-101324\n\u003c!-- /gh-linked-prs --\u003e\n","author":{"url":"https://github.com/mdboom","@type":"Person","name":"mdboom"},"datePublished":"2023-01-20T18:01:32.000Z","interactionStatistic":{"@type":"InteractionCounter","interactionType":"https://schema.org/CommentAction","userInteractionCount":15},"url":"https://github.com/101196/cpython/issues/101196"}
| route-pattern | /_view_fragments/issues/show/:user_id/:repository/:id/issue_layout(.:format) |
| route-controller | voltron_issues_fragments |
| route-action | issue_layout |
| fetch-nonce | v2:566cd261-c644-41cf-08a0-461badbf006f |
| current-catalog-service-hash | 81bb79d38c15960b92d99bca9288a9108c7a47b18f2423d0f6438c5b7bcd2114 |
| request-id | A08E:1E6D3F:1028EF9:158336E:696992DA |
| html-safe-nonce | 8a8115bf434bc96b12be3ed9cd62591992cac1bb6841b59eab0a12af66fc7191 |
| visitor-payload | eyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiJBMDhFOjFFNkQzRjoxMDI4RUY5OjE1ODMzNkU6Njk2OTkyREEiLCJ2aXNpdG9yX2lkIjoiMzA4MDE2ODk4OTYwMjE5MDA0MiIsInJlZ2lvbl9lZGdlIjoiaWFkIiwicmVnaW9uX3JlbmRlciI6ImlhZCJ9 |
| visitor-hmac | eb39439a9367092d241040edb0f94ace4f012b413ea6710088bcbef96e67a95f |
| hovercard-subject-tag | issue:1551250603 |
| github-keyboard-shortcuts | repository,issues,copilot |
| google-site-verification | Apib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I |
| octolytics-url | https://collector.github.com/github/collect |
| analytics-location | / |
| fb:app_id | 1401488693436528 |
| apple-itunes-app | app-id=1477376905, app-argument=https://github.com/_view_fragments/issues/show/python/cpython/101196/issue_layout |
| twitter:image | https://opengraph.githubassets.com/436436f536f515c96499f54d7bb29bc9da5d22ffbff950569365c42d527636b9/python/cpython/issues/101196 |
| twitter:card | summary_large_image |
| og:image | https://opengraph.githubassets.com/436436f536f515c96499f54d7bb29bc9da5d22ffbff950569365c42d527636b9/python/cpython/issues/101196 |
| og:image:alt | I went down this rabbit hole when someone mentioned that isfile/isdir/exists all make a rather expensive os.stat call on Windows (which is actually a long wrapper around a number of system calls on... |
| og:image:width | 1200 |
| og:image:height | 600 |
| og:site_name | GitHub |
| og:type | object |
| og:author:username | mdboom |
| hostname | github.com |
| expected-hostname | github.com |
| None | 3542e147982176a7ebaa23dfb559c8af16f721c03ec560c68c56b64a0f35e751 |
| turbo-cache-control | no-preview |
| go-import | github.com/python/cpython git https://github.com/python/cpython.git |
| octolytics-dimension-user_id | 1525981 |
| octolytics-dimension-user_login | python |
| octolytics-dimension-repository_id | 81598961 |
| octolytics-dimension-repository_nwo | python/cpython |
| octolytics-dimension-repository_public | true |
| octolytics-dimension-repository_is_fork | false |
| octolytics-dimension-repository_network_root_id | 81598961 |
| octolytics-dimension-repository_network_root_nwo | python/cpython |
| turbo-body-classes | logged-out env-production page-responsive |
| disable-turbo | false |
| browser-stats-url | https://api.github.com/_private/browser/stats |
| browser-errors-url | https://api.github.com/_private/browser/errors |
| release | af80af7cc9e3de9c336f18b208a600950a3c187c |
| ui-target | full |
| theme-color | #1e2327 |
| color-scheme | light dark |
Links:
Viewport: width=device-width