Title: gh-143658: importlib.metadata: Use `str.translate` to improve performance of `importlib.metadata.Prepared.normalized` by hugovk · Pull Request #143660 · python/cpython · GitHub
Open Graph Title: gh-143658: importlib.metadata: Use `str.translate` to improve performance of `importlib.metadata.Prepared.normalized` by hugovk · Pull Request #143660 · python/cpython
X Title: gh-143658: importlib.metadata: Use `str.translate` to improve performance of `importlib.metadata.Prepared.normalized` by hugovk · Pull Request #143660 · python/cpython
Description: We can apply @henryiii's improvement to packaging in pypa/packaging#1030 (see also https://iscinumpy.dev/post/packaging-faster/) to improve the performance of canonicalize_name and make it ~3.7 times faster. Benchmark Run Prepared.normalize(n) on every name in PyPI: # benchmark_names_stdlib.py import sqlite3 import timeit from importlib.metadata import Prepared # Get data with: # curl -L https://github.com/pypi-data/pypi-json-data/releases/download/latest/pypi-data.sqlite.gz | gzip -d > pypi-data.sqlite # Or ues pre-cached files from: # https://gist.github.com/hugovk/efdbee0620cc64df7b405b52cf0b6e42 CACHE_FILE = "/tmp/bench/names.txt" DB_FILE = "/tmp/bench/pypi-data.sqlite" try: with open(CACHE_FILE) as f: TEST_ALL_NAMES = [line.rstrip("\n") for line in f] except FileNotFoundError: TEST_ALL_NAMES = [] with sqlite3.connect(DB_FILE) as conn: with open(CACHE_FILE, "w") as cache: for (name,) in conn.execute("SELECT name FROM projects"): if name: TEST_ALL_NAMES.append(name) cache.write(name + "\n") def bench(): for n in TEST_ALL_NAMES: Prepared.normalize(n) if __name__ == "__main__": print(f"Loaded {len(TEST_ALL_NAMES):,} names") t = timeit.timeit("bench()", globals=globals(), number=1) print(f"Time: {t:.4f} seconds") Benchmark data can be found at https://gist.github.com/hugovk/efdbee0620cc64df7b405b52cf0b6e42 Before With optimisations: ❯ ./python.exe benchmark_names_stdlib.py Loaded 8,344,947 names Time: 5.1483 seconds After ❯ ./python.exe benchmark_names_stdlib.py Loaded 8,344,947 names Time: 1.3754 seconds 3.7 times faster. Issue: gh-143658
Open Graph Description: We can apply @henryiii's improvement to packaging in pypa/packaging#1030 (see also https://iscinumpy.dev/post/packaging-faster/) to improve the performance of canonicalize_name and make it ~3.7...
X Description: We can apply @henryiii's improvement to packaging in pypa/packaging#1030 (see also https://iscinumpy.dev/post/packaging-faster/) to improve the performance of canonicalize_name and make it ...
Opengraph URL: https://github.com/python/cpython/pull/143660
X: @github
Domain: github.com
| route-pattern | /:user_id/:repository/pull/:id/commits/:range(.:format) |
| route-controller | pull_requests |
| route-action | commits |
| fetch-nonce | v2:c4236851-2ef2-ac39-160e-21c285295182 |
| current-catalog-service-hash | ae870bc5e265a340912cde392f23dad3671a0a881730ffdadd82f2f57d81641b |
| request-id | A8E4:24D925:450544E:5BDC9AD:69958E61 |
| html-safe-nonce | 509e1a40e1a7f5bc2ce86aa4a9f3d9a765524c2377b8ac27c2c859a2ba8229ca |
| visitor-payload | eyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiJBOEU0OjI0RDkyNTo0NTA1NDRFOjVCREM5QUQ6Njk5NThFNjEiLCJ2aXNpdG9yX2lkIjoiODY5OTQxMzQzODk5NTQ2Nzg3MyIsInJlZ2lvbl9lZGdlIjoiaWFkIiwicmVnaW9uX3JlbmRlciI6ImlhZCJ9 |
| visitor-hmac | 51571ee701e8846a08b6aefbc908a4a3836f20a4325dc25ef8bc1be8a796768d |
| hovercard-subject-tag | pull_request:3162057244 |
| github-keyboard-shortcuts | repository,pull-request-list,pull-request-conversation,pull-request-files-changed,copilot |
| google-site-verification | Apib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I |
| octolytics-url | https://collector.github.com/github/collect |
| analytics-location | / |
| fb:app_id | 1401488693436528 |
| apple-itunes-app | app-id=1477376905, app-argument=https://github.com/python/cpython/pull/143660/commits/7b7f9a8773cc12f54ff987e6c6fa8ffcbf3a6000 |
| twitter:image | https://avatars.githubusercontent.com/u/1324225?s=400&v=4 |
| twitter:card | summary_large_image |
| og:image | https://avatars.githubusercontent.com/u/1324225?s=400&v=4 |
| og:image:alt | We can apply @henryiii's improvement to packaging in pypa/packaging#1030 (see also https://iscinumpy.dev/post/packaging-faster/) to improve the performance of canonicalize_name and make it ~3.7... |
| og:site_name | GitHub |
| og:type | object |
| hostname | github.com |
| expected-hostname | github.com |
| None | 4bd759bc5f83244e2a0de29b937365905c0fefd238b6f077c24a49830375b4df |
| turbo-cache-control | no-preview |
| diff-view | unified |
| go-import | github.com/python/cpython git https://github.com/python/cpython.git |
| octolytics-dimension-user_id | 1525981 |
| octolytics-dimension-user_login | python |
| octolytics-dimension-repository_id | 81598961 |
| octolytics-dimension-repository_nwo | python/cpython |
| octolytics-dimension-repository_public | true |
| octolytics-dimension-repository_is_fork | false |
| octolytics-dimension-repository_network_root_id | 81598961 |
| octolytics-dimension-repository_network_root_nwo | python/cpython |
| turbo-body-classes | logged-out env-production page-responsive |
| disable-turbo | false |
| browser-stats-url | https://api.github.com/_private/browser/stats |
| browser-errors-url | https://api.github.com/_private/browser/errors |
| release | 84938599335a8dc49305d1a140adf6e19877540a |
| ui-target | full |
| theme-color | #1e2327 |
| color-scheme | light dark |
Links:
Viewport: width=device-width