Title: Add fast path to `deepcopy()` for empty list/tuple/dict/set · Issue #121192 · python/cpython · GitHub
Open Graph Title: Add fast path to `deepcopy()` for empty list/tuple/dict/set · Issue #121192 · python/cpython
X Title: Add fast path to `deepcopy()` for empty list/tuple/dict/set · Issue #121192 · python/cpython
Description: deepcopy() can be surprisingly slow when called with empty containers like lists, tuples, dicts, sets or frozensets. Adding a fast path for this case similar to #114266 would significantly speed up such cases by about 4x - 28x while havi...
Open Graph Description: deepcopy() can be surprisingly slow when called with empty containers like lists, tuples, dicts, sets or frozensets. Adding a fast path for this case similar to #114266 would significantly speed up...
X Description: deepcopy() can be surprisingly slow when called with empty containers like lists, tuples, dicts, sets or frozensets. Adding a fast path for this case similar to #114266 would significantly speed up...
Opengraph URL: https://github.com/python/cpython/issues/121192
X: @github
Domain: github.com
{"@context":"https://schema.org","@type":"DiscussionForumPosting","headline":"Add fast path to `deepcopy()` for empty list/tuple/dict/set","articleBody":"`deepcopy()` can be surprisingly slow when called with empty containers like lists, tuples, dicts, sets or frozensets.\r\n\r\nAdding a fast path for this case similar to #114266 would significantly speed up such cases by about 4x - 28x while having little impact on the default path and not adding too much complexity. With such a patch the following benchmarking script would show a significant speedup compared to `main`:\r\n```python\r\nimport pyperf\r\nrunner = pyperf.Runner()\r\n\r\nsetup = \"\"\"\r\nimport copy\r\n\r\na = {\"list\": [1, 2 ,3 ,4], \"t\": (1, 2, 3), \"str\": \"hello\", \"subdict\": {\"a\": True}}\r\n\r\nt = ()\r\nfs = frozenset()\r\nl = []\r\ns = set()\r\nd = {}\r\ndeep = [[], (), {}, set(), frozenset()]\r\n\"\"\"\r\n\r\nrunner.timeit(name=\"deepcopy dict\", stmt=f\"b = copy.deepcopy(a)\", setup=setup)\r\nrunner.timeit(name=\"deepcopy empty tuple\", stmt=f\"b = copy.deepcopy(t)\", setup=setup)\r\nrunner.timeit(name=\"deepcopy empty frozenset\", stmt=f\"b = copy.deepcopy(fs)\", setup=setup)\r\nrunner.timeit(name=\"deepcopy empty list\", stmt=f\"b = copy.deepcopy(l)\", setup=setup)\r\nrunner.timeit(name=\"deepcopy empty set\", stmt=f\"b = copy.deepcopy(s)\", setup=setup)\r\nrunner.timeit(name=\"deepcopy empty dict\", stmt=f\"b = copy.deepcopy(d)\", setup=setup)\r\nrunner.timeit(name=\"deepcopy multiple empty containers\", stmt=f\"b = copy.deepcopy(deep)\", setup=setup)\r\n\r\n```\r\n```\r\ndeepcopy dict: Mean +- std dev: [baseline] 1.86 us +- 0.06 us -\u003e [optimize-empty-copy] 2.02 us +- 0.02 us: 1.09x slower\r\ndeepcopy empty tuple: Mean +- std dev: [baseline] 285 ns +- 2 ns -\u003e [optimize-empty-copy] 48.4 ns +- 0.9 ns: 5.89x faster\r\ndeepcopy empty frozenset: Mean +- std dev: [baseline] 1.47 us +- 0.11 us -\u003e [optimize-empty-copy] 49.9 ns +- 1.5 ns: 29.44x faster\r\ndeepcopy empty list: Mean +- std dev: [baseline] 323 ns +- 2 ns -\u003e [optimize-empty-copy] 82.7 ns +- 2.5 ns: 3.91x faster\r\ndeepcopy empty set: Mean +- std dev: [baseline] 1.46 us +- 0.10 us -\u003e [optimize-empty-copy] 85.4 ns +- 4.9 ns: 17.04x faster\r\ndeepcopy empty dict: Mean +- std dev: [baseline] 326 ns +- 4 ns -\u003e [optimize-empty-copy] 83.3 ns +- 2.6 ns: 3.91x faster\r\ndeepcopy multiple empty containers: Mean +- std dev: [baseline] 4.13 us +- 0.04 us -\u003e [optimize-empty-copy] 1.16 us +- 0.02 us: 3.56x faster\r\n\r\nGeometric mean: 5.48x faster\r\n```\r\n\r\nThis might conflict with @eendebakpt efforts in #91610 or could be something that should be added to the proposed C version as well.\r\n\r\nFor context, I noticed this when using [pydantic](https://docs.pydantic.dev/latest/) models with mutable default values where pydantic would deep copy the default value upon class instantiation. E.g.:\r\n```python\r\nclass Foo(pydantic.BaseModel):\r\n bar: list[int] = []\r\n```\r\nTo be fair the proper fix in this case would be not to use a mutable default value in pydantic and switch to `pydantic.Field(default_factory=list)` similar to dataclasses instead which is much faster. But potentially there might be other scenarios where deepcopying empty iterables might be common.\r\n\r\nI'm happy to make a PR unless it conflicts with the efforts going on in #91610.\n\n\u003c!-- gh-linked-prs --\u003e\n### Linked PRs\n* gh-121193\n\u003c!-- /gh-linked-prs --\u003e\n","author":{"url":"https://github.com/lgeiger","@type":"Person","name":"lgeiger"},"datePublished":"2024-06-30T21:54:34.000Z","interactionStatistic":{"@type":"InteractionCounter","interactionType":"https://schema.org/CommentAction","userInteractionCount":9},"url":"https://github.com/121192/cpython/issues/121192"}
| route-pattern | /_view_fragments/issues/show/:user_id/:repository/:id/issue_layout(.:format) |
| route-controller | voltron_issues_fragments |
| route-action | issue_layout |
| fetch-nonce | v2:ea7fbd8e-f0f2-28f4-d72c-e200efc28c1f |
| current-catalog-service-hash | 81bb79d38c15960b92d99bca9288a9108c7a47b18f2423d0f6438c5b7bcd2114 |
| request-id | D14E:F74F6:16E906D:1E5C918:696AE69A |
| html-safe-nonce | adbc5b98217112c2e27a988b889760db3b525452515c5f7f839658efd6b9bfd8 |
| visitor-payload | eyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiJEMTRFOkY3NEY2OjE2RTkwNkQ6MUU1QzkxODo2OTZBRTY5QSIsInZpc2l0b3JfaWQiOiIxMzM4NjQ1MTQzNjE0MDU2MDkwIiwicmVnaW9uX2VkZ2UiOiJpYWQiLCJyZWdpb25fcmVuZGVyIjoiaWFkIn0= |
| visitor-hmac | 351213d1fc9d69648ad050f8def3f35857fab24851c2137746492e13540624a6 |
| hovercard-subject-tag | issue:2382461813 |
| github-keyboard-shortcuts | repository,issues,copilot |
| google-site-verification | Apib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I |
| octolytics-url | https://collector.github.com/github/collect |
| analytics-location | / |
| fb:app_id | 1401488693436528 |
| apple-itunes-app | app-id=1477376905, app-argument=https://github.com/_view_fragments/issues/show/python/cpython/121192/issue_layout |
| twitter:image | https://opengraph.githubassets.com/f3f68ff4ebe55f7138a37dfe1f4fdab5162de674f25a360b83984f4cd7d28999/python/cpython/issues/121192 |
| twitter:card | summary_large_image |
| og:image | https://opengraph.githubassets.com/f3f68ff4ebe55f7138a37dfe1f4fdab5162de674f25a360b83984f4cd7d28999/python/cpython/issues/121192 |
| og:image:alt | deepcopy() can be surprisingly slow when called with empty containers like lists, tuples, dicts, sets or frozensets. Adding a fast path for this case similar to #114266 would significantly speed up... |
| og:image:width | 1200 |
| og:image:height | 600 |
| og:site_name | GitHub |
| og:type | object |
| og:author:username | lgeiger |
| hostname | github.com |
| expected-hostname | github.com |
| None | 5f99f7c1d70f01da5b93e5ca90303359738944d8ab470e396496262c66e60b8d |
| turbo-cache-control | no-preview |
| go-import | github.com/python/cpython git https://github.com/python/cpython.git |
| octolytics-dimension-user_id | 1525981 |
| octolytics-dimension-user_login | python |
| octolytics-dimension-repository_id | 81598961 |
| octolytics-dimension-repository_nwo | python/cpython |
| octolytics-dimension-repository_public | true |
| octolytics-dimension-repository_is_fork | false |
| octolytics-dimension-repository_network_root_id | 81598961 |
| octolytics-dimension-repository_network_root_nwo | python/cpython |
| turbo-body-classes | logged-out env-production page-responsive |
| disable-turbo | false |
| browser-stats-url | https://api.github.com/_private/browser/stats |
| browser-errors-url | https://api.github.com/_private/browser/errors |
| release | 3d84d50b3c75fa36755c3cf392edbc09e626f979 |
| ui-target | full |
| theme-color | #1e2327 |
| color-scheme | light dark |
Links:
Viewport: width=device-width