Title: Bytecode positions seem way too broad · Issue #93691 · python/cpython · GitHub
Open Graph Title: Bytecode positions seem way too broad · Issue #93691 · python/cpython
X Title: Bytecode positions seem way too broad · Issue #93691 · python/cpython
Description: (Note that dis currently has a bug in displaying accurate location info in the presence of CACHEs. The correct information can be observed by working with co_positions directly or using the code from that PR.) While developing specialist...
Open Graph Description: (Note that dis currently has a bug in displaying accurate location info in the presence of CACHEs. The correct information can be observed by working with co_positions directly or using the code fr...
X Description: (Note that dis currently has a bug in displaying accurate location info in the presence of CACHEs. The correct information can be observed by working with co_positions directly or using the code fr...
Opengraph URL: https://github.com/python/cpython/issues/93691
X: @github
Domain: github.com
{"@context":"https://schema.org","@type":"DiscussionForumPosting","headline":"Bytecode positions seem way too broad","articleBody":"(Note that `dis` currently has a [bug](https://github.com/python/cpython/pull/93663) in displaying accurate location info in the presence of `CACHE`s. The correct information can be observed by working with `co_positions` directly or using the code from that PR.)\r\n\r\nWhile developing `specialist`, I realized that there are lots of common code patterns that produce bytecode with unexpectedly large source ranges. In addition to being unhelpful for both friendly tracebacks (the original motivation) and things like bytecode introspection, I suspect these huge ranges may also be bloating the size of our internal position tables as well.\r\n\r\nConsider the following function:\r\n\r\n```py\r\ndef analyze(path): # 1\r\n upper = lower = total = 0 # 2\r\n with open(path) as file: # 3\r\n for line in file: # 4\r\n for character in line: # 5\r\n if character.isupper(): # 6\r\n upper += 1 # 7\r\n elif character.islower(): # 8\r\n lower += 1 # 9\r\n total += 1 # 10\r\n return lower / total, upper / total # 11\r\n\r\n\r\nimport dis\r\nfrom pprint import pprint as pp\r\ndef pos(p):\r\n return (p.lineno, p.end_lineno, p.col_offset, p.end_col_offset)\r\n\r\npp([(pos(x.positions), x.opname, x.argval) for x in dis.get_instructions(analyze)])\r\n```\r\n\r\nThings that should probably span one line at most:\r\n- The first `GET_ITER`/`FOR_ITER` pair span all of lines 4 through 10.\r\n- The second `GET_ITER`/`FOR_ITER` pair spans all of lines 5 through 10.\r\n- The first `POP_JUMP_FORWARD_IF_FALSE` spans all of lines 6 through 9.\r\n- The second `POP_JUMP_FORWARD_IF_FALSE` spans all of lines 8 through 9.\r\n- Ten instructions for `with` cleanup each span all of lines 3 through 10.\r\n\r\nThings that should probably be artificial:\r\n- A `JUMP_FORWARD` spans all of line 7.\r\n- The first `JUMP_BACKWARD` spans all of line 10.\r\n- The second `JUMP_BACKWARD` spans all of lines 5 through 10.\r\n\r\nThings I don't get:\r\n- A `NOP` spans all of lines 4 through 10.\r\n\r\nAs a result, over half of the generated bytecode for this function claims to span line 9, for instance. Also not shown here: the instructions for building functions and classes have similarly huge spans.\r\n\r\nI think this can be tightened up in the compiler by:\r\n- Being more aggressive in calling `SET_LOC` on child nodes.\r\n- Being more aggressive in calling `UNSET_LOC` before unconditional jumps.\n\n\u003c!-- gh-linked-prs --\u003e\n### Linked PRs\n* gh-120125\n* gh-120330\n* gh-120399\n* gh-120405\n* gh-123604\n* gh-123605\n\u003c!-- /gh-linked-prs --\u003e\n","author":{"url":"https://github.com/brandtbucher","@type":"Person","name":"brandtbucher"},"datePublished":"2022-06-10T18:12:47.000Z","interactionStatistic":{"@type":"InteractionCounter","interactionType":"https://schema.org/CommentAction","userInteractionCount":5},"url":"https://github.com/93691/cpython/issues/93691"}
| route-pattern | /_view_fragments/issues/show/:user_id/:repository/:id/issue_layout(.:format) |
| route-controller | voltron_issues_fragments |
| route-action | issue_layout |
| fetch-nonce | v2:36ba34e9-0a3e-5c3a-1d90-01db79488eea |
| current-catalog-service-hash | 81bb79d38c15960b92d99bca9288a9108c7a47b18f2423d0f6438c5b7bcd2114 |
| request-id | 9520:1CEAED:22E997C:2DEE088:696B0F64 |
| html-safe-nonce | b28b20fd8ee168e0c061aa36affe3825f038fbb57efe772e72af97c68a575b39 |
| visitor-payload | eyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiI5NTIwOjFDRUFFRDoyMkU5OTdDOjJERUUwODg6Njk2QjBGNjQiLCJ2aXNpdG9yX2lkIjoiNDc5MDc1Nzc3OTU5MDI4NzIwNSIsInJlZ2lvbl9lZGdlIjoiaWFkIiwicmVnaW9uX3JlbmRlciI6ImlhZCJ9 |
| visitor-hmac | de24b02ec923764e252d97fb8d4f09560e5daa68a5550a3689e40c5b36665ff6 |
| hovercard-subject-tag | issue:1267866640 |
| github-keyboard-shortcuts | repository,issues,copilot |
| google-site-verification | Apib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I |
| octolytics-url | https://collector.github.com/github/collect |
| analytics-location | / |
| fb:app_id | 1401488693436528 |
| apple-itunes-app | app-id=1477376905, app-argument=https://github.com/_view_fragments/issues/show/python/cpython/93691/issue_layout |
| twitter:image | https://opengraph.githubassets.com/6856519e2664a9ca873d994afeedf7325a279adbe158ba0307f06989549fd5ef/python/cpython/issues/93691 |
| twitter:card | summary_large_image |
| og:image | https://opengraph.githubassets.com/6856519e2664a9ca873d994afeedf7325a279adbe158ba0307f06989549fd5ef/python/cpython/issues/93691 |
| og:image:alt | (Note that dis currently has a bug in displaying accurate location info in the presence of CACHEs. The correct information can be observed by working with co_positions directly or using the code fr... |
| og:image:width | 1200 |
| og:image:height | 600 |
| og:site_name | GitHub |
| og:type | object |
| og:author:username | brandtbucher |
| hostname | github.com |
| expected-hostname | github.com |
| None | 5f99f7c1d70f01da5b93e5ca90303359738944d8ab470e396496262c66e60b8d |
| turbo-cache-control | no-preview |
| go-import | github.com/python/cpython git https://github.com/python/cpython.git |
| octolytics-dimension-user_id | 1525981 |
| octolytics-dimension-user_login | python |
| octolytics-dimension-repository_id | 81598961 |
| octolytics-dimension-repository_nwo | python/cpython |
| octolytics-dimension-repository_public | true |
| octolytics-dimension-repository_is_fork | false |
| octolytics-dimension-repository_network_root_id | 81598961 |
| octolytics-dimension-repository_network_root_nwo | python/cpython |
| turbo-body-classes | logged-out env-production page-responsive |
| disable-turbo | false |
| browser-stats-url | https://api.github.com/_private/browser/stats |
| browser-errors-url | https://api.github.com/_private/browser/errors |
| release | 82560a55c6b2054555076f46e683151ee28a19bc |
| ui-target | full |
| theme-color | #1e2327 |
| color-scheme | light dark |
Links:
Viewport: width=device-width