Title: Incorrect parsing of TarInfo header when GNU long name and type AREGTYPE are combined · Issue #141707 · python/cpython · GitHub
Open Graph Title: Incorrect parsing of TarInfo header when GNU long name and type AREGTYPE are combined · Issue #141707 · python/cpython
X Title: Incorrect parsing of TarInfo header when GNU long name and type AREGTYPE are combined · Issue #141707 · python/cpython
Description: Bug report Bug description: When an entry uses GNU long name encoding the tarfile module reads in the data blocks for the name and then calls self.fromtarfile() again to get the 'actual' header. This second header is the source of truth ...
Open Graph Description: Bug report Bug description: When an entry uses GNU long name encoding the tarfile module reads in the data blocks for the name and then calls self.fromtarfile() again to get the 'actual' header. Th...
X Description: Bug report Bug description: When an entry uses GNU long name encoding the tarfile module reads in the data blocks for the name and then calls self.fromtarfile() again to get the 'actual' he...
Opengraph URL: https://github.com/python/cpython/issues/141707
X: @github
Domain: patch-diff.githubusercontent.com
{"@context":"https://schema.org","@type":"DiscussionForumPosting","headline":"Incorrect parsing of TarInfo header when GNU long name and type AREGTYPE are combined","articleBody":"# Bug report\n\n### Bug description:\n\nWhen an entry uses GNU long name encoding the `tarfile` module reads in the data blocks for the name and then [calls self.fromtarfile() again](https://github.com/python/cpython/blob/4867f717e21c3b5f0ad0e81f950c69dac6c95e6e/Lib/tarfile.py#L1404) to get the 'actual' header. This second header is the source of truth for everything _except_ the name which is just garbage data.\n\nThe problem is that `fromtarfile()` eventually calls `frombuf()` where [this logic](https://github.com/python/cpython/blob/4867f717e21c3b5f0ad0e81f950c69dac6c95e6e/Lib/tarfile.py#L1310-L1311) incorrectly uses the garbage data and overrides the entry type to directory, corrupting the entry.\n\nBecause the entry is detected as a directory, the offset is not updated properly and the next call to read a TarInfo entry will usually result in an exception. However, the exception lands up in [this block](https://github.com/python/cpython/blob/4867f717e21c3b5f0ad0e81f950c69dac6c95e6e/Lib/tarfile.py#L2851-L2857) where neither of the `if` conditions are met, so the exception is silently discarded. `tarinfo` remains `None` and the code [eventually decides](https://github.com/python/cpython/blob/4867f717e21c3b5f0ad0e81f950c69dac6c95e6e/Lib/tarfile.py#L2881-L2882) that there are no more entries in the tar file.\n\nI initially ran into this issue due to reports of invalid sdists being generated by maturin.\nSee: https://github.com/PyO3/maturin/issues/2855\n\n### CPython versions tested on:\n\n3.9, 3.10, 3.11, 3.12, 3.13, 3.14\n\n### Operating systems tested on:\n\nmacOS, Linux\n\n\u003c!-- gh-linked-prs --\u003e\n### Linked PRs\n* gh-143157\n* gh-143934\n\u003c!-- /gh-linked-prs --\u003e\n","author":{"url":"https://github.com/e-nomem","@type":"Person","name":"e-nomem"},"datePublished":"2025-11-18T10:34:31.000Z","interactionStatistic":{"@type":"InteractionCounter","interactionType":"https://schema.org/CommentAction","userInteractionCount":1},"url":"https://github.com/141707/cpython/issues/141707"}
| route-pattern | /_view_fragments/issues/show/:user_id/:repository/:id/issue_layout(.:format) |
| route-controller | voltron_issues_fragments |
| route-action | issue_layout |
| fetch-nonce | v2:3ac5c97e-f280-d525-c424-f746a30a31f6 |
| current-catalog-service-hash | 81bb79d38c15960b92d99bca9288a9108c7a47b18f2423d0f6438c5b7bcd2114 |
| request-id | C4DA:3C8A20:899DD1F:B7B55E7:696E1833 |
| html-safe-nonce | b7d09b9a721458371f0ac2661702857036db6ec7be8286add0768a4ec49915b0 |
| visitor-payload | eyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiJDNERBOjNDOEEyMDo4OTlERDFGOkI3QjU1RTc6Njk2RTE4MzMiLCJ2aXNpdG9yX2lkIjoiNDIzMTk1NzMyMjkyMjg1ODU0NyIsInJlZ2lvbl9lZGdlIjoiaWFkIiwicmVnaW9uX3JlbmRlciI6ImlhZCJ9 |
| visitor-hmac | 030a450f454ea8764eaab7642f9d81a692877498d0286d9a675ce9c9dbce5b57 |
| hovercard-subject-tag | issue:3637381854 |
| github-keyboard-shortcuts | repository,issues,copilot |
| google-site-verification | Apib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I |
| octolytics-url | https://collector.github.com/github/collect |
| analytics-location | / |
| fb:app_id | 1401488693436528 |
| apple-itunes-app | app-id=1477376905, app-argument=https://github.com/_view_fragments/issues/show/python/cpython/141707/issue_layout |
| twitter:image | https://opengraph.githubassets.com/6a80d203aec9cb844f5dbf7dacfe4dbbb3cbeb7102f23f2297866518b9cdd141/python/cpython/issues/141707 |
| twitter:card | summary_large_image |
| og:image | https://opengraph.githubassets.com/6a80d203aec9cb844f5dbf7dacfe4dbbb3cbeb7102f23f2297866518b9cdd141/python/cpython/issues/141707 |
| og:image:alt | Bug report Bug description: When an entry uses GNU long name encoding the tarfile module reads in the data blocks for the name and then calls self.fromtarfile() again to get the 'actual' header. Th... |
| og:image:width | 1200 |
| og:image:height | 600 |
| og:site_name | GitHub |
| og:type | object |
| og:author:username | e-nomem |
| hostname | github.com |
| expected-hostname | github.com |
| None | 1a7d6d739bf034e67486b9f97a31887ca30302b72a0acac49b6bcddff34356d7 |
| turbo-cache-control | no-preview |
| go-import | github.com/python/cpython git https://github.com/python/cpython.git |
| octolytics-dimension-user_id | 1525981 |
| octolytics-dimension-user_login | python |
| octolytics-dimension-repository_id | 81598961 |
| octolytics-dimension-repository_nwo | python/cpython |
| octolytics-dimension-repository_public | true |
| octolytics-dimension-repository_is_fork | false |
| octolytics-dimension-repository_network_root_id | 81598961 |
| octolytics-dimension-repository_network_root_nwo | python/cpython |
| turbo-body-classes | logged-out env-production page-responsive |
| disable-turbo | false |
| browser-stats-url | https://api.github.com/_private/browser/stats |
| browser-errors-url | https://api.github.com/_private/browser/errors |
| release | 87d7872ec7094ed247923539669aabda9230966f |
| ui-target | full |
| theme-color | #1e2327 |
| color-scheme | light dark |
Links:
Viewport: width=device-width