Title: `BaseSubprocessTransport.__del__` fails if the event loop is already closed, which can leak an orphan process · Issue #114177 · python/cpython · GitHub
Open Graph Title: `BaseSubprocessTransport.__del__` fails if the event loop is already closed, which can leak an orphan process · Issue #114177 · python/cpython
X Title: `BaseSubprocessTransport.__del__` fails if the event loop is already closed, which can leak an orphan process · Issue #114177 · python/cpython
Description: Bug report Bug description: there is a race where it's possible for BaseSubprocessTransport.__del__ to try to close the transport after the event loop has been closed. this results in an unraisable exception in __del__, and it can also r...
Open Graph Description: Bug report Bug description: there is a race where it's possible for BaseSubprocessTransport.__del__ to try to close the transport after the event loop has been closed. this results in an unraisable...
X Description: Bug report Bug description: there is a race where it's possible for BaseSubprocessTransport.__del__ to try to close the transport after the event loop has been closed. this results in an unrais...
Opengraph URL: https://github.com/python/cpython/issues/114177
X: @github
Domain: github.com
{"@context":"https://schema.org","@type":"DiscussionForumPosting","headline":"`BaseSubprocessTransport.__del__` fails if the event loop is already closed, which can leak an orphan process","articleBody":"# Bug report\n\n### Bug description:\n\nthere is a race where it's possible for `BaseSubprocessTransport.__del__` to try to close the transport *after* the event loop has been closed. this results in an unraisable exception in `__del__`, and it can also result in an orphan process being leaked.\n\nthe following is a reproducer that triggers the race between `run()` exiting and [the process dying and the event loop learning about the process's death]. on my machine, with this reproducer the bug occurs (due to `run()` winning the race) maybe 90% of the time:\n\n```python\nfrom __future__ import annotations\n\nimport asyncio\nfrom subprocess import PIPE\n\n\nasync def main() -\u003e None:\n try:\n async with asyncio.timeout(1):\n process = await asyncio.create_subprocess_exec(\n \"/usr/bin/env\",\n \"sh\",\n \"-c\",\n \"while true; do sleep 1; done\",\n stdin=PIPE,\n stdout=PIPE,\n stderr=PIPE,\n )\n try:\n await process.wait()\n except BaseException:\n process.kill()\n # N.B.: even though they send it SIGKILL, the user is (very briefly)\n # leaking an orphan process to asyncio, because they are not waiting for\n # the event loop to learn that the process died. if we added\n # while True:\n # try:\n # await process.wait()\n # except CancelledError:\n # pass\n # else:\n # break\n # (i.e. if we used structured concurrency) then the race would not\n # occur.\n raise\n except TimeoutError:\n pass\n\n\nif __name__ == \"__main__\":\n asyncio.run(main())\n```\n\nmost of the time, running this emits\n\n```\nException ignored in: \u003cfunction BaseSubprocessTransport.__del__ at 0x7fd25db428e0\u003e\nTraceback (most recent call last):\n File \"/usr/lib/python3.11/asyncio/base_subprocess.py\", line 126, in __del__\n self.close()\n File \"/usr/lib/python3.11/asyncio/base_subprocess.py\", line 104, in close\n proto.pipe.close()\n File \"/usr/lib/python3.11/asyncio/unix_events.py\", line 566, in close\n self._close(None)\n File \"/usr/lib/python3.11/asyncio/unix_events.py\", line 590, in _close\n self._loop.call_soon(self._call_connection_lost, exc)\n File \"/usr/lib/python3.11/asyncio/base_events.py\", line 761, in call_soon\n self._check_closed()\n File \"/usr/lib/python3.11/asyncio/base_events.py\", line 519, in _check_closed\n raise RuntimeError('Event loop is closed')\nRuntimeError: Event loop is closed\n```\n\nthis case looks similar to GH-109538. i think the following patch (analogous to GH-111983) fixes it:\n\n```diff\ndiff --git a/Lib/asyncio/base_subprocess.py b/Lib/asyncio/base_subprocess.py\nindex 6dbde2b696..ba219ba39d 100644\n--- a/Lib/asyncio/base_subprocess.py\n+++ b/Lib/asyncio/base_subprocess.py\n@@ -123,8 +123,11 @@ def close(self):\n\n def __del__(self, _warn=warnings.warn):\n if not self._closed:\n- _warn(f\"unclosed transport {self!r}\", ResourceWarning, source=self)\n- self.close()\n+ if self._loop.is_closed():\n+ _warn(\"loop is closed\", ResourceWarning, source=self)\n+ else:\n+ _warn(f\"unclosed transport {self!r}\", ResourceWarning, source=self)\n+ self.close()\n\n def get_pid(self):\n return self._pid\n```\n\nhowever, there is another case for which the above patch is not sufficient. in the above example the user orphaned the process after sending `SIGKILL`/`TerminateProcess` (which is not immediate, but only schedules the kill), but what if they fully orphan it?\n\n```python\nfrom __future__ import annotations\n\nimport asyncio\nfrom subprocess import PIPE\n\n\nasync def main_leak_subprocess() -\u003e None:\n await asyncio.create_subprocess_exec(\n \"/usr/bin/env\",\n \"sh\",\n \"-c\",\n \"while true; do sleep 1; done\",\n stdin=PIPE,\n stdout=PIPE,\n stderr=PIPE,\n )\n\n\nif __name__ == \"__main__\":\n asyncio.run(main_leak_subprocess())\n```\n\ncurrently (on `main`), when the race condition occurs (for this example the condition is `run()` winning the race against `BaseSubprocessTransport` GC) then asyncio emits a loud complaint `Exception ignored in: \u003cfunction BaseSubprocessTransport.__del__ at 0x7f5b3b291e40\u003e` and leaks the orphan process (check `htop` after the interpreter exits!). asyncio probably also leaks the pipes.\n\nbut with the patch above, asyncio will quietly leak the orphan process (and probably pipes), but it will not yell about the leak unless the user enables `ResourceWarning`s. which is not good.\n\nso a more correct patch (fixes both cases) may be something along the lines of\n\n```diff\ndiff --git a/Lib/asyncio/base_subprocess.py b/Lib/asyncio/base_subprocess.py\nindex 6dbde2b696..9c86c8444c 100644\n--- a/Lib/asyncio/base_subprocess.py\n+++ b/Lib/asyncio/base_subprocess.py\n@@ -123,8 +123,31 @@ def close(self):\n\n def __del__(self, _warn=warnings.warn):\n if not self._closed:\n- _warn(f\"unclosed transport {self!r}\", ResourceWarning, source=self)\n- self.close()\n+ if not self._loop.is_closed():\n+ _warn(f\"unclosed transport {self!r}\", ResourceWarning, source=self)\n+ self.close()\n+ else:\n+ _warn(\"loop is closed\", ResourceWarning, source=self)\n+\n+ # self.close() requires the event loop to be open, so we need to reach\n+ # into close() and its dependencies and manually do the bare-minimum\n+ # cleanup that they'd do if the loop was open. I.e. we do the syscalls\n+ # only; we can't interact with an event loop.\n+\n+ # TODO: Probably need some more stuff here too so that we don't leak fd's...\n+\n+ if (self._proc is not None and\n+ # has the child process finished?\n+ self._returncode is None and\n+ # the child process has finished, but the\n+ # transport hasn't been notified yet?\n+ self._proc.poll() is None):\n+\n+ try:\n+ self._proc.kill()\n+ except (ProcessLookupError, PermissionError):\n+ # the process may have already exited or may be running setuid\n+ pass\n\n def get_pid(self):\n return self._pid\n```\n\nwith this patch applied, neither example leaks an orphan process out of `run()`, and both examples emit `ResourceWarning`. however this patch is rather messy. it is also perhaps still leaking pipe fd's out of `run()`. (the fd's probably get closed by the OS when the interpreter shuts down, but i suspect one end of each pipe will be an orphan from the time when `run()` exits to the time when the interpreter shuts down, which can be arbitrarily long).\n\n### CPython versions tested on:\n\n3.11, CPython main branch\n\n### Operating systems tested on:\n\nLinux\n\n\u003c!-- gh-linked-prs --\u003e\n### Linked PRs\n* gh-134508\n* gh-134561\n* gh-134562\n\u003c!-- /gh-linked-prs --\u003e\n","author":{"url":"https://github.com/gschaffner","@type":"Person","name":"gschaffner"},"datePublished":"2024-01-17T08:08:30.000Z","interactionStatistic":{"@type":"InteractionCounter","interactionType":"https://schema.org/CommentAction","userInteractionCount":6},"url":"https://github.com/114177/cpython/issues/114177"}
| route-pattern | /_view_fragments/issues/show/:user_id/:repository/:id/issue_layout(.:format) |
| route-controller | voltron_issues_fragments |
| route-action | issue_layout |
| fetch-nonce | v2:10f8d6df-7ec7-8476-b622-15be3318d4e4 |
| current-catalog-service-hash | 81bb79d38c15960b92d99bca9288a9108c7a47b18f2423d0f6438c5b7bcd2114 |
| request-id | C9C8:1D772C:22B6C1F:2DD4F5E:696B156F |
| html-safe-nonce | ef10d716b0943003405acfdd39e0883a6431908ae0e3be6b2e72071fa0804a12 |
| visitor-payload | eyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiJDOUM4OjFENzcyQzoyMkI2QzFGOjJERDRGNUU6Njk2QjE1NkYiLCJ2aXNpdG9yX2lkIjoiNDI3NjczNjIyMjQ1NDAyNzYzMSIsInJlZ2lvbl9lZGdlIjoiaWFkIiwicmVnaW9uX3JlbmRlciI6ImlhZCJ9 |
| visitor-hmac | f9bb2d7dacbf98453e69e71239be86163701f1a83dd3cacef626cc124c24ee44 |
| hovercard-subject-tag | issue:2085638107 |
| github-keyboard-shortcuts | repository,issues,copilot |
| google-site-verification | Apib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I |
| octolytics-url | https://collector.github.com/github/collect |
| analytics-location | / |
| fb:app_id | 1401488693436528 |
| apple-itunes-app | app-id=1477376905, app-argument=https://github.com/_view_fragments/issues/show/python/cpython/114177/issue_layout |
| twitter:image | https://opengraph.githubassets.com/436851f8416f2282f045839dca69c838e99b28d3f3561702b12b3f7b0a274e02/python/cpython/issues/114177 |
| twitter:card | summary_large_image |
| og:image | https://opengraph.githubassets.com/436851f8416f2282f045839dca69c838e99b28d3f3561702b12b3f7b0a274e02/python/cpython/issues/114177 |
| og:image:alt | Bug report Bug description: there is a race where it's possible for BaseSubprocessTransport.__del__ to try to close the transport after the event loop has been closed. this results in an unraisable... |
| og:image:width | 1200 |
| og:image:height | 600 |
| og:site_name | GitHub |
| og:type | object |
| og:author:username | gschaffner |
| hostname | github.com |
| expected-hostname | github.com |
| None | 5f99f7c1d70f01da5b93e5ca90303359738944d8ab470e396496262c66e60b8d |
| turbo-cache-control | no-preview |
| go-import | github.com/python/cpython git https://github.com/python/cpython.git |
| octolytics-dimension-user_id | 1525981 |
| octolytics-dimension-user_login | python |
| octolytics-dimension-repository_id | 81598961 |
| octolytics-dimension-repository_nwo | python/cpython |
| octolytics-dimension-repository_public | true |
| octolytics-dimension-repository_is_fork | false |
| octolytics-dimension-repository_network_root_id | 81598961 |
| octolytics-dimension-repository_network_root_nwo | python/cpython |
| turbo-body-classes | logged-out env-production page-responsive |
| disable-turbo | false |
| browser-stats-url | https://api.github.com/_private/browser/stats |
| browser-errors-url | https://api.github.com/_private/browser/errors |
| release | 82560a55c6b2054555076f46e683151ee28a19bc |
| ui-target | full |
| theme-color | #1e2327 |
| color-scheme | light dark |
Links:
Viewport: width=device-width