Title: `random_combination_with_replacement` recipe has misleading docstring · Issue #102653 · python/cpython · GitHub
Open Graph Title: `random_combination_with_replacement` recipe has misleading docstring · Issue #102653 · python/cpython
X Title: `random_combination_with_replacement` recipe has misleading docstring · Issue #102653 · python/cpython
Description: Documentation The random module has four recipes that are supposed to "efficiently make random selections from the combinatoric iterators in the itertools module". And their docstrings all say "Random selection from [iterator]". Both sug...
Open Graph Description: Documentation The random module has four recipes that are supposed to "efficiently make random selections from the combinatoric iterators in the itertools module". And their docstrings all say "Ran...
X Description: Documentation The random module has four recipes that are supposed to "efficiently make random selections from the combinatoric iterators in the itertools module". And their docstrings al...
Opengraph URL: https://github.com/python/cpython/issues/102653
X: @github
Domain: github.com
{"@context":"https://schema.org","@type":"DiscussionForumPosting","headline":"`random_combination_with_replacement` recipe has misleading docstring","articleBody":"# Documentation\r\n\r\nThe `random` module has four [recipes](https://docs.python.org/3/library/random.html#recipes) that are supposed to *\"efficiently make random selections from the combinatoric iterators in the itertools module\"*. And their docstrings all say *\"Random selection from [iterator]\"*. Both suggest they're equivalent to `random.choice(list(iterator))`, just efficiently.\r\n\r\nFor example, `itertools.combinations_with_replacement([0, 1], r=4)` produces these five combinations:\r\n```\r\n(0, 0, 0, 0)\r\n(0, 0, 0, 1)\r\n(0, 0, 1, 1)\r\n(0, 1, 1, 1)\r\n(1, 1, 1, 1)\r\n```\r\nSo `random.choice(list(iterator))` would return one of those five with 20% probability each.\r\n\r\nBut the `random_combination_with_replacement` recipe instead produces these probabilities:\r\n```\r\n(0, 0, 0, 0) 6.25%\r\n(0, 0, 0, 1) 25.00%\r\n(0, 0, 1, 1) 37.50%\r\n(0, 1, 1, 1) 25.00%\r\n(1, 1, 1, 1) 6.25%\r\n```\r\n\r\nHere's an implementation that *is* equivalent to `random.choice(list(iterator))`:\r\n\r\n```python\r\ndef random_combination_with_replacement(iterable, r):\r\n \"Random selection from itertools.combinations_with_replacement(iterable, r)\"\r\n pool = tuple(iterable)\r\n n = len(pool)\r\n indices = sorted(random.sample(range(n+r-1), k=r))\r\n return tuple(pool[i-j] for j, i in enumerate(indices))\r\n```\r\n\r\n---\r\n\r\nOne can view the combinations as the result of actually simulating r random draws with replacement, where the multiset `{0,0,1,1}` indeed occurs more often, namely as `0011`, `0101`, `0110`, etc. But that is not the only valid view and isn't the view suggested by the documentation (as my first paragraph argued). Though if that view and the bias is the intention, then I suggest its documentation should mention the bias.\r\n\r\n\u003cdetails\u003e\u003csummary\u003eTest code\u003c/summary\u003e\r\n\r\n[Attempt This Online!](https://ato.pxeger.com/run?1=1ZXNbtQwEMc5cfBTWEVoE5GELqoQqrSn3vfAtaqs1Jmwbv2F7QghxJNw6QUeiqdh_JH9EIvY9oCET7E985-Znyf2tx_2c9gY_fDwfQpj--7n82dCWeMCdb0ejCJlJgK4YIz0ZHRGUW6kBB6E0Z4WiyszaTQiJJr2txLoil6fN3R5Qxx-XhBCXrTtToi2Rwch1gkdqsXWcFGT0TgMqW6p0DuBLq4I3acs2CcRNsyBlT0HBSgwp9FQV18SiiMLR6-6JJNqSYYBIxxLZoCxkGB74X6LxqISm5WOxD57n0Soh8JtF_uxxZwlwW3WqyeJJA0HYXK61NfxjREcKil8MUX1eibFJ-dQo9iiJxcW6BNI_Qs2Fr2QS5ishO12LlnjugRdRZO8IvSAZXtc99jGMFQHOHycfoBK1w29X0Uee-BygCh1LW5obFKROjQrzuisM9Z4GP7E7vQmy0q9_G8Y-l5F94LwlWuXf8PY3mWQd01mCXpSsRUxh0J1xnpllO2d8EaTdD-Mk-bR49F_a3OKy0lG2wPav2_qve-YY8eY7hUwljfWCG15Hkea8hD_6HKXJvuDs0lsWCkTma4LyfmCbPDY4WPcLweBeh0KKF_VpVMOb0J0WHyJPq_Xl2-7Ny-_Lur8EJT3YH4XfgE)\r\n\r\n```python\r\nimport random\r\nimport itertools\r\nfrom collections import Counter\r\n\r\niterable = [0, 1]\r\nr = 4\r\n\r\n\r\n#-- itertools ----------------------\r\n\r\nprint('itertools')\r\nfor comb in itertools.combinations_with_replacement(iterable, r):\r\n print(comb)\r\n\r\n\r\n#-- from iterator ------------------\r\n\r\ndef random_combination_with_replacement_from_iterator(iterable, r):\r\n \"Random selection from itertools.combinations_with_replacement(iterable, r)\"\r\n iterator = itertools.combinations_with_replacement(iterable, r)\r\n return random.choice(list(iterator))\r\n\r\n\r\n#-- current random recipe ----------\r\n\r\ndef random_combination_with_replacement(iterable, r):\r\n \"Random selection from itertools.combinations_with_replacement(iterable, r)\"\r\n pool = tuple(iterable)\r\n n = len(pool)\r\n indices = sorted(random.choices(range(n), k=r))\r\n return tuple(pool[i] for i in indices)\r\n\r\n\r\n#-- proposed random recipe ---------\r\n\r\ndef random_combination_with_replacement_proposal(iterable, r):\r\n \"Random selection from itertools.combinations_with_replacement(iterable, r)\"\r\n pool = tuple(iterable)\r\n n = len(pool)\r\n indices = sorted(random.sample(range(n+r-1), k=r))\r\n return tuple(pool[i-j] for j, i in enumerate(indices))\r\n\r\n\r\n#-- Comparison\r\n\r\nfor func in random_combination_with_replacement_from_iterator, random_combination_with_replacement, random_combination_with_replacement_proposal:\r\n print()\r\n print(func.__name__)\r\n N = 100000\r\n ctr = Counter(func(iterable, r) for _ in range(N))\r\n for comb, freq in sorted(ctr.items()):\r\n print(comb, f'{freq/N:6.2%}')\r\n```\r\n\r\n\u003c/details\u003e\r\n\r\n\u003cdetails\u003e\u003csummary\u003eTest results\u003c/summary\u003e\r\n\r\n```\r\nitertools\r\n(0, 0, 0, 0)\r\n(0, 0, 0, 1)\r\n(0, 0, 1, 1)\r\n(0, 1, 1, 1)\r\n(1, 1, 1, 1)\r\n\r\nrandom_combination_with_replacement_from_iterator\r\n(0, 0, 0, 0) 19.89%\r\n(0, 0, 0, 1) 20.08%\r\n(0, 0, 1, 1) 20.01%\r\n(0, 1, 1, 1) 19.88%\r\n(1, 1, 1, 1) 20.14%\r\n\r\nrandom_combination_with_replacement\r\n(0, 0, 0, 0) 6.14%\r\n(0, 0, 0, 1) 24.98%\r\n(0, 0, 1, 1) 37.71%\r\n(0, 1, 1, 1) 25.04%\r\n(1, 1, 1, 1) 6.13%\r\n\r\nrandom_combination_with_replacement_proposal\r\n(0, 0, 0, 0) 20.17%\r\n(0, 0, 0, 1) 19.82%\r\n(0, 0, 1, 1) 20.18%\r\n(0, 1, 1, 1) 19.88%\r\n(1, 1, 1, 1) 19.95%\r\n```\r\n\r\n\u003c/details\u003e\n\n\u003c!-- gh-linked-prs --\u003e\n### Linked PRs\n* gh-102742\n* gh-102754\n\u003c!-- /gh-linked-prs --\u003e\n","author":{"url":"https://github.com/pochmann","@type":"Person","name":"pochmann"},"datePublished":"2023-03-13T19:27:42.000Z","interactionStatistic":{"@type":"InteractionCounter","interactionType":"https://schema.org/CommentAction","userInteractionCount":7},"url":"https://github.com/102653/cpython/issues/102653"}
| route-pattern | /_view_fragments/issues/show/:user_id/:repository/:id/issue_layout(.:format) |
| route-controller | voltron_issues_fragments |
| route-action | issue_layout |
| fetch-nonce | v2:d03b4e1e-a5b2-d346-a904-f08f928bdd7d |
| current-catalog-service-hash | 81bb79d38c15960b92d99bca9288a9108c7a47b18f2423d0f6438c5b7bcd2114 |
| request-id | D5F2:14EC01:3BBBB:54E31:696A0BB4 |
| html-safe-nonce | 9ea5020639bf2af6509daec3448028b02d00b763ad4501e70d326f174c3d4f54 |
| visitor-payload | eyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiJENUYyOjE0RUMwMTozQkJCQjo1NEUzMTo2OTZBMEJCNCIsInZpc2l0b3JfaWQiOiIyMjkyMjk0NzQyODA1NDQ5NjUyIiwicmVnaW9uX2VkZ2UiOiJpYWQiLCJyZWdpb25fcmVuZGVyIjoiaWFkIn0= |
| visitor-hmac | 69f34762c4d5054ea2b4134ea4a80c7b0cf6e2fb6171ce8d64fe1cc7cae64a69 |
| hovercard-subject-tag | issue:1622118323 |
| github-keyboard-shortcuts | repository,issues,copilot |
| google-site-verification | Apib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I |
| octolytics-url | https://collector.github.com/github/collect |
| analytics-location | / |
| fb:app_id | 1401488693436528 |
| apple-itunes-app | app-id=1477376905, app-argument=https://github.com/_view_fragments/issues/show/python/cpython/102653/issue_layout |
| twitter:image | https://opengraph.githubassets.com/8600ec77e3f1b1f6b28bb9d38cf2b459ca61056936271be3a7db6e26d8995c42/python/cpython/issues/102653 |
| twitter:card | summary_large_image |
| og:image | https://opengraph.githubassets.com/8600ec77e3f1b1f6b28bb9d38cf2b459ca61056936271be3a7db6e26d8995c42/python/cpython/issues/102653 |
| og:image:alt | Documentation The random module has four recipes that are supposed to "efficiently make random selections from the combinatoric iterators in the itertools module". And their docstrings all say "Ran... |
| og:image:width | 1200 |
| og:image:height | 600 |
| og:site_name | GitHub |
| og:type | object |
| og:author:username | pochmann |
| hostname | github.com |
| expected-hostname | github.com |
| None | 699227a00bbb7fe1eec276d2ae1c3a93068bc5ba483bd9dc4b2a27a8f4f2f595 |
| turbo-cache-control | no-preview |
| go-import | github.com/python/cpython git https://github.com/python/cpython.git |
| octolytics-dimension-user_id | 1525981 |
| octolytics-dimension-user_login | python |
| octolytics-dimension-repository_id | 81598961 |
| octolytics-dimension-repository_nwo | python/cpython |
| octolytics-dimension-repository_public | true |
| octolytics-dimension-repository_is_fork | false |
| octolytics-dimension-repository_network_root_id | 81598961 |
| octolytics-dimension-repository_network_root_nwo | python/cpython |
| turbo-body-classes | logged-out env-production page-responsive |
| disable-turbo | false |
| browser-stats-url | https://api.github.com/_private/browser/stats |
| browser-errors-url | https://api.github.com/_private/browser/errors |
| release | 7266b2d935baa1c6474b16dd9feaa5ca30607261 |
| ui-target | full |
| theme-color | #1e2327 |
| color-scheme | light dark |
Links:
Viewport: width=device-width