Title: In-place sampling · Issue #8 · randomized-algorithm/random · GitHub
Open Graph Title: In-place sampling · Issue #8 · randomized-algorithm/random
X Title: In-place sampling · Issue #8 · randomized-algorithm/random
Description: If you want to sample k items out of n, Fisher Yates shuffles your item array. Hereunder is an efficient no shuffle implementation. Something better can probably be achieved with prefix sums and rank/select. Theorem We can sample k items...
Open Graph Description: If you want to sample k items out of n, Fisher Yates shuffles your item array. Hereunder is an efficient no shuffle implementation. Something better can probably be achieved with prefix sums and ra...
X Description: If you want to sample k items out of n, Fisher Yates shuffles your item array. Hereunder is an efficient no shuffle implementation. Something better can probably be achieved with prefix sums and ra...
Opengraph URL: https://github.com/randomized-algorithm/random/issues/8
X: @github
Domain: patch-diff.githubusercontent.com
{"@context":"https://schema.org","@type":"DiscussionForumPosting","headline":"In-place sampling","articleBody":"If you want to sample k items out of n, Fisher Yates shuffles your item array.\r\nHereunder is an efficient no shuffle implementation. Something better can probably be achieved with prefix sums and rank/select.\r\n\r\n#### Theorem\r\n\r\nWe can sample k items out of n with k random draws and O(k log k) time, without modifying the items array.\r\n\r\n#### Proof\r\n\r\nTODO fix proof\r\n\r\nGiven n items, at draw i=1,2,...,n with a binary search tree containing all previously drawn items indices between 0 and n-1, pick a random integer x between 0 and n - i. Binary search for its corresponding index value as follows: start with p = 0 the number of predecessors of the current node, let l be the size of the left subtree of the current node, let v be the value of the current node, if x + p + l + 1 \u003c v\r\nthe current node becomes the left child, otherwise the current node becomes the right child and p = p + l + 1. If there is no such child, insert x + p in the case of a left child, x + p + l + 1 in the case of a right child.\r\n\r\nIf x + p + l + 1 \u003c v, then x cannot be in the right subtree of v. The item index corresponding to x would be x + p + l + 1 if it was the right child of v which is smaller than v. Pick any node c in the right subtree of v. Let g be the number of predecessors of c in the right subtree of v. The value of c is at least v + 1 + g because all indices are distinct. Suppose c has no right child. The item index of x as a right child of c is x + p + l + 1 + g + 1 \u003c v + g + 1 \u003c= c. Suppose c has no left child. The item index of x as a left child of c is x + p + l + 1 + g \u003c v + g \u003c v + g + 1 \u003c= c.\r\n\r\nIf x + p + l + 1 \u003e v, then x cannot be in the left subtree of v. If v has no left child, then l = 0 and the value of x as the left child of v would be x + p which is larger or equal to v (indices must be distinct). Pick any node c in the left subtree of v.\r\nLet s be the number of successors of s in the left subtree of v. The value of c is at most v - s - 1 because all indices are distinct. Suppose c has no left child. The item index of x as a left child of c is x + p + l - s - 1 \u003e v - s - 2 so x + p + l - s - 1 \u003e= v - s - 1 = c.","author":{"url":"https://github.com/make-github-pseudonymous-again","@type":"Person","name":"make-github-pseudonymous-again"},"datePublished":"2017-03-03T17:37:54.000Z","interactionStatistic":{"@type":"InteractionCounter","interactionType":"https://schema.org/CommentAction","userInteractionCount":1},"url":"https://github.com/8/random/issues/8"}
| route-pattern | /_view_fragments/issues/show/:user_id/:repository/:id/issue_layout(.:format) |
| route-controller | voltron_issues_fragments |
| route-action | issue_layout |
| fetch-nonce | v2:0843ac6c-4bac-95ef-aeda-2717ce53ee04 |
| current-catalog-service-hash | 81bb79d38c15960b92d99bca9288a9108c7a47b18f2423d0f6438c5b7bcd2114 |
| request-id | 8FE8:19BE69:88D213:B6E439:698F7F9D |
| html-safe-nonce | a2c220d68e9f45c860fcf0fa00ed1f3dc3aac4a10800e62f0c7781426a825aa3 |
| visitor-payload | eyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiI4RkU4OjE5QkU2OTo4OEQyMTM6QjZFNDM5OjY5OEY3RjlEIiwidmlzaXRvcl9pZCI6IjU1Nzk3OTk5MDUzMDY2NDAyODUiLCJyZWdpb25fZWRnZSI6ImlhZCIsInJlZ2lvbl9yZW5kZXIiOiJpYWQifQ== |
| visitor-hmac | 55112dc125000559a93a8819fa674a85cd23a912d54691d1a5cd7ec4d838f5ff |
| hovercard-subject-tag | issue:211754103 |
| github-keyboard-shortcuts | repository,issues,copilot |
| google-site-verification | Apib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I |
| octolytics-url | https://collector.github.com/github/collect |
| analytics-location | / |
| fb:app_id | 1401488693436528 |
| apple-itunes-app | app-id=1477376905, app-argument=https://github.com/_view_fragments/issues/show/randomized-algorithm/random/8/issue_layout |
| twitter:image | https://opengraph.githubassets.com/a795094d8576c252201874ceca120d27b04ae12cda24e1a032fa11e33a572cbe/randomized-algorithm/random/issues/8 |
| twitter:card | summary_large_image |
| og:image | https://opengraph.githubassets.com/a795094d8576c252201874ceca120d27b04ae12cda24e1a032fa11e33a572cbe/randomized-algorithm/random/issues/8 |
| og:image:alt | If you want to sample k items out of n, Fisher Yates shuffles your item array. Hereunder is an efficient no shuffle implementation. Something better can probably be achieved with prefix sums and ra... |
| og:image:width | 1200 |
| og:image:height | 600 |
| og:site_name | GitHub |
| og:type | object |
| og:author:username | make-github-pseudonymous-again |
| hostname | github.com |
| expected-hostname | github.com |
| None | 4763146d672e989a41c6c0bd715790c0c59341d9f855508c8a3196e1e480b8f7 |
| turbo-cache-control | no-preview |
| go-import | github.com/randomized-algorithm/random git https://github.com/randomized-algorithm/random.git |
| octolytics-dimension-user_id | 87477532 |
| octolytics-dimension-user_login | randomized-algorithm |
| octolytics-dimension-repository_id | 24736215 |
| octolytics-dimension-repository_nwo | randomized-algorithm/random |
| octolytics-dimension-repository_public | true |
| octolytics-dimension-repository_is_fork | false |
| octolytics-dimension-repository_network_root_id | 24736215 |
| octolytics-dimension-repository_network_root_nwo | randomized-algorithm/random |
| turbo-body-classes | logged-out env-production page-responsive |
| disable-turbo | false |
| browser-stats-url | https://api.github.com/_private/browser/stats |
| browser-errors-url | https://api.github.com/_private/browser/errors |
| release | 8b442a4e0d8e68ffb351da689499018fde153e49 |
| ui-target | full |
| theme-color | #1e2327 |
| color-scheme | light dark |
Links:
Viewport: width=device-width