Title: Low decompression speed in multithreading · Issue #269 · python-lz4/python-lz4 · GitHub
Open Graph Title: Low decompression speed in multithreading · Issue #269 · python-lz4/python-lz4
X Title: Low decompression speed in multithreading · Issue #269 · python-lz4/python-lz4
Description: As far as i understand, the GIL should be dropped when calling the underlying LZ4 C library during compression and decompression, see lz4.block.decompress I only see a minor speedup when using multiple threads for decompression using pyt...
Open Graph Description: As far as i understand, the GIL should be dropped when calling the underlying LZ4 C library during compression and decompression, see lz4.block.decompress I only see a minor speedup when using mult...
X Description: As far as i understand, the GIL should be dropped when calling the underlying LZ4 C library during compression and decompression, see lz4.block.decompress I only see a minor speedup when using mult...
Opengraph URL: https://github.com/python-lz4/python-lz4/issues/269
X: @github
Domain: patch-diff.githubusercontent.com
{"@context":"https://schema.org","@type":"DiscussionForumPosting","headline":"Low decompression speed in multithreading","articleBody":"As far as i understand, the GIL should be dropped when calling the underlying LZ4 C library during compression and decompression, see [lz4.block.decompress](https://github.com/python-lz4/python-lz4/blob/58df0834b57d485f2483bf5ccf1007c313b25557/lz4/block/_block.c#L355-L361)\r\n\r\nI only see a minor speedup when using multiple threads for decompression using python-lz4 4.3.2 on a 6 core Intel i5-8400 on Debian 11 and also not on a Amd Ryzen 5900X on Windows 10, neither with Python 3.11.1 nor 3.8.10.\r\n\r\nCompression speed seems to increase almost linearly with the number of threads.\r\n\r\nThe following code gives me about 4500MB/s decompression speed (slight underestimation due to some overhead from starting the threads) when using 6 threads and ~4300MB/s when using 1 thread on an AMD Ryzen 5900X on Windows 10. Using lz4.frame yields similar results. Using py-lz4framed instead gives me about 13000MB/s using 6 threads and ~8300MB/s on 1 thread (not sure the compression settings are the same, but at the very least there is some speedup for multithreading).\r\n\r\n\r\n```python\r\nimport os\r\nimport threading\r\nimport time\r\nimport lz4.block\r\n\r\nsize_mb = 2000\r\nn_threads = 6\r\n\r\ninput_data = size_mb * 1024 * os.urandom(1024)\r\ncompressed = lz4.block.compress(input_data)\r\ninput_data = None\r\n\r\n\r\ndef decompress(data):\r\n start_time = time.perf_counter()\r\n thread_start_time = time.thread_time()\r\n lz4.block.decompress(data)\r\n stop_time = time.perf_counter()\r\n thread_stop_time = time.thread_time()\r\n print(f\"{threading.current_thread()} Decompression took {(stop_time-start_time)*1000:.3f}ms \"\r\n f\"(Thread time: {(thread_stop_time-thread_start_time)*1000:.3f}ms)\\n\")\r\n\r\n\r\nthreads = [threading.Thread(name=str(i), target=decompress, args=(compressed,)) for i in range(n_threads)]\r\n\r\nstart_thread_time = time.perf_counter()\r\n[t.start() for t in threads]\r\n[t.join() for t in threads]\r\ndone_thread_time = time.perf_counter()\r\nduration = done_thread_time - start_thread_time\r\nMBs = n_threads*size_mb/duration\r\nprint(f\"Total time: {duration}s : {MBs}MB/s\")\r\n```\r\n","author":{"url":"https://github.com/Dalbasar","@type":"Person","name":"Dalbasar"},"datePublished":"2023-01-02T21:26:52.000Z","interactionStatistic":{"@type":"InteractionCounter","interactionType":"https://schema.org/CommentAction","userInteractionCount":5},"url":"https://github.com/269/python-lz4/issues/269"}
| route-pattern | /_view_fragments/issues/show/:user_id/:repository/:id/issue_layout(.:format) |
| route-controller | voltron_issues_fragments |
| route-action | issue_layout |
| fetch-nonce | v2:d7364fd6-0e0c-c81b-8b5b-3da5a0e4036c |
| current-catalog-service-hash | 81bb79d38c15960b92d99bca9288a9108c7a47b18f2423d0f6438c5b7bcd2114 |
| request-id | 9412:46893:1F1258C:2B93AD7:698233FE |
| html-safe-nonce | 81bb54878c7966d40281656edfa159c50a3ff99685cb46fc8fd9e71f58fe0d33 |
| visitor-payload | eyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiI5NDEyOjQ2ODkzOjFGMTI1OEM6MkI5M0FENzo2OTgyMzNGRSIsInZpc2l0b3JfaWQiOiI1ODkzNDQ2NzE2Nzk5MjAyMzAyIiwicmVnaW9uX2VkZ2UiOiJpYWQiLCJyZWdpb25fcmVuZGVyIjoiaWFkIn0= |
| visitor-hmac | d18f20a4c41552d793ae040a7aaaea2a8f5ce39672d7c7af524832b7e6270544 |
| hovercard-subject-tag | issue:1516731443 |
| github-keyboard-shortcuts | repository,issues,copilot |
| google-site-verification | Apib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I |
| octolytics-url | https://collector.github.com/github/collect |
| analytics-location | / |
| fb:app_id | 1401488693436528 |
| apple-itunes-app | app-id=1477376905, app-argument=https://github.com/_view_fragments/issues/show/python-lz4/python-lz4/269/issue_layout |
| twitter:image | https://opengraph.githubassets.com/f607bb51b47c7de323d60f6066acb63d649c751e2202889479349b9d14760b71/python-lz4/python-lz4/issues/269 |
| twitter:card | summary_large_image |
| og:image | https://opengraph.githubassets.com/f607bb51b47c7de323d60f6066acb63d649c751e2202889479349b9d14760b71/python-lz4/python-lz4/issues/269 |
| og:image:alt | As far as i understand, the GIL should be dropped when calling the underlying LZ4 C library during compression and decompression, see lz4.block.decompress I only see a minor speedup when using mult... |
| og:image:width | 1200 |
| og:image:height | 600 |
| og:site_name | GitHub |
| og:type | object |
| og:author:username | Dalbasar |
| hostname | github.com |
| expected-hostname | github.com |
| None | 9135c12c87e8ba63197821abe054b7c0a2842a97636ee220df564bb0788e556d |
| turbo-cache-control | no-preview |
| go-import | github.com/python-lz4/python-lz4 git https://github.com/python-lz4/python-lz4.git |
| octolytics-dimension-user_id | 18689658 |
| octolytics-dimension-user_login | python-lz4 |
| octolytics-dimension-repository_id | 57201963 |
| octolytics-dimension-repository_nwo | python-lz4/python-lz4 |
| octolytics-dimension-repository_public | true |
| octolytics-dimension-repository_is_fork | false |
| octolytics-dimension-repository_network_root_id | 57201963 |
| octolytics-dimension-repository_network_root_nwo | python-lz4/python-lz4 |
| turbo-body-classes | logged-out env-production page-responsive |
| disable-turbo | false |
| browser-stats-url | https://api.github.com/_private/browser/stats |
| browser-errors-url | https://api.github.com/_private/browser/errors |
| release | 6b281db2f76bdbf4b45c30b597223c551b06e6e0 |
| ui-target | full |
| theme-color | #1e2327 |
| color-scheme | light dark |
Links:
Viewport: width=device-width