René's URL Explorer Experiment


Title: gh-101178: refactor base64.b85encode to be memory friendly by romuald · Pull Request #112248 · python/cpython · GitHub

Open Graph Title: gh-101178: refactor base64.b85encode to be memory friendly by romuald · Pull Request #112248 · python/cpython

X Title: gh-101178: refactor base64.b85encode to be memory friendly by romuald · Pull Request #112248 · python/cpython

Description: Current description Rewrote the base64._85encode method logic in C, by plugging in the binascii module (already taking care of the bae64 methods) By using C and a single buffer, the memory use is reduced to a minimum, addressing the initial issue. It also greatly improves performance as a bonus: main SMALL (11 bytes): 1575427 iterations (1.27 µs per call, 115.41 ns per byte) MEDIUM (200 bytes): 204909 iterations (9.76 µs per call, 48.80 ns per byte) BIG (5000 bytes): 8623 iterations (231.94 µs per call, 46.39 ns per byte) VERYBIG (500000 bytes): 81 iterations (24.69 ms per call, 49.38 ns per byte) branch SMALL (11 bytes): 11230718 iterations (178.08 ns per call, 16.19 ns per byte) MEDIUM (200 bytes): 6004721 iterations (333.07 ns per call, 1.67 ns per byte) BIG (5000 bytes): 458005 iterations (4.37 µs per call, 873.35 ps per byte) VERYBIG (500000 bytes): 4772 iterations (419.11 µs per call, 838.22 ps per byte) Script used to test: https://gist.github.com/romuald/7aeba5f40693bb351da4abe62ad7321d Previous description (python refactor) not up to date with current PR Refactor code to make use of generators instead of allocating 2 potentially huge lists for large datasets Memory gain only measured using macOS and a 5Mb input. Using main: Before encoding Physical footprint: 16.3M Physical footprint (peak): 21.3M After encoding Physical footprint: 45.0M Physical footprint (peak): 244.1M With refactor: Before encoding Physical footprint: 14.6M Physical footprint (peak): 19.6M After encoding Physical footprint: 28.5M Physical footprint (peak): 34.4M The execution time is more than doubled, which may not be acceptable. However the memory used is reduced by more than 90% edit: changed the algorithm to be more efficient, the performance decrease now seems to be negligible I also have no idea how (and if) I should test this Here is the script I've used to measure the execution time, the memdebug can probably be adapted to read /proc/{pid} on Linux edit: updated to work on Linux too import os import sys import random import hashlib import platform import subprocess from time import time from base64 import b85encode def memdebug(): if platform.system == "Darwin": if not os.environ.get("MallocStackLogging"): return res = subprocess.check_output(["malloc_history", str(os.getpid()), "-highWaterMark", "-allBySize"]) for line in res.splitlines(): if line.startswith(b"Physical"): print(line.decode()) elif platform.system() == "Linux": with open(f"/proc/{os.getpid()}/status") as reader: for line in reader: if line.startswith("VmPeak:"): print(line, end="") def main(): # use a stable input rnd = random.Random() rnd.seed(42) data = rnd.randbytes(5_000_000) memdebug() start = time() import pdb try: res = b85encode(data) except Exception: # pdb.post_mortem() raise end = time() memdebug() print("Data length:", len(data)) print("Output length:", len(res)) print(f"Decode time: {end-start:.3f}s") h = hashlib.md5(res).hexdigest() print("Hashed result", h) assert h == "ad97e45ba085865e70f7aa05c9a31388" if __name__ == '__main__': main() Issue: gh-101178

Open Graph Description: Current description Rewrote the base64._85encode method logic in C, by plugging in the binascii module (already taking care of the bae64 methods) By using C and a single buffer, the memory use is r...

X Description: Current description Rewrote the base64._85encode method logic in C, by plugging in the binascii module (already taking care of the bae64 methods) By using C and a single buffer, the memory use is r...

Opengraph URL: https://github.com/python/cpython/pull/112248

X: @github

direct link

Domain: github.com

route-pattern/:user_id/:repository/pull/:id/checks(.:format)
route-controllerpull_requests
route-actionchecks
fetch-noncev2:ee68a3f2-8bd4-f6f3-52a3-af4c27d93b8d
current-catalog-service-hash87dc3bc62d9b466312751bfd5f889726f4f1337bdff4e8be7da7c93d6c00a25a
request-idB272:36099A:1E732D7:2ACA6A9:696A5C1F
html-safe-noncef3a378a0d1fe95abf15adc452041130a907b40f2486a1de0fdd6c77d5bc96836
visitor-payloadeyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiJCMjcyOjM2MDk5QToxRTczMkQ3OjJBQ0E2QTk6Njk2QTVDMUYiLCJ2aXNpdG9yX2lkIjoiNjUyMTgzMzIwNzA5Nzg3NTQ4NyIsInJlZ2lvbl9lZGdlIjoiaWFkIiwicmVnaW9uX3JlbmRlciI6ImlhZCJ9
visitor-hmacf3fac9c101772e9c07400c83c912966cf93c906b1d29581ed3e5f9ae3fcd4de2
hovercard-subject-tagpull_request:1607720156
github-keyboard-shortcutsrepository,pull-request-list,pull-request-conversation,pull-request-files-changed,checks,copilot
google-site-verificationApib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I
octolytics-urlhttps://collector.github.com/github/collect
analytics-location///pull_requests/show/checks
fb:app_id1401488693436528
apple-itunes-appapp-id=1477376905, app-argument=https://github.com/python/cpython/pull/112248/checks
twitter:imagehttps://avatars.githubusercontent.com/u/113200?s=400&v=4
twitter:cardsummary_large_image
og:imagehttps://avatars.githubusercontent.com/u/113200?s=400&v=4
og:image:altCurrent description Rewrote the base64._85encode method logic in C, by plugging in the binascii module (already taking care of the bae64 methods) By using C and a single buffer, the memory use is r...
og:site_nameGitHub
og:typeobject
hostnamegithub.com
expected-hostnamegithub.com
None3f871c8e07f0ae1886fa8dac284166d28b09ad5bada6476fc10b674e489788ef
turbo-cache-controlno-cache
go-importgithub.com/python/cpython git https://github.com/python/cpython.git
octolytics-dimension-user_id1525981
octolytics-dimension-user_loginpython
octolytics-dimension-repository_id81598961
octolytics-dimension-repository_nwopython/cpython
octolytics-dimension-repository_publictrue
octolytics-dimension-repository_is_forkfalse
octolytics-dimension-repository_network_root_id81598961
octolytics-dimension-repository_network_root_nwopython/cpython
turbo-body-classeslogged-out env-production page-responsive full-width full-width-p-0
disable-turbofalse
browser-stats-urlhttps://api.github.com/_private/browser/stats
browser-errors-urlhttps://api.github.com/_private/browser/errors
release63c426b30d262aba269ef14c40e3c817b384cd61
ui-targetfull
theme-color#1e2327
color-schemelight dark

Links:

Skip to contenthttps://github.com/python/cpython/pull/112248/checks#start-of-content
https://github.com/
Sign in https://github.com/login?return_to=https%3A%2F%2Fgithub.com%2Fpython%2Fcpython%2Fpull%2F112248%2Fchecks
GitHub CopilotWrite better code with AIhttps://github.com/features/copilot
GitHub SparkBuild and deploy intelligent appshttps://github.com/features/spark
GitHub ModelsManage and compare promptshttps://github.com/features/models
MCP RegistryNewIntegrate external toolshttps://github.com/mcp
ActionsAutomate any workflowhttps://github.com/features/actions
CodespacesInstant dev environmentshttps://github.com/features/codespaces
IssuesPlan and track workhttps://github.com/features/issues
Code ReviewManage code changeshttps://github.com/features/code-review
GitHub Advanced SecurityFind and fix vulnerabilitieshttps://github.com/security/advanced-security
Code securitySecure your code as you buildhttps://github.com/security/advanced-security/code-security
Secret protectionStop leaks before they starthttps://github.com/security/advanced-security/secret-protection
Why GitHubhttps://github.com/why-github
Documentationhttps://docs.github.com
Bloghttps://github.blog
Changeloghttps://github.blog/changelog
Marketplacehttps://github.com/marketplace
View all featureshttps://github.com/features
Enterpriseshttps://github.com/enterprise
Small and medium teamshttps://github.com/team
Startupshttps://github.com/enterprise/startups
Nonprofitshttps://github.com/solutions/industry/nonprofits
App Modernizationhttps://github.com/solutions/use-case/app-modernization
DevSecOpshttps://github.com/solutions/use-case/devsecops
DevOpshttps://github.com/solutions/use-case/devops
CI/CDhttps://github.com/solutions/use-case/ci-cd
View all use caseshttps://github.com/solutions/use-case
Healthcarehttps://github.com/solutions/industry/healthcare
Financial serviceshttps://github.com/solutions/industry/financial-services
Manufacturinghttps://github.com/solutions/industry/manufacturing
Governmenthttps://github.com/solutions/industry/government
View all industrieshttps://github.com/solutions/industry
View all solutionshttps://github.com/solutions
AIhttps://github.com/resources/articles?topic=ai
Software Developmenthttps://github.com/resources/articles?topic=software-development
DevOpshttps://github.com/resources/articles?topic=devops
Securityhttps://github.com/resources/articles?topic=security
View all topicshttps://github.com/resources/articles
Customer storieshttps://github.com/customer-stories
Events & webinarshttps://github.com/resources/events
Ebooks & reportshttps://github.com/resources/whitepapers
Business insightshttps://github.com/solutions/executive-insights
GitHub Skillshttps://skills.github.com
Documentationhttps://docs.github.com
Customer supporthttps://support.github.com
Community forumhttps://github.com/orgs/community/discussions
Trust centerhttps://github.com/trust-center
Partnershttps://github.com/partners
GitHub SponsorsFund open source developershttps://github.com/sponsors
Security Labhttps://securitylab.github.com
Maintainer Communityhttps://maintainers.github.com
Acceleratorhttps://github.com/accelerator
Archive Programhttps://archiveprogram.github.com
Topicshttps://github.com/topics
Trendinghttps://github.com/trending
Collectionshttps://github.com/collections
Enterprise platformAI-powered developer platformhttps://github.com/enterprise
GitHub Advanced SecurityEnterprise-grade security featureshttps://github.com/security/advanced-security
Copilot for BusinessEnterprise-grade AI featureshttps://github.com/features/copilot/copilot-business
Premium SupportEnterprise-grade 24/7 supporthttps://github.com/premium-support
Pricinghttps://github.com/pricing
Search syntax tipshttps://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax
documentationhttps://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax
Sign in https://github.com/login?return_to=https%3A%2F%2Fgithub.com%2Fpython%2Fcpython%2Fpull%2F112248%2Fchecks
Sign up https://github.com/signup?ref_cta=Sign+up&ref_loc=header+logged+out&ref_page=%2F%3Cuser-name%3E%2F%3Crepo-name%3E%2Fpull_requests%2Fshow%2Fchecks&source=header-repo&source_repo=python%2Fcpython
Reloadhttps://github.com/python/cpython/pull/112248/checks
Reloadhttps://github.com/python/cpython/pull/112248/checks
Reloadhttps://github.com/python/cpython/pull/112248/checks
python https://github.com/python
cpythonhttps://github.com/python/cpython
Please reload this pagehttps://github.com/python/cpython/pull/112248/checks
Notifications https://github.com/login?return_to=%2Fpython%2Fcpython
Fork 33.9k https://github.com/login?return_to=%2Fpython%2Fcpython
Star 71.1k https://github.com/login?return_to=%2Fpython%2Fcpython
Code https://github.com/python/cpython
Issues 5k+ https://github.com/python/cpython/issues
Pull requests 2.1k https://github.com/python/cpython/pulls
Actions https://github.com/python/cpython/actions
Projects 31 https://github.com/python/cpython/projects
Security Uh oh! There was an error while loading. Please reload this page. https://github.com/python/cpython/security
Please reload this pagehttps://github.com/python/cpython/pull/112248/checks
Insights https://github.com/python/cpython/pulse
Code https://github.com/python/cpython
Issues https://github.com/python/cpython/issues
Pull requests https://github.com/python/cpython/pulls
Actions https://github.com/python/cpython/actions
Projects https://github.com/python/cpython/projects
Security https://github.com/python/cpython/security
Insights https://github.com/python/cpython/pulse
Sign up for GitHub https://github.com/signup?return_to=%2Fpython%2Fcpython%2Fissues%2Fnew%2Fchoose
terms of servicehttps://docs.github.com/terms
privacy statementhttps://docs.github.com/privacy
Sign inhttps://github.com/login?return_to=%2Fpython%2Fcpython%2Fissues%2Fnew%2Fchoose
romualdhttps://github.com/romuald
python:mainhttps://github.com/python/cpython/tree/main
romuald:gh-101178-b58encode-memusehttps://github.com/romuald/cpython/tree/gh-101178-b58encode-memuse
Conversation 26 https://github.com/python/cpython/pull/112248
Commits 7 https://github.com/python/cpython/pull/112248/commits
Checks 37 https://github.com/python/cpython/pull/112248/checks
Files changed 4 https://github.com/python/cpython/pull/112248/files
Please reload this pagehttps://github.com/python/cpython/pull/112248/checks
Sign in for the full log viewhttps://github.com/login?return_to=https%3A%2F%2Fgithub.com%2Fpython%2Fcpython%2Fpull%2F112248%2Fchecks
gh-101178: refactor base64.b85encode to be memory friendly https://github.com/python/cpython/pull/112248/checks#top
Please reload this pagehttps://github.com/python/cpython/pull/112248/checks
Check labels on: pull_request https://github.com/python/cpython/actions/runs/13379131245
DO-NOT-MERGE https://github.com/python/cpython/actions/runs/13379131245/job/37364416507?pr=112248
Unresolved review https://github.com/python/cpython/actions/runs/13379131245/job/37364416701?pr=112248
Lint on: pull_request https://github.com/python/cpython/actions/runs/13379131248
lint https://github.com/python/cpython/actions/runs/13379131248/job/37364416929?pr=112248
Tests on: pull_request https://github.com/python/cpython/actions/runs/13379131251
Change detection / Create context from changed files https://github.com/python/cpython/actions/runs/13379131251/job/37364417053?pr=112248
Docs https://github.com/python/cpython/actions/runs/13379131251/job/37364428401?pr=112248
Check if Autoconf files are up to date https://github.com/python/cpython/actions/runs/13379131251/job/37364427821?pr=112248
Check if generated files are up to date https://github.com/python/cpython/actions/runs/13379131251/job/37364428157?pr=112248
Windows / build and test (x64) https://github.com/python/cpython/actions/runs/13379131251/job/37364431135?pr=112248
Windows (free-threading) / build and test (x64) https://github.com/python/cpython/actions/runs/13379131251/job/37364431303?pr=112248
Windows / build (arm64) https://github.com/python/cpython/actions/runs/13379131251/job/37364430650?pr=112248
Windows (free-threading) / build (arm64) https://github.com/python/cpython/actions/runs/13379131251/job/37364430979?pr=112248
Windows / build and test (Win32) https://github.com/python/cpython/actions/runs/13379131251/job/37364430824?pr=112248
Windows MSI${{ '' }} https://github.com/python/cpython/actions/runs/13379131251/job/37364429250?pr=112248
macOS / build and test (ghcr.io/cirruslabs/macos-runner:sonoma) https://github.com/python/cpython/actions/runs/13379131251/job/37364431695?pr=112248
macOS (free-threading) / build and test (ghcr.io/cirruslabs/macos-runner:sonoma) https://github.com/python/cpython/actions/runs/13379131251/job/37364431499?pr=112248
macOS / build and test (macos-13) https://github.com/python/cpython/actions/runs/13379131251/job/37364431905?pr=112248
Ubuntu / build and test (ubuntu-24.04) https://github.com/python/cpython/actions/runs/13379131251/job/37364432248?pr=112248
Ubuntu / build and test (ubuntu-22.04-arm) https://github.com/python/cpython/actions/runs/13379131251/job/37364432060?pr=112248
Ubuntu (free-threading) / build and test (ubuntu-24.04) https://github.com/python/cpython/actions/runs/13379131251/job/37364432802?pr=112248
Ubuntu (free-threading) / build and test (ubuntu-22.04-arm) https://github.com/python/cpython/actions/runs/13379131251/job/37364432405?pr=112248
Ubuntu (bolt) / build and test (ubuntu-24.04) https://github.com/python/cpython/actions/runs/13379131251/job/37364432588?pr=112248
Ubuntu SSL tests with OpenSSL (ubuntu-24.04, 3.0.15) https://github.com/python/cpython/actions/runs/13379131251/job/37364428808?pr=112248
Ubuntu SSL tests with OpenSSL (ubuntu-24.04, 3.1.7) https://github.com/python/cpython/actions/runs/13379131251/job/37364428941?pr=112248
Ubuntu SSL tests with OpenSSL (ubuntu-24.04, 3.2.3) https://github.com/python/cpython/actions/runs/13379131251/job/37364429082?pr=112248
Ubuntu SSL tests with OpenSSL (ubuntu-24.04, 3.3.2) https://github.com/python/cpython/actions/runs/13379131251/job/37364429431?pr=112248
Ubuntu SSL tests with OpenSSL (ubuntu-24.04, 3.4.0) https://github.com/python/cpython/actions/runs/13379131251/job/37364429751?pr=112248
WASI / build and test https://github.com/python/cpython/actions/runs/13379131251/job/37364430444?pr=112248
Hypothesis tests on Ubuntu https://github.com/python/cpython/actions/runs/13379131251/job/37364430303?pr=112248
Address sanitizer (ubuntu-24.04) https://github.com/python/cpython/actions/runs/13379131251/job/37364428666?pr=112248
Thread sanitizer / Thread sanitizer https://github.com/python/cpython/actions/runs/13379131251/job/37364433000?pr=112248
Thread sanitizer (free-threading) / Thread sanitizer https://github.com/python/cpython/actions/runs/13379131251/job/37364433177?pr=112248
Cross build Linux https://github.com/python/cpython/actions/runs/13379131251/job/37364430009?pr=112248
CIFuzz (address) https://github.com/python/cpython/actions/runs/13379131251/job/37364429562?pr=112248
CIFuzz (undefined) https://github.com/python/cpython/actions/runs/13379131251/job/37364429887?pr=112248
CIFuzz (memory) https://github.com/python/cpython/actions/runs/13379131251/job/37364430150?pr=112248
All required checks pass https://github.com/python/cpython/actions/runs/13379131251/job/37365189058?pr=112248
ClusterFuzzLite/CIFuzz https://github.com/python/cpython/pull/112248/checks?check_run_id=37365059541
https://github.com
Termshttps://docs.github.com/site-policy/github-terms/github-terms-of-service
Privacyhttps://docs.github.com/site-policy/privacy-policies/github-privacy-statement
Securityhttps://github.com/security
Statushttps://www.githubstatus.com/
Communityhttps://github.community/
Docshttps://docs.github.com/
Contacthttps://support.github.com?tags=dotcom-footer

Viewport: width=device-width


URLs of crawlers that visited me.