René's URL Explorer Experiment


Title: More strict rules for group numbers and names in RE · Issue #91760 · python/cpython · GitHub

Open Graph Title: More strict rules for group numbers and names in RE · Issue #91760 · python/cpython

X Title: More strict rules for group numbers and names in RE · Issue #91760 · python/cpython

Description: There were unintentional changes in parsing regular expressions between Python 2 and Python 3. Group references. In patterns and replacement strings you can refer a group by its number using syntax \N where N is a 1-2 digit decimal numbe...

Open Graph Description: There were unintentional changes in parsing regular expressions between Python 2 and Python 3. Group references. In patterns and replacement strings you can refer a group by its number using syntax...

X Description: There were unintentional changes in parsing regular expressions between Python 2 and Python 3. Group references. In patterns and replacement strings you can refer a group by its number using syntax...

Opengraph URL: https://github.com/python/cpython/issues/91760

X: @github

direct link

Domain: github.com


Hey, it has json ld scripts:
{"@context":"https://schema.org","@type":"DiscussionForumPosting","headline":"More strict rules for group numbers and names in RE","articleBody":"There were unintentional changes in parsing regular expressions between Python 2 and Python 3.\r\n\r\n1. Group references.\r\n\r\n   In patterns and replacement strings you can refer a group by its number using syntax `\\N` where N is a 1-2 digit decimal number. The number should not start by 0, because it will be in an octal escape sequence. The group number can also be used in the conditional expression `(?(N)...)` in patterns and in references `\\g\u003cN\u003e` in replacement strings. And it is interesting, that in Python 3 it can be not only a sequence of decimal digits. The following things are allowed in the group number:\r\n\r\n   * Initial zero: `\\g\u003c01\u003e`.\r\n   * Spaces around the number: `\\g\u003c 1 \u003e`.\r\n   * Underscores: `\\g\u003c1_2\u003e`.\r\n   * Non-decimal digits: `\\g\u003c¹\u003e`.\r\n   * Non-ASCII decimal digits: `\\g\u003c१\u003e`.\r\n\r\n   All this is purely an implementation artifact. After `\\g\u003c` we search the nearest `\u003e` and pass a substring between `\u003c` and `\u003e` to `int()`. In other implementation we could search the longest sequence of decimal digits and all above examples (except may be the first one) would be filtered out automatically.\r\n\r\n2. Group names.\r\n\r\n   In `(?P\u003cname\u003e...)`, `(?P=name)`, `(?(name)...)` and `\\g\u003cname\u003e` we can refer groups by name. To avoid ambiguity there is a limitation: the name should follow the rules for identifier. In Python 2 it means that it should contain only letters, digits and underscores and start with a non-digit. Letters and digits are ASCII-only: [A-Za-z] and [0-9].\r\n\r\n   In Python 3 identifiers can contain non-ASCII letters and digits. It is good. But in bytes patterns and replacement strings the codes `\\xaa`, `\\xb2`, `\\xb3`, `\\xb5`, `\\xb9`, `\\xba`, `\\xc0`-`\\xd6`, `\\xd8`-`\\xf6`, `\\xf8`-`\\xff` are allowed in the group name. They correspond characters `ª²³µ¹ºÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõöøùúûüýþÿ` after decoding.\r\n\r\n   It is an implementation artifact too. Bytes patterns and replacement strings are decoded with the Latin1 encoding for parsing. It simplifies and speeds up the code. There is no other reason why letters and digits in the range U-0080--U-00FF are allowed.\r\n\r\n   Note that In Python 3 the bytes literal can only contain printable literal characters in the ASCII range. Codes outside of this range should be represented as octal or hexadecimal escape sequences. So supporting non-ASCII letters and digits does not add to readability.\r\n\r\nSince the above \"features\" are not intentional, not supported by most other RE engines (except `regex`, which is also written in Python), are not tested, and can be changed in result of refactoring the parser, I suggest to introduce more strict rules on group number and name.\r\n\r\n1. Group number should only contain ASCII decimal digits in range [0-9]. Initial 0 is not allowed except for group number 0.\r\n2. Group name in the bytes pattern or replacement string should only contain ASCII letters and digits.\r\n\r\nThe question: do we need a deprecation period for this? I have wrote a code for both options (with deprecation and with error), will create PRs tomorrow.\r\n","author":{"url":"https://github.com/serhiy-storchaka","@type":"Person","name":"serhiy-storchaka"},"datePublished":"2022-04-20T18:30:07.000Z","interactionStatistic":{"@type":"InteractionCounter","interactionType":"https://schema.org/CommentAction","userInteractionCount":2},"url":"https://github.com/91760/cpython/issues/91760"}

route-pattern/_view_fragments/issues/show/:user_id/:repository/:id/issue_layout(.:format)
route-controllervoltron_issues_fragments
route-actionissue_layout
fetch-noncev2:6067c71f-d110-94b1-5321-26ee22a2dec1
current-catalog-service-hash81bb79d38c15960b92d99bca9288a9108c7a47b18f2423d0f6438c5b7bcd2114
request-idAEE0:1CC3CE:14F756F:1DC2AB4:69691BEE
html-safe-nonce69379d8e99531baae607e4fe4d3e76cecb50ab875c74c6b1ef8d594adc5e82cf
visitor-payloadeyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiJBRUUwOjFDQzNDRToxNEY3NTZGOjFEQzJBQjQ6Njk2OTFCRUUiLCJ2aXNpdG9yX2lkIjoiNzI2NzIwMDgwNTc5ODYwMzAiLCJyZWdpb25fZWRnZSI6ImlhZCIsInJlZ2lvbl9yZW5kZXIiOiJpYWQifQ==
visitor-hmacf19a5395ef295cb897c90988c87fd3fa3e98e3cd956cbe14d549f70617762811
hovercard-subject-tagissue:1210062413
github-keyboard-shortcutsrepository,issues,copilot
google-site-verificationApib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I
octolytics-urlhttps://collector.github.com/github/collect
analytics-location///voltron/issues_fragments/issue_layout
fb:app_id1401488693436528
apple-itunes-appapp-id=1477376905, app-argument=https://github.com/_view_fragments/issues/show/python/cpython/91760/issue_layout
twitter:imagehttps://opengraph.githubassets.com/81e8200c1e2d0914f9f3f22730297293b55bc1660cafdb2debe07477c6014055/python/cpython/issues/91760
twitter:cardsummary_large_image
og:imagehttps://opengraph.githubassets.com/81e8200c1e2d0914f9f3f22730297293b55bc1660cafdb2debe07477c6014055/python/cpython/issues/91760
og:image:altThere were unintentional changes in parsing regular expressions between Python 2 and Python 3. Group references. In patterns and replacement strings you can refer a group by its number using syntax...
og:image:width1200
og:image:height600
og:site_nameGitHub
og:typeobject
og:author:usernameserhiy-storchaka
hostnamegithub.com
expected-hostnamegithub.com
None0e60568924309a021b51adabdce15c2a2f285b556f3130d1a2fa2a5bce11c55f
turbo-cache-controlno-preview
go-importgithub.com/python/cpython git https://github.com/python/cpython.git
octolytics-dimension-user_id1525981
octolytics-dimension-user_loginpython
octolytics-dimension-repository_id81598961
octolytics-dimension-repository_nwopython/cpython
octolytics-dimension-repository_publictrue
octolytics-dimension-repository_is_forkfalse
octolytics-dimension-repository_network_root_id81598961
octolytics-dimension-repository_network_root_nwopython/cpython
turbo-body-classeslogged-out env-production page-responsive
disable-turbofalse
browser-stats-urlhttps://api.github.com/_private/browser/stats
browser-errors-urlhttps://api.github.com/_private/browser/errors
releasedd206f7ed6207863172be4a783826e86bd2375c3
ui-targetfull
theme-color#1e2327
color-schemelight dark

Links:

Skip to contenthttps://github.com/python/cpython/issues/91760#start-of-content
https://github.com/
Sign in https://github.com/login?return_to=https%3A%2F%2Fgithub.com%2Fpython%2Fcpython%2Fissues%2F91760
GitHub CopilotWrite better code with AIhttps://github.com/features/copilot
GitHub SparkBuild and deploy intelligent appshttps://github.com/features/spark
GitHub ModelsManage and compare promptshttps://github.com/features/models
MCP RegistryNewIntegrate external toolshttps://github.com/mcp
ActionsAutomate any workflowhttps://github.com/features/actions
CodespacesInstant dev environmentshttps://github.com/features/codespaces
IssuesPlan and track workhttps://github.com/features/issues
Code ReviewManage code changeshttps://github.com/features/code-review
GitHub Advanced SecurityFind and fix vulnerabilitieshttps://github.com/security/advanced-security
Code securitySecure your code as you buildhttps://github.com/security/advanced-security/code-security
Secret protectionStop leaks before they starthttps://github.com/security/advanced-security/secret-protection
Why GitHubhttps://github.com/why-github
Documentationhttps://docs.github.com
Bloghttps://github.blog
Changeloghttps://github.blog/changelog
Marketplacehttps://github.com/marketplace
View all featureshttps://github.com/features
Enterpriseshttps://github.com/enterprise
Small and medium teamshttps://github.com/team
Startupshttps://github.com/enterprise/startups
Nonprofitshttps://github.com/solutions/industry/nonprofits
App Modernizationhttps://github.com/solutions/use-case/app-modernization
DevSecOpshttps://github.com/solutions/use-case/devsecops
DevOpshttps://github.com/solutions/use-case/devops
CI/CDhttps://github.com/solutions/use-case/ci-cd
View all use caseshttps://github.com/solutions/use-case
Healthcarehttps://github.com/solutions/industry/healthcare
Financial serviceshttps://github.com/solutions/industry/financial-services
Manufacturinghttps://github.com/solutions/industry/manufacturing
Governmenthttps://github.com/solutions/industry/government
View all industrieshttps://github.com/solutions/industry
View all solutionshttps://github.com/solutions
AIhttps://github.com/resources/articles?topic=ai
Software Developmenthttps://github.com/resources/articles?topic=software-development
DevOpshttps://github.com/resources/articles?topic=devops
Securityhttps://github.com/resources/articles?topic=security
View all topicshttps://github.com/resources/articles
Customer storieshttps://github.com/customer-stories
Events & webinarshttps://github.com/resources/events
Ebooks & reportshttps://github.com/resources/whitepapers
Business insightshttps://github.com/solutions/executive-insights
GitHub Skillshttps://skills.github.com
Documentationhttps://docs.github.com
Customer supporthttps://support.github.com
Community forumhttps://github.com/orgs/community/discussions
Trust centerhttps://github.com/trust-center
Partnershttps://github.com/partners
GitHub SponsorsFund open source developershttps://github.com/sponsors
Security Labhttps://securitylab.github.com
Maintainer Communityhttps://maintainers.github.com
Acceleratorhttps://github.com/accelerator
Archive Programhttps://archiveprogram.github.com
Topicshttps://github.com/topics
Trendinghttps://github.com/trending
Collectionshttps://github.com/collections
Enterprise platformAI-powered developer platformhttps://github.com/enterprise
GitHub Advanced SecurityEnterprise-grade security featureshttps://github.com/security/advanced-security
Copilot for BusinessEnterprise-grade AI featureshttps://github.com/features/copilot/copilot-business
Premium SupportEnterprise-grade 24/7 supporthttps://github.com/premium-support
Pricinghttps://github.com/pricing
Search syntax tipshttps://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax
documentationhttps://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax
Sign in https://github.com/login?return_to=https%3A%2F%2Fgithub.com%2Fpython%2Fcpython%2Fissues%2F91760
Sign up https://github.com/signup?ref_cta=Sign+up&ref_loc=header+logged+out&ref_page=%2F%3Cuser-name%3E%2F%3Crepo-name%3E%2Fvoltron%2Fissues_fragments%2Fissue_layout&source=header-repo&source_repo=python%2Fcpython
Reloadhttps://github.com/python/cpython/issues/91760
Reloadhttps://github.com/python/cpython/issues/91760
Reloadhttps://github.com/python/cpython/issues/91760
python https://github.com/python
cpythonhttps://github.com/python/cpython
Please reload this pagehttps://github.com/python/cpython/issues/91760
Notifications https://github.com/login?return_to=%2Fpython%2Fcpython
Fork 33.9k https://github.com/login?return_to=%2Fpython%2Fcpython
Star 71.1k https://github.com/login?return_to=%2Fpython%2Fcpython
Code https://github.com/python/cpython
Issues 5k+ https://github.com/python/cpython/issues
Pull requests 2k https://github.com/python/cpython/pulls
Actions https://github.com/python/cpython/actions
Projects 31 https://github.com/python/cpython/projects
Security Uh oh! There was an error while loading. Please reload this page. https://github.com/python/cpython/security
Please reload this pagehttps://github.com/python/cpython/issues/91760
Insights https://github.com/python/cpython/pulse
Code https://github.com/python/cpython
Issues https://github.com/python/cpython/issues
Pull requests https://github.com/python/cpython/pulls
Actions https://github.com/python/cpython/actions
Projects https://github.com/python/cpython/projects
Security https://github.com/python/cpython/security
Insights https://github.com/python/cpython/pulse
New issuehttps://github.com/login?return_to=https://github.com/python/cpython/issues/91760
New issuehttps://github.com/login?return_to=https://github.com/python/cpython/issues/91760
#91794https://github.com/python/cpython/pull/91794
#91792https://github.com/python/cpython/pull/91792
More strict rules for group numbers and names in REhttps://github.com/python/cpython/issues/91760#top
#91794https://github.com/python/cpython/pull/91794
#91792https://github.com/python/cpython/pull/91792
3.11only security fixeshttps://github.com/python/cpython/issues?q=state%3Aopen%20label%3A%223.11%22
topic-regexhttps://github.com/python/cpython/issues?q=state%3Aopen%20label%3A%22topic-regex%22
type-featureA feature request or enhancementhttps://github.com/python/cpython/issues?q=state%3Aopen%20label%3A%22type-feature%22
https://github.com/serhiy-storchaka
https://github.com/serhiy-storchaka
serhiy-storchakahttps://github.com/serhiy-storchaka
on Apr 20, 2022https://github.com/python/cpython/issues/91760#issue-1210062413
3.11only security fixeshttps://github.com/python/cpython/issues?q=state%3Aopen%20label%3A%223.11%22
topic-regexhttps://github.com/python/cpython/issues?q=state%3Aopen%20label%3A%22topic-regex%22
type-featureA feature request or enhancementhttps://github.com/python/cpython/issues?q=state%3Aopen%20label%3A%22type-feature%22
https://github.com
Termshttps://docs.github.com/site-policy/github-terms/github-terms-of-service
Privacyhttps://docs.github.com/site-policy/privacy-policies/github-privacy-statement
Securityhttps://github.com/security
Statushttps://www.githubstatus.com/
Communityhttps://github.community/
Docshttps://docs.github.com/
Contacthttps://support.github.com?tags=dotcom-footer

Viewport: width=device-width


URLs of crawlers that visited me.