René's URL Explorer Experiment


Title: unicodedata module needs way of accurately determining XID_START and XID_CONTINUE properties. · Issue #129117 · python/cpython · GitHub

Open Graph Title: unicodedata module needs way of accurately determining XID_START and XID_CONTINUE properties. · Issue #129117 · python/cpython

X Title: unicodedata module needs way of accurately determining XID_START and XID_CONTINUE properties. · Issue #129117 · python/cpython

Description: Bug report Bug description: With the unicodedata module, it is possible to determine if a unicode character is a valid identifier start or identifier continuation character, but not in a few cases. The method is to look at unicodedata.ca...

Open Graph Description: Bug report Bug description: With the unicodedata module, it is possible to determine if a unicode character is a valid identifier start or identifier continuation character, but not in a few cases....

X Description: Bug report Bug description: With the unicodedata module, it is possible to determine if a unicode character is a valid identifier start or identifier continuation character, but not in a few cases....

Opengraph URL: https://github.com/python/cpython/issues/129117

X: @github

direct link

Domain: github.com


Hey, it has json ld scripts:
{"@context":"https://schema.org","@type":"DiscussionForumPosting","headline":"unicodedata module needs way of accurately determining XID_START and XID_CONTINUE properties.","articleBody":"# Bug report\n\n### Bug description:\n\nWith the `unicodedata` module, it is possible to determine if a unicode character is a valid identifier start or identifier continuation character, _but not in a few cases_.\nThe method is to look at `unicodedata.category(c)`.\nA start character has category in `\"Lu Ll Lt Lm Lo Nl Pc\".split()`.\nA continue character has category in `\"Lu Ll Lt Lm Lo Mn Mc Nd Nl Pc\".split()`.\n\nHowever, there are several codepoints which don't match these criteria, either because they are not that type of character or because their category is different.\nHere is a complete list of the exceptions, on Python 3.13 and Unicode version 16.0:\nShould be `XID_START` but are not:\n```\n005f Pc True LOW LINE\n037a Lm True GREEK YPOGEGRAMMENI\n0e33 Lo True THAI CHARACTER SARA AM\n0eb3 Lo True LAO VOWEL SIGN AM\n203f Pc True UNDERTIE\n2040 Pc True CHARACTER TIE\n2054 Pc True INVERTED UNDERTIE\n2e2f Lm True VERTICAL TILDE\nfc5e Lo True ARABIC LIGATURE SHADDA WITH DAMMATAN ISOLATED FORM\nfc5f Lo True ARABIC LIGATURE SHADDA WITH KASRATAN ISOLATED FORM\nfc60 Lo True ARABIC LIGATURE SHADDA WITH FATHA ISOLATED FORM\nfc61 Lo True ARABIC LIGATURE SHADDA WITH DAMMA ISOLATED FORM\nfc62 Lo True ARABIC LIGATURE SHADDA WITH KASRA ISOLATED FORM\nfc63 Lo True ARABIC LIGATURE SHADDA WITH SUPERSCRIPT ALEF ISOLATED FORM\nfdfa Lo True ARABIC LIGATURE SALLALLAHOU ALAYHE WASALLAM\nfdfb Lo True ARABIC LIGATURE JALLAJALALOUHOU\nfe33 Pc True PRESENTATION FORM FOR VERTICAL LOW LINE\nfe34 Pc True PRESENTATION FORM FOR VERTICAL WAVY LOW LINE\nfe4d Pc True DASHED LOW LINE\nfe4e Pc True CENTRELINE LOW LINE\nfe4f Pc True WAVY LOW LINE\nfe70 Lo True ARABIC FATHATAN ISOLATED FORM\nfe72 Lo True ARABIC DAMMATAN ISOLATED FORM\nfe74 Lo True ARABIC KASRATAN ISOLATED FORM\nfe76 Lo True ARABIC FATHA ISOLATED FORM\nfe78 Lo True ARABIC DAMMA ISOLATED FORM\nfe7a Lo True ARABIC KASRA ISOLATED FORM\nfe7c Lo True ARABIC SHADDA ISOLATED FORM\nfe7e Lo True ARABIC SUKUN ISOLATED FORM\nff3f Pc True FULLWIDTH LOW LINE\nff9e Lm True HALFWIDTH KATAKANA VOICED SOUND MARK\nff9f Lm True HALFWIDTH KATAKANA SEMI-VOICED SOUND MARK\n```\nShould not be `XID_START` but are:\n```\n1885 Mn False MONGOLIAN LETTER ALI GALI BALUDA\n1886 Mn False MONGOLIAN LETTER ALI GALI THREE BALUDA\n2118 Sm False SCRIPT CAPITAL P\n212e So False ESTIMATED SYMBOL\n```\nShould be `XID_CONTINUE` but are not:\n```\n037a Lm True GREEK YPOGEGRAMMENI\n2e2f Lm True VERTICAL TILDE\nfc5e Lo True ARABIC LIGATURE SHADDA WITH DAMMATAN ISOLATED FORM\nfc5f Lo True ARABIC LIGATURE SHADDA WITH KASRATAN ISOLATED FORM\nfc60 Lo True ARABIC LIGATURE SHADDA WITH FATHA ISOLATED FORM\nfc61 Lo True ARABIC LIGATURE SHADDA WITH DAMMA ISOLATED FORM\nfc62 Lo True ARABIC LIGATURE SHADDA WITH KASRA ISOLATED FORM\nfc63 Lo True ARABIC LIGATURE SHADDA WITH SUPERSCRIPT ALEF ISOLATED FORM\nfdfa Lo True ARABIC LIGATURE SALLALLAHOU ALAYHE WASALLAM\nfdfb Lo True ARABIC LIGATURE JALLAJALALOUHOU\nfe70 Lo True ARABIC FATHATAN ISOLATED FORM\nfe72 Lo True ARABIC DAMMATAN ISOLATED FORM\nfe74 Lo True ARABIC KASRATAN ISOLATED FORM\nfe76 Lo True ARABIC FATHA ISOLATED FORM\nfe78 Lo True ARABIC DAMMA ISOLATED FORM\nfe7a Lo True ARABIC KASRA ISOLATED FORM\nfe7c Lo True ARABIC SHADDA ISOLATED FORM\nfe7e Lo True ARABIC SUKUN ISOLATED FORM\n```\nShould not be `XID_CONTINUE` but are:\n```\n00b7 Po False MIDDLE DOT\n0387 Po False GREEK ANO TELEIA\n1369 No False ETHIOPIC DIGIT ONE\n136a No False ETHIOPIC DIGIT TWO\n136b No False ETHIOPIC DIGIT THREE\n136c No False ETHIOPIC DIGIT FOUR\n136d No False ETHIOPIC DIGIT FIVE\n136e No False ETHIOPIC DIGIT SIX\n136f No False ETHIOPIC DIGIT SEVEN\n1370 No False ETHIOPIC DIGIT EIGHT\n1371 No False ETHIOPIC DIGIT NINE\n19da No False NEW TAI LUE THAM DIGIT ONE\n200c Cf False ZERO WIDTH NON-JOINER\n200d Cf False ZERO WIDTH JOINER\n2118 Sm False SCRIPT CAPITAL P\n212e So False ESTIMATED SYMBOL\n30fb Po False KATAKANA MIDDLE DOT\nff65 Po False HALFWIDTH KATAKANA MIDDLE DOT\n```\n\nMany of these exceptions are specified in the UAX#31 Section 5.1, [NFKC Modifications](https://www.unicode.org/reports/tr31/tr31-41.html#NFKC_Modifications).\n\n### Proposal\nI suggest adding two functions to the module, `unicodedata.isidstart(chr)` and `unicodedata.isidcontinue(chr)`.  These return `True` if `chr` appears in the `DerivedCoreProperties.txt` file as `XID_Start` or  `XID_Continue`, _resp._\n\n\n### CPython versions tested on:\n\n3.13\n\n### Operating systems tested on:\n\nWindows\n\n\u003c!-- gh-linked-prs --\u003e\n### Linked PRs\n* gh-140269\n\u003c!-- /gh-linked-prs --\u003e\n","author":{"url":"https://github.com/mrolle45","@type":"Person","name":"mrolle45"},"datePublished":"2025-01-21T06:21:56.000Z","interactionStatistic":{"@type":"InteractionCounter","interactionType":"https://schema.org/CommentAction","userInteractionCount":3},"url":"https://github.com/129117/cpython/issues/129117"}

route-pattern/_view_fragments/issues/show/:user_id/:repository/:id/issue_layout(.:format)
route-controllervoltron_issues_fragments
route-actionissue_layout
fetch-noncev2:68c2c3a1-1138-1de8-8b26-16398a62b3b7
current-catalog-service-hash81bb79d38c15960b92d99bca9288a9108c7a47b18f2423d0f6438c5b7bcd2114
request-idE658:144C84:A42A59:E64C08:696A2621
html-safe-nonce71ee87de3efabdbd7a0f20a77d77d2279772146e78c0f7b1f176d3179c685040
visitor-payloadeyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiJFNjU4OjE0NEM4NDpBNDJBNTk6RTY0QzA4OjY5NkEyNjIxIiwidmlzaXRvcl9pZCI6IjczOTU5MTg4ODM3NjA0NzEzOCIsInJlZ2lvbl9lZGdlIjoiaWFkIiwicmVnaW9uX3JlbmRlciI6ImlhZCJ9
visitor-hmac3d16ba3c567025c09e81ba8f0cf679c44381c5841f2bcaf2cb1fbb8f5ab6ba71
hovercard-subject-tagissue:2800810826
github-keyboard-shortcutsrepository,issues,copilot
google-site-verificationApib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I
octolytics-urlhttps://collector.github.com/github/collect
analytics-location///voltron/issues_fragments/issue_layout
fb:app_id1401488693436528
apple-itunes-appapp-id=1477376905, app-argument=https://github.com/_view_fragments/issues/show/python/cpython/129117/issue_layout
twitter:imagehttps://opengraph.githubassets.com/b51ec3433929c6167834b57da566319497c5c908430d37082fb90b7b89a5fd10/python/cpython/issues/129117
twitter:cardsummary_large_image
og:imagehttps://opengraph.githubassets.com/b51ec3433929c6167834b57da566319497c5c908430d37082fb90b7b89a5fd10/python/cpython/issues/129117
og:image:altBug report Bug description: With the unicodedata module, it is possible to determine if a unicode character is a valid identifier start or identifier continuation character, but not in a few cases....
og:image:width1200
og:image:height600
og:site_nameGitHub
og:typeobject
og:author:usernamemrolle45
hostnamegithub.com
expected-hostnamegithub.com
Nonea1022f03e4f0d91ea173e4e5dac892c982e0588c62f1ce56121d755a320a3569
turbo-cache-controlno-preview
go-importgithub.com/python/cpython git https://github.com/python/cpython.git
octolytics-dimension-user_id1525981
octolytics-dimension-user_loginpython
octolytics-dimension-repository_id81598961
octolytics-dimension-repository_nwopython/cpython
octolytics-dimension-repository_publictrue
octolytics-dimension-repository_is_forkfalse
octolytics-dimension-repository_network_root_id81598961
octolytics-dimension-repository_network_root_nwopython/cpython
turbo-body-classeslogged-out env-production page-responsive
disable-turbofalse
browser-stats-urlhttps://api.github.com/_private/browser/stats
browser-errors-urlhttps://api.github.com/_private/browser/errors
releasef472b8e6c7b3fdd5d0354972a3f4c516289bf0be
ui-targetfull
theme-color#1e2327
color-schemelight dark

Links:

Skip to contenthttps://github.com/python/cpython/issues/129117#start-of-content
https://github.com/
Sign in https://github.com/login?return_to=https%3A%2F%2Fgithub.com%2Fpython%2Fcpython%2Fissues%2F129117
GitHub CopilotWrite better code with AIhttps://github.com/features/copilot
GitHub SparkBuild and deploy intelligent appshttps://github.com/features/spark
GitHub ModelsManage and compare promptshttps://github.com/features/models
MCP RegistryNewIntegrate external toolshttps://github.com/mcp
ActionsAutomate any workflowhttps://github.com/features/actions
CodespacesInstant dev environmentshttps://github.com/features/codespaces
IssuesPlan and track workhttps://github.com/features/issues
Code ReviewManage code changeshttps://github.com/features/code-review
GitHub Advanced SecurityFind and fix vulnerabilitieshttps://github.com/security/advanced-security
Code securitySecure your code as you buildhttps://github.com/security/advanced-security/code-security
Secret protectionStop leaks before they starthttps://github.com/security/advanced-security/secret-protection
Why GitHubhttps://github.com/why-github
Documentationhttps://docs.github.com
Bloghttps://github.blog
Changeloghttps://github.blog/changelog
Marketplacehttps://github.com/marketplace
View all featureshttps://github.com/features
Enterpriseshttps://github.com/enterprise
Small and medium teamshttps://github.com/team
Startupshttps://github.com/enterprise/startups
Nonprofitshttps://github.com/solutions/industry/nonprofits
App Modernizationhttps://github.com/solutions/use-case/app-modernization
DevSecOpshttps://github.com/solutions/use-case/devsecops
DevOpshttps://github.com/solutions/use-case/devops
CI/CDhttps://github.com/solutions/use-case/ci-cd
View all use caseshttps://github.com/solutions/use-case
Healthcarehttps://github.com/solutions/industry/healthcare
Financial serviceshttps://github.com/solutions/industry/financial-services
Manufacturinghttps://github.com/solutions/industry/manufacturing
Governmenthttps://github.com/solutions/industry/government
View all industrieshttps://github.com/solutions/industry
View all solutionshttps://github.com/solutions
AIhttps://github.com/resources/articles?topic=ai
Software Developmenthttps://github.com/resources/articles?topic=software-development
DevOpshttps://github.com/resources/articles?topic=devops
Securityhttps://github.com/resources/articles?topic=security
View all topicshttps://github.com/resources/articles
Customer storieshttps://github.com/customer-stories
Events & webinarshttps://github.com/resources/events
Ebooks & reportshttps://github.com/resources/whitepapers
Business insightshttps://github.com/solutions/executive-insights
GitHub Skillshttps://skills.github.com
Documentationhttps://docs.github.com
Customer supporthttps://support.github.com
Community forumhttps://github.com/orgs/community/discussions
Trust centerhttps://github.com/trust-center
Partnershttps://github.com/partners
GitHub SponsorsFund open source developershttps://github.com/sponsors
Security Labhttps://securitylab.github.com
Maintainer Communityhttps://maintainers.github.com
Acceleratorhttps://github.com/accelerator
Archive Programhttps://archiveprogram.github.com
Topicshttps://github.com/topics
Trendinghttps://github.com/trending
Collectionshttps://github.com/collections
Enterprise platformAI-powered developer platformhttps://github.com/enterprise
GitHub Advanced SecurityEnterprise-grade security featureshttps://github.com/security/advanced-security
Copilot for BusinessEnterprise-grade AI featureshttps://github.com/features/copilot/copilot-business
Premium SupportEnterprise-grade 24/7 supporthttps://github.com/premium-support
Pricinghttps://github.com/pricing
Search syntax tipshttps://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax
documentationhttps://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax
Sign in https://github.com/login?return_to=https%3A%2F%2Fgithub.com%2Fpython%2Fcpython%2Fissues%2F129117
Sign up https://github.com/signup?ref_cta=Sign+up&ref_loc=header+logged+out&ref_page=%2F%3Cuser-name%3E%2F%3Crepo-name%3E%2Fvoltron%2Fissues_fragments%2Fissue_layout&source=header-repo&source_repo=python%2Fcpython
Reloadhttps://github.com/python/cpython/issues/129117
Reloadhttps://github.com/python/cpython/issues/129117
Reloadhttps://github.com/python/cpython/issues/129117
python https://github.com/python
cpythonhttps://github.com/python/cpython
Please reload this pagehttps://github.com/python/cpython/issues/129117
Notifications https://github.com/login?return_to=%2Fpython%2Fcpython
Fork 33.9k https://github.com/login?return_to=%2Fpython%2Fcpython
Star 71.1k https://github.com/login?return_to=%2Fpython%2Fcpython
Code https://github.com/python/cpython
Issues 5k+ https://github.com/python/cpython/issues
Pull requests 2.1k https://github.com/python/cpython/pulls
Actions https://github.com/python/cpython/actions
Projects 31 https://github.com/python/cpython/projects
Security Uh oh! There was an error while loading. Please reload this page. https://github.com/python/cpython/security
Please reload this pagehttps://github.com/python/cpython/issues/129117
Insights https://github.com/python/cpython/pulse
Code https://github.com/python/cpython
Issues https://github.com/python/cpython/issues
Pull requests https://github.com/python/cpython/pulls
Actions https://github.com/python/cpython/actions
Projects https://github.com/python/cpython/projects
Security https://github.com/python/cpython/security
Insights https://github.com/python/cpython/pulse
New issuehttps://github.com/login?return_to=https://github.com/python/cpython/issues/129117
New issuehttps://github.com/login?return_to=https://github.com/python/cpython/issues/129117
unicodedata module needs way of accurately determining XID_START and XID_CONTINUE properties.https://github.com/python/cpython/issues/129117#top
extension-modulesC modules in the Modules dirhttps://github.com/python/cpython/issues?q=state%3Aopen%20label%3A%22extension-modules%22
topic-unicodehttps://github.com/python/cpython/issues?q=state%3Aopen%20label%3A%22topic-unicode%22
type-featureA feature request or enhancementhttps://github.com/python/cpython/issues?q=state%3Aopen%20label%3A%22type-feature%22
https://github.com/mrolle45
https://github.com/mrolle45
mrolle45https://github.com/mrolle45
on Jan 21, 2025https://github.com/python/cpython/issues/129117#issue-2800810826
NFKC Modificationshttps://www.unicode.org/reports/tr31/tr31-41.html#NFKC_Modifications
gh-129117: Expose _PyUnicode_IsXidContinue/Start in unicodedata #140269https://github.com/python/cpython/pull/140269
extension-modulesC modules in the Modules dirhttps://github.com/python/cpython/issues?q=state%3Aopen%20label%3A%22extension-modules%22
topic-unicodehttps://github.com/python/cpython/issues?q=state%3Aopen%20label%3A%22topic-unicode%22
type-featureA feature request or enhancementhttps://github.com/python/cpython/issues?q=state%3Aopen%20label%3A%22type-feature%22
https://github.com
Termshttps://docs.github.com/site-policy/github-terms/github-terms-of-service
Privacyhttps://docs.github.com/site-policy/privacy-policies/github-privacy-statement
Securityhttps://github.com/security
Statushttps://www.githubstatus.com/
Communityhttps://github.community/
Docshttps://docs.github.com/
Contacthttps://support.github.com?tags=dotcom-footer

Viewport: width=device-width


URLs of crawlers that visited me.