René's URL Explorer Experiment


Title: Refresh big5hkscs mapping to HKSCS-2016 · Issue #93271 · python/cpython · GitHub

Open Graph Title: Refresh big5hkscs mapping to HKSCS-2016 · Issue #93271 · python/cpython

X Title: Refresh big5hkscs mapping to HKSCS-2016 · Issue #93271 · python/cpython

Description: While working on #84508 I noticed that the mapping for big5hkscs codec has not been updated in a while. The current version in CPython reflects the Big-5 mappings for HKSCS-2004. Since then, there have been some updates: HKSCS-2008 adds ...

Open Graph Description: While working on #84508 I noticed that the mapping for big5hkscs codec has not been updated in a while. The current version in CPython reflects the Big-5 mappings for HKSCS-2004. Since then, there ...

X Description: While working on #84508 I noticed that the mapping for big5hkscs codec has not been updated in a while. The current version in CPython reflects the Big-5 mappings for HKSCS-2004. Since then, there ...

Opengraph URL: https://github.com/python/cpython/issues/93271

X: @github

direct link

Domain: github.com


Hey, it has json ld scripts:
{"@context":"https://schema.org","@type":"DiscussionForumPosting","headline":"Refresh big5hkscs mapping to HKSCS-2016","articleBody":"While working on #84508 I noticed that the mapping for `big5hkscs` codec has not been updated in a while. The current version in CPython reflects the Big-5 mappings for *HKSCS-2004*.\r\n\r\nSince then, there have been [some updates](https://www.ccli.gov.hk/en/hkscs/what_is_hkscs.html):\r\n\r\n- *HKSCS-2008* adds 68 code points to the Big-5 encoding scheme\r\n- *HKSCS-2016* adds no code points to Big-5 (it's Unicode-only), but since new characters have been added to Unicode, the mapping can change\r\n- after 2016, at least one mapped code point has been changed in an amendment\r\n\r\nI can update the script and generate the mapping using the latest data available on the [CCLI](https://www.ccli.gov.hk/en/download/) website, since I was already looking into this.\r\n\r\nIf we care about refreshing `big5hkscs` at all, there are a couple questions about compatibility. In case mapping a Big-5 code X used to map to Unicode code point A (in HKSCS-2004), and is changed to map to B (in later versions):\r\n\r\n1) should we: decode X to A, or to B?\r\n\r\n2) should we: encode B to X, A to X, or both?\r\n\r\n---\r\n\r\nE.g. right now the Big-5 sequence 9D73 round-trips:\r\n\r\n```python\r\n\u003e\u003e\u003e x = bytes.fromhex('9D73')\r\n\u003e\u003e\u003e x.decode('big5hkscs') == '\\u4ca4'\r\nTrue\r\n\u003e\u003e\u003e '\\u4ca4'.encode('big5hkscs') == x\r\nTrue\r\n```\r\n\r\nIf we followed the new HKSCS-2016 mapping with no compatibility provisions, this round-trip would instead go through the newly mapped character `\\u9fd0`. This might be fine for some users, but it might break compatibility for others. So the questions are about what kind of compatibility we want to guarantee.\r\n\r\n---\r\n\r\nRelated question which should not block this issue. For the web platform, WHATWG defines a [Big5 encoding](https://encoding.spec.whatwg.org/#legacy-multi-byte-chinese-(traditional)-encodings) which includes HKSCS extensions, and already overlaps 99% with `big5hkscs`, but is incompatible in some cases. Since one of the users of the CPython CJK codecs is html5lib, this means that html5lib does not comply with the web platform specifications. Should CPython be concerned with this, since it already provides the codec and the mapping tables, and it could provide a web-compatible codec with just a few fixups? Or does this belong in third-party libraries?","author":{"url":"https://github.com/sorcio","@type":"Person","name":"sorcio"},"datePublished":"2022-05-26T20:13:44.000Z","interactionStatistic":{"@type":"InteractionCounter","interactionType":"https://schema.org/CommentAction","userInteractionCount":3},"url":"https://github.com/93271/cpython/issues/93271"}

route-pattern/_view_fragments/issues/show/:user_id/:repository/:id/issue_layout(.:format)
route-controllervoltron_issues_fragments
route-actionissue_layout
fetch-noncev2:b652ade6-a9a2-1277-b687-a01243cc7859
current-catalog-service-hash81bb79d38c15960b92d99bca9288a9108c7a47b18f2423d0f6438c5b7bcd2114
request-idABAE:345820:10CC2D0:1720823:6969FB93
html-safe-nonceb433b5379b670d2566b0f4e5e081c73c7ebaaa1cea22e109b6746079d2e24b69
visitor-payloadeyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiJBQkFFOjM0NTgyMDoxMENDMkQwOjE3MjA4MjM6Njk2OUZCOTMiLCJ2aXNpdG9yX2lkIjoiMTkwMzEzOTU2NzI3NDk0OTUyMyIsInJlZ2lvbl9lZGdlIjoiaWFkIiwicmVnaW9uX3JlbmRlciI6ImlhZCJ9
visitor-hmac526f075dd5427ff3186ab48b66c95da4b93097d678a9fb338ddc02c4a8bcc21f
hovercard-subject-tagissue:1250003822
github-keyboard-shortcutsrepository,issues,copilot
google-site-verificationApib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I
octolytics-urlhttps://collector.github.com/github/collect
analytics-location///voltron/issues_fragments/issue_layout
fb:app_id1401488693436528
apple-itunes-appapp-id=1477376905, app-argument=https://github.com/_view_fragments/issues/show/python/cpython/93271/issue_layout
twitter:imagehttps://opengraph.githubassets.com/78c9f0bf185840aa2ef547900387f9fadff154b4898884acb3c4b54d53d631fb/python/cpython/issues/93271
twitter:cardsummary_large_image
og:imagehttps://opengraph.githubassets.com/78c9f0bf185840aa2ef547900387f9fadff154b4898884acb3c4b54d53d631fb/python/cpython/issues/93271
og:image:altWhile working on #84508 I noticed that the mapping for big5hkscs codec has not been updated in a while. The current version in CPython reflects the Big-5 mappings for HKSCS-2004. Since then, there ...
og:image:width1200
og:image:height600
og:site_nameGitHub
og:typeobject
og:author:usernamesorcio
hostnamegithub.com
expected-hostnamegithub.com
None7b32f1c7c4549428ee399213e8345494fc55b5637195d3fc5f493657579235e8
turbo-cache-controlno-preview
go-importgithub.com/python/cpython git https://github.com/python/cpython.git
octolytics-dimension-user_id1525981
octolytics-dimension-user_loginpython
octolytics-dimension-repository_id81598961
octolytics-dimension-repository_nwopython/cpython
octolytics-dimension-repository_publictrue
octolytics-dimension-repository_is_forkfalse
octolytics-dimension-repository_network_root_id81598961
octolytics-dimension-repository_network_root_nwopython/cpython
turbo-body-classeslogged-out env-production page-responsive
disable-turbofalse
browser-stats-urlhttps://api.github.com/_private/browser/stats
browser-errors-urlhttps://api.github.com/_private/browser/errors
releasebdde15ad1b403e23b08bbd89b53fbe6bdf688cad
ui-targetfull
theme-color#1e2327
color-schemelight dark

Links:

Skip to contenthttps://github.com/python/cpython/issues/93271#start-of-content
https://github.com/
Sign in https://github.com/login?return_to=https%3A%2F%2Fgithub.com%2Fpython%2Fcpython%2Fissues%2F93271
GitHub CopilotWrite better code with AIhttps://github.com/features/copilot
GitHub SparkBuild and deploy intelligent appshttps://github.com/features/spark
GitHub ModelsManage and compare promptshttps://github.com/features/models
MCP RegistryNewIntegrate external toolshttps://github.com/mcp
ActionsAutomate any workflowhttps://github.com/features/actions
CodespacesInstant dev environmentshttps://github.com/features/codespaces
IssuesPlan and track workhttps://github.com/features/issues
Code ReviewManage code changeshttps://github.com/features/code-review
GitHub Advanced SecurityFind and fix vulnerabilitieshttps://github.com/security/advanced-security
Code securitySecure your code as you buildhttps://github.com/security/advanced-security/code-security
Secret protectionStop leaks before they starthttps://github.com/security/advanced-security/secret-protection
Why GitHubhttps://github.com/why-github
Documentationhttps://docs.github.com
Bloghttps://github.blog
Changeloghttps://github.blog/changelog
Marketplacehttps://github.com/marketplace
View all featureshttps://github.com/features
Enterpriseshttps://github.com/enterprise
Small and medium teamshttps://github.com/team
Startupshttps://github.com/enterprise/startups
Nonprofitshttps://github.com/solutions/industry/nonprofits
App Modernizationhttps://github.com/solutions/use-case/app-modernization
DevSecOpshttps://github.com/solutions/use-case/devsecops
DevOpshttps://github.com/solutions/use-case/devops
CI/CDhttps://github.com/solutions/use-case/ci-cd
View all use caseshttps://github.com/solutions/use-case
Healthcarehttps://github.com/solutions/industry/healthcare
Financial serviceshttps://github.com/solutions/industry/financial-services
Manufacturinghttps://github.com/solutions/industry/manufacturing
Governmenthttps://github.com/solutions/industry/government
View all industrieshttps://github.com/solutions/industry
View all solutionshttps://github.com/solutions
AIhttps://github.com/resources/articles?topic=ai
Software Developmenthttps://github.com/resources/articles?topic=software-development
DevOpshttps://github.com/resources/articles?topic=devops
Securityhttps://github.com/resources/articles?topic=security
View all topicshttps://github.com/resources/articles
Customer storieshttps://github.com/customer-stories
Events & webinarshttps://github.com/resources/events
Ebooks & reportshttps://github.com/resources/whitepapers
Business insightshttps://github.com/solutions/executive-insights
GitHub Skillshttps://skills.github.com
Documentationhttps://docs.github.com
Customer supporthttps://support.github.com
Community forumhttps://github.com/orgs/community/discussions
Trust centerhttps://github.com/trust-center
Partnershttps://github.com/partners
GitHub SponsorsFund open source developershttps://github.com/sponsors
Security Labhttps://securitylab.github.com
Maintainer Communityhttps://maintainers.github.com
Acceleratorhttps://github.com/accelerator
Archive Programhttps://archiveprogram.github.com
Topicshttps://github.com/topics
Trendinghttps://github.com/trending
Collectionshttps://github.com/collections
Enterprise platformAI-powered developer platformhttps://github.com/enterprise
GitHub Advanced SecurityEnterprise-grade security featureshttps://github.com/security/advanced-security
Copilot for BusinessEnterprise-grade AI featureshttps://github.com/features/copilot/copilot-business
Premium SupportEnterprise-grade 24/7 supporthttps://github.com/premium-support
Pricinghttps://github.com/pricing
Search syntax tipshttps://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax
documentationhttps://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax
Sign in https://github.com/login?return_to=https%3A%2F%2Fgithub.com%2Fpython%2Fcpython%2Fissues%2F93271
Sign up https://github.com/signup?ref_cta=Sign+up&ref_loc=header+logged+out&ref_page=%2F%3Cuser-name%3E%2F%3Crepo-name%3E%2Fvoltron%2Fissues_fragments%2Fissue_layout&source=header-repo&source_repo=python%2Fcpython
Reloadhttps://github.com/python/cpython/issues/93271
Reloadhttps://github.com/python/cpython/issues/93271
Reloadhttps://github.com/python/cpython/issues/93271
python https://github.com/python
cpythonhttps://github.com/python/cpython
Please reload this pagehttps://github.com/python/cpython/issues/93271
Notifications https://github.com/login?return_to=%2Fpython%2Fcpython
Fork 33.9k https://github.com/login?return_to=%2Fpython%2Fcpython
Star 71.1k https://github.com/login?return_to=%2Fpython%2Fcpython
Code https://github.com/python/cpython
Issues 5k+ https://github.com/python/cpython/issues
Pull requests 2.1k https://github.com/python/cpython/pulls
Actions https://github.com/python/cpython/actions
Projects 31 https://github.com/python/cpython/projects
Security Uh oh! There was an error while loading. Please reload this page. https://github.com/python/cpython/security
Please reload this pagehttps://github.com/python/cpython/issues/93271
Insights https://github.com/python/cpython/pulse
Code https://github.com/python/cpython
Issues https://github.com/python/cpython/issues
Pull requests https://github.com/python/cpython/pulls
Actions https://github.com/python/cpython/actions
Projects https://github.com/python/cpython/projects
Security https://github.com/python/cpython/security
Insights https://github.com/python/cpython/pulse
New issuehttps://github.com/login?return_to=https://github.com/python/cpython/issues/93271
New issuehttps://github.com/login?return_to=https://github.com/python/cpython/issues/93271
Refresh big5hkscs mapping to HKSCS-2016https://github.com/python/cpython/issues/93271#top
https://github.com/corona10
extension-modulesC modules in the Modules dirhttps://github.com/python/cpython/issues?q=state%3Aopen%20label%3A%22extension-modules%22
topic-unicodehttps://github.com/python/cpython/issues?q=state%3Aopen%20label%3A%22topic-unicode%22
type-featureA feature request or enhancementhttps://github.com/python/cpython/issues?q=state%3Aopen%20label%3A%22type-feature%22
https://github.com/sorcio
https://github.com/sorcio
sorciohttps://github.com/sorcio
on May 26, 2022https://github.com/python/cpython/issues/93271#issue-1250003822
#84508https://github.com/python/cpython/issues/84508
some updateshttps://www.ccli.gov.hk/en/hkscs/what_is_hkscs.html
CCLIhttps://www.ccli.gov.hk/en/download/
Big5 encodinghttps://encoding.spec.whatwg.org/#legacy-multi-byte-chinese-(traditional)-encodings
corona10https://github.com/corona10
extension-modulesC modules in the Modules dirhttps://github.com/python/cpython/issues?q=state%3Aopen%20label%3A%22extension-modules%22
topic-unicodehttps://github.com/python/cpython/issues?q=state%3Aopen%20label%3A%22topic-unicode%22
type-featureA feature request or enhancementhttps://github.com/python/cpython/issues?q=state%3Aopen%20label%3A%22type-feature%22
https://github.com
Termshttps://docs.github.com/site-policy/github-terms/github-terms-of-service
Privacyhttps://docs.github.com/site-policy/privacy-policies/github-privacy-statement
Securityhttps://github.com/security
Statushttps://www.githubstatus.com/
Communityhttps://github.community/
Docshttps://docs.github.com/
Contacthttps://support.github.com?tags=dotcom-footer

Viewport: width=device-width


URLs of crawlers that visited me.