René's URL Explorer Experiment


Title: Unclear type mapping with both packb/unpackb, especially with mixed type sequences · Issue #281 · msgpack/msgpack-python · GitHub

Open Graph Title: Unclear type mapping with both packb/unpackb, especially with mixed type sequences · Issue #281 · msgpack/msgpack-python

X Title: Unclear type mapping with both packb/unpackb, especially with mixed type sequences · Issue #281 · msgpack/msgpack-python

Description: Problem: The current/packing unpacking situation is confusing and complex when it comes to dealing with the different binary and string types that will be packed to, or unpacked from, the MessagePack "str format family" and "bin format f...

Open Graph Description: Problem: The current/packing unpacking situation is confusing and complex when it comes to dealing with the different binary and string types that will be packed to, or unpacked from, the MessagePa...

X Description: Problem: The current/packing unpacking situation is confusing and complex when it comes to dealing with the different binary and string types that will be packed to, or unpacked from, the MessagePa...

Opengraph URL: https://github.com/msgpack/msgpack-python/issues/281

X: @github

direct link

Domain: github.com


Hey, it has json ld scripts:
{"@context":"https://schema.org","@type":"DiscussionForumPosting","headline":"Unclear type mapping with both packb/unpackb, especially with mixed type sequences","articleBody":"## Problem:\r\n\r\nThe current/packing unpacking situation is confusing and complex when it comes to dealing with the different binary and string types that will be packed to, or unpacked from, the MessagePack \"_str format family_\" and \"_bin format family_\" data types. It is difficult to determine the correct combination for a satisfactory type mapping in all situations.\r\n\r\nIn addition, the current `msgpack-python` (now `msgpack`) implementations do not have a solution (in either direction) for dealing with data containers that contain both string and binary data types.\r\n\r\nFor example (on the unpacking/deserialization side), the following byte sequence defines a MessagePack array that contains two elements: a unicode snowman character in utf-8, and an arbitrary byte sequence of [0x00, 0x01, 0x02]:\r\n\r\n```\r\ndata = b'\\x92\\xa3\\xe2\\x98\\x83\\xc4\\x03\\x00\\x01\\x02'\r\n```\r\nWhat possible combination of `msgpack.unpackb` kwargs can _properly_ unpack this to a two element list containing a python string and a suitable binary type (like `bytearray` in Python 2, and `bytes` in Python 3)? Conversely, how would you generate such a MessagePack structure from python (aside from direct generation as above)?\r\n\r\n## Proposal:\r\n\r\nRather than having a collection of effectivey global switches that can be sent to `packb`/`unpackb` (for example: `raw_as_bytes` and `use_bin_type`), it would be better if there were a method for defining an explicit typemap that would be used at a per-element level, and which defined the type mappings to use for both `packb` (from python to the MessagePack protocol) and `unpackb` (from MessagePack to python).\r\n\r\nFor example, it would be great if `msgpack.unpackb(data_bytes, typemap='ideal'`) would get the \"ideal\" behaviour I outline in the tables further below. When using the `typemap` switch, packing/unpacking could then work in a per-element way, rather than having issues with mixed-type sequences like the global switches currently do. Possible `typemap` values could be similar to in the columns defined further below: `('ideal', '0.4', '0.5', 'default')`, or somesuch, where `default` would be the default value, and currently equate to `typemap='0.5'`. It would also be illegal (raising `ValueError`) to specify kwargs like `raw_as_bytes` together with a `typemap` specification). I think this proposal will resolve any potential compatibility issues.\r\n\r\nThis `typemap` kwarg behaviour should be bidirectional. Specifically, there should also be a similar possibility for `msgpack.packb(data, typemap=`ideal`).\r\n\r\nIn addition (with the exception of python 2 `str` and `bytearray` ambiguity) it should _always_ be the case that `unpackb(packb(v)) == v`. This is currently _not_ true with available `msgpack` python versions.\r\n\r\n## Explicit Type Mapping Tables\r\n\r\nThe type mapping situation for msgpack bin and str (current, and \"ideal\") are covered in the tables below, covering the current situation for different `msgpack` versions, as well as my proposed `ideal` mapping.\r\n\r\nNote that, in the tables below:\r\n* PackType == `mp_str` refers to the \"str format family\", with leading (`101XXXXX`, `0xd9`, `0xda`, `0xdb`)\r\n* PackType == `mp_bin` refers to the \"bin format family\", with lead bits in (`0xc4`, `0xc5`, `0xc6`)\r\n* The \"PackType (_ideal_)\" column indicates what I personally think the accurate pack/unpack targets should be for each type.\r\n* `packb`/`unpackb` results for versions \"\u003c 0.5.x\" and \"\u003e= 0.5.x\" are what you get with default values for all global kwarg switches like `raw_as_bytes`\r\n* I have intentionally not referenced the types where the mapping is (in my opinion) extremely clear, for example:\r\n  * msgpack `nil format` ⇔ `None`\r\n  * msgpack `bool format` ⇔ `bool`\r\n  * msgpack `int format family` ⇔ `int`\r\n  * msgpack `float format family` ⇔ `float`\r\n  * msgpack `array format family` ⇔ `list`\r\n    * aside: I'd prefer an immutable tuple here, although map/dict targets have to be mutable so consistency is an issue\r\n  * msgpack `map format family` ⇔ `dict`\r\n\r\n### Python 2 packing/serialization behaviour\r\n| Python2 type | PackType (_ideal_) | PackType (\u003c0.5.x) | PackType (\u003e=0.5.x) | Comment |\r\n| --- | --- | --- | --- | --- |\r\n| `str`           | `mp_bin` | `mp_str` | `mp_str` | `str` is really bytes in python 2 |\r\n| `unicode`   | `mp_str` | `mp_str` | `mp_str` | Should always encode to utf8 |\r\n| `bytearray` | `mp_bin` | ERROR | `mp_str` | If this doesn't go to `mp_bin`, what does? See notes below on the ERROR case (which was good) |\r\n\r\n\\* In the case of the ERROR for `msgpack` \u003c 0.5.x, this was actually extremely useful, since it resulted an the `default` callback being invoked where you could specifically manage `bytearray` (since msgpack-python itself did not). Now that 0.5.x silently encodes `bytearray` to `mp_str`, this is no longer possible. This is actually the problem that triggered me to raise this issue (after an update to 0.5.x broke a build).\r\n\r\n### Python 2 unpacking/deserialization behaviour\r\n| Msgpack type | Python2 type (_ideal_) | Python2 type (\u003c0.5.x) | Python2 type (\u003e=0.5.x) | Comment |\r\n| --- | --- | --- | --- | --- |\r\n| `mp_str` | `unicode` | `str` | `str` | Should always decode assuming utf8 |\r\n| `mp_bin` | `str` | `str` | `str` |  `bytearray` would actually be a more literal/ideal unpack target, but is unfamiliar to most (and is mutable) |\r\n\r\n### Python 3 packing/serialization behaviour\r\n| Python3 type | PackType (_ideal_) | PackType (\u003c0.5.x) | PackType (\u003e=0.5.x) | Comment |\r\n| --- | --- | --- | --- | --- |\r\n| `str`        | `mp_str` | `mp_str` | `mp_str` | Always encode with utf8 |\r\n| `bytes`   | `mp_bin` | `mp_str` | `mp_str` |  |\r\n\r\n### Python 3 unpacking/deserialization behaviour\r\n| Python3 type | PackType (ideal) | PackType (\u003c0.5.x) | PackType (\u003e=0.5.x) | Comment |\r\n| --- | --- | --- | --- | --- |\r\n| `mp_str` | `str` | `bytes` | `bytes` | The default conversion to `bytes` is particularly confusing |\r\n| `mp_bin` | `bytes` | `bytes` | `bytes` |  |\r\n\r\n## References\r\nThere are some other issues related to this that are worth referencing here:\r\n* #191 -- regarding backwards compatibility issues moving towards 1.0\r\n* #99 -- an old issue about properly decoding string types\r\n* #224 -- an issue specifically about serializing `bytearray`\r\n* [msgpack #121](https://github.com/msgpack/msgpack/issues/121) -- an old/length issue (now closed) about differentiating between raw binary data and strings\r\n\r\n\r\nI decide to make a new issue, since this is a general proposal about A) explicitly clarifying type mapping in both directions between msgpack and python, and B) it explicitly covers both bin and str msgpack formats (in msgpack terminology) as well as both python str/bytes cases.\r\n \r\n","author":{"url":"https://github.com/rwarren","@type":"Person","name":"rwarren"},"datePublished":"2018-02-06T21:53:10.000Z","interactionStatistic":{"@type":"InteractionCounter","interactionType":"https://schema.org/CommentAction","userInteractionCount":3},"url":"https://github.com/281/msgpack-python/issues/281"}

route-pattern/_view_fragments/issues/show/:user_id/:repository/:id/issue_layout(.:format)
route-controllervoltron_issues_fragments
route-actionissue_layout
fetch-noncev2:d4cb4198-6143-e5bb-7568-3fb023db9842
current-catalog-service-hash81bb79d38c15960b92d99bca9288a9108c7a47b18f2423d0f6438c5b7bcd2114
request-idD2C6:39A024:1BFD02:2408DF:69742106
html-safe-nonced8cca7c1d604adea1a5279e205eb3e6cf50ef70957c28e816ba3d16f700182dc
visitor-payloadeyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiJEMkM2OjM5QTAyNDoxQkZEMDI6MjQwOERGOjY5NzQyMTA2IiwidmlzaXRvcl9pZCI6IjgxMTI5MDE2ODE1MzMxNjU4MzAiLCJyZWdpb25fZWRnZSI6ImlhZCIsInJlZ2lvbl9yZW5kZXIiOiJpYWQifQ==
visitor-hmac1778fb4dea963000b18f550d6f9581dbbb3a1ab9a3fed5fbd162e22d0dff316b
hovercard-subject-tagissue:294926069
github-keyboard-shortcutsrepository,issues,copilot
google-site-verificationApib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I
octolytics-urlhttps://collector.github.com/github/collect
analytics-location///voltron/issues_fragments/issue_layout
fb:app_id1401488693436528
apple-itunes-appapp-id=1477376905, app-argument=https://github.com/_view_fragments/issues/show/msgpack/msgpack-python/281/issue_layout
twitter:imagehttps://opengraph.githubassets.com/672de2f993db21195033d55f4b7469a83f069d2b2d90d8299f379cb72cc0cd37/msgpack/msgpack-python/issues/281
twitter:cardsummary_large_image
og:imagehttps://opengraph.githubassets.com/672de2f993db21195033d55f4b7469a83f069d2b2d90d8299f379cb72cc0cd37/msgpack/msgpack-python/issues/281
og:image:altProblem: The current/packing unpacking situation is confusing and complex when it comes to dealing with the different binary and string types that will be packed to, or unpacked from, the MessagePa...
og:image:width1200
og:image:height600
og:site_nameGitHub
og:typeobject
og:author:usernamerwarren
hostnamegithub.com
expected-hostnamegithub.com
None447dc9917c3d68d647a01abfdefe55ec7ee1785922136c1d8395dbb3ab6d57b9
turbo-cache-controlno-preview
go-importgithub.com/msgpack/msgpack-python git https://github.com/msgpack/msgpack-python.git
octolytics-dimension-user_id198264
octolytics-dimension-user_loginmsgpack
octolytics-dimension-repository_id2242705
octolytics-dimension-repository_nwomsgpack/msgpack-python
octolytics-dimension-repository_publictrue
octolytics-dimension-repository_is_forkfalse
octolytics-dimension-repository_network_root_id2242705
octolytics-dimension-repository_network_root_nwomsgpack/msgpack-python
turbo-body-classeslogged-out env-production page-responsive
disable-turbofalse
browser-stats-urlhttps://api.github.com/_private/browser/stats
browser-errors-urlhttps://api.github.com/_private/browser/errors
release8dad7bdfecbe3eaa97ac4e632d6b47e2b23e81d9
ui-targetfull
theme-color#1e2327
color-schemelight dark

Links:

Skip to contenthttps://github.com/msgpack/msgpack-python/issues/281#start-of-content
https://github.com/
Sign in https://github.com/login?return_to=https%3A%2F%2Fgithub.com%2Fmsgpack%2Fmsgpack-python%2Fissues%2F281
GitHub CopilotWrite better code with AIhttps://github.com/features/copilot
GitHub SparkBuild and deploy intelligent appshttps://github.com/features/spark
GitHub ModelsManage and compare promptshttps://github.com/features/models
MCP RegistryNewIntegrate external toolshttps://github.com/mcp
ActionsAutomate any workflowhttps://github.com/features/actions
CodespacesInstant dev environmentshttps://github.com/features/codespaces
IssuesPlan and track workhttps://github.com/features/issues
Code ReviewManage code changeshttps://github.com/features/code-review
GitHub Advanced SecurityFind and fix vulnerabilitieshttps://github.com/security/advanced-security
Code securitySecure your code as you buildhttps://github.com/security/advanced-security/code-security
Secret protectionStop leaks before they starthttps://github.com/security/advanced-security/secret-protection
Why GitHubhttps://github.com/why-github
Documentationhttps://docs.github.com
Bloghttps://github.blog
Changeloghttps://github.blog/changelog
Marketplacehttps://github.com/marketplace
View all featureshttps://github.com/features
Enterpriseshttps://github.com/enterprise
Small and medium teamshttps://github.com/team
Startupshttps://github.com/enterprise/startups
Nonprofitshttps://github.com/solutions/industry/nonprofits
App Modernizationhttps://github.com/solutions/use-case/app-modernization
DevSecOpshttps://github.com/solutions/use-case/devsecops
DevOpshttps://github.com/solutions/use-case/devops
CI/CDhttps://github.com/solutions/use-case/ci-cd
View all use caseshttps://github.com/solutions/use-case
Healthcarehttps://github.com/solutions/industry/healthcare
Financial serviceshttps://github.com/solutions/industry/financial-services
Manufacturinghttps://github.com/solutions/industry/manufacturing
Governmenthttps://github.com/solutions/industry/government
View all industrieshttps://github.com/solutions/industry
View all solutionshttps://github.com/solutions
AIhttps://github.com/resources/articles?topic=ai
Software Developmenthttps://github.com/resources/articles?topic=software-development
DevOpshttps://github.com/resources/articles?topic=devops
Securityhttps://github.com/resources/articles?topic=security
View all topicshttps://github.com/resources/articles
Customer storieshttps://github.com/customer-stories
Events & webinarshttps://github.com/resources/events
Ebooks & reportshttps://github.com/resources/whitepapers
Business insightshttps://github.com/solutions/executive-insights
GitHub Skillshttps://skills.github.com
Documentationhttps://docs.github.com
Customer supporthttps://support.github.com
Community forumhttps://github.com/orgs/community/discussions
Trust centerhttps://github.com/trust-center
Partnershttps://github.com/partners
GitHub SponsorsFund open source developershttps://github.com/sponsors
Security Labhttps://securitylab.github.com
Maintainer Communityhttps://maintainers.github.com
Acceleratorhttps://github.com/accelerator
Archive Programhttps://archiveprogram.github.com
Topicshttps://github.com/topics
Trendinghttps://github.com/trending
Collectionshttps://github.com/collections
Enterprise platformAI-powered developer platformhttps://github.com/enterprise
GitHub Advanced SecurityEnterprise-grade security featureshttps://github.com/security/advanced-security
Copilot for BusinessEnterprise-grade AI featureshttps://github.com/features/copilot/copilot-business
Premium SupportEnterprise-grade 24/7 supporthttps://github.com/premium-support
Pricinghttps://github.com/pricing
Search syntax tipshttps://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax
documentationhttps://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax
Sign in https://github.com/login?return_to=https%3A%2F%2Fgithub.com%2Fmsgpack%2Fmsgpack-python%2Fissues%2F281
Sign up https://github.com/signup?ref_cta=Sign+up&ref_loc=header+logged+out&ref_page=%2F%3Cuser-name%3E%2F%3Crepo-name%3E%2Fvoltron%2Fissues_fragments%2Fissue_layout&source=header-repo&source_repo=msgpack%2Fmsgpack-python
Reloadhttps://github.com/msgpack/msgpack-python/issues/281
Reloadhttps://github.com/msgpack/msgpack-python/issues/281
Reloadhttps://github.com/msgpack/msgpack-python/issues/281
msgpack https://github.com/msgpack
msgpack-pythonhttps://github.com/msgpack/msgpack-python
Notifications https://github.com/login?return_to=%2Fmsgpack%2Fmsgpack-python
Fork 238 https://github.com/login?return_to=%2Fmsgpack%2Fmsgpack-python
Star 2.1k https://github.com/login?return_to=%2Fmsgpack%2Fmsgpack-python
Code https://github.com/msgpack/msgpack-python
Issues 3 https://github.com/msgpack/msgpack-python/issues
Pull requests 3 https://github.com/msgpack/msgpack-python/pulls
Discussions https://github.com/msgpack/msgpack-python/discussions
Actions https://github.com/msgpack/msgpack-python/actions
Security 0 https://github.com/msgpack/msgpack-python/security
Insights https://github.com/msgpack/msgpack-python/pulse
Code https://github.com/msgpack/msgpack-python
Issues https://github.com/msgpack/msgpack-python/issues
Pull requests https://github.com/msgpack/msgpack-python/pulls
Discussions https://github.com/msgpack/msgpack-python/discussions
Actions https://github.com/msgpack/msgpack-python/actions
Security https://github.com/msgpack/msgpack-python/security
Insights https://github.com/msgpack/msgpack-python/pulse
New issuehttps://github.com/login?return_to=https://github.com/msgpack/msgpack-python/issues/281
New issuehttps://github.com/login?return_to=https://github.com/msgpack/msgpack-python/issues/281
Unclear type mapping with both packb/unpackb, especially with mixed type sequenceshttps://github.com/msgpack/msgpack-python/issues/281#top
https://github.com/rwarren
https://github.com/rwarren
rwarrenhttps://github.com/rwarren
on Feb 6, 2018https://github.com/msgpack/msgpack-python/issues/281#issue-294926069
Backward incompatible API change toward 1.0 #191https://github.com/msgpack/msgpack-python/issues/191
unpack should decode string types by default #99https://github.com/msgpack/msgpack-python/issues/99
can't serialize bytearray #224https://github.com/msgpack/msgpack-python/issues/224
msgpack #121https://github.com/msgpack/msgpack/issues/121
https://github.com
Termshttps://docs.github.com/site-policy/github-terms/github-terms-of-service
Privacyhttps://docs.github.com/site-policy/privacy-policies/github-privacy-statement
Securityhttps://github.com/security
Statushttps://www.githubstatus.com/
Communityhttps://github.community/
Docshttps://docs.github.com/
Contacthttps://support.github.com?tags=dotcom-footer

Viewport: width=device-width


URLs of crawlers that visited me.