René's URL Explorer Experiment
Title: feat: i/o for DocVec by JohannesMessner · Pull Request #1562 · docarray/docarray · GitHub
Open Graph Title: feat: i/o for DocVec by JohannesMessner · Pull Request #1562 · docarray/docarray
X Title: feat: i/o for DocVec by JohannesMessner · Pull Request #1562 · docarray/docarray
Description: This PR implements the IOMixinArray for DocVec, enabling the following (de)serializations:
bytes
csv
json
pandas dataframe
proto (#1639)
tensor_type for proto (#1645)
save/load
base64
Documentation
Depends on #1561 (protobuf serialization bug)
JSON data now contains four columns, like this:
{"tensor_columns":{"tens":[[0.18736880416567359,0.9418134463574087,0.23608414651162635,0.200237847284912,0.714415793571785],[0.2980003926683161,0.9763157895635218,0.605654616422409,0.3988857003524561,0.9359414534852715],[0.6382713675712178,0.7725828308842624,0.1442529379493993,0.9327649675846935,0.7312168445052849],[0.07432756375020888,0.84017563415619,0.895668448511279,0.636209667658674,0.299134816130166],[0.1693478076528051,0.039113459980241405,0.3005636243582387,0.06480147049887997,0.09461133654601517]],"tens_none":null},"doc_columns":{"inner":{"tensor_columns":{"tens":[[0.27449474647200733,0.10957290479279247,0.6840416906244843,0.02922130361087716,0.1610742166462915],[0.27449474647200733,0.10957290479279247,0.6840416906244843,0.02922130361087716,0.1610742166462915],[0.27449474647200733,0.10957290479279247,0.6840416906244843,0.02922130361087716,0.1610742166462915],[0.27449474647200733,0.10957290479279247,0.6840416906244843,0.02922130361087716,0.1610742166462915],[0.27449474647200733,0.10957290479279247,0.6840416906244843,0.02922130361087716,0.1610742166462915]]},"doc_columns":{},"docs_vec_columns":{"tens":[[0.27449474647200733,0.10957290479279247,0.6840416906244843,0.02922130361087716,0.1610742166462915],[0.27449474647200733,0.10957290479279247,0.6840416906244843,0.02922130361087716,0.1610742166462915],[0.27449474647200733,0.10957290479279247,0.6840416906244843,0.02922130361087716,0.1610742166462915],[0.27449474647200733,0.10957290479279247,0.6840416906244843,0.02922130361087716,0.1610742166462915],[0.27449474647200733,0.10957290479279247,0.6840416906244843,0.02922130361087716,0.1610742166462915]]},"any_columns":{"id":["1fa4159e51fdfa029b2e2bbb45be0e58","1fa4159e51fdfa029b2e2bbb45be0e58","1fa4159e51fdfa029b2e2bbb45be0e58","1fa4159e51fdfa029b2e2bbb45be0e58","1fa4159e51fdfa029b2e2bbb45be0e58"]}},"inner_none":null},"docs_vec_columns":{"tens":[[0.18736880416567359,0.9418134463574087,0.23608414651162635,0.200237847284912,0.714415793571785],[0.2980003926683161,0.9763157895635218,0.605654616422409,0.3988857003524561,0.9359414534852715],[0.6382713675712178,0.7725828308842624,0.1442529379493993,0.9327649675846935,0.7312168445052849],[0.07432756375020888,0.84017563415619,0.895668448511279,0.636209667658674,0.299134816130166],[0.1693478076528051,0.039113459980241405,0.3005636243582387,0.06480147049887997,0.09461133654601517]],"tens_none":null},"any_columns":{"id":["5f2575c95a52380be1a9a68e26cd19c5","a5015ccad05d1b7ead1dce92650e5e8f","f14b45ea6384f235f990f40cddf39444","fe0004643fb7aadb7813438727f42dd1","489f382c9887c3fe3ef075a735722ff4"],"text":["0","1","2","3","4"],"num":[null,null,null,null,null]}}
CSV is not supported, since it is a row-based format.
All other formats are used the same way as in DocList, but have a different serialized representation.
Closes #1330
Open Graph Description: This PR implements the IOMixinArray for DocVec, enabling the following (de)serializations:
bytes
csv
json
pandas dataframe
proto (#1639)
tensor_type for proto (#1645)
save/load
base64
...
X Description: This PR implements the IOMixinArray for DocVec, enabling the following (de)serializations:
bytes
csv
json
pandas dataframe
proto (#1639)
tensor_type for proto (#1645)
save/load
base64
...
Opengraph URL: https://github.com/docarray/docarray/pull/1562
X: @github
direct link
Domain: github.com
| route-pattern | /:user_id/:repository/pull/:id/commits/:range(.:format) |
| route-controller | pull_requests |
| route-action | commits |
| fetch-nonce | v2:f6938f27-cbbb-79e7-16bc-ab458f65f96a |
| current-catalog-service-hash | ae870bc5e265a340912cde392f23dad3671a0a881730ffdadd82f2f57d81641b |
| request-id | D226:F611B:D624A9:11C1214:6994F3A0 |
| html-safe-nonce | db7dff76e8bd1961db4c36c0e72b5341afd3c8e1fa20cf54ebe59d0d0fc90f4d |
| visitor-payload | eyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiJEMjI2OkY2MTFCOkQ2MjRBOToxMUMxMjE0OjY5OTRGM0EwIiwidmlzaXRvcl9pZCI6Ijg1MDkyMDk1OTI1MjEzNTYxOTIiLCJyZWdpb25fZWRnZSI6ImlhZCIsInJlZ2lvbl9yZW5kZXIiOiJpYWQifQ== |
| visitor-hmac | 9f8b3dc0f9919b4ee09b5d189c7e7c4ece649df2981d362428d81441f495311d |
| hovercard-subject-tag | pull_request:1359491454 |
| github-keyboard-shortcuts | repository,pull-request-list,pull-request-conversation,pull-request-files-changed,copilot |
| google-site-verification | Apib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I |
| octolytics-url | https://collector.github.com/github/collect |
| analytics-location | ///pull_requests/show/commits |
| fb:app_id | 1401488693436528 |
| apple-itunes-app | app-id=1477376905, app-argument=https://github.com/docarray/docarray/pull/1562/commits/f83fb4f7ac73bd1cd7a1ae573cbacea83e7affa5 |
| twitter:image | https://avatars.githubusercontent.com/u/44071807?s=400&v=4 |
| twitter:card | summary_large_image |
| og:image | https://avatars.githubusercontent.com/u/44071807?s=400&v=4 |
| og:image:alt | This PR implements the IOMixinArray for DocVec, enabling the following (de)serializations:
bytes
csv
json
pandas dataframe
proto (#1639)
tensor_type for proto (#1645)
save/load
base64
... |
| og:site_name | GitHub |
| og:type | object |
| hostname | github.com |
| expected-hostname | github.com |
| None | 4bd759bc5f83244e2a0de29b937365905c0fefd238b6f077c24a49830375b4df |
| turbo-cache-control | no-preview |
| diff-view | unified |
| go-import | github.com/docarray/docarray git https://github.com/docarray/docarray.git |
| octolytics-dimension-user_id | 117445116 |
| octolytics-dimension-user_login | docarray |
| octolytics-dimension-repository_id | 438303578 |
| octolytics-dimension-repository_nwo | docarray/docarray |
| octolytics-dimension-repository_public | true |
| octolytics-dimension-repository_is_fork | false |
| octolytics-dimension-repository_network_root_id | 438303578 |
| octolytics-dimension-repository_network_root_nwo | docarray/docarray |
| turbo-body-classes | logged-out env-production page-responsive |
| disable-turbo | false |
| browser-stats-url | https://api.github.com/_private/browser/stats |
| browser-errors-url | https://api.github.com/_private/browser/errors |
| release | 48c364f96c0b8ac95ea493e8f02d59e2a5af92a3 |
| ui-target | canary-1 |
| theme-color | #1e2327 |
| color-scheme | light dark |
Links:
| Skip to content | https://github.com/docarray/docarray/pull/1562/commits/f83fb4f7ac73bd1cd7a1ae573cbacea83e7affa5#start-of-content |
|
| https://github.com/ |
|
Sign in
| https://github.com/login?return_to=https%3A%2F%2Fgithub.com%2Fdocarray%2Fdocarray%2Fpull%2F1562%2Fcommits%2Ff83fb4f7ac73bd1cd7a1ae573cbacea83e7affa5 |
| GitHub CopilotWrite better code with AI | https://github.com/features/copilot |
| GitHub SparkBuild and deploy intelligent apps | https://github.com/features/spark |
| GitHub ModelsManage and compare prompts | https://github.com/features/models |
| MCP RegistryNewIntegrate external tools | https://github.com/mcp |
| ActionsAutomate any workflow | https://github.com/features/actions |
| CodespacesInstant dev environments | https://github.com/features/codespaces |
| IssuesPlan and track work | https://github.com/features/issues |
| Code ReviewManage code changes | https://github.com/features/code-review |
| GitHub Advanced SecurityFind and fix vulnerabilities | https://github.com/security/advanced-security |
| Code securitySecure your code as you build | https://github.com/security/advanced-security/code-security |
| Secret protectionStop leaks before they start | https://github.com/security/advanced-security/secret-protection |
| Why GitHub | https://github.com/why-github |
| Documentation | https://docs.github.com |
| Blog | https://github.blog |
| Changelog | https://github.blog/changelog |
| Marketplace | https://github.com/marketplace |
| View all features | https://github.com/features |
| Enterprises | https://github.com/enterprise |
| Small and medium teams | https://github.com/team |
| Startups | https://github.com/enterprise/startups |
| Nonprofits | https://github.com/solutions/industry/nonprofits |
| App Modernization | https://github.com/solutions/use-case/app-modernization |
| DevSecOps | https://github.com/solutions/use-case/devsecops |
| DevOps | https://github.com/solutions/use-case/devops |
| CI/CD | https://github.com/solutions/use-case/ci-cd |
| View all use cases | https://github.com/solutions/use-case |
| Healthcare | https://github.com/solutions/industry/healthcare |
| Financial services | https://github.com/solutions/industry/financial-services |
| Manufacturing | https://github.com/solutions/industry/manufacturing |
| Government | https://github.com/solutions/industry/government |
| View all industries | https://github.com/solutions/industry |
| View all solutions | https://github.com/solutions |
| AI | https://github.com/resources/articles?topic=ai |
| Software Development | https://github.com/resources/articles?topic=software-development |
| DevOps | https://github.com/resources/articles?topic=devops |
| Security | https://github.com/resources/articles?topic=security |
| View all topics | https://github.com/resources/articles |
| Customer stories | https://github.com/customer-stories |
| Events & webinars | https://github.com/resources/events |
| Ebooks & reports | https://github.com/resources/whitepapers |
| Business insights | https://github.com/solutions/executive-insights |
| GitHub Skills | https://skills.github.com |
| Documentation | https://docs.github.com |
| Customer support | https://support.github.com |
| Community forum | https://github.com/orgs/community/discussions |
| Trust center | https://github.com/trust-center |
| Partners | https://github.com/partners |
| GitHub SponsorsFund open source developers | https://github.com/sponsors |
| Security Lab | https://securitylab.github.com |
| Maintainer Community | https://maintainers.github.com |
| Accelerator | https://github.com/accelerator |
| Archive Program | https://archiveprogram.github.com |
| Topics | https://github.com/topics |
| Trending | https://github.com/trending |
| Collections | https://github.com/collections |
| Enterprise platformAI-powered developer platform | https://github.com/enterprise |
| GitHub Advanced SecurityEnterprise-grade security features | https://github.com/security/advanced-security |
| Copilot for BusinessEnterprise-grade AI features | https://github.com/features/copilot/copilot-business |
| Premium SupportEnterprise-grade 24/7 support | https://github.com/premium-support |
| Pricing | https://github.com/pricing |
| Search syntax tips | https://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax |
| documentation | https://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax |
|
Sign in
| https://github.com/login?return_to=https%3A%2F%2Fgithub.com%2Fdocarray%2Fdocarray%2Fpull%2F1562%2Fcommits%2Ff83fb4f7ac73bd1cd7a1ae573cbacea83e7affa5 |
|
Sign up
| https://github.com/signup?ref_cta=Sign+up&ref_loc=header+logged+out&ref_page=%2F%3Cuser-name%3E%2F%3Crepo-name%3E%2Fpull_requests%2Fshow%2Fcommits&source=header-repo&source_repo=docarray%2Fdocarray |
| Reload | https://github.com/docarray/docarray/pull/1562/commits/f83fb4f7ac73bd1cd7a1ae573cbacea83e7affa5 |
| Reload | https://github.com/docarray/docarray/pull/1562/commits/f83fb4f7ac73bd1cd7a1ae573cbacea83e7affa5 |
| Reload | https://github.com/docarray/docarray/pull/1562/commits/f83fb4f7ac73bd1cd7a1ae573cbacea83e7affa5 |
|
docarray
| https://github.com/docarray |
| docarray | https://github.com/docarray/docarray |
|
Notifications
| https://github.com/login?return_to=%2Fdocarray%2Fdocarray |
|
Fork
235
| https://github.com/login?return_to=%2Fdocarray%2Fdocarray |
|
Star
3.1k
| https://github.com/login?return_to=%2Fdocarray%2Fdocarray |
|
Code
| https://github.com/docarray/docarray |
|
Issues
68
| https://github.com/docarray/docarray/issues |
|
Pull requests
37
| https://github.com/docarray/docarray/pulls |
|
Discussions
| https://github.com/docarray/docarray/discussions |
|
Actions
| https://github.com/docarray/docarray/actions |
|
Security
0
| https://github.com/docarray/docarray/security |
|
Insights
| https://github.com/docarray/docarray/pulse |
|
Code
| https://github.com/docarray/docarray |
|
Issues
| https://github.com/docarray/docarray/issues |
|
Pull requests
| https://github.com/docarray/docarray/pulls |
|
Discussions
| https://github.com/docarray/docarray/discussions |
|
Actions
| https://github.com/docarray/docarray/actions |
|
Security
| https://github.com/docarray/docarray/security |
|
Insights
| https://github.com/docarray/docarray/pulse |
| Sign up for GitHub
| https://github.com/signup?return_to=%2Fdocarray%2Fdocarray%2Fissues%2Fnew%2Fchoose |
| terms of service | https://docs.github.com/terms |
| privacy statement | https://docs.github.com/privacy |
| Sign in | https://github.com/login?return_to=%2Fdocarray%2Fdocarray%2Fissues%2Fnew%2Fchoose |
| JohannesMessner | https://github.com/JohannesMessner |
| main | https://github.com/docarray/docarray/tree/main |
| feat-docvec-io | https://github.com/docarray/docarray/tree/feat-docvec-io |
|
Conversation
11
| https://github.com/docarray/docarray/pull/1562 |
|
Commits
26
| https://github.com/docarray/docarray/pull/1562/commits |
|
Checks
0
| https://github.com/docarray/docarray/pull/1562/checks |
|
Files changed
| https://github.com/docarray/docarray/pull/1562/files |
| Please reload this page | https://github.com/docarray/docarray/pull/1562/commits/f83fb4f7ac73bd1cd7a1ae573cbacea83e7affa5 |
|
feat: i/o for DocVec
| https://github.com/docarray/docarray/pull/1562/commits/f83fb4f7ac73bd1cd7a1ae573cbacea83e7affa5#top |
|
Show all changes
26 commits
| https://github.com/docarray/docarray/pull/1562/files |
|
d43057b
feat: json and dict for docvec
JohannesMessner May 22, 2023
| https://github.com/docarray/docarray/pull/1562/commits/d43057b91872eca2b174d24af9d9f1d3f4ae7925 |
|
c45bfca
test: add tests
JohannesMessner May 22, 2023
| https://github.com/docarray/docarray/pull/1562/commits/c45bfcaadb3bbca2464d2ee6bbd250474381585f |
|
564d144
test: add docvec to dict test
JohannesMessner May 22, 2023
| https://github.com/docarray/docarray/pull/1562/commits/564d14441bba5abc3b3ca37225c036bafdee0a0e |
|
76f9c8e
feat: to from dataframe for docvec
JohannesMessner May 22, 2023
| https://github.com/docarray/docarray/pull/1562/commits/76f9c8e97c196209ce7398ccae6b16346f642004 |
|
73a1ac7
test: dataframe docvec tests
JohannesMessner May 22, 2023
| https://github.com/docarray/docarray/pull/1562/commits/73a1ac7417b1192562d240a4413300e78953f9a7 |
|
f83fb4f
feat: to from csv for docvec
JohannesMessner May 22, 2023
| https://github.com/docarray/docarray/pull/1562/commits/f83fb4f7ac73bd1cd7a1ae573cbacea83e7affa5 |
|
ca8dc12
test: test csv with docvec
JohannesMessner May 22, 2023
| https://github.com/docarray/docarray/pull/1562/commits/ca8dc12d19653a66d3b06f6fd5f3510fb9d5ac5c |
|
2b52b1e
Merge branch 'main' into feat-docvec-io
JohannesMessner Jun 14, 2023
| https://github.com/docarray/docarray/pull/1562/commits/2b52b1edc8fe3d5234d4d83bf23453bdd76c7704 |
|
b115637
feat: pickle serialization for docvec
JohannesMessner Jun 14, 2023
| https://github.com/docarray/docarray/pull/1562/commits/b115637d00a16134cf0a6158341ddff0610f3797 |
|
bd86985
feat: protbuf array serialization for docvec
JohannesMessner Jun 14, 2023
| https://github.com/docarray/docarray/pull/1562/commits/bd86985eac44a5a2707141011def112e1e541461 |
|
c280ff2
test: test base64 deser for docvec
JohannesMessner Jun 14, 2023
| https://github.com/docarray/docarray/pull/1562/commits/c280ff28ad37f1e7ea89742409c7de27a1d3ba31 |
|
ad881cf
test: test save and load for docvec
JohannesMessner Jun 14, 2023
| https://github.com/docarray/docarray/pull/1562/commits/ad881cf60f52992c1cb0af97b6a9c65bd773565d |
|
4b1b533
feat: docvec json column wise
JohannesMessner Jun 19, 2023
| https://github.com/docarray/docarray/pull/1562/commits/4b1b533ccf299eca6e37a3571ea71ed161e5a053 |
|
60e651e
Merge branch 'main' into feat-docvec-io
JohannesMessner Jun 19, 2023
| https://github.com/docarray/docarray/pull/1562/commits/60e651e3b86e429064a6376751abd84839cd976b |
|
f9c97ec
Merge branch 'main' into feat-docvec-io
JohannesMessner Jun 20, 2023
| https://github.com/docarray/docarray/pull/1562/commits/f9c97ecac98ede2d8870920b8349bd5ac4d10b43 |
|
0603fc5
test: add test for docvec json
JohannesMessner Jun 20, 2023
| https://github.com/docarray/docarray/pull/1562/commits/0603fc5edc9cf19f9e1535e1a89ca2d8a9ed9957 |
|
c6ace8e
test: add tensor type arg
JohannesMessner Jun 20, 2023
| https://github.com/docarray/docarray/pull/1562/commits/c6ace8e6337da8ea7d53257c1c333c40df446763 |
|
51719b2
fix: mypy stuff
JohannesMessner Jun 26, 2023
| https://github.com/docarray/docarray/pull/1562/commits/51719b2a60df308597e38b00a249bf7d463d5db6 |
|
ad5f5bd
fix: raising of error when needed
JohannesMessner Jun 26, 2023
| https://github.com/docarray/docarray/pull/1562/commits/ad5f5bdc478a4e745365e9ffc2628e6de92e7d98 |
|
200dbac
fix: more exception raising
JohannesMessner Jun 26, 2023
| https://github.com/docarray/docarray/pull/1562/commits/200dbaccd8cd0ac13644b4c19ccb79d4cf2fcc20 |
|
8d1f446
fix: mypy
JohannesMessner Jun 26, 2023
| https://github.com/docarray/docarray/pull/1562/commits/8d1f446d80cd5b8e884aa920002cfe1b2da25cfd |
|
6815720
refactor: don't expose to/from csv for docvec
JohannesMessner Jun 26, 2023
| https://github.com/docarray/docarray/pull/1562/commits/6815720104f60d2b96444dc17ac94124189db388 |
|
6b5ddc7
test: adjust tests
JohannesMessner Jun 26, 2023
| https://github.com/docarray/docarray/pull/1562/commits/6b5ddc76d4f45a686469c6ed02be63b344ce024b |
|
587c20a
docs: add documentation for docvec io
JohannesMessner Jun 27, 2023
| https://github.com/docarray/docarray/pull/1562/commits/587c20a7d916283c0c08062b33dc01fa7a5309e4 |
|
663f17d
Merge branch 'main' into feat-docvec-io
JohannesMessner Jun 27, 2023
| https://github.com/docarray/docarray/pull/1562/commits/663f17d4a7e968e409fd3e6b3248e5489988b6ad |
|
7d035fb
Merge branch 'main' into feat-docvec-io
JohannesMessner Jun 28, 2023
| https://github.com/docarray/docarray/pull/1562/commits/7d035fbff8974a49b2846b8b31637feef52debbd |
|
Clear filters
| https://github.com/docarray/docarray/pull/1562/commits/f83fb4f7ac73bd1cd7a1ae573cbacea83e7affa5 |
| Please reload this page | https://github.com/docarray/docarray/pull/1562/commits/f83fb4f7ac73bd1cd7a1ae573cbacea83e7affa5 |
| Please reload this page | https://github.com/docarray/docarray/pull/1562/commits/f83fb4f7ac73bd1cd7a1ae573cbacea83e7affa5 |
|
Prev
| https://github.com/docarray/docarray/pull/1562/commits/73a1ac7417b1192562d240a4413300e78953f9a7 |
|
Next
| https://github.com/docarray/docarray/pull/1562/commits/ca8dc12d19653a66d3b06f6fd5f3510fb9d5ac5c |
| Please reload this page | https://github.com/docarray/docarray/pull/1562/commits/f83fb4f7ac73bd1cd7a1ae573cbacea83e7affa5 |
|
| https://github.com/JohannesMessner |
| JohannesMessner | https://github.com/docarray/docarray/commits?author=JohannesMessner |
| docarray/array/doc_list/io.py | https://github.com/docarray/docarray/pull/1562/commits/f83fb4f7ac73bd1cd7a1ae573cbacea83e7affa5#diff-267ad6885a280598db88e6957862608d84366cc6ecd0d0cb17b441f9eee558b1 |
|
View file
| https://github.com/docarray/docarray/blob/f83fb4f7ac73bd1cd7a1ae573cbacea83e7affa5/docarray/array/doc_list/io.py |
|
Open in desktop
| https://desktop.github.com |
| https://github.co/hiddenchars |
| https://github.com/docarray/docarray/pull/1562/commits/{{ revealButtonHref }} |
|
| https://github.com/docarray/docarray/pull/1562/commits/f83fb4f7ac73bd1cd7a1ae573cbacea83e7affa5#diff-267ad6885a280598db88e6957862608d84366cc6ecd0d0cb17b441f9eee558b1 |
| Please reload this page | https://github.com/docarray/docarray/pull/1562/commits/f83fb4f7ac73bd1cd7a1ae573cbacea83e7affa5 |
|
| https://github.com/docarray/docarray/pull/1562/commits/f83fb4f7ac73bd1cd7a1ae573cbacea83e7affa5#diff-267ad6885a280598db88e6957862608d84366cc6ecd0d0cb17b441f9eee558b1 |
|
| https://github.com/docarray/docarray/pull/1562/commits/f83fb4f7ac73bd1cd7a1ae573cbacea83e7affa5#diff-267ad6885a280598db88e6957862608d84366cc6ecd0d0cb17b441f9eee558b1 |
|
| https://github.com |
| Terms | https://docs.github.com/site-policy/github-terms/github-terms-of-service |
| Privacy | https://docs.github.com/site-policy/privacy-policies/github-privacy-statement |
| Security | https://github.com/security |
| Status | https://www.githubstatus.com/ |
| Community | https://github.community/ |
| Docs | https://docs.github.com/ |
| Contact | https://support.github.com?tags=dotcom-footer |
Viewport: width=device-width
URLs of crawlers that visited me.