René's URL Explorer Experiment


Title: index.find() tries to reshape and fails · Issue #1822 · docarray/docarray · GitHub

Open Graph Title: index.find() tries to reshape and fails · Issue #1822 · docarray/docarray

X Title: index.find() tries to reshape and fails · Issue #1822 · docarray/docarray

Description: Initial Checks I have read and followed the docs and still think this is a bug Description Apologies the title of this is not the best. I have a very odd case and can't seem to understand what is causing it. I have also failed at recreat...

Open Graph Description: Initial Checks I have read and followed the docs and still think this is a bug Description Apologies the title of this is not the best. I have a very odd case and can't seem to understand what is c...

X Description: Initial Checks I have read and followed the docs and still think this is a bug Description Apologies the title of this is not the best. I have a very odd case and can't seem to understand what ...

Opengraph URL: https://github.com/docarray/docarray/issues/1822

X: @github

direct link

Domain: github.com


Hey, it has json ld scripts:
{"@context":"https://schema.org","@type":"DiscussionForumPosting","headline":"index.find() tries to reshape and fails","articleBody":"### Initial Checks\r\n\r\n- [X] I have read and followed [the docs](https://docs.docarray.org/) and still think this is a bug\r\n\r\n### Description\r\n\r\nApologies the title of this is not the best. I have a very odd case and can't seem to understand what is causing it. I have also failed at recreating the issue in a simpler example.\r\n\r\nI have a Doc List where each document has been built with the same process however the data is obviously different for each doc. I am using the hnswlib backend.\r\n\r\nThe issue I have is after I built the doc list with no issues I then try to run a .find() on the individual elements of the doc list, some of which fail and some don't. The error I get on some of these can be seen in the traceback below.\r\n\r\nCode Snippet:\r\n```python\r\nclass AddressDoc(BaseDoc):\r\n    ELID: int\r\n    FULL_ADDRESS: str\r\n    EMBEDDINGS: NdArray[768]\r\n\r\ndef build_doc_list(data):\r\n    st = time.time()\r\n    dl = DocList[AddressDoc](\r\n            AddressDoc(\r\n                ELID=0000000,\r\n                FULL_ADDRESS=\"\",\r\n                EMBEDDINGS=d[\"EMBEDDINGS\"],\r\n            )\r\n            for d in data\r\n    )\r\n    logger.info(f\"Doc list created... {time.time()-st}\")\r\n    return dl\r\n\r\ndoc_index = HnswDocumentIndex[AddressDoc](work_dir=db_path)\r\ndl = build_doc_list(data)\r\n\r\n# This works!\r\nresults = doc_index.find(dl[2], search_field=\"EMBEDDINGS\", limit=1)\r\n\r\n# This doesn't!\r\nresults = doc_index.find(dl[3], search_field=\"EMBEDDINGS\", limit=1)\r\n\r\ntype(dl[2].EMBEDDINGS) == type(dl[3].EMBEDDINGS) # returns True\r\ntype(dl[2].EMBEDDINGS.shape) == type(dl[3].EMBEDDINGS.shape) # returns True\r\n\r\n```\r\nI have compared dl[2] and dl[3] left right and center and can't understand what the issue is. The embeddings array in both documents are the same shape which I have checked with numpy (.shape, .ndims, .size). I can't understand what the difference is between the two that causes the error below.\r\n\r\n\r\nTraceback below:\r\n```\r\nFile /usr/local/lib/python3.11/site-packages/docarray/index/abstract.py:503, in BaseDocIndex.find(self, query, search_field, limit, **kwargs)\r\n    [501](file:///usr/local/lib/python3.11/site-packages/docarray/index/abstract.py?line=500)     query_vec = query\r\n    [502](file:///usr/local/lib/python3.11/site-packages/docarray/index/abstract.py?line=501) query_vec_np = self._to_numpy(query_vec)\r\n--\u003e [503](file:///usr/local/lib/python3.11/site-packages/docarray/index/abstract.py?line=502) docs, scores = self._find(\r\n    [504](file:///usr/local/lib/python3.11/site-packages/docarray/index/abstract.py?line=503)     query_vec_np, search_field=search_field, limit=limit, **kwargs\r\n    [505](file:///usr/local/lib/python3.11/site-packages/docarray/index/abstract.py?line=504) )\r\n    [507](file:///usr/local/lib/python3.11/site-packages/docarray/index/abstract.py?line=506) if isinstance(docs, List) and not isinstance(docs, DocList):\r\n    [508](file:///usr/local/lib/python3.11/site-packages/docarray/index/abstract.py?line=507)     docs = self._dict_list_to_docarray(docs)\r\n\r\nFile /usr/local/lib/python3.11/site-packages/docarray/index/backends/hnswlib.py:328, in HnswDocumentIndex._find(self, query, limit, search_field)\r\n    [324](file:///usr/local/lib/python3.11/site-packages/docarray/index/backends/hnswlib.py?line=323) def _find(\r\n...\r\n--\u003e [197](file:///usr/local/lib/python3.11/site-packages/docarray/typing/tensor/ndarray.py?line=196)     return cls._docarray_from_native(x.reshape(source.shape))\r\n    [198](file:///usr/local/lib/python3.11/site-packages/docarray/typing/tensor/ndarray.py?line=197) elif len(source.shape) \u003e 0:\r\n    [199](file:///usr/local/lib/python3.11/site-packages/docarray/typing/tensor/ndarray.py?line=198)     return cls._docarray_from_native(np.zeros(source.shape))\r\n\r\nValueError: cannot reshape array of size 768 into shape (768,768)\r\n```\r\n\r\n\r\n### Example Code\r\n\r\n_No response_\r\n\r\n### Python, DocArray \u0026 OS Version\r\n\r\n```Text\r\n0.39.0\r\n```\r\n\r\n\r\n### Affected Components\r\n\r\n- [X] [Vector Database / Index](https://docs.docarray.org/user_guide/storing/docindex/)\r\n- [ ] [Representing](https://docs.docarray.org/user_guide/representing/first_step)\r\n- [ ] [Sending](https://docs.docarray.org/user_guide/sending/first_step/)\r\n- [ ] [storing](https://docs.docarray.org/user_guide/storing/first_step/)\r\n- [ ] [multi modal data type](https://docs.docarray.org/data_types/first_steps/)","author":{"url":"https://github.com/nikhilmakan02","@type":"Person","name":"nikhilmakan02"},"datePublished":"2023-10-12T05:28:01.000Z","interactionStatistic":{"@type":"InteractionCounter","interactionType":"https://schema.org/CommentAction","userInteractionCount":10},"url":"https://github.com/1822/docarray/issues/1822"}

route-pattern/_view_fragments/issues/show/:user_id/:repository/:id/issue_layout(.:format)
route-controllervoltron_issues_fragments
route-actionissue_layout
fetch-noncev2:b435bc04-33d8-1e0c-f560-1d33bd83bc6a
current-catalog-service-hash81bb79d38c15960b92d99bca9288a9108c7a47b18f2423d0f6438c5b7bcd2114
request-idC7C8:3D21D:1B2ABB6:254E4A9:6964BFC7
html-safe-nonceee9d0105f54cf3dee73d4d18a8dbe661d9dd0d6d30259d359af86b83e1446254
visitor-payloadeyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiJDN0M4OjNEMjFEOjFCMkFCQjY6MjU0RTRBOTo2OTY0QkZDNyIsInZpc2l0b3JfaWQiOiI0OTUyMjY2MDk5MTU1ODQ1MDYzIiwicmVnaW9uX2VkZ2UiOiJpYWQiLCJyZWdpb25fcmVuZGVyIjoiaWFkIn0=
visitor-hmaca6e1621fa46925d2e1e2bc65dce57eb3f5c575641245aaf4bbcb4500f591730c
hovercard-subject-tagissue:1939203142
github-keyboard-shortcutsrepository,issues,copilot
google-site-verificationApib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I
octolytics-urlhttps://collector.github.com/github/collect
analytics-location///voltron/issues_fragments/issue_layout
fb:app_id1401488693436528
apple-itunes-appapp-id=1477376905, app-argument=https://github.com/_view_fragments/issues/show/docarray/docarray/1822/issue_layout
twitter:imagehttps://opengraph.githubassets.com/268cc57c0b9daa4b575616b4968d11ebfa1fec97927cbd6215f50fca1b16f7ac/docarray/docarray/issues/1822
twitter:cardsummary_large_image
og:imagehttps://opengraph.githubassets.com/268cc57c0b9daa4b575616b4968d11ebfa1fec97927cbd6215f50fca1b16f7ac/docarray/docarray/issues/1822
og:image:altInitial Checks I have read and followed the docs and still think this is a bug Description Apologies the title of this is not the best. I have a very odd case and can't seem to understand what is c...
og:image:width1200
og:image:height600
og:site_nameGitHub
og:typeobject
og:author:usernamenikhilmakan02
hostnamegithub.com
expected-hostnamegithub.com
None21df671ce2c9f1a16940ccbd3af6cb4f3f12a856929ca7eb1b4aea8e384ea442
turbo-cache-controlno-preview
go-importgithub.com/docarray/docarray git https://github.com/docarray/docarray.git
octolytics-dimension-user_id117445116
octolytics-dimension-user_logindocarray
octolytics-dimension-repository_id438303578
octolytics-dimension-repository_nwodocarray/docarray
octolytics-dimension-repository_publictrue
octolytics-dimension-repository_is_forkfalse
octolytics-dimension-repository_network_root_id438303578
octolytics-dimension-repository_network_root_nwodocarray/docarray
turbo-body-classeslogged-out env-production page-responsive
disable-turbofalse
browser-stats-urlhttps://api.github.com/_private/browser/stats
browser-errors-urlhttps://api.github.com/_private/browser/errors
release5707c685ac172d50a0bdd7533dde4f8aabcf8eef
ui-targetfull
theme-color#1e2327
color-schemelight dark

Links:

Skip to contenthttps://github.com/docarray/docarray/issues/1822#start-of-content
https://github.com/
Sign in https://github.com/login?return_to=https%3A%2F%2Fgithub.com%2Fdocarray%2Fdocarray%2Fissues%2F1822
GitHub CopilotWrite better code with AIhttps://github.com/features/copilot
GitHub SparkBuild and deploy intelligent appshttps://github.com/features/spark
GitHub ModelsManage and compare promptshttps://github.com/features/models
MCP RegistryNewIntegrate external toolshttps://github.com/mcp
ActionsAutomate any workflowhttps://github.com/features/actions
CodespacesInstant dev environmentshttps://github.com/features/codespaces
IssuesPlan and track workhttps://github.com/features/issues
Code ReviewManage code changeshttps://github.com/features/code-review
GitHub Advanced SecurityFind and fix vulnerabilitieshttps://github.com/security/advanced-security
Code securitySecure your code as you buildhttps://github.com/security/advanced-security/code-security
Secret protectionStop leaks before they starthttps://github.com/security/advanced-security/secret-protection
Why GitHubhttps://github.com/why-github
Documentationhttps://docs.github.com
Bloghttps://github.blog
Changeloghttps://github.blog/changelog
Marketplacehttps://github.com/marketplace
View all featureshttps://github.com/features
Enterpriseshttps://github.com/enterprise
Small and medium teamshttps://github.com/team
Startupshttps://github.com/enterprise/startups
Nonprofitshttps://github.com/solutions/industry/nonprofits
App Modernizationhttps://github.com/solutions/use-case/app-modernization
DevSecOpshttps://github.com/solutions/use-case/devsecops
DevOpshttps://github.com/solutions/use-case/devops
CI/CDhttps://github.com/solutions/use-case/ci-cd
View all use caseshttps://github.com/solutions/use-case
Healthcarehttps://github.com/solutions/industry/healthcare
Financial serviceshttps://github.com/solutions/industry/financial-services
Manufacturinghttps://github.com/solutions/industry/manufacturing
Governmenthttps://github.com/solutions/industry/government
View all industrieshttps://github.com/solutions/industry
View all solutionshttps://github.com/solutions
AIhttps://github.com/resources/articles?topic=ai
Software Developmenthttps://github.com/resources/articles?topic=software-development
DevOpshttps://github.com/resources/articles?topic=devops
Securityhttps://github.com/resources/articles?topic=security
View all topicshttps://github.com/resources/articles
Customer storieshttps://github.com/customer-stories
Events & webinarshttps://github.com/resources/events
Ebooks & reportshttps://github.com/resources/whitepapers
Business insightshttps://github.com/solutions/executive-insights
GitHub Skillshttps://skills.github.com
Documentationhttps://docs.github.com
Customer supporthttps://support.github.com
Community forumhttps://github.com/orgs/community/discussions
Trust centerhttps://github.com/trust-center
Partnershttps://github.com/partners
GitHub SponsorsFund open source developershttps://github.com/sponsors
Security Labhttps://securitylab.github.com
Maintainer Communityhttps://maintainers.github.com
Acceleratorhttps://github.com/accelerator
Archive Programhttps://archiveprogram.github.com
Topicshttps://github.com/topics
Trendinghttps://github.com/trending
Collectionshttps://github.com/collections
Enterprise platformAI-powered developer platformhttps://github.com/enterprise
GitHub Advanced SecurityEnterprise-grade security featureshttps://github.com/security/advanced-security
Copilot for BusinessEnterprise-grade AI featureshttps://github.com/features/copilot/copilot-business
Premium SupportEnterprise-grade 24/7 supporthttps://github.com/premium-support
Pricinghttps://github.com/pricing
Search syntax tipshttps://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax
documentationhttps://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax
Sign in https://github.com/login?return_to=https%3A%2F%2Fgithub.com%2Fdocarray%2Fdocarray%2Fissues%2F1822
Sign up https://github.com/signup?ref_cta=Sign+up&ref_loc=header+logged+out&ref_page=%2F%3Cuser-name%3E%2F%3Crepo-name%3E%2Fvoltron%2Fissues_fragments%2Fissue_layout&source=header-repo&source_repo=docarray%2Fdocarray
Reloadhttps://github.com/docarray/docarray/issues/1822
Reloadhttps://github.com/docarray/docarray/issues/1822
Reloadhttps://github.com/docarray/docarray/issues/1822
docarray https://github.com/docarray
docarrayhttps://github.com/docarray/docarray
Notifications https://github.com/login?return_to=%2Fdocarray%2Fdocarray
Fork 234 https://github.com/login?return_to=%2Fdocarray%2Fdocarray
Star 3.1k https://github.com/login?return_to=%2Fdocarray%2Fdocarray
Code https://github.com/docarray/docarray
Issues 68 https://github.com/docarray/docarray/issues
Pull requests 36 https://github.com/docarray/docarray/pulls
Discussions https://github.com/docarray/docarray/discussions
Actions https://github.com/docarray/docarray/actions
Security Uh oh! There was an error while loading. Please reload this page. https://github.com/docarray/docarray/security
Please reload this pagehttps://github.com/docarray/docarray/issues/1822
Insights https://github.com/docarray/docarray/pulse
Code https://github.com/docarray/docarray
Issues https://github.com/docarray/docarray/issues
Pull requests https://github.com/docarray/docarray/pulls
Discussions https://github.com/docarray/docarray/discussions
Actions https://github.com/docarray/docarray/actions
Security https://github.com/docarray/docarray/security
Insights https://github.com/docarray/docarray/pulse
New issuehttps://github.com/login?return_to=https://github.com/docarray/docarray/issues/1822
New issuehttps://github.com/login?return_to=https://github.com/docarray/docarray/issues/1822
index.find() tries to reshape and failshttps://github.com/docarray/docarray/issues/1822#top
https://github.com/JoanFM
https://github.com/nikhilmakan02
https://github.com/nikhilmakan02
nikhilmakan02https://github.com/nikhilmakan02
on Oct 12, 2023https://github.com/docarray/docarray/issues/1822#issue-1939203142
the docshttps://docs.docarray.org/
Vector Database / Indexhttps://docs.docarray.org/user_guide/storing/docindex/
Representinghttps://docs.docarray.org/user_guide/representing/first_step
Sendinghttps://docs.docarray.org/user_guide/sending/first_step/
storinghttps://docs.docarray.org/user_guide/storing/first_step/
multi modal data typehttps://docs.docarray.org/data_types/first_steps/
JoanFMhttps://github.com/JoanFM
DocArray backloghttps://github.com/orgs/docarray/projects/4
https://github.com
Termshttps://docs.github.com/site-policy/github-terms/github-terms-of-service
Privacyhttps://docs.github.com/site-policy/privacy-policies/github-privacy-statement
Securityhttps://github.com/security
Statushttps://www.githubstatus.com/
Communityhttps://github.community/
Docshttps://docs.github.com/
Contacthttps://support.github.com?tags=dotcom-footer

Viewport: width=device-width


URLs of crawlers that visited me.