Title: Seeking advice: remove is painfully slow · Issue #25 · objectbox/objectbox-python · GitHub
Open Graph Title: Seeking advice: remove is painfully slow · Issue #25 · objectbox/objectbox-python
X Title: Seeking advice: remove is painfully slow · Issue #25 · objectbox/objectbox-python
Description: I'm finding that adding and searching an objectbox database is really fast. However, the remove operation is really slow (1 second per object.) The database is on a local NVME SSD drive. It contains about 20,000 hashes and takes about 6G...
Open Graph Description: I'm finding that adding and searching an objectbox database is really fast. However, the remove operation is really slow (1 second per object.) The database is on a local NVME SSD drive. It contain...
X Description: I'm finding that adding and searching an objectbox database is really fast. However, the remove operation is really slow (1 second per object.) The database is on a local NVME SSD drive. It con...
Opengraph URL: https://github.com/objectbox/objectbox-python/issues/25
X: @github
Domain: patch-diff.githubusercontent.com
{"@context":"https://schema.org","@type":"DiscussionForumPosting","headline":"Seeking advice: remove is painfully slow","articleBody":"I'm finding that adding and searching an objectbox database is really fast. However, the remove operation is really slow (1 second per object.) The database is on a local NVME SSD drive. It contains about 20,000 hashes and takes about 6GB.\r\n\r\nMy find_unique hash_box.query operation is fast - it's literally the call to hash_box.remove that takes the time.\r\n\r\nWhat am I doing wrong?\r\n\r\n``` python\r\n@Entity()\r\nclass ImHash:\r\n id = Id\r\n key = String(index=Index(IndexType.HASH), unique=True)\r\n cos_value = Float32Vector(index=HnswIndex(\r\n dimensions=62720,\r\n distance_type=VectorDistanceType.COSINE,\r\n ))\r\n\r\n\r\ndef hash_image(im: Image.Image) -\u003e list[float]:\r\n vector = img2vec.get_vec(im, tensor=True)\r\n return vector.detach().cpu().numpy().flatten()\r\n\r\n\r\ndef hash_and_store(name_or_fp, key: str):\r\n im = Image.open(name_or_fp)\r\n h = hash_image(im)\r\n ih = find_unique(key)\r\n if ih is None:\r\n # create\r\n ih = ImHash()\r\n ih.key = key\r\n ih.cos_value = h\r\n with store_lock:\r\n hash_box.put(ih)\r\n\r\n\r\ndef init(db_dir: pathlib.Path):\r\n global store, hash_box, img2vec\r\n store = Store(directory=str(db_dir / directory_name),\r\n model_json_file=str(db_dir / json_model_name),\r\n max_db_size_in_kb=10 * 1024 * 1024)\r\n hash_box = store.box(ImHash)\r\n img2vec = Img2Vec(cuda=False, model='efficientnet_b0')\r\n\r\n\r\ndef close():\r\n store.close()\r\n\r\n\r\ndef find_unique(key: str):\r\n with store_lock:\r\n query = hash_box.query(ImHash.key.equals(key)).build()\r\n result = query.find()\r\n if len(result) == 0:\r\n return None\r\n elif len(result) \u003e 1:\r\n print('Multiple matches found')\r\n return None\r\n else:\r\n return result[0]\r\n\r\n\r\ndef find_similar(key: str) -\u003e list[tuple[ImHash, float]]:\r\n target = find_unique(key)\r\n with store_lock:\r\n query = hash_box.query(ImHash.cos_value.nearest_neighbor(target.cos_value, 8)).build()\r\n results = query.find_with_scores()\r\n results.sort(key=lambda x: x[1])\r\n return results\r\n\r\n\r\ndef remove(key: str):\r\n target = find_unique(key)\r\n if target is not None:\r\n with store_lock:\r\n hash_box.remove(target)\r\n\r\n\r\ndef remove_many(keys: list[str]):\r\n with store.write_tx():\r\n for k in keys:\r\n i = find_unique(k)\r\n if i is None:\r\n print('Hash key \"%s\" was already gone' % k)\r\n else:\r\n with store_lock:\r\n hash_box.remove(i.id)\r\n```","author":{"url":"https://github.com/patknight","@type":"Person","name":"patknight"},"datePublished":"2024-11-17T21:40:06.000Z","interactionStatistic":{"@type":"InteractionCounter","interactionType":"https://schema.org/CommentAction","userInteractionCount":0},"url":"https://github.com/25/objectbox-python/issues/25"}
| route-pattern | /_view_fragments/issues/show/:user_id/:repository/:id/issue_layout(.:format) |
| route-controller | voltron_issues_fragments |
| route-action | issue_layout |
| fetch-nonce | v2:34827eed-76b1-9003-a239-ddb3e6c607a7 |
| current-catalog-service-hash | 81bb79d38c15960b92d99bca9288a9108c7a47b18f2423d0f6438c5b7bcd2114 |
| request-id | A478:371D0F:51D3473:6A60EDF:697DE24E |
| html-safe-nonce | 0785bfe16b5f9b4cb9df9f691d95f6500f3cf06fcef36d16d9852a1d3e9294a6 |
| visitor-payload | eyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiJBNDc4OjM3MUQwRjo1MUQzNDczOjZBNjBFREY6Njk3REUyNEUiLCJ2aXNpdG9yX2lkIjoiODQ1NjU4OTU2MDE3NDczMTg1NSIsInJlZ2lvbl9lZGdlIjoiaWFkIiwicmVnaW9uX3JlbmRlciI6ImlhZCJ9 |
| visitor-hmac | 7554b677b98a2e698ad27ca6fdb51e29989968655dfb2f3cea663784f03ec341 |
| hovercard-subject-tag | issue:2666574973 |
| github-keyboard-shortcuts | repository,issues,copilot |
| google-site-verification | Apib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I |
| octolytics-url | https://collector.github.com/github/collect |
| analytics-location | / |
| fb:app_id | 1401488693436528 |
| apple-itunes-app | app-id=1477376905, app-argument=https://github.com/_view_fragments/issues/show/objectbox/objectbox-python/25/issue_layout |
| twitter:image | https://opengraph.githubassets.com/6bf90480331d87f27c72cf24f59bd0d8f08ca000b5cef8e57f7ea2ffee015c8f/objectbox/objectbox-python/issues/25 |
| twitter:card | summary_large_image |
| og:image | https://opengraph.githubassets.com/6bf90480331d87f27c72cf24f59bd0d8f08ca000b5cef8e57f7ea2ffee015c8f/objectbox/objectbox-python/issues/25 |
| og:image:alt | I'm finding that adding and searching an objectbox database is really fast. However, the remove operation is really slow (1 second per object.) The database is on a local NVME SSD drive. It contain... |
| og:image:width | 1200 |
| og:image:height | 600 |
| og:site_name | GitHub |
| og:type | object |
| og:author:username | patknight |
| hostname | github.com |
| expected-hostname | github.com |
| None | 60279d4097367e16897439d16d6bbe4180663db828c666eeed2656988ffe59f6 |
| turbo-cache-control | no-preview |
| go-import | github.com/objectbox/objectbox-python git https://github.com/objectbox/objectbox-python.git |
| octolytics-dimension-user_id | 22327943 |
| octolytics-dimension-user_login | objectbox |
| octolytics-dimension-repository_id | 185552041 |
| octolytics-dimension-repository_nwo | objectbox/objectbox-python |
| octolytics-dimension-repository_public | true |
| octolytics-dimension-repository_is_fork | false |
| octolytics-dimension-repository_network_root_id | 185552041 |
| octolytics-dimension-repository_network_root_nwo | objectbox/objectbox-python |
| turbo-body-classes | logged-out env-production page-responsive |
| disable-turbo | false |
| browser-stats-url | https://api.github.com/_private/browser/stats |
| browser-errors-url | https://api.github.com/_private/browser/errors |
| release | 7c85641c598ad130c74f7bcc27f58575cac69551 |
| ui-target | canary-2 |
| theme-color | #1e2327 |
| color-scheme | light dark |
Links:
Viewport: width=device-width