Title: [DynamoDB] BatchWriteItem operation: Provided list of item keys contains duplicates · Issue #2483 · feast-dev/feast · GitHub
Open Graph Title: [DynamoDB] BatchWriteItem operation: Provided list of item keys contains duplicates · Issue #2483 · feast-dev/feast
X Title: [DynamoDB] BatchWriteItem operation: Provided list of item keys contains duplicates · Issue #2483 · feast-dev/feast
Description: Expected Behavior Duplication should be handled if a partition key already exists in the batch to be written to DynamoDB. Current Behavior The following exception raises when running the local test test_online_retrieval[LOCAL:File:dynamo...
Open Graph Description: Expected Behavior Duplication should be handled if a partition key already exists in the batch to be written to DynamoDB. Current Behavior The following exception raises when running the local test...
X Description: Expected Behavior Duplication should be handled if a partition key already exists in the batch to be written to DynamoDB. Current Behavior The following exception raises when running the local test...
Opengraph URL: https://github.com/feast-dev/feast/issues/2483
X: @github
Domain: github.com
{"@context":"https://schema.org","@type":"DiscussionForumPosting","headline":"[DynamoDB] BatchWriteItem operation: Provided list of item keys contains duplicates","articleBody":"## Expected Behavior \r\n\r\nDuplication should be handled if a partition key already exists in the batch to be written to DynamoDB.\r\n\r\n## Current Behavior\r\n\r\nThe following exception raises when running the local test `test_online_retrieval[LOCAL:File:dynamodb-True]`\r\n\r\n```bash\r\nbotocore.exceptions.ClientError: An error occurred (ValidationException) when calling the BatchWriteItem operation: Provided list of item keys contains duplicates\r\n```\r\n\r\n## Steps to reproduce\r\n\r\nThis is the output from the pytest log\r\n\r\n```bash\r\nenvironment = Environment(name='integration_test_63b98a_1', test_repo_config=LOCAL:File:dynamodb, feature_store=\u003cfeast.feature_store...sal.data_sources.file.FileDataSourceCreator object at 0x7fb91d38f730\u003e, python_feature_server=False, worker_id='master')\r\nuniversal_data_sources = (UniversalEntities(customer_vals=[1001, 1002, 1003, 1004, 1005, 1006, 1007, 1008, 1009, 1010, 1011, 1012, 1013, 1014, ...object at 0x7fb905f6b7c0\u003e, field_mapping=\u003cfeast.infra.offline_stores.file_source.FileSource object at 0x7fb905f794f0\u003e))\r\nfull_feature_names = True\r\n\r\n @pytest.mark.integration\r\n @pytest.mark.universal\r\n @pytest.mark.parametrize(\"full_feature_names\", [True, False], ids=lambda v: str(v))\r\n def test_online_retrieval(environment, universal_data_sources, full_feature_names):\r\n fs = environment.feature_store\r\n entities, datasets, data_sources = universal_data_sources\r\n feature_views = construct_universal_feature_views(data_sources)\r\n \r\n feature_service = FeatureService(\r\n \"convrate_plus100\",\r\n features=[feature_views.driver[[\"conv_rate\"]], feature_views.driver_odfv],\r\n )\r\n feature_service_entity_mapping = FeatureService(\r\n name=\"entity_mapping\",\r\n features=[\r\n feature_views.location.with_name(\"origin\").with_join_key_map(\r\n {\"location_id\": \"origin_id\"}\r\n ),\r\n feature_views.location.with_name(\"destination\").with_join_key_map(\r\n {\"location_id\": \"destination_id\"}\r\n ),\r\n ],\r\n )\r\n \r\n feast_objects = []\r\n feast_objects.extend(feature_views.values())\r\n feast_objects.extend(\r\n [\r\n driver(),\r\n customer(),\r\n location(),\r\n feature_service,\r\n feature_service_entity_mapping,\r\n ]\r\n )\r\n fs.apply(feast_objects)\r\n\u003e fs.materialize(\r\n environment.start_date - timedelta(days=1),\r\n environment.end_date + timedelta(days=1),\r\n )\r\n\r\nsdk/python/tests/integration/online_store/test_universal_online.py:426: \r\n_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ \r\nsdk/python/feast/feature_store.py:1165: in materialize\r\n provider.materialize_single_feature_view(\r\nsdk/python/feast/infra/passthrough_provider.py:164: in materialize_single_feature_view\r\n self.online_write_batch(\r\nsdk/python/feast/infra/passthrough_provider.py:86: in online_write_batch\r\n self.online_store.online_write_batch(config, table, data, progress)\r\nsdk/python/feast/infra/online_stores/dynamodb.py:208: in online_write_batch\r\n progress(1)\r\n../venv/lib/python3.9/site-packages/boto3/dynamodb/table.py:168: in __exit__\r\n self._flush()\r\n../venv/lib/python3.9/site-packages/boto3/dynamodb/table.py:144: in _flush\r\n response = self._client.batch_write_item(\r\n../venv/lib/python3.9/site-packages/botocore/client.py:395: in _api_call\r\n return self._make_api_call(operation_name, kwargs)\r\n_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ \r\n\r\nself = \u003cbotocore.client.DynamoDB object at 0x7fb9056a0eb0\u003e, operation_name = 'BatchWriteItem'\r\napi_params = {'RequestItems': {'integration_test_63b98a_1.global_stats': [{'PutRequest': {'Item': {'entity_id': '361ad244a817acdb9c...Item': {'entity_id': '361ad244a817acdb9cb041cf7ee8b4b0', 'event_ts': '2022-04-03 16:00:00+00:00', 'values': {...}}}}]}}\r\n\r\n def _make_api_call(self, operation_name, api_params):\r\n operation_model = self._service_model.operation_model(operation_name)\r\n service_name = self._service_model.service_name\r\n history_recorder.record('API_CALL', {\r\n 'service': service_name,\r\n 'operation': operation_name,\r\n 'params': api_params,\r\n })\r\n if operation_model.deprecated:\r\n logger.debug('Warning: %s.%s() is deprecated',\r\n service_name, operation_name)\r\n request_context = {\r\n 'client_region': self.meta.region_name,\r\n 'client_config': self.meta.config,\r\n 'has_streaming_input': operation_model.has_streaming_input,\r\n 'auth_type': operation_model.auth_type,\r\n }\r\n request_dict = self._convert_to_request_dict(\r\n api_params, operation_model, context=request_context)\r\n resolve_checksum_context(request_dict, operation_model, api_params)\r\n \r\n service_id = self._service_model.service_id.hyphenize()\r\n handler, event_response = self.meta.events.emit_until_response(\r\n 'before-call.{service_id}.{operation_name}'.format(\r\n service_id=service_id,\r\n operation_name=operation_name),\r\n model=operation_model, params=request_dict,\r\n request_signer=self._request_signer, context=request_context)\r\n \r\n if event_response is not None:\r\n http, parsed_response = event_response\r\n else:\r\n apply_request_checksum(request_dict)\r\n http, parsed_response = self._make_request(\r\n operation_model, request_dict, request_context)\r\n \r\n self.meta.events.emit(\r\n 'after-call.{service_id}.{operation_name}'.format(\r\n service_id=service_id,\r\n operation_name=operation_name),\r\n http_response=http, parsed=parsed_response,\r\n model=operation_model, context=request_context\r\n )\r\n \r\n if http.status_code \u003e= 300:\r\n error_code = parsed_response.get(\"Error\", {}).get(\"Code\")\r\n error_class = self.exceptions.from_code(error_code)\r\n\u003e raise error_class(parsed_response, operation_name)\r\nE botocore.exceptions.ClientError: An error occurred (ValidationException) when calling the BatchWriteItem operation: Provided list of item keys contains duplicates\r\n\r\n```\r\n\r\n### Specifications\r\n\r\n- Version: `feast 0.18.1`\r\n- Platform: `Windows`\r\n\r\n## Possible Solution\r\n\r\nOverwrite by partition keys in `DynamoDB.online_write_batch()` method\r\n\r\n```python\r\nwith table_instance.batch_writer(overwrite_by_pkeys=[\"entity_id\"]) as batch:\r\n for entity_key, features, timestamp, created_ts in data:\r\n entity_id = compute_entity_id(entity_key)\r\n```\r\n\r\nThis solution comes from [StackOverflow](https://stackoverflow.com/questions/56632960/dynamodb-batchwriteitem-provided-list-of-item-keys-contains-duplicates)\r\n\r\n## Other Comments\r\n\r\nThis error prompt while developing #2358 , I can provide a solution to both in the same PR if possible.","author":{"url":"https://github.com/TremaMiguel","@type":"Person","name":"TremaMiguel"},"datePublished":"2022-04-04T17:03:26.000Z","interactionStatistic":{"@type":"InteractionCounter","interactionType":"https://schema.org/CommentAction","userInteractionCount":0},"url":"https://github.com/2483/feast/issues/2483"}
| route-pattern | /_view_fragments/issues/show/:user_id/:repository/:id/issue_layout(.:format) |
| route-controller | voltron_issues_fragments |
| route-action | issue_layout |
| fetch-nonce | v2:69b34830-b867-9eec-880c-709e54870536 |
| current-catalog-service-hash | 81bb79d38c15960b92d99bca9288a9108c7a47b18f2423d0f6438c5b7bcd2114 |
| request-id | B234:2F1D49:17832A:1FAEB3:69793F7A |
| html-safe-nonce | e7ee1745f046d39db40c5143fa6de6156eeab6ed4e40a3d7089a67228222c71a |
| visitor-payload | eyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiJCMjM0OjJGMUQ0OToxNzgzMkE6MUZBRUIzOjY5NzkzRjdBIiwidmlzaXRvcl9pZCI6IjgzNzg0MjYwNjAyMzk4ODAwNTgiLCJyZWdpb25fZWRnZSI6ImlhZCIsInJlZ2lvbl9yZW5kZXIiOiJpYWQifQ== |
| visitor-hmac | 9cea03f33a45573c1d3b75deffd562acf3b6694ea4ed1aa318e753c277f1fea2 |
| hovercard-subject-tag | issue:1192068721 |
| github-keyboard-shortcuts | repository,issues,copilot |
| google-site-verification | Apib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I |
| octolytics-url | https://collector.github.com/github/collect |
| analytics-location | / |
| fb:app_id | 1401488693436528 |
| apple-itunes-app | app-id=1477376905, app-argument=https://github.com/_view_fragments/issues/show/feast-dev/feast/2483/issue_layout |
| twitter:image | https://opengraph.githubassets.com/3d545e71f9ceea4471a9ddc6d65588d4463ecadfe251792954bb4f0674099847/feast-dev/feast/issues/2483 |
| twitter:card | summary_large_image |
| og:image | https://opengraph.githubassets.com/3d545e71f9ceea4471a9ddc6d65588d4463ecadfe251792954bb4f0674099847/feast-dev/feast/issues/2483 |
| og:image:alt | Expected Behavior Duplication should be handled if a partition key already exists in the batch to be written to DynamoDB. Current Behavior The following exception raises when running the local test... |
| og:image:width | 1200 |
| og:image:height | 600 |
| og:site_name | GitHub |
| og:type | object |
| og:author:username | TremaMiguel |
| hostname | github.com |
| expected-hostname | github.com |
| None | f9bf80f4f4d71a2f9361692e65b326c887a4b25c15fe127257a2d331d14031bd |
| turbo-cache-control | no-preview |
| go-import | github.com/feast-dev/feast git https://github.com/feast-dev/feast.git |
| octolytics-dimension-user_id | 57027613 |
| octolytics-dimension-user_login | feast-dev |
| octolytics-dimension-repository_id | 161133770 |
| octolytics-dimension-repository_nwo | feast-dev/feast |
| octolytics-dimension-repository_public | true |
| octolytics-dimension-repository_is_fork | false |
| octolytics-dimension-repository_network_root_id | 161133770 |
| octolytics-dimension-repository_network_root_nwo | feast-dev/feast |
| turbo-body-classes | logged-out env-production page-responsive |
| disable-turbo | false |
| browser-stats-url | https://api.github.com/_private/browser/stats |
| browser-errors-url | https://api.github.com/_private/browser/errors |
| release | 4aabbf3f1d27b754d95d7a9a6e02d14a5aaeb4e6 |
| ui-target | full |
| theme-color | #1e2327 |
| color-scheme | light dark |
Links:
Viewport: width=device-width