Title: Duplicate data source in feature registry · Issue #2581 · feast-dev/feast · GitHub
Open Graph Title: Duplicate data source in feature registry · Issue #2581 · feast-dev/feast
X Title: Duplicate data source in feature registry · Issue #2581 · feast-dev/feast
Description: Expected Behavior Registry protobuf should not have duplicate data source definitions. Current Behavior Every time we run feast apply, we noticed that a data source source is appended to the registry protobuf with the exact same details....
Open Graph Description: Expected Behavior Registry protobuf should not have duplicate data source definitions. Current Behavior Every time we run feast apply, we noticed that a data source source is appended to the regist...
X Description: Expected Behavior Registry protobuf should not have duplicate data source definitions. Current Behavior Every time we run feast apply, we noticed that a data source source is appended to the regist...
Opengraph URL: https://github.com/feast-dev/feast/issues/2581
X: @github
Domain: github.com
{"@context":"https://schema.org","@type":"DiscussionForumPosting","headline":"Duplicate data source in feature registry","articleBody":"## Expected Behavior \r\nRegistry protobuf should not have duplicate data source definitions.\r\n\r\n## Current Behavior\r\nEvery time we run `feast apply`, we noticed that a data source source is appended to the registry protobuf with the exact same details.\r\n\r\n```python\r\nfrom feast.protos.feast.core.Registry_pb2 import Registry as RegistryProto\r\n\r\nregistry_proto = RegistryProto()\r\n\r\nwith open('registry.db', 'rb') as f: data = f.read()\r\n\r\nprint(registry_proto.FromString(data))\r\n````\r\n\r\nThis is what the protobuf looks like after running `feast apply` 3 times in a row without making any changes to the definition.\r\n\r\n```\r\nentities {\r\n spec {\r\n name: \"__dummy\"\r\n value_type: STRING\r\n join_key: \"__dummy_id\"\r\n project: \"myproject\"\r\n }\r\n meta {\r\n created_timestamp {\r\n seconds: 1650485581\r\n nanos: 37674000\r\n }\r\n last_updated_timestamp {\r\n seconds: 1650485581\r\n nanos: 37723000\r\n }\r\n }\r\n}\r\nregistry_schema_version: \"1\"\r\nversion_id: \"743b0630-373b-405f-861d-3de738962d7c\"\r\nlast_updated {\r\n seconds: 1650485581\r\n nanos: 37831000\r\n}\r\nfeature_views {\r\n spec {\r\n name: \"IRIS\"\r\n project: \"myproject\"\r\n entities: \"__dummy\"\r\n features {\r\n name: \"PETAL_LENGTH\"\r\n value_type: FLOAT\r\n }\r\n features {\r\n name: \"PETAL_WIDTH\"\r\n value_type: FLOAT\r\n }\r\n features {\r\n name: \"SEPAL_LENGTH\"\r\n value_type: FLOAT\r\n }\r\n features {\r\n name: \"SEPAL_WIDTH\"\r\n value_type: FLOAT\r\n }\r\n features {\r\n name: \"SPECIES\"\r\n value_type: INT64\r\n }\r\n ttl {\r\n seconds: 31449600\r\n }\r\n batch_source {\r\n type: BATCH_SNOWFLAKE\r\n timestamp_field: \"EVENT_TIMESTAMP\"\r\n created_timestamp_column: \"CREATE_TIMESTAMP\"\r\n data_source_class_type: \"feast.infra.offline_stores.snowflake_source.SnowflakeSource\"\r\n snowflake_options {\r\n table: \"IRIS\"\r\n schema: \"MY_SCHEMA\"\r\n database: \"MY_DATABASE\"\r\n }\r\n }\r\n online: true\r\n description: \"A sample feature view containing the Iris dataset.\"\r\n }\r\n meta {\r\n created_timestamp {\r\n seconds: 1650485581\r\n nanos: 37333000\r\n }\r\n last_updated_timestamp {\r\n seconds: 1650485581\r\n nanos: 37333000\r\n }\r\n }\r\n}\r\ndata_sources {\r\n type: BATCH_SNOWFLAKE\r\n timestamp_field: \"EVENT_TIMESTAMP\"\r\n created_timestamp_column: \"CREATE_TIMESTAMP\"\r\n data_source_class_type: \"feast.infra.offline_stores.snowflake_source.SnowflakeSource\"\r\n snowflake_options {\r\n table: \"IRIS\"\r\n schema: \"MY_SCHEMA\"\r\n database: \"MY_DATABASE\"\r\n }\r\n project: \"myproject\"\r\n}\r\ndata_sources {\r\n type: BATCH_SNOWFLAKE\r\n timestamp_field: \"EVENT_TIMESTAMP\"\r\n created_timestamp_column: \"CREATE_TIMESTAMP\"\r\n data_source_class_type: \"feast.infra.offline_stores.snowflake_source.SnowflakeSource\"\r\n snowflake_options {\r\n table: \"IRIS\"\r\n schema: \"MY_SCHEMA\"\r\n database: \"MY_DATABASE\"\r\n }\r\n project: \"myproject\"\r\n}\r\ndata_sources {\r\n type: BATCH_SNOWFLAKE\r\n timestamp_field: \"EVENT_TIMESTAMP\"\r\n created_timestamp_column: \"CREATE_TIMESTAMP\"\r\n data_source_class_type: \"feast.infra.offline_stores.snowflake_source.SnowflakeSource\"\r\n snowflake_options {\r\n table: \"IRIS\"\r\n schema: \"MY_SCHEMA\"\r\n database: \"MY_DATABASE\"\r\n }\r\n project: \"myproject\"\r\n}\r\n```\r\n## Steps to reproduce\r\n\r\nHere's the `feature_store.yaml`:\r\n```yaml\r\nproject: myproject\r\nregistry: s3://mybucket/registry.db\r\nprovider: local\r\noffline_store:\r\n type: snowflake.offline\r\n account: myaccount\r\n user: myuser\r\n password: mypasssword\r\n role: myrole\r\n warehouse: mywarehouse\r\n database: mydatabase\r\nonline_store:\r\n type: redis\r\n connection_string: myredisconnectionstring\r\n```\r\n\r\nAnd here's the feature definition `feature_definitions/iris.py`:\r\n```python\r\nfrom datetime import timedelta\r\nfrom pathlib import Path\r\n\r\nfrom feast import Feature, FeatureView, SnowflakeSource, ValueType\r\n\r\nname = Path(__file__).stem\r\n\r\niris_source = SnowflakeSource(\r\n schema=\"MY_SCHEMA\",\r\n database=\"MY_DATABASE\",\r\n table=\"IRIS\",\r\n timestamp_field=\"EVENT_TIMESTAMP\",\r\n created_timestamp_column=\"CREATE_TIMESTAMP\"\r\n)\r\n\r\niris = FeatureView(\r\n name=name.upper(),\r\n entities=[],\r\n ttl=timedelta(weeks=52),\r\n schema=[\r\n Feature(name=\"PETAL_LENGTHS\", dtype=ValueType.FLOAT),\r\n Feature(name=\"PETAL_WIDTH\", dtype=ValueType.FLOAT),\r\n Feature(name=\"SEPAL_LENGTH\", dtype=ValueType.FLOAT),\r\n Feature(name=\"SEPAL_WIDTH\", dtype=ValueType.FLOAT),\r\n Feature(name=\"SPECIES\", dtype=ValueType.INT64),\r\n ],\r\n online=True,\r\n source=iris_source,\r\n description=\"A sample feature view containing the Iris dataset.\"\r\n)\r\n```\r\n\r\n### Specifications\r\n\r\n- Version: `0.2.0`\r\n- Platform: AWS EC2 (Linux2 AMI)\r\n- Subsystem:\r\n\r\n## Possible Solution\r\n","author":{"url":"https://github.com/tkch3n","@type":"Person","name":"tkch3n"},"datePublished":"2022-04-20T20:50:31.000Z","interactionStatistic":{"@type":"InteractionCounter","interactionType":"https://schema.org/CommentAction","userInteractionCount":4},"url":"https://github.com/2581/feast/issues/2581"}
| route-pattern | /_view_fragments/issues/show/:user_id/:repository/:id/issue_layout(.:format) |
| route-controller | voltron_issues_fragments |
| route-action | issue_layout |
| fetch-nonce | v2:0ef747b1-d33d-f5cb-0194-95f9974ad8d5 |
| current-catalog-service-hash | 81bb79d38c15960b92d99bca9288a9108c7a47b18f2423d0f6438c5b7bcd2114 |
| request-id | E274:247C68:6CB5D2A:94176E9:69786DE4 |
| html-safe-nonce | b50a2af4ead22d9b2cba5a3db1d340caad5aad7a38c416f7a12e7b589e2500ad |
| visitor-payload | eyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiJFMjc0OjI0N0M2ODo2Q0I1RDJBOjk0MTc2RTk6Njk3ODZERTQiLCJ2aXNpdG9yX2lkIjoiNTcyOTQ4OTI2MDg1MTUyMzA0NCIsInJlZ2lvbl9lZGdlIjoiaWFkIiwicmVnaW9uX3JlbmRlciI6ImlhZCJ9 |
| visitor-hmac | 423c4537a24e1e52e1e8b81f8ac97a4d8f963fd85dda0900871fd60af2c268f3 |
| hovercard-subject-tag | issue:1210197101 |
| github-keyboard-shortcuts | repository,issues,copilot |
| google-site-verification | Apib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I |
| octolytics-url | https://collector.github.com/github/collect |
| analytics-location | / |
| fb:app_id | 1401488693436528 |
| apple-itunes-app | app-id=1477376905, app-argument=https://github.com/_view_fragments/issues/show/feast-dev/feast/2581/issue_layout |
| twitter:image | https://opengraph.githubassets.com/54615c492dc98cf8a7ef8fc5e3e684ba6881d22e8aeb3c8118ec62c41e84f5f9/feast-dev/feast/issues/2581 |
| twitter:card | summary_large_image |
| og:image | https://opengraph.githubassets.com/54615c492dc98cf8a7ef8fc5e3e684ba6881d22e8aeb3c8118ec62c41e84f5f9/feast-dev/feast/issues/2581 |
| og:image:alt | Expected Behavior Registry protobuf should not have duplicate data source definitions. Current Behavior Every time we run feast apply, we noticed that a data source source is appended to the regist... |
| og:image:width | 1200 |
| og:image:height | 600 |
| og:site_name | GitHub |
| og:type | object |
| og:author:username | tkch3n |
| hostname | github.com |
| expected-hostname | github.com |
| None | 2981c597c945c1d90ac6fa355ce7929b2f413dfe7872ca5c435ee53a24a1de50 |
| turbo-cache-control | no-preview |
| go-import | github.com/feast-dev/feast git https://github.com/feast-dev/feast.git |
| octolytics-dimension-user_id | 57027613 |
| octolytics-dimension-user_login | feast-dev |
| octolytics-dimension-repository_id | 161133770 |
| octolytics-dimension-repository_nwo | feast-dev/feast |
| octolytics-dimension-repository_public | true |
| octolytics-dimension-repository_is_fork | false |
| octolytics-dimension-repository_network_root_id | 161133770 |
| octolytics-dimension-repository_network_root_nwo | feast-dev/feast |
| turbo-body-classes | logged-out env-production page-responsive |
| disable-turbo | false |
| browser-stats-url | https://api.github.com/_private/browser/stats |
| browser-errors-url | https://api.github.com/_private/browser/errors |
| release | 520b65a872113b919c1bbdb03834a50af15859fd |
| ui-target | full |
| theme-color | #1e2327 |
| color-scheme | light dark |
Links:
Viewport: width=device-width