Title: OnDemandFeatureView.feature_transformation.infer_features does pass UDF outputs to python_type_to_feast_value_type · Issue #4308 · feast-dev/feast · GitHub
Open Graph Title: OnDemandFeatureView.feature_transformation.infer_features does pass UDF outputs to python_type_to_feast_value_type · Issue #4308 · feast-dev/feast
X Title: OnDemandFeatureView.feature_transformation.infer_features does pass UDF outputs to python_type_to_feast_value_type · Issue #4308 · feast-dev/feast
Description: Expected Behavior OnDemandFeatureView.feature_transformation.infer_features should be able to infer features from primitive python types for all supported feast data types, for all transformation backends. Current Behavior All on demand ...
Open Graph Description: Expected Behavior OnDemandFeatureView.feature_transformation.infer_features should be able to infer features from primitive python types for all supported feast data types, for all transformation b...
X Description: Expected Behavior OnDemandFeatureView.feature_transformation.infer_features should be able to infer features from primitive python types for all supported feast data types, for all transformation b...
Opengraph URL: https://github.com/feast-dev/feast/issues/4308
X: @github
Domain: github.com
{"@context":"https://schema.org","@type":"DiscussionForumPosting","headline":"OnDemandFeatureView.feature_transformation.infer_features does pass UDF outputs to python_type_to_feast_value_type","articleBody":"## Expected Behavior \r\n\r\n`OnDemandFeatureView.feature_transformation.infer_features` should be able to infer features from primitive python types for all supported feast data types, for all transformation backends.\r\n\r\n## Current Behavior\r\n\r\nAll on demand feature views are currently broken for list types, as there is no way to bypass schema inference.\r\n\r\n### Details\r\n\r\n`OnDemandFeatureView.feature_transformation.infer_features` can only infer features in the type map inside `python_type_to_feast_value_type`, _i.e._\r\n\r\n```Python\r\ntype_map = {\r\n \"int\": ValueType.INT64,\r\n \"str\": ValueType.STRING,\r\n \"string\": ValueType.STRING, # pandas.StringDtype\r\n \"float\": ValueType.DOUBLE,\r\n \"bytes\": ValueType.BYTES,\r\n \"float64\": ValueType.DOUBLE,\r\n \"float32\": ValueType.FLOAT,\r\n \"int64\": ValueType.INT64,\r\n \"uint64\": ValueType.INT64,\r\n \"int32\": ValueType.INT32,\r\n \"uint32\": ValueType.INT32,\r\n \"int16\": ValueType.INT32,\r\n \"uint16\": ValueType.INT32,\r\n \"uint8\": ValueType.INT32,\r\n \"int8\": ValueType.INT32,\r\n \"bool\": ValueType.BOOL,\r\n \"boolean\": ValueType.BOOL,\r\n \"timedelta\": ValueType.UNIX_TIMESTAMP,\r\n \"timestamp\": ValueType.UNIX_TIMESTAMP,\r\n \"datetime\": ValueType.UNIX_TIMESTAMP,\r\n \"datetime64[ns]\": ValueType.UNIX_TIMESTAMP,\r\n \"datetime64[ns, tz]\": ValueType.UNIX_TIMESTAMP,\r\n \"category\": ValueType.STRING,\r\n}\r\n```\r\n\r\nThis is because if the type _e.g._ `ValueType.FLOAT_LIST` doesn't have a mapping in the dictionary above, and `value is None`, then `isinstance(value, dtype)` checks will fall through to the `ValueError` in `python_type_to_feast_value_type`.\r\n\r\n## Steps to reproduce\r\n\r\nInitialize a new repository:\r\n\r\n```bash\r\nfeast init\r\n```\r\n\r\nModify the sample `on_demand_feature_view` to return an array of floats instead of just floats, _e.g._\r\n\r\n```diff\r\ndiff --git a/true_garfish/feature_repo/example_repo.py b/true_garfish/feature_repo/example_repo.py\r\nindex 1f5b946..59d4501 100644\r\n--- a/true_garfish/feature_repo/example_repo.py\r\n+++ b/true_garfish/feature_repo/example_repo.py\r\n@@ -16,7 +16,7 @@ from feast import (\r\n from feast.feature_logging import LoggingConfig\r\n from feast.infra.offline_stores.file_source import FileLoggingDestination\r\n from feast.on_demand_feature_view import on_demand_feature_view\r\n-from feast.types import Float32, Float64, Int64\r\n+from feast.types import Float32, Float64, Int64, Array\r\n \r\n # Define an entity for the driver. You can think of an entity as a primary key used to\r\n # fetch features.\r\n@@ -72,15 +72,16 @@ input_request = RequestSource(\r\n @on_demand_feature_view(\r\n sources=[driver_stats_fv, input_request],\r\n schema=[\r\n- Field(name=\"conv_rate_plus_val1\", dtype=Float64),\r\n- Field(name=\"conv_rate_plus_val2\", dtype=Float64),\r\n+ Field(name=\"conv_rate_plus_vals\", dtype=Array(Float64)),\r\n ],\r\n )\r\n def transformed_conv_rate(inputs: pd.DataFrame) -\u003e pd.DataFrame:\r\n- df = pd.DataFrame()\r\n- df[\"conv_rate_plus_val1\"] = inputs[\"conv_rate\"] + inputs[\"val_to_add\"]\r\n- df[\"conv_rate_plus_val2\"] = inputs[\"conv_rate\"] + inputs[\"val_to_add_2\"]\r\n- return df\r\n+ result = {\"conv_rate_plus_vals\": []}\r\n+ for _, row in inputs.iterrows():\r\n+ result[\"conv_rate_plus_vals\"].append(\r\n+ [row[\"conv_rate\"] + row[\"val_to_add\"], row[\"conv_rate\"] + row[\"val_to_add_2\"]]\r\n+ )\r\n+ return pd.DataFrame(data=result)\r\n```\r\n\r\n3. Run `feast apply`, and you should get the following error:\r\n\r\n```bash\r\nTraceback (most recent call last):\r\n File \"~/.../.venv/bin/feast\", line 8, in \u003cmodule\u003e\r\n sys.exit(cli())\r\n ^^^^^\r\n File \"~/.../.venv/lib/python3.12/site-packages/click/core.py\", line 1157, in __call__\r\n return self.main(*args, **kwargs)\r\n ^^^^^^^^^^^^^^^^^^^^^^^^^^\r\n File \"~/.../.venv/lib/python3.12/site-packages/click/core.py\", line 1078, in main\r\n rv = self.invoke(ctx)\r\n ^^^^^^^^^^^^^^^^\r\n File \"~/.../.venv/lib/python3.12/site-packages/click/core.py\", line 1688, in invoke\r\n return _process_result(sub_ctx.command.invoke(sub_ctx))\r\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\r\n File \"~/.../.venv/lib/python3.12/site-packages/click/core.py\", line 1434, in invoke\r\n return ctx.invoke(self.callback, **ctx.params)\r\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\r\n File \"~/.../.venv/lib/python3.12/site-packages/click/core.py\", line 783, in invoke\r\n return __callback(*args, **kwargs)\r\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^\r\n File \"~/.../.venv/lib/python3.12/site-packages/click/decorators.py\", line 33, in new_func\r\n return f(get_current_context(), *args, **kwargs)\r\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\r\n File \"~/.../.venv/lib/python3.12/site-packages/feast/cli.py\", line 506, in apply_total_command\r\n apply_total(repo_config, repo, skip_source_validation)\r\n File \"~/.../.venv/lib/python3.12/site-packages/feast/repo_operations.py\", line 347, in apply_total\r\n apply_total_with_repo_instance(\r\n File \"~/.../.venv/lib/python3.12/site-packages/feast/repo_operations.py\", line 299, in apply_total_with_repo_instance\r\n registry_diff, infra_diff, new_infra = store.plan(repo)\r\n ^^^^^^^^^^^^^^^^\r\n File \"~/.../.venv/lib/python3.12/site-packages/feast/feature_store.py\", line 745, in plan\r\n self._make_inferences(\r\n File \"~/.../.venv/lib/python3.12/site-packages/feast/feature_store.py\", line 640, in _make_inferences\r\n odfv.infer_features()\r\n File \"~/.../.venv/lib/python3.12/site-packages/feast/on_demand_feature_view.py\", line 521, in infer_features\r\n inferred_features = self.feature_transformation.infer_features(\r\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\r\n File \"~/....venv/lib/python3.12/site-packages/feast/transformation/pandas_transformation.py\", line 47, in infer_features\r\n python_type_to_feast_value_type(f, type_name=str(dt))\r\n File \"~/.../.venv/lib/python3.12/site-packages/feast/type_map.py\", line 215, in python_type_to_feast_value_type\r\n raise ValueError(\r\nValueError: Value with native type object cannot be converted into Feast value type\r\n```\r\n\r\nAdding some debug statements inside `python_type_to_feast_value_type`, we get the following locals before the error was raised:\r\n\r\n```\r\nname='conv_rate_plus_vals'\r\nvalue=None\r\nrecurse=True\r\ntype_name='object'\r\ntype(value)=\u003cclass 'NoneType'\u003e\r\n``` \r\n\r\nAs mentioned before this is because all transformation backends don't pass values to the type mapper, _e.g._ the [pandas backend in this case](https://github.com/feast-dev/feast/blob/master/sdk/python/feast/transformation/pandas_transformation.py#L47) \r\n\r\n### Specifications\r\n\r\n- Version: 0.39.0\r\n- Platform: arm64\r\n- Subsystem: MacOS\r\n\r\n## Possible Solution\r\n\r\n- Pass the sample values generated for type inference through to the type mapper\r\n- Update the type mapper to handle lists that are two levels deep. This is because primitive UDF outputs are wrapped in either a `np.array` or `list` of length 1, so therefore lists should be two levels deep with the inner list being the list of feature values.\r\n","author":{"url":"https://github.com/alexmirrington","@type":"Person","name":"alexmirrington"},"datePublished":"2024-06-24T08:49:49.000Z","interactionStatistic":{"@type":"InteractionCounter","interactionType":"https://schema.org/CommentAction","userInteractionCount":1},"url":"https://github.com/4308/feast/issues/4308"}
| route-pattern | /_view_fragments/issues/show/:user_id/:repository/:id/issue_layout(.:format) |
| route-controller | voltron_issues_fragments |
| route-action | issue_layout |
| fetch-nonce | v2:c8a5b44a-e9a8-3186-386e-557bf074bab4 |
| current-catalog-service-hash | 81bb79d38c15960b92d99bca9288a9108c7a47b18f2423d0f6438c5b7bcd2114 |
| request-id | 9998:337619:1F8F43D:2C4A84A:696F6308 |
| html-safe-nonce | 875e41d3fda8ef2e66f3c73a842cd63d2384341970dadfc960dddbeee7c016b2 |
| visitor-payload | eyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiI5OTk4OjMzNzYxOToxRjhGNDNEOjJDNEE4NEE6Njk2RjYzMDgiLCJ2aXNpdG9yX2lkIjoiNzgzODU5OTIzMTQ4NzA0MjMxMiIsInJlZ2lvbl9lZGdlIjoiaWFkIiwicmVnaW9uX3JlbmRlciI6ImlhZCJ9 |
| visitor-hmac | 7ebb1a0a515ec324a419a358eb9c3c25e88a4bdef843649041450fbfd7dc5308 |
| hovercard-subject-tag | issue:2369631868 |
| github-keyboard-shortcuts | repository,issues,copilot |
| google-site-verification | Apib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I |
| octolytics-url | https://collector.github.com/github/collect |
| analytics-location | / |
| fb:app_id | 1401488693436528 |
| apple-itunes-app | app-id=1477376905, app-argument=https://github.com/_view_fragments/issues/show/feast-dev/feast/4308/issue_layout |
| twitter:image | https://opengraph.githubassets.com/921bf562874d42dbdbba7b00868b25008f9eeec52b9c24e20b087a53b26ad4fd/feast-dev/feast/issues/4308 |
| twitter:card | summary_large_image |
| og:image | https://opengraph.githubassets.com/921bf562874d42dbdbba7b00868b25008f9eeec52b9c24e20b087a53b26ad4fd/feast-dev/feast/issues/4308 |
| og:image:alt | Expected Behavior OnDemandFeatureView.feature_transformation.infer_features should be able to infer features from primitive python types for all supported feast data types, for all transformation b... |
| og:image:width | 1200 |
| og:image:height | 600 |
| og:site_name | GitHub |
| og:type | object |
| og:author:username | alexmirrington |
| hostname | github.com |
| expected-hostname | github.com |
| None | 774d0922d2c4577043d2dab90427344eb4c6ce1d5579acb1dd504cff1a7e46f8 |
| turbo-cache-control | no-preview |
| go-import | github.com/feast-dev/feast git https://github.com/feast-dev/feast.git |
| octolytics-dimension-user_id | 57027613 |
| octolytics-dimension-user_login | feast-dev |
| octolytics-dimension-repository_id | 161133770 |
| octolytics-dimension-repository_nwo | feast-dev/feast |
| octolytics-dimension-repository_public | true |
| octolytics-dimension-repository_is_fork | false |
| octolytics-dimension-repository_network_root_id | 161133770 |
| octolytics-dimension-repository_network_root_nwo | feast-dev/feast |
| turbo-body-classes | logged-out env-production page-responsive |
| disable-turbo | false |
| browser-stats-url | https://api.github.com/_private/browser/stats |
| browser-errors-url | https://api.github.com/_private/browser/errors |
| release | 650acea592f12d1bd8931d44546c209e0b06ed6e |
| ui-target | full |
| theme-color | #1e2327 |
| color-scheme | light dark |
Links:
Viewport: width=device-width