Title: Feast materialize throws an unhandled "ZeroDivisionError: division by zero" exception · Issue #2399 · feast-dev/feast · GitHub
Open Graph Title: Feast materialize throws an unhandled "ZeroDivisionError: division by zero" exception · Issue #2399 · feast-dev/feast
X Title: Feast materialize throws an unhandled "ZeroDivisionError: division by zero" exception · Issue #2399 · feast-dev/feast
Description: Expected Behavior With feast version 0.19.3, feast materialize should not throw an unhandled exception In feast version 0.18.1, everything works as expected. → python feast_materialize.py Materializing 1 feature views from 2022-03-10 05:...
Open Graph Description: Expected Behavior With feast version 0.19.3, feast materialize should not throw an unhandled exception In feast version 0.18.1, everything works as expected. → python feast_materialize.py Materiali...
X Description: Expected Behavior With feast version 0.19.3, feast materialize should not throw an unhandled exception In feast version 0.18.1, everything works as expected. → python feast_materialize.py Materiali...
Opengraph URL: https://github.com/feast-dev/feast/issues/2399
X: @github
Domain: github.com
{"@context":"https://schema.org","@type":"DiscussionForumPosting","headline":"Feast materialize throws an unhandled \"ZeroDivisionError: division by zero\" exception","articleBody":"## Expected Behavior \r\n\r\nWith feast version `0.19.3`, `feast materialize` should not throw an unhandled exception\r\n\r\nIn feast version `0.18.1`, everything works as expected.\r\n\r\n```\r\n→ python feast_materialize.py\r\nMaterializing 1 feature views from 2022-03-10 05:41:44-08:00 to 2022-03-11 05:41:44-08:00 into the dynamodb online store.\r\n\r\nryoung_division_by_zero_reproducer:\r\n100%|█████████████████████████████████████████████████████████████████| 2/2 [00:00\u003c00:00, 19.34it/s]\r\n```\r\n\r\n## Current Behavior\r\n\r\n```\r\n→ python feast_materialize.py\r\n/Users/ryoung/.pyenv/versions/3.8.10/lib/python3.8/importlib/__init__.py:127: DeprecationWarning: The toolz.compatibility module is no longer needed in Python 3 and has been deprecated. Please import these utilities directly from the standard library. This module will be removed in a future release.\r\n return _bootstrap._gcd_import(name[level:], package, level)\r\nMaterializing 1 feature views from 2022-03-10 05:42:56-08:00 to 2022-03-11 05:42:56-08:00 into the dynamodb online store.\r\n\r\nryoung_division_by_zero_reproducer:\r\nTraceback (most recent call last):\r\n File \"feast_materialize.py\", line 32, in \u003cmodule\u003e\r\n fs.materialize(\r\n File \"/Users/ryoung/.pyenv/versions/3.8.10/envs/python-monorepo-3.8.10/lib/python3.8/site-packages/feast/usage.py\", line 269, in wrapper\r\n return func(*args, **kwargs)\r\n File \"/Users/ryoung/.pyenv/versions/3.8.10/envs/python-monorepo-3.8.10/lib/python3.8/site-packages/feast/feature_store.py\", line 1130, in materialize\r\n provider.materialize_single_feature_view(\r\n File \"/Users/ryoung/.pyenv/versions/3.8.10/envs/python-monorepo-3.8.10/lib/python3.8/site-packages/feast/infra/passthrough_provider.py\", line 154, in materialize_single_feature_view\r\n table = offline_job.to_arrow()\r\n File \"/Users/ryoung/.pyenv/versions/3.8.10/envs/python-monorepo-3.8.10/lib/python3.8/site-packages/feast/infra/offline_stores/offline_store.py\", line 121, in to_arrow\r\n return self._to_arrow_internal()\r\n File \"/Users/ryoung/.pyenv/versions/3.8.10/envs/python-monorepo-3.8.10/lib/python3.8/site-packages/feast/usage.py\", line 280, in wrapper\r\n raise exc.with_traceback(traceback)\r\n File \"/Users/ryoung/.pyenv/versions/3.8.10/envs/python-monorepo-3.8.10/lib/python3.8/site-packages/feast/usage.py\", line 269, in wrapper\r\n return func(*args, **kwargs)\r\n File \"/Users/ryoung/.pyenv/versions/3.8.10/envs/python-monorepo-3.8.10/lib/python3.8/site-packages/feast/infra/offline_stores/file.py\", line 75, in _to_arrow_internal\r\n df = self.evaluation_function().compute()\r\n File \"/Users/ryoung/.pyenv/versions/3.8.10/envs/python-monorepo-3.8.10/lib/python3.8/site-packages/feast/infra/offline_stores/file.py\", line 309, in evaluate_offline_job\r\n source_df = source_df.sort_values(by=event_timestamp_column)\r\n File \"/Users/ryoung/.pyenv/versions/3.8.10/envs/python-monorepo-3.8.10/lib/python3.8/site-packages/dask/dataframe/core.py\", line 4388, in sort_values\r\n return sort_values(\r\n File \"/Users/ryoung/.pyenv/versions/3.8.10/envs/python-monorepo-3.8.10/lib/python3.8/site-packages/dask/dataframe/shuffle.py\", line 146, in sort_values\r\n df = rearrange_by_divisions(\r\n File \"/Users/ryoung/.pyenv/versions/3.8.10/envs/python-monorepo-3.8.10/lib/python3.8/site-packages/dask/dataframe/shuffle.py\", line 446, in rearrange_by_divisions\r\n df3 = rearrange_by_column(\r\n File \"/Users/ryoung/.pyenv/versions/3.8.10/envs/python-monorepo-3.8.10/lib/python3.8/site-packages/dask/dataframe/shuffle.py\", line 473, in rearrange_by_column\r\n df = df.repartition(npartitions=npartitions)\r\n File \"/Users/ryoung/.pyenv/versions/3.8.10/envs/python-monorepo-3.8.10/lib/python3.8/site-packages/dask/dataframe/core.py\", line 1319, in repartition\r\n return repartition_npartitions(self, npartitions)\r\n File \"/Users/ryoung/.pyenv/versions/3.8.10/envs/python-monorepo-3.8.10/lib/python3.8/site-packages/dask/dataframe/core.py\", line 6859, in repartition_npartitions\r\n npartitions_ratio = df.npartitions / npartitions\r\nZeroDivisionError: division by zero\r\n```\r\n\r\n## Steps to reproduce\r\nCreate a list of feature records in PySpark and write them out as a parquet file.\r\n```\r\nfrom pyspark.sql import types as T\r\n\r\nfrom datetime import datetime, timedelta\r\n\r\nINPUT_SCHEMA = T.StructType(\r\n [\r\n T.StructField(\"id\", T.StringType(), False),\r\n T.StructField(\"feature1\", T.FloatType(), False),\r\n T.StructField(\"feature2\", T.FloatType(), False),\r\n T.StructField(\"event_timestamp\", T.TimestampType(), False),\r\n ]\r\n)\r\n\r\nnow = datetime.now()\r\none_hour_ago = now - timedelta(hours=1)\r\n\r\nfeature_records = [\r\n {\r\n \"id\": \"foo\",\r\n \"event_timestamp\": one_hour_ago,\r\n \"feature1\": 5.50,\r\n \"feature2\": 7.50,\r\n },\r\n {\r\n \"id\": \"bar\",\r\n \"event_timestamp\": one_hour_ago,\r\n \"feature1\": -1.10,\r\n \"feature2\": 2.20,\r\n },\r\n]\r\n\r\ndf = spark.createDataFrame(data=feature_records, schema=INPUT_SCHEMA)\r\ndf.show(truncate=False)\r\n\r\ndf.write.parquet(mode=\"overwrite\", path=\"s3://XXX/reproducer/2022-03-11T05:34:51.599215/\")\r\n```\r\n\r\nThe output should look something like:\r\n```\r\n+---+--------+--------+--------------------------+\r\n|id |feature1|feature2|event_timestamp |\r\n+---+--------+--------+--------------------------+\r\n|foo|5.5 |7.5 |2022-03-11 04:35:39.318222|\r\n|bar|-1.1 |2.2 |2022-03-11 04:35:39.318222|\r\n+---+--------+--------+--------------------------+\r\n```\r\n\r\nCreate a `feast_materialize.py` script.\r\n\r\n```\r\nfrom datetime import datetime, timedelta\r\n\r\nfrom feast import FeatureStore, Entity, Feature, FeatureView, FileSource, ValueType\r\n\r\nnow = datetime.utcnow()\r\none_day_ago = now - timedelta(days=1)\r\n\r\ns3_url = \"s3://XXX/reproducer/2022-03-11T05:34:51.599215/\"\r\n\r\noffline_features_dump = FileSource(\r\n path=s3_url,\r\n event_timestamp_column=\"event_timestamp\",\r\n)\r\n\r\nentity = Entity(name=\"id\", value_type=ValueType.STRING)\r\nfeature_names = [\"feature1\", \"feature2\"]\r\n\r\nfeature_view = FeatureView(\r\n name=\"ryoung_division_by_zero_reproducer\",\r\n entities=[\"id\"],\r\n ttl=timedelta(days=30),\r\n features=[Feature(name=f, dtype=ValueType.FLOAT) for f in feature_names],\r\n online=True,\r\n batch_source=offline_features_dump,\r\n)\r\n\r\nfs = FeatureStore(\".\")\r\nfs.apply(entity)\r\nfs.apply(feature_view)\r\n\r\nfs.materialize(\r\n start_date=one_day_ago,\r\n end_date=now,\r\n feature_views=[\"ryoung_division_by_zero_reproducer\"],\r\n)\r\n```\r\n\r\nNote that you need to supply your own S3 bucket.\r\n\r\n### Specifications\r\n\r\n- Version: `0.19.3`\r\n- Platform: `Darwin Kernel Version 21.3.0`\r\n- Subsystem:\r\n\r\n## Possible Solution\r\n\r\nI downgraded back to feast version `0.18.1`.","author":{"url":"https://github.com/RenaultAI","@type":"Person","name":"RenaultAI"},"datePublished":"2022-03-11T05:44:57.000Z","interactionStatistic":{"@type":"InteractionCounter","interactionType":"https://schema.org/CommentAction","userInteractionCount":4},"url":"https://github.com/2399/feast/issues/2399"}
| route-pattern | /_view_fragments/issues/show/:user_id/:repository/:id/issue_layout(.:format) |
| route-controller | voltron_issues_fragments |
| route-action | issue_layout |
| fetch-nonce | v2:df9e70d3-b92c-484e-ffaf-b2d1e9a44a1d |
| current-catalog-service-hash | 81bb79d38c15960b92d99bca9288a9108c7a47b18f2423d0f6438c5b7bcd2114 |
| request-id | B284:344C46:2D85E8C:4060393:697917B9 |
| html-safe-nonce | 32012b7b2079b9032ca1115a8c7f0e6e11e314385cc919996cfdf8d6b5c51900 |
| visitor-payload | eyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiJCMjg0OjM0NEM0NjoyRDg1RThDOjQwNjAzOTM6Njk3OTE3QjkiLCJ2aXNpdG9yX2lkIjoiMzAxMTY4NjcwMjQzNjEyODY5NyIsInJlZ2lvbl9lZGdlIjoiaWFkIiwicmVnaW9uX3JlbmRlciI6ImlhZCJ9 |
| visitor-hmac | e9a0a4a53aeb4c72c931c7fdaa5e43a9a8c6b820b876ea07da469245f47efd29 |
| hovercard-subject-tag | issue:1166027451 |
| github-keyboard-shortcuts | repository,issues,copilot |
| google-site-verification | Apib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I |
| octolytics-url | https://collector.github.com/github/collect |
| analytics-location | / |
| fb:app_id | 1401488693436528 |
| apple-itunes-app | app-id=1477376905, app-argument=https://github.com/_view_fragments/issues/show/feast-dev/feast/2399/issue_layout |
| twitter:image | https://opengraph.githubassets.com/f6d2d563e13996dba5853547407bb9a52563fc0ef1fc065333def1fc4b0ddfdb/feast-dev/feast/issues/2399 |
| twitter:card | summary_large_image |
| og:image | https://opengraph.githubassets.com/f6d2d563e13996dba5853547407bb9a52563fc0ef1fc065333def1fc4b0ddfdb/feast-dev/feast/issues/2399 |
| og:image:alt | Expected Behavior With feast version 0.19.3, feast materialize should not throw an unhandled exception In feast version 0.18.1, everything works as expected. → python feast_materialize.py Materiali... |
| og:image:width | 1200 |
| og:image:height | 600 |
| og:site_name | GitHub |
| og:type | object |
| og:author:username | RenaultAI |
| hostname | github.com |
| expected-hostname | github.com |
| None | db675ffbe86f3a08023aaf76f083fc7f65e074708cdc617650b84119176f1009 |
| turbo-cache-control | no-preview |
| go-import | github.com/feast-dev/feast git https://github.com/feast-dev/feast.git |
| octolytics-dimension-user_id | 57027613 |
| octolytics-dimension-user_login | feast-dev |
| octolytics-dimension-repository_id | 161133770 |
| octolytics-dimension-repository_nwo | feast-dev/feast |
| octolytics-dimension-repository_public | true |
| octolytics-dimension-repository_is_fork | false |
| octolytics-dimension-repository_network_root_id | 161133770 |
| octolytics-dimension-repository_network_root_nwo | feast-dev/feast |
| turbo-body-classes | logged-out env-production page-responsive |
| disable-turbo | false |
| browser-stats-url | https://api.github.com/_private/browser/stats |
| browser-errors-url | https://api.github.com/_private/browser/errors |
| release | 3e6c9f597d227b0490794716e8b9dddd21a41ead |
| ui-target | full |
| theme-color | #1e2327 |
| color-scheme | light dark |
Links:
Viewport: width=device-width