Title: Remove feature "granularity" and relegate to metadata · Issue #17 · feast-dev/feast · GitHub
Open Graph Title: Remove feature "granularity" and relegate to metadata · Issue #17 · feast-dev/feast
X Title: Remove feature "granularity" and relegate to metadata · Issue #17 · feast-dev/feast
Description: [edit] Granularities are no longer required in FeatureRow, or FeatureSpecs, as we have removed history from serving store and the serving api. Thus there is also no requirement for it to be in the warehouse store. Additionally the notion...
Open Graph Description: [edit] Granularities are no longer required in FeatureRow, or FeatureSpecs, as we have removed history from serving store and the serving api. Thus there is also no requirement for it to be in the ...
X Description: [edit] Granularities are no longer required in FeatureRow, or FeatureSpecs, as we have removed history from serving store and the serving api. Thus there is also no requirement for it to be in the ...
Opengraph URL: https://github.com/feast-dev/feast/issues/17
X: @github
Domain: github.com
{"@context":"https://schema.org","@type":"DiscussionForumPosting","headline":"Remove feature \"granularity\" and relegate to metadata","articleBody":"[edit] Granularities are no longer required in FeatureRow, or FeatureSpecs, as we have removed history from serving store and the serving api. Thus there is also no requirement for it to be in the warehouse store. Additionally the notion of granularity has proven to be confusing to end users. History of issue kept below:\r\n\r\n\r\nI'd like to discuss feature granularities.\r\n\r\n## What is granularity\r\n\r\nCurrently we have a fixed set of feast granularities {seconds, minutes, hours, days}.\r\nIt is not always obvious what the feast granularity refers to.\r\n\r\nIn general a feature is handled by a few different datetimes throughout it's lifecycle:\r\n\r\n* the **window** duration of an aggregation (this is upstream to feast)\r\n* the trigger **frequency** that an event is emitted per key, likely irregular if more than once per window (this is upstream to feast)\r\n* the **ingestion event timestamp** that Feast receives during ingestion, determined by the feature creator\r\n* the **storage event timestamp** used to store and retrieve features in Feast, determined by feast. \r\n\r\nThe storage event timestamp is derived by rounding the ingestion event timestamp to start of the granularity for all the features in a feature row. Eg: for a granularity of 1 hour, we round the ingestion timestamp to the start of the enclosing hour.\r\n\r\nFor example, say we have a feature that is aggregated over a 1 hour fixed windows and triggered every one minute. Each minute an update of the 1 hour window aggregation is provided. We would naturally use a 1 hour granularity for this. The ingestion event timestamp should be within the one hour window. The storage event timestamp would be the start of the window.\r\n\r\nAnother example, say we have a feature that is aggregated over a 10 minute sliding window, and triggered only once at the end of every window. In this case, the feast granularity actually needs to be 1 minute. Which can seem confusing. \r\n\r\n## Limitations of current approach\r\nFeast rounds the ingested timestamps to a granularity provided by creation, this seemed a convenience, but it hinders the use of custom granularities and it can cause confusion.\r\n\r\nFor example: because the granularities are an enum and there is not 5 minute option. If we wanted to store and overwrite a new key every five minutes, we would need to use a finer granularity and manually round the ingestion timestamps to the 5 minute marks during feature creation. \r\n\r\nAnother example: Lets say we have a feature called \"product.day.sold\". As it is updated throughout the day, it could represent the number of products sold on that day so far, or just as easily it could represent the number of products sold in the last 24 hours at the time it was updated. It could also represent the last 7 days of sold products as it stood on that particular day. Basically the meaning of this feature is determined by how the feature was created. The **feature granularity is not enough information**, and could be misleading when feature creators are forced to workaround it's limitations.\r\n\r\nI suggest that instead of attempting to handle granularities, we should just require that rounding the timestamps should always happen during feature creation, not within Feast, and we should simply store features against the event timestamp provided. \r\n\r\nThe problem of how to serve keys if do not have a fixed granularity, is not as bad as it sounds.\r\n* firstly, it is only an issue at all when a feature is requested across a time range, not \"latest\". And \"latest\" is the most common request.\r\n* secondly, our currently supported stores, BigTable and Redis, both support scans across key date ranges (Redis via our bucketing approach).\r\n\r\nAnother problem is how do we prevent feature creators from over polluting a key space with far too granular timestamps? We will still have this problem regardless, as a feature creator can always use the \"seconds\" granularity.\r\n\r\n## My proposal\r\n\r\n* The storage event timestamp should be the same thing as ingestion event timestamp.\r\n* We should drop granularity from FeatureRow and ignore it for ingestion and storage purposes. \r\n* We should drop the requirement that granularity is part of the featureId. So instead of {entityName}.{granularity}.{featureName}, it should just be {entityName}.{featureName}.\r\n* BigQuery tables (which are currently separated by granularities, should instead be separated by a feature's group)\r\n\r\nWe would be committing to a requirement that timely short scans across a key range are supported by all stores.\r\n\r\n## Benefits\r\n\r\n* An easier to understand data model.\r\n* Enables storing at custom granularities.\r\n* Simplified code\r\n\r\nWhat do people think?\r\nIs there an issue with serving I have missed?","author":{"url":"https://github.com/tims","@type":"Person","name":"tims"},"datePublished":"2018-12-26T04:48:51.000Z","interactionStatistic":{"@type":"InteractionCounter","interactionType":"https://schema.org/CommentAction","userInteractionCount":8},"url":"https://github.com/17/feast/issues/17"}
| route-pattern | /_view_fragments/issues/show/:user_id/:repository/:id/issue_layout(.:format) |
| route-controller | voltron_issues_fragments |
| route-action | issue_layout |
| fetch-nonce | v2:d9eb1868-2b4c-43c0-79cd-659d1bb34e77 |
| current-catalog-service-hash | 81bb79d38c15960b92d99bca9288a9108c7a47b18f2423d0f6438c5b7bcd2114 |
| request-id | 89BC:30AF41:26766F5:35C0409:696F7175 |
| html-safe-nonce | f9c8ff2645a3bc004e14f87fc97699f56065bbadf40dab821d70956139b74ce7 |
| visitor-payload | eyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiI4OUJDOjMwQUY0MToyNjc2NkY1OjM1QzA0MDk6Njk2RjcxNzUiLCJ2aXNpdG9yX2lkIjoiNDc4ODk4ODg1ODY1NTAxMTE4OSIsInJlZ2lvbl9lZGdlIjoiaWFkIiwicmVnaW9uX3JlbmRlciI6ImlhZCJ9 |
| visitor-hmac | b9793107e437c905cb94e83ecd5121fcadf12a75dcb4d0a3b2017afb81d19b8a |
| hovercard-subject-tag | issue:394081360 |
| github-keyboard-shortcuts | repository,issues,copilot |
| google-site-verification | Apib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I |
| octolytics-url | https://collector.github.com/github/collect |
| analytics-location | / |
| fb:app_id | 1401488693436528 |
| apple-itunes-app | app-id=1477376905, app-argument=https://github.com/_view_fragments/issues/show/feast-dev/feast/17/issue_layout |
| twitter:image | https://opengraph.githubassets.com/b9ec24beec5a2cf6ccf136ee02b0066c0b1a435a96b6ceff960d905bff2681d6/feast-dev/feast/issues/17 |
| twitter:card | summary_large_image |
| og:image | https://opengraph.githubassets.com/b9ec24beec5a2cf6ccf136ee02b0066c0b1a435a96b6ceff960d905bff2681d6/feast-dev/feast/issues/17 |
| og:image:alt | [edit] Granularities are no longer required in FeatureRow, or FeatureSpecs, as we have removed history from serving store and the serving api. Thus there is also no requirement for it to be in the ... |
| og:image:width | 1200 |
| og:image:height | 600 |
| og:site_name | GitHub |
| og:type | object |
| og:author:username | tims |
| hostname | github.com |
| expected-hostname | github.com |
| None | acc7f8c2e144976eeb6e7a73826049dd183bc572ccf5eaa9afb8265c617a97ab |
| turbo-cache-control | no-preview |
| go-import | github.com/feast-dev/feast git https://github.com/feast-dev/feast.git |
| octolytics-dimension-user_id | 57027613 |
| octolytics-dimension-user_login | feast-dev |
| octolytics-dimension-repository_id | 161133770 |
| octolytics-dimension-repository_nwo | feast-dev/feast |
| octolytics-dimension-repository_public | true |
| octolytics-dimension-repository_is_fork | false |
| octolytics-dimension-repository_network_root_id | 161133770 |
| octolytics-dimension-repository_network_root_nwo | feast-dev/feast |
| turbo-body-classes | logged-out env-production page-responsive |
| disable-turbo | false |
| browser-stats-url | https://api.github.com/_private/browser/stats |
| browser-errors-url | https://api.github.com/_private/browser/errors |
| release | cc546a224d6d4726a8d7c3a0dfe0cf65dbf9b7bd |
| ui-target | full |
| theme-color | #1e2327 |
| color-scheme | light dark |
Links:
Viewport: width=device-width