René's URL Explorer Experiment


Title: Feast API: Feature references, concept hierarchy, and data model · Issue #479 · feast-dev/feast · GitHub

Open Graph Title: Feast API: Feature references, concept hierarchy, and data model · Issue #479 · feast-dev/feast

X Title: Feast API: Feature references, concept hierarchy, and data model · Issue #479 · feast-dev/feast

Description: This issue is meant to be a discussion of the current Feast API as it relates to feature references, a key component of the user facing API. Additionally, it will also discuss the current data model and our concept hierarchy. 1. Backgrou...

Open Graph Description: This issue is meant to be a discussion of the current Feast API as it relates to feature references, a key component of the user facing API. Additionally, it will also discuss the current data mode...

X Description: This issue is meant to be a discussion of the current Feast API as it relates to feature references, a key component of the user facing API. Additionally, it will also discuss the current data mode...

Opengraph URL: https://github.com/feast-dev/feast/issues/479

X: @github

direct link

Domain: github.com


Hey, it has json ld scripts:
{"@context":"https://schema.org","@type":"DiscussionForumPosting","headline":"Feast API: Feature references, concept hierarchy, and data model","articleBody":"This issue is meant to be a discussion of the current Feast API as it relates to `feature references`,  a key component of the user facing API. Additionally, it will also discuss the current data model and our concept hierarchy. \r\n\r\n## 1. Background\r\nThe Feast user facing API and data model changed dramatically from 0.1 to 0.2+. The original intention was to simplify the API as much as possible and gradually evolve it as new user requirements available.\r\n\r\nTwo important reference documents on this topic are\r\n* [Feast 0.3 RFC](https://docs.google.com/document/d/1QnUQWhwJ1fDnMQ4sZdUdLa_4C-x5Hhpw3C8nYLg4nsY/edit)\r\n* [Feast Projects RFC](https://docs.google.com/document/d/14-QBz9X8zK_aGAY2ti43a7PqMxvqp0035ec7QYgmEBM/edit#)\r\n\r\n## 2. Problem statement\r\nThe Feast API is evolving as more and more teams adopt the software and share their requirements with us. In most cases this means an expansion of the API, but in some cases it means a reversal. \r\n\r\nWith the introduction of projects into Feast ([Feast Projects RFC](https://docs.google.com/document/d/14-QBz9X8zK_aGAY2ti43a7PqMxvqp0035ec7QYgmEBM/edit#)), our API has evolved again. This change has affected feature references, the data model, and concept hierarchy.\r\n\r\nThe most critical feedback on this change has been that it introduces unnecessary complexity to address problems (isolation, namespacing, security), that could be solved in a different way.\r\n\r\n## 3. Objective\r\nThe point of this GitHub issue is to settle our API for feature references, our concept hierarchy, and data model in such a way that we\r\n* Meet all our known requirements for future development\r\n* Minimize user facing changes and migration requirements\r\n* Maintain flexibility in accepting new user requirements and evolving our API\r\n\r\nPut simply, we want to make sure that we are on the right path and make the necessary changes now when its least disruptive.\r\n\r\n## 4. What are feature references?\r\nFeature references (previously Feature Ids) are strings/objects within Feast that allows Feast and users of Feast to reference specific features. Feature references are primarily used as a means of indicating to Feast which features a user would like to retrieve.\r\n\r\nOriginally, feature references were defined as follows\r\n`\u003cfeature-set\u003e:\u003cfeature-name\u003e:\u003cfeature-version\u003e`\r\nAll parts of the above reference were required at the time.\r\n\r\nFeature references have recently been updated (as part of the [Projects RFC](https://docs.google.com/document/d/14-QBz9X8zK_aGAY2ti43a7PqMxvqp0035ec7QYgmEBM/edit#))\r\n\r\nThe move towards project namespaces now moves feature sets and features/entities into the following hierarchy\r\n![Screenshot from 2020-02-18 10-23-19](https://user-images.githubusercontent.com/6728866/74698640-8c60f780-5239-11ea-8583-53671dd55ac2.png)\r\n\r\nFeature references are now defined as: `\u003cproject\u003e/\u003cfeature-name\u003e:\u003cfeature-version\u003e`\r\n\r\nThe following constraints apply\r\n- Versions are optional. If no version is provided then the latest version of a feature is used.\r\n- Feature names must be unique within a project (even across feature sets within that project).\r\n- Entity names must be unique within a project (but can be reused across feature sets).\r\n\r\nOne of our primary motivations was to allow users to reference features directly by name. With `versions` becoming optional and allowing the `project` to be set externally, this is now possible. Users can provide features as a list of feature names\r\n\r\nAn example of feature references being used below (from the Python SDK):\r\n```\r\nonline_features = client.get_online_features(\r\n    feature_refs=[\r\n        f\"daily_transactions\",\r\n        f\"total_transactions\",\r\n    ],\r\n    entity_rows=entity_rows,\r\n)\r\n```\r\n\r\n## 5. How are feature references used?\r\n### 5.1 During online serving\r\nDuring online serving the user will provide two sets of information to Feast during feature retrieval. \r\n- A list of feature references \r\n- A list of entities\r\n\r\nFeast wants to construct a response object with all of the data from these features on all of these entities.\r\n\r\nFor example, if a user sends a request with a single feature reference as `daily_transactions`, Feast will attempt to add the missing information. It will add the `project` id (which currently must be provided by the user), it will then determine the `feature set` that contains that feature name, and then finally it will determine the latest `version` of the feature set in which the feature occurs. \r\n\r\nInternally, Feast is left with something that resembles the following\r\n`my_customer_project/my_customer_feature_set:daily_transactions:3`\r\n\r\nSince features are stored based on feature sets, Feast first converts the above into what we can informally define as a feature set reference, resembling the following\r\n`\u003cproject\u003e/\u003cfeature-set-name\u003e:\u003cfeature-set-version\u003e`\r\nor tangibly\r\n`my_customer_project/my_customer_feature_set:3`\r\n\r\nIn the case of Redis, Feast will use the above feature set reference, along with the entities the user has provided, to construct a list of keys to look up. The responses from the database are then used to build a response object that is returned to the user.\r\n\r\n### 5.2 During batch serving\r\nThe batch serving case is very similar to the online serving case, but with more complexity on queries and joins.\r\n\r\nThe user provides the following during batch retrieval\r\n- A list of feature references \r\n- A list of entities paired with timestamps\r\n\r\nFeature references are converted into their full form, as well as used to create feature set references (as in online serving). In the case of BigQuery, the feature set reference maps directly to a table. For each feature set table that Feast needs to query features from, Feast runs a point in time correct query using the entities+timestamps for the specific feature columns. This produces a resultant table with the users requested feature data, over the timestamps and features, but one specific feature set. \r\n\r\nFeast then uses the entity columns in each feature set table as a means of joining the results of these sub-queries into a single resultant dataframe. \r\n\r\n### 5.3 During ingestion of data into stores\r\nWhen loading data into Feast, data first needs to be converted into [FeatureRow](https://github.com/gojek/feast/blob/master/protos/feast/types/FeatureRow.proto#L28) format and then pushed into a Kafka stream. \r\n\r\nDuring this conversion to feature row form, it is necessary to set a field called `feature_set` with the feature set reference. To reiterate, the feature set reference looks something like:\r\n`\u003cproject\u003e/\u003cfeature-set-name\u003e:\u003cfeature-set-version\u003e`\r\n\r\nIngestion jobs that pick up these rows are then able to easily identify the row as belonging to a specific project and feature set. The jobs then write all of these rows to all of the stores that subscribe to these feature sets. \r\n\r\n## 6. Problems with the current implementation\r\n### 6.1 **Feature set versions are unnecessary:**\r\nThe concept of feature set versions was introduced in order to allow users to reuse feature set names. However, they add additional complexity at both ingestion time as well as retrieval time. Users need to maintain a knowledge of the correct version of feature set to ingest data to and to retrieve data from. If they dont pin their retrieval to a specific version then they risk having their system go down at a version increment.\r\n\r\n### 6.2 **Projects could be unnecessary at the top of the concept hierarchy:**\r\nProjects as a concept was introduced to provide a means of \r\n* **Isolation between users:** Users can register the same feature sets and features within their own project namespace without conflicts arrising between users.\r\n* **Access control:** Projects provide a top level hierarchy that makes access control more convenient to implement\r\n* **Ease of feature retrieval:** By introducing naming constraints at the project level, it is easier to logically group and reference feature by name. Thus, projects provide a way of grouping based on retrieval where feature sets provide a means of grouping based on ingestion.\r\n\r\nThe problem with `projects` is that it introduces a layer into the concept hierarchy that makes Feast harder to understand and could be introducing unnecessary complexity. It's possible that all of the above requirements for introducing projects could be addressed while still maintaining feature sets as the top level concept.\r\n\r\n### 6.3 **Projects are a cause for code smell in the data model:**\r\nThere are currently three locations where projects occur.\r\n1. Ingestion (FeatureRows)\r\n2. Stores (tables and keys)\r\n3. Serving/retrieval (incoming queries)\r\n\r\nThe current approach has code smell in the fact that FeatureRows have to know their own identity. Today, having each FeatureRow know its own identify allows Feast to consume from topics that contain mixed feature sets (versions and names). Feast is able to differentiate FeatureRows from each other and can know how to interpret their contents based on a feature reference contained within the row.\r\n\r\nHowever, In the case that Feast were to consume features from an external stream that it had no control over (not even the data model), Feast would not have the feature set reference conveniently available inside the event payload. \r\n\r\nThe second occurrence of projects is in the store. Tables are currently named according to `projectName_featureSet_version`. Projects are a necessity here since feature set names can be duplicated across projects. However, projects are not essential complexity in the same way a feature set is, and doesnt seem natural to encode into the data model itself.\r\n\r\n### 6.4 **Feature sets are a leaky abstraction**: \r\nFeature sets are a core part of the existing data model. Feature data is stored on a feature set within a feast store like Redis or BigQuery. In order to find the features a user is looking for, it is still necessary to determine the feature set they need from their `feature reference`. This seems to work at retrieval time since Feast Serving can maintain a cache of available feature sets (albeit introducing a new inefficiency during lookup). Two problems exist here:\r\n\r\n1. There is a disconnect between how users are producing data (`feature set references`) and how users are consuming data (`feature references`). Users are loading in FeatureRows into feature sets, but they are querying out features from projects. Ideally these two concepts wouldn't be so distinct.\r\n2. Currently, feature references are defined as follows: `\u003cproject\u003e/\u003cfeature-name\u003e:\u003cfeature-version\u003e`. However, the concept of a `feature-version` doesn't exist. Feature are currently inheriting their version from their feature set. So right now a `feature references` still contain trace information about the parent feature set.","author":{"url":"https://github.com/woop","@type":"Person","name":"woop"},"datePublished":"2020-02-18T04:48:11.000Z","interactionStatistic":{"@type":"InteractionCounter","interactionType":"https://schema.org/CommentAction","userInteractionCount":23},"url":"https://github.com/479/feast/issues/479"}

route-pattern/_view_fragments/issues/show/:user_id/:repository/:id/issue_layout(.:format)
route-controllervoltron_issues_fragments
route-actionissue_layout
fetch-noncev2:c71ecac1-a278-3962-e07e-5ef300213c14
current-catalog-service-hash81bb79d38c15960b92d99bca9288a9108c7a47b18f2423d0f6438c5b7bcd2114
request-id94F4:1D13BD:B62EB3:F1E959:697BFD24
html-safe-noncec3a5c1c03667505574a1b3c73ea3b22cb40df8560a5099221b1dd4f2e94dae7c
visitor-payloadeyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiI5NEY0OjFEMTNCRDpCNjJFQjM6RjFFOTU5OjY5N0JGRDI0IiwidmlzaXRvcl9pZCI6IjY3Nzk2NjY4NDk4NjE0Njc0MjgiLCJyZWdpb25fZWRnZSI6ImlhZCIsInJlZ2lvbl9yZW5kZXIiOiJpYWQifQ==
visitor-hmac5e5e791ac1315446c09c1ab524e4df110892c2a10edba1996f335fa6b7726a13
hovercard-subject-tagissue:566643466
github-keyboard-shortcutsrepository,issues,copilot
google-site-verificationApib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I
octolytics-urlhttps://collector.github.com/github/collect
analytics-location///voltron/issues_fragments/issue_layout
fb:app_id1401488693436528
apple-itunes-appapp-id=1477376905, app-argument=https://github.com/_view_fragments/issues/show/feast-dev/feast/479/issue_layout
twitter:imagehttps://opengraph.githubassets.com/3a06a14dbdc9296d937fc9bcaa458df475976926a38c9137f0b8f24faec28b51/feast-dev/feast/issues/479
twitter:cardsummary_large_image
og:imagehttps://opengraph.githubassets.com/3a06a14dbdc9296d937fc9bcaa458df475976926a38c9137f0b8f24faec28b51/feast-dev/feast/issues/479
og:image:altThis issue is meant to be a discussion of the current Feast API as it relates to feature references, a key component of the user facing API. Additionally, it will also discuss the current data mode...
og:image:width1200
og:image:height600
og:site_nameGitHub
og:typeobject
og:author:usernamewoop
hostnamegithub.com
expected-hostnamegithub.com
Noneda4f0ee56809799586f8ee546b27f94fe9b5893edfbf87732e82be45be013b52
turbo-cache-controlno-preview
go-importgithub.com/feast-dev/feast git https://github.com/feast-dev/feast.git
octolytics-dimension-user_id57027613
octolytics-dimension-user_loginfeast-dev
octolytics-dimension-repository_id161133770
octolytics-dimension-repository_nwofeast-dev/feast
octolytics-dimension-repository_publictrue
octolytics-dimension-repository_is_forkfalse
octolytics-dimension-repository_network_root_id161133770
octolytics-dimension-repository_network_root_nwofeast-dev/feast
turbo-body-classeslogged-out env-production page-responsive
disable-turbofalse
browser-stats-urlhttps://api.github.com/_private/browser/stats
browser-errors-urlhttps://api.github.com/_private/browser/errors
release2d0972e08a3f8dfff1c4bf1f3d026a7d3a209c26
ui-targetfull
theme-color#1e2327
color-schemelight dark

Links:

Skip to contenthttps://github.com/feast-dev/feast/issues/479#start-of-content
https://github.com/
Sign in https://github.com/login?return_to=https%3A%2F%2Fgithub.com%2Ffeast-dev%2Ffeast%2Fissues%2F479
GitHub CopilotWrite better code with AIhttps://github.com/features/copilot
GitHub SparkBuild and deploy intelligent appshttps://github.com/features/spark
GitHub ModelsManage and compare promptshttps://github.com/features/models
MCP RegistryNewIntegrate external toolshttps://github.com/mcp
ActionsAutomate any workflowhttps://github.com/features/actions
CodespacesInstant dev environmentshttps://github.com/features/codespaces
IssuesPlan and track workhttps://github.com/features/issues
Code ReviewManage code changeshttps://github.com/features/code-review
GitHub Advanced SecurityFind and fix vulnerabilitieshttps://github.com/security/advanced-security
Code securitySecure your code as you buildhttps://github.com/security/advanced-security/code-security
Secret protectionStop leaks before they starthttps://github.com/security/advanced-security/secret-protection
Why GitHubhttps://github.com/why-github
Documentationhttps://docs.github.com
Bloghttps://github.blog
Changeloghttps://github.blog/changelog
Marketplacehttps://github.com/marketplace
View all featureshttps://github.com/features
Enterpriseshttps://github.com/enterprise
Small and medium teamshttps://github.com/team
Startupshttps://github.com/enterprise/startups
Nonprofitshttps://github.com/solutions/industry/nonprofits
App Modernizationhttps://github.com/solutions/use-case/app-modernization
DevSecOpshttps://github.com/solutions/use-case/devsecops
DevOpshttps://github.com/solutions/use-case/devops
CI/CDhttps://github.com/solutions/use-case/ci-cd
View all use caseshttps://github.com/solutions/use-case
Healthcarehttps://github.com/solutions/industry/healthcare
Financial serviceshttps://github.com/solutions/industry/financial-services
Manufacturinghttps://github.com/solutions/industry/manufacturing
Governmenthttps://github.com/solutions/industry/government
View all industrieshttps://github.com/solutions/industry
View all solutionshttps://github.com/solutions
AIhttps://github.com/resources/articles?topic=ai
Software Developmenthttps://github.com/resources/articles?topic=software-development
DevOpshttps://github.com/resources/articles?topic=devops
Securityhttps://github.com/resources/articles?topic=security
View all topicshttps://github.com/resources/articles
Customer storieshttps://github.com/customer-stories
Events & webinarshttps://github.com/resources/events
Ebooks & reportshttps://github.com/resources/whitepapers
Business insightshttps://github.com/solutions/executive-insights
GitHub Skillshttps://skills.github.com
Documentationhttps://docs.github.com
Customer supporthttps://support.github.com
Community forumhttps://github.com/orgs/community/discussions
Trust centerhttps://github.com/trust-center
Partnershttps://github.com/partners
GitHub SponsorsFund open source developershttps://github.com/sponsors
Security Labhttps://securitylab.github.com
Maintainer Communityhttps://maintainers.github.com
Acceleratorhttps://github.com/accelerator
Archive Programhttps://archiveprogram.github.com
Topicshttps://github.com/topics
Trendinghttps://github.com/trending
Collectionshttps://github.com/collections
Enterprise platformAI-powered developer platformhttps://github.com/enterprise
GitHub Advanced SecurityEnterprise-grade security featureshttps://github.com/security/advanced-security
Copilot for BusinessEnterprise-grade AI featureshttps://github.com/features/copilot/copilot-business
Premium SupportEnterprise-grade 24/7 supporthttps://github.com/premium-support
Pricinghttps://github.com/pricing
Search syntax tipshttps://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax
documentationhttps://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax
Sign in https://github.com/login?return_to=https%3A%2F%2Fgithub.com%2Ffeast-dev%2Ffeast%2Fissues%2F479
Sign up https://github.com/signup?ref_cta=Sign+up&ref_loc=header+logged+out&ref_page=%2F%3Cuser-name%3E%2F%3Crepo-name%3E%2Fvoltron%2Fissues_fragments%2Fissue_layout&source=header-repo&source_repo=feast-dev%2Ffeast
Reloadhttps://github.com/feast-dev/feast/issues/479
Reloadhttps://github.com/feast-dev/feast/issues/479
Reloadhttps://github.com/feast-dev/feast/issues/479
feast-dev https://github.com/feast-dev
feasthttps://github.com/feast-dev/feast
Notifications https://github.com/login?return_to=%2Ffeast-dev%2Ffeast
Fork 1.2k https://github.com/login?return_to=%2Ffeast-dev%2Ffeast
Star 6.7k https://github.com/login?return_to=%2Ffeast-dev%2Ffeast
Code https://github.com/feast-dev/feast
Issues 182 https://github.com/feast-dev/feast/issues
Pull requests 66 https://github.com/feast-dev/feast/pulls
Discussions https://github.com/feast-dev/feast/discussions
Actions https://github.com/feast-dev/feast/actions
Security 0 https://github.com/feast-dev/feast/security
Insights https://github.com/feast-dev/feast/pulse
Code https://github.com/feast-dev/feast
Issues https://github.com/feast-dev/feast/issues
Pull requests https://github.com/feast-dev/feast/pulls
Discussions https://github.com/feast-dev/feast/discussions
Actions https://github.com/feast-dev/feast/actions
Security https://github.com/feast-dev/feast/security
Insights https://github.com/feast-dev/feast/pulse
New issuehttps://github.com/login?return_to=https://github.com/feast-dev/feast/issues/479
New issuehttps://github.com/login?return_to=https://github.com/feast-dev/feast/issues/479
Feast API: Feature references, concept hierarchy, and data modelhttps://github.com/feast-dev/feast/issues/479#top
keep-openhttps://github.com/feast-dev/feast/issues?q=state%3Aopen%20label%3A%22keep-open%22
kind/discussionhttps://github.com/feast-dev/feast/issues?q=state%3Aopen%20label%3A%22kind%2Fdiscussion%22
kind/featureNew feature or requesthttps://github.com/feast-dev/feast/issues?q=state%3Aopen%20label%3A%22kind%2Ffeature%22
https://github.com/woop
https://github.com/woop
woophttps://github.com/woop
on Feb 18, 2020https://github.com/feast-dev/feast/issues/479#issue-566643466
Feast 0.3 RFChttps://docs.google.com/document/d/1QnUQWhwJ1fDnMQ4sZdUdLa_4C-x5Hhpw3C8nYLg4nsY/edit
Feast Projects RFChttps://docs.google.com/document/d/14-QBz9X8zK_aGAY2ti43a7PqMxvqp0035ec7QYgmEBM/edit
Feast Projects RFChttps://docs.google.com/document/d/14-QBz9X8zK_aGAY2ti43a7PqMxvqp0035ec7QYgmEBM/edit
Projects RFChttps://docs.google.com/document/d/14-QBz9X8zK_aGAY2ti43a7PqMxvqp0035ec7QYgmEBM/edit
https://user-images.githubusercontent.com/6728866/74698640-8c60f780-5239-11ea-8583-53671dd55ac2.png
FeatureRowhttps://github.com/gojek/feast/blob/master/protos/feast/types/FeatureRow.proto#L28
keep-openhttps://github.com/feast-dev/feast/issues?q=state%3Aopen%20label%3A%22keep-open%22
kind/discussionhttps://github.com/feast-dev/feast/issues?q=state%3Aopen%20label%3A%22kind%2Fdiscussion%22
kind/featureNew feature or requesthttps://github.com/feast-dev/feast/issues?q=state%3Aopen%20label%3A%22kind%2Ffeature%22
https://github.com
Termshttps://docs.github.com/site-policy/github-terms/github-terms-of-service
Privacyhttps://docs.github.com/site-policy/privacy-policies/github-privacy-statement
Securityhttps://github.com/security
Statushttps://www.githubstatus.com/
Communityhttps://github.community/
Docshttps://docs.github.com/
Contacthttps://support.github.com?tags=dotcom-footer

Viewport: width=device-width


URLs of crawlers that visited me.