Title: Refactor Source & Job data model and Stop Duplicate Ingestion Jobs by woop · Pull Request #685 · feast-dev/feast · GitHub
Open Graph Title: Refactor Source & Job data model and Stop Duplicate Ingestion Jobs by woop · Pull Request #685 · feast-dev/feast
X Title: Refactor Source & Job data model and Stop Duplicate Ingestion Jobs by woop · Pull Request #685 · feast-dev/feast
Description: What this PR does / why we need it: 1. Generalise Source Model. The current Source model in Feast Core is Kafka specific. For all intents and purposes it is a hardcoded implementation of KafkaSource containing topics/brokers as top level fields, despite the naming being Source. Not generalizing the data model at this point (prior to the release of 0.5) will cause further problems down the road when new sources are introduced. This PR moves Source configuration into a config object and isolates Kafka specific logic to case statements. isDefault is retained for the time being as a top level field, since it can easily be phased out if needed. Configuration is stored in String format: Comparison between Source objects with .equals() no longer takes into the isDefault field. Under this model, identical Source objects (ie .equals()) can be stored as duplicate source Objects. Added de duplication code to JobCoordinatorService to take this into account. 2. Make Feast stop duplicate ingestion Jobs. Currently JobCoordinatorService does not stop duplicate jobs (ie ingestion jobs that ingest from the same exact source to store pairing.). Updates JobCoordinatorService to abort these extra ingestion Jobs when safe (ie only when JobCoordinatorService can find a running ingestion job for each Source to Store pairing). 3. Job Model Refactors Standardize & Updated JobManager API: startJob() is standardised as transitioning a Job from PENDING to RUNNING. abortJob() is standardised as transitioning a Job from RUNNING to ABORTING changed abortJob() to return Job and take a Job as args to be consistent with other methods. Refactored JobUpdateTask.call() to be easier to follow. Refactored JobCoordinatorService.poll() into multiple methods (ie getSourceToStoreMapping(), makeJobUpdateTasks()) to make code more readable. Updated Job to store source fields (ie type and config) as inline fields in the Job table. This is done to make the Job model more consistent why the ingestion Job it represents Modifying the Source that the Job model references does not reflect onto the underlying Ingestion Job. Hence source fields are copied onto Job to reflect this in the Job model. Which issue(s) this PR fixes: Fixes #632 Does this PR introduce a user-facing change?: The database schema for Source has been generalized. This is a breaking change and requires a migration. The database schema for Job has changed. The Job table stores no longer Sources by id, instead stores Source.config and Source.type as inline fields. Feast now stops duplicate Ingestion Jobs with the same source and store pairing.
Open Graph Description: What this PR does / why we need it: 1. Generalise Source Model. The current Source model in Feast Core is Kafka specific. For all intents and purposes it is a hardcoded implementation of KafkaSourc...
X Description: What this PR does / why we need it: 1. Generalise Source Model. The current Source model in Feast Core is Kafka specific. For all intents and purposes it is a hardcoded implementation of KafkaSourc...
Opengraph URL: https://github.com/feast-dev/feast/pull/685
X: @github
Domain: github.com
| route-pattern | /:user_id/:repository/pull/:id/checks(.:format) |
| route-controller | pull_requests |
| route-action | checks |
| fetch-nonce | v2:d81711e3-13d1-df93-94cc-d31eeddb4f5d |
| current-catalog-service-hash | 87dc3bc62d9b466312751bfd5f889726f4f1337bdff4e8be7da7c93d6c00a25a |
| request-id | C3F0:2CEC96:8D9EAC:BCB02B:697BF451 |
| html-safe-nonce | a4078584e570dd7d353acdb1f49831e2f656e376f8e1f0b949f9771cfe5009a6 |
| visitor-payload | eyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiJDM0YwOjJDRUM5Njo4RDlFQUM6QkNCMDJCOjY5N0JGNDUxIiwidmlzaXRvcl9pZCI6Ijg1MDEzMjQyMjE1NDI4MjI5OTMiLCJyZWdpb25fZWRnZSI6ImlhZCIsInJlZ2lvbl9yZW5kZXIiOiJpYWQifQ== |
| visitor-hmac | 3434ecddaf64fcfad46a0fad868a11b755257120364ee9f690b59cb53f526e51 |
| hovercard-subject-tag | pull_request:415065697 |
| github-keyboard-shortcuts | repository,pull-request-list,pull-request-conversation,pull-request-files-changed,checks,copilot |
| google-site-verification | Apib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I |
| octolytics-url | https://collector.github.com/github/collect |
| analytics-location | / |
| fb:app_id | 1401488693436528 |
| apple-itunes-app | app-id=1477376905, app-argument=https://github.com/feast-dev/feast/pull/685/checks |
| twitter:image | https://avatars.githubusercontent.com/u/6728866?s=400&v=4 |
| twitter:card | summary_large_image |
| og:image | https://avatars.githubusercontent.com/u/6728866?s=400&v=4 |
| og:image:alt | What this PR does / why we need it: 1. Generalise Source Model. The current Source model in Feast Core is Kafka specific. For all intents and purposes it is a hardcoded implementation of KafkaSourc... |
| og:site_name | GitHub |
| og:type | object |
| hostname | github.com |
| expected-hostname | github.com |
| None | da4f0ee56809799586f8ee546b27f94fe9b5893edfbf87732e82be45be013b52 |
| turbo-cache-control | no-preview |
| go-import | github.com/feast-dev/feast git https://github.com/feast-dev/feast.git |
| octolytics-dimension-user_id | 57027613 |
| octolytics-dimension-user_login | feast-dev |
| octolytics-dimension-repository_id | 161133770 |
| octolytics-dimension-repository_nwo | feast-dev/feast |
| octolytics-dimension-repository_public | true |
| octolytics-dimension-repository_is_fork | false |
| octolytics-dimension-repository_network_root_id | 161133770 |
| octolytics-dimension-repository_network_root_nwo | feast-dev/feast |
| turbo-body-classes | logged-out env-production page-responsive full-width full-width-p-0 |
| disable-turbo | false |
| browser-stats-url | https://api.github.com/_private/browser/stats |
| browser-errors-url | https://api.github.com/_private/browser/errors |
| release | 847cd6771d7fb3caaa9384a1fe1215457fe1e4f4 |
| ui-target | full |
| theme-color | #1e2327 |
| color-scheme | light dark |
Links:
Viewport: width=device-width