René's URL Explorer Experiment


Title: GitHub - exajobs/data-engineering-collection: A collection of awesome software, libraries, Learning Tutorials, documents, books, resources and interesting stuff about Big Data Science & Engineering

Open Graph Title: GitHub - exajobs/data-engineering-collection: A collection of awesome software, libraries, Learning Tutorials, documents, books, resources and interesting stuff about Big Data Science & Engineering

X Title: GitHub - exajobs/data-engineering-collection: A collection of awesome software, libraries, Learning Tutorials, documents, books, resources and interesting stuff about Big Data Science & Engineering

Description: A collection of awesome software, libraries, Learning Tutorials, documents, books, resources and interesting stuff about Big Data Science & Engineering - exajobs/data-engineering-collection

Open Graph Description: A collection of awesome software, libraries, Learning Tutorials, documents, books, resources and interesting stuff about Big Data Science & Engineering - exajobs/data-engineering-collection

X Description: A collection of awesome software, libraries, Learning Tutorials, documents, books, resources and interesting stuff about Big Data Science & Engineering - exajobs/data-engineering-collection

Opengraph URL: https://github.com/exajobs/data-engineering-collection

X: @github

direct link

Domain: patch-diff.githubusercontent.com

route-pattern/:user_id/:repository
route-controllerfiles
route-actiondisambiguate
fetch-noncev2:84c4772b-b4fb-9752-94f2-b3b1d18e10d3
current-catalog-service-hashf3abb0cc802f3d7b95fc8762b94bdcb13bf39634c40c357301c4aa1d67a256fb
request-idC12E:3ED45B:B602223:F2053D4:698C49E6
html-safe-noncee675dd3d9546d53cd1bcd7305efc58831f2da1645a7aaf3503e975b7e6c833b2
visitor-payloadeyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiJDMTJFOjNFRDQ1QjpCNjAyMjIzOkYyMDUzRDQ6Njk4QzQ5RTYiLCJ2aXNpdG9yX2lkIjoiODI3NTMyODYzNDQ3NTc5Mjg3MCIsInJlZ2lvbl9lZGdlIjoiaWFkIiwicmVnaW9uX3JlbmRlciI6ImlhZCJ9
visitor-hmac42cd1b0f6ed3f7d72f4addef43598963684730d24e2204e1f2b6d8e23d0100e5
hovercard-subject-tagrepository:441267015
github-keyboard-shortcutsrepository,copilot
google-site-verificationApib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I
octolytics-urlhttps://collector.github.com/github/collect
analytics-location//
fb:app_id1401488693436528
apple-itunes-appapp-id=1477376905, app-argument=https://github.com/exajobs/data-engineering-collection
twitter:imagehttps://opengraph.githubassets.com/49b9df3b68a90355e3bd9fcb834141cea15c22413d572a10de69185182e69585/exajobs/data-engineering-collection
twitter:cardsummary_large_image
og:imagehttps://opengraph.githubassets.com/49b9df3b68a90355e3bd9fcb834141cea15c22413d572a10de69185182e69585/exajobs/data-engineering-collection
og:image:alt A collection of awesome software, libraries, Learning Tutorials, documents, books, resources and interesting stuff about Big Data Science & Engineering - exajobs/data-engineering-collection
og:image:width1200
og:image:height600
og:site_nameGitHub
og:typeobject
hostnamegithub.com
expected-hostnamegithub.com
None640eeb7b6ff4d8d106235d228c0c286e82592d4d2403227b5b2b4fc5832297a4
turbo-cache-controlno-preview
go-importgithub.com/exajobs/data-engineering-collection git https://github.com/exajobs/data-engineering-collection.git
octolytics-dimension-user_id95390472
octolytics-dimension-user_loginexajobs
octolytics-dimension-repository_id441267015
octolytics-dimension-repository_nwoexajobs/data-engineering-collection
octolytics-dimension-repository_publictrue
octolytics-dimension-repository_is_forkfalse
octolytics-dimension-repository_network_root_id441267015
octolytics-dimension-repository_network_root_nwoexajobs/data-engineering-collection
turbo-body-classeslogged-out env-production page-responsive
disable-turbofalse
browser-stats-urlhttps://api.github.com/_private/browser/stats
browser-errors-urlhttps://api.github.com/_private/browser/errors
release3d444f0a47beeeac94cddbb51c91ab408befe8d4
ui-targetfull
theme-color#1e2327
color-schemelight dark

Links:

Skip to contenthttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#start-of-content
https://patch-diff.githubusercontent.com/
Sign in https://patch-diff.githubusercontent.com/login?return_to=https%3A%2F%2Fgithub.com%2Fexajobs%2Fdata-engineering-collection
GitHub CopilotWrite better code with AIhttps://github.com/features/copilot
GitHub SparkBuild and deploy intelligent appshttps://github.com/features/spark
GitHub ModelsManage and compare promptshttps://github.com/features/models
MCP RegistryNewIntegrate external toolshttps://github.com/mcp
ActionsAutomate any workflowhttps://github.com/features/actions
CodespacesInstant dev environmentshttps://github.com/features/codespaces
IssuesPlan and track workhttps://github.com/features/issues
Code ReviewManage code changeshttps://github.com/features/code-review
GitHub Advanced SecurityFind and fix vulnerabilitieshttps://github.com/security/advanced-security
Code securitySecure your code as you buildhttps://github.com/security/advanced-security/code-security
Secret protectionStop leaks before they starthttps://github.com/security/advanced-security/secret-protection
Why GitHubhttps://github.com/why-github
Documentationhttps://docs.github.com
Bloghttps://github.blog
Changeloghttps://github.blog/changelog
Marketplacehttps://github.com/marketplace
View all featureshttps://github.com/features
Enterpriseshttps://github.com/enterprise
Small and medium teamshttps://github.com/team
Startupshttps://github.com/enterprise/startups
Nonprofitshttps://github.com/solutions/industry/nonprofits
App Modernizationhttps://github.com/solutions/use-case/app-modernization
DevSecOpshttps://github.com/solutions/use-case/devsecops
DevOpshttps://github.com/solutions/use-case/devops
CI/CDhttps://github.com/solutions/use-case/ci-cd
View all use caseshttps://github.com/solutions/use-case
Healthcarehttps://github.com/solutions/industry/healthcare
Financial serviceshttps://github.com/solutions/industry/financial-services
Manufacturinghttps://github.com/solutions/industry/manufacturing
Governmenthttps://github.com/solutions/industry/government
View all industrieshttps://github.com/solutions/industry
View all solutionshttps://github.com/solutions
AIhttps://github.com/resources/articles?topic=ai
Software Developmenthttps://github.com/resources/articles?topic=software-development
DevOpshttps://github.com/resources/articles?topic=devops
Securityhttps://github.com/resources/articles?topic=security
View all topicshttps://github.com/resources/articles
Customer storieshttps://github.com/customer-stories
Events & webinarshttps://github.com/resources/events
Ebooks & reportshttps://github.com/resources/whitepapers
Business insightshttps://github.com/solutions/executive-insights
GitHub Skillshttps://skills.github.com
Documentationhttps://docs.github.com
Customer supporthttps://support.github.com
Community forumhttps://github.com/orgs/community/discussions
Trust centerhttps://github.com/trust-center
Partnershttps://github.com/partners
GitHub SponsorsFund open source developershttps://github.com/sponsors
Security Labhttps://securitylab.github.com
Maintainer Communityhttps://maintainers.github.com
Acceleratorhttps://github.com/accelerator
Archive Programhttps://archiveprogram.github.com
Topicshttps://github.com/topics
Trendinghttps://github.com/trending
Collectionshttps://github.com/collections
Enterprise platformAI-powered developer platformhttps://github.com/enterprise
GitHub Advanced SecurityEnterprise-grade security featureshttps://github.com/security/advanced-security
Copilot for BusinessEnterprise-grade AI featureshttps://github.com/features/copilot/copilot-business
Premium SupportEnterprise-grade 24/7 supporthttps://github.com/premium-support
Pricinghttps://github.com/pricing
Search syntax tipshttps://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax
documentationhttps://docs.github.com/search-github/github-code-search/understanding-github-code-search-syntax
Sign in https://patch-diff.githubusercontent.com/login?return_to=https%3A%2F%2Fgithub.com%2Fexajobs%2Fdata-engineering-collection
Sign up https://patch-diff.githubusercontent.com/signup?ref_cta=Sign+up&ref_loc=header+logged+out&ref_page=%2F%3Cuser-name%3E%2F%3Crepo-name%3E&source=header-repo&source_repo=exajobs%2Fdata-engineering-collection
Reloadhttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection
Reloadhttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection
Reloadhttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection
exajobs https://patch-diff.githubusercontent.com/exajobs
data-engineering-collectionhttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection
Notifications https://patch-diff.githubusercontent.com/login?return_to=%2Fexajobs%2Fdata-engineering-collection
Fork 1 https://patch-diff.githubusercontent.com/login?return_to=%2Fexajobs%2Fdata-engineering-collection
Star 11 https://patch-diff.githubusercontent.com/login?return_to=%2Fexajobs%2Fdata-engineering-collection
MIT license https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection/blob/main/LICENSE
11 stars https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection/stargazers
1 fork https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection/forks
Branches https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection/branches
Tags https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection/tags
Activity https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection/activity
Star https://patch-diff.githubusercontent.com/login?return_to=%2Fexajobs%2Fdata-engineering-collection
Notifications https://patch-diff.githubusercontent.com/login?return_to=%2Fexajobs%2Fdata-engineering-collection
Code https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection
Issues 0 https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection/issues
Pull requests 0 https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection/pulls
Actions https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection/actions
Projects 0 https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection/projects
Security 0 https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection/security
Insights https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection/pulse
Code https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection
Issues https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection/issues
Pull requests https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection/pulls
Actions https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection/actions
Projects https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection/projects
Security https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection/security
Insights https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection/pulse
Brancheshttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection/branches
Tagshttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection/tags
https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection/branches
https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection/tags
8 Commitshttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection/commits/main/
https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection/commits/main/
.gitignorehttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection/blob/main/.gitignore
.gitignorehttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection/blob/main/.gitignore
LICENSEhttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection/blob/main/LICENSE
LICENSEhttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection/blob/main/LICENSE
README.mdhttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection/blob/main/README.md
README.mdhttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection/blob/main/README.md
READMEhttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection
MIT licensehttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection
https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#data-engineering-collection
https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#table-of-contents-
Data Engineeringhttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#data-engineering
RDBMShttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#rdbms
Frameworkshttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#frameworks
Distributed Programminghttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#distributed-programming
Distributed Filesystemhttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#distributed-filesystem
Distributed Indexhttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#distributed-index
Document Data Modelhttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#document-data-model
Key Map Data Modelhttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#key-map-data-model
Key-value Data Modelhttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#key-value-data-model
Graph Data Modelhttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#graph-data-model
Databaseshttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#databases
Columnar Databaseshttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#columnar-databases
NewSQL Databaseshttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#newsql-databases
Time-Series Databaseshttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#time-series-databases
SQL-like processinghttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#sql-like-processing
Data Ingestionhttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#data-ingestion
Service Programminghttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#service-programming
Schedulinghttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#scheduling
Machine Learninghttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#machine-learning
Benchmarkinghttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#benchmarking
Securityhttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#security
System Deploymenthttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#system-deployment
Applicationshttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#applications
Search engine and frameworkhttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#search-engine-and-framework
MySQL forks and evolutionshttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#mysql-forks-and-evolutions
PostgreSQL forks and evolutionshttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#postgresql-forks-and-evolutions
Memcached forks and evolutionshttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#memcached-forks-and-evolutions
Embedded Databaseshttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#embedded-databases
Business Intelligencehttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#business-intelligence
Data Visualizationhttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#data-visualization
Internet of things and sensor datahttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#internet-of-things-and-sensor-data
Interesting Readingshttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#interesting-readings
Interesting Papershttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#interesting-papers
2015 - 2016https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#2015---2016
2013 - 2014https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#2013---2014
2011 - 2012https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#2011---2012
2001 - 2010https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#2001---2010
Videoshttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#videos
Bookshttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#books
Streaminghttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#streaming
Distributed systemshttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#distributed-systems
Graph Based approachhttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#graph-based-approach
Data Visualizationhttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#data-visualization-1
Other Awesome Listshttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#other-awesome-lists
https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#rdbms-
MySQLhttps://www.mysql.com/
PostgreSQLhttps://www.postgresql.org/
Oracle Databasehttp://www.oracle.com/us/corporate/features/database-12c/index.html
Teradatahttp://www.teradata.com/products-and-services/teradata-database/
https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#frameworks-
Bistrohttps://github.com/facebook/bistro
IBM Streamshttps://www.ibm.com/analytics/us/en/technology/stream-computing/
Apache Hadoophttp://hadoop.apache.org/
Tigonhttps://github.com/caskdata/tigon
Pachydermhttp://pachyderm.io/
Polyaxonhttps://github.com/polyaxon/polyaxon
Smookshttps://github.com/smooks/smooks
https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#distributed-programming-
AddThis Hydrahttps://github.com/addthis/hydra
AMPLab SIMRhttp://databricks.github.io/simr/
Apache APEXhttps://apex.apache.org/
Apache Beamhttps://beam.apache.org/
Apache Crunchhttp://crunch.apache.org/
Apache DataFuhttp://incubator.apache.org/projects/datafu.html
Apache Flinkhttp://flink.apache.org/
Apache Gearpumphttp://gearpump.apache.org/
Apache Gorahttp://gora.apache.org/
Apache Hamahttp://hama.apache.org/
Apache MapReducehttps://wiki.apache.org/hadoop/MapReduce/
Apache Pighttps://pig.apache.org/
Apache REEFhttp://reef.apache.org/
Apache S4http://incubator.apache.org/projects/s4.html
Apache Sparkhttp://spark.apache.org/
Apache Spark Streaminghttps://spark.apache.org/docs/latest/streaming-programming-guide.html
Apache Stormhttp://storm.apache.org
Apache Samzahttp://samza.apache.org/
Apache Tezhttp://tez.apache.org/
Apache Twillhttps://incubator.apache.org/projects/twill.html
Baidu Bigflowhttp://bigflow.cloud/en/index.html
Cascaloghttp://cascalog.org/
Cheetahhttp://vldbarc.org/pvldb/vldb2010/pvldb_vol3/I08.pdf
Concurrent Cascadinghttp://www.cascading.org/
Damballa Parkourhttps://github.com/damballa/parkour
Datasalt Pangoolhttps://github.com/datasalt/pangool
DataTorrent StrAMhttps://www.datatorrent.com/
Facebook Coronahttps://www.facebook.com/notes/facebook-engineering/under-the-hood-scheduling-mapreduce-jobs-more-efficiently-with-corona/10151142560538920
Facebook Peregrinehttp://peregrine_mapreduce.bitbucket.org/
Facebook Scubahttps://www.facebook.com/notes/facebook-engineering/under-the-hood-data-diving-with-scuba/10150599692628920
Google Dataflowhttps://googledevelopers.blogspot.it/2014/06/cloud-platform-at-google-io-new-big.html
Google MapReducehttps://research.google.com/archive/mapreduce.html
Google MillWheelhttps://research.google.com/pubs/pub41378.html
IBM Streamshttps://www.ibm.com/analytics/us/en/technology/stream-computing/
JAQLhttps://code.google.com/p/jaql/
Kitehttp://kitesdk.org/docs/current/
Metamarkets Druidhttp://druid.io/
Netflix PigPenhttps://github.com/Netflix/PigPen
Nokia Discohttp://discoproject.org/
Onyxhttp://www.onyxplatform.org/
Pinterest Pinlaterhttps://medium.com/@Pinterest_Engineering/pinlater-an-asynchronous-job-execution-system-b8664cb8aa7d
Pydoophttp://crs4.github.io/pydoop/
Rayhttps://github.com/ray-project/ray
Rackerlabs Bluefloodhttp://blueflood.io/
Skalehttps://github.com/skale-me/skale-engine
Stratospherehttp://stratosphere.eu/
Streamdrillhttps://streamdrill.com/
streamsx.topologyhttps://github.com/IBMStreams/streamsx.topology
Tuktuhttps://github.com/UnderstandLingBV/Tuktu
Twitter Heronhttps://github.com/twitter/heron
Twitter Scaldinghttps://github.com/twitter/scalding
Twitter Summingbirdhttps://github.com/twitter/summingbird
Twitter TSARhttps://blog.twitter.com/engineering/en_us/a/2014/tsar-a-timeseries-aggregator.html
Wallaroohttp://www.wallaroolabs.com/community
https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#distributed-filesystem-
Ambryhttps://github.com/linkedin/ambry
Apache HDFShttp://hadoop.apache.org/
Apache Kuduhttp://kudu.apache.org/
BeeGFShttps://www.beegfs.io/content/
Ceph Filesystemhttp://ceph.com/ceph-storage/file-system/
Disco DDFShttp://disco.readthedocs.org/en/latest/howto/ddfs.html
Facebook Haystackhttps://www.facebook.com/note.php?note_id=76191543919
Google GFShttp://static.googleusercontent.com/media/research.google.com/en//archive/gfs-sosp2003.pdf
Google Megastorehttps://research.google.com/pubs/pub36971.html
GridGainhttps://www.gridgain.com/
Lustre file systemhttp://wiki.lustre.org/
Microsoft Azure Data Lake Storehttps://hadoop.apache.org/docs/current/hadoop-azure-datalake/index.html
Quantcast File System QFShttps://www.quantcast.com/about-us/quantcast-file-system/
Red Hat GlusterFShttp://gluster.org/
Seaweed-FShttps://github.com/chrislusf/seaweedfs
Alluxiohttp://www.alluxio.org/
Tahoe-LAFShttps://www.tahoe-lafs.org/trac/tahoe-lafs
Baidu File Systemhttps://github.com/baidu/bfs
https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#distributed-index-
Pilosahttps://github.com/pilosa/pilosa
https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#document-data-model-
Actian Versanthttps://www.actian.com/data-management/ingres-sql-rdbms/
Crate Datahttps://crate.io/
Facebook Apollohttp://www.infoq.com/news/2014/06/facebook-apollo
jumboDBhttp://comsysto.github.io/jumbodb/
LinkedIn Espressohttps://engineering.linkedin.com/data
MarkLogichttp://www.marklogic.com/
Microsoft Azure DocumentDBhttps://azure.microsoft.com/en-us/services/cosmos-db/
MongoDBhttps://www.mongodb.com/
RavenDBhttps://ravendb.net/
RethinkDBhttps://rethinkdb.com/
https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#key-map-data-model-
Key-value Data Modelhttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#key-value-data-model
Columnar Databaseshttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#columnar-databases
Distinguishing two major types of Column Storeshttp://dbmsmusings.blogspot.com/2010/03/distinguishing-two-major-types-of_29.html
Apache Accumulohttp://accumulo.apache.org/
Apache Cassandrahttp://cassandra.apache.org/
Apache HBasehttp://hbase.apache.org/
Baidu Terahttps://github.com/baidu/tera
Facebook HydraBasehttps://code.facebook.com/posts/321111638043166/hydrabase-the-evolution-of-hbase-facebook/
Google BigTablehttp://static.googleusercontent.com/media/research.google.com/en//archive/bigtable-osdi06.pdf
Google Cloud Datastorehttps://cloud.google.com/datastore/docs/concepts/overview
Hypertablehttp://www.hypertable.org/
InfiniDBhttps://github.com/infinidb/infinidb/
Tephrahttps://github.com/caskdata/tephra
Twitter Manhattanhttps://blog.twitter.com/engineering/en_us/a/2014/manhattan-our-real-time-multi-tenant-distributed-database-for-twitter-scale.html
ScyllaDBhttp://www.scylladb.com/
https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#key-value-data-model
Aerospikehttp://www.aerospike.com/
Amazon DynamoDBhttps://aws.amazon.com/dynamodb/
Badgerhttps://open.dgraph.io/post/badger/
Bolthttps://github.com/boltdb/bolt
BTDBhttps://github.com/Bobris/BTDB
BuntDBhttps://github.com/tidwall/buntdb
Edishttps://github.com/cbd/edis
ElephantDBhttps://github.com/nathanmarz/elephantdb
EventStorehttps://geteventstore.com/
GhostDBhttps://github.com/jakekgrog/GhostDB
Gravitonhttps://github.com/deroproject/graviton
GridDBhttps://github.com/griddb/griddb_nosql
HyperDexhttps://github.com/rescrv/HyperDex
Ignitehttps://ignite.apache.org/index.html
LinkedIn Kratihttps://github.com/linkedin-sna/sna-page/tree/master/krati
Linkedin Voldemorthttp://www.project-voldemort.com/voldemort/
Oracle NoSQL Databasehttp://www.oracle.com/technetwork/database/database-technologies/nosqldb/overview/index.html
Redishttps://redis.io/
Riakhttps://github.com/basho/riak
Storehaushttps://github.com/twitter/storehaus
SummitDBhttps://github.com/tidwall/summitdb
Tarantoolhttps://github.com/tarantool/tarantool
TiKVhttps://github.com/pingcap/tikv
Tile38https://github.com/tidwall/tile38
TreodeDBhttps://github.com/Treode/store
https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#graph-data-model
AgensGraphhttp://www.agensgraph.com/
Apache Giraphhttp://giraph.apache.org/
Apache Spark Bagelhttp://spark.apache.org/docs/0.7.3/bagel-programming-guide.html
ArangoDBhttps://www.arangodb.com/
DGraphhttps://github.com/dgraph-io/dgraph
EliasDBhttps://github.com/krotik/eliasdb
Facebook TAOhttps://www.facebook.com/notes/facebook-engineering/tao-the-power-of-the-graph/10151525983993920
GCHQ Gafferhttps://github.com/gchq/Gaffer
Google Cayleyhttps://github.com/cayleygraph/cayley
Google Pregelhttp://kowshik.github.io/JPregel/pregel_paper.pdf
GraphLab PowerGraphhttps://turi.com/products/create/docs/
GraphXhttps://amplab.cs.berkeley.edu/publication/graphx-grades/
Gremlinhttps://github.com/tinkerpop/gremlin
Infovorehttps://github.com/paulhoule/infovore
Intel GraphBuilderhttps://01.org/graphbuilder/
JanusGraphhttp://janusgraph.org
MapGraphhttps://www.blazegraph.com/mapgraph-technology/
Microsoft Graph Enginehttps://github.com/Microsoft/GraphEngine
Neo4jhttps://neo4j.com/
OrientDBhttp://orientdb.com/
Phoebushttps://github.com/xslogic/phoebus
Titanhttp://thinkaurelius.github.io/titan/
Twitter FlockDBhttps://github.com/twitter-archive/flockdb
NodeXLhttps://nodexl.codeplex.com/
https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#databases
RQLitehttps://github.com/rqlite/rqlite
MySQLhttps://www.mysql.com/
TiDBhttps://github.com/pingcap/tidb
Percona XtraBackuphttps://www.percona.com/software/mysql-database/percona-xtrabackup
mysql_utilshttps://github.com/pinterest/mysql_utils
MariaDBhttps://mariadb.org/
PostgreSQLhttps://www.postgresql.org/
Amazon RDShttps://aws.amazon.com/rds/
Crate.IOhttps://crate.io/
Redishttps://redis.io/
Riakhttp://docs.basho.com/riak/kv/
AWS DynamoDBhttps://aws.amazon.com/dynamodb/
HyperDexhttps://github.com/rescrv/HyperDex
SSDBhttp://ssdb.io
Kyoto Tycoonhttps://github.com/alticelabs/kyoto
IonDBhttps://github.com/iondbproject/iondb
Cassandrahttps://cassandra.apache.org/
Cassandra Calculatorhttps://www.ecyrd.com/cassandracalculator/
CCMhttps://github.com/pcmanus/ccm
ScyllaDBhttps://github.com/scylladb/scylla
https://www.scylladb.com/https://www.scylladb.com/
HBasehttps://hbase.apache.org/
AWS Redshifthttps://aws.amazon.com/redshift/
FiloDBhttps://github.com/filodb/FiloDB
Verticahttps://www.vertica.com
ClickHousehttps://clickhouse.tech
MongoDBhttps://www.mongodb.com
Percona Server for MongoDBhttps://www.percona.com/software/mongo-database/percona-server-for-mongodb
MemDBhttps://github.com/rain1017/memdb
Elasticsearchhttps://www.elastic.co/
Couchbasehttps://www.couchbase.com/
RethinkDBhttps://rethinkdb.com/
RavenDBhttps://ravendb.net/
Neo4jhttps://neo4j.com/
OrientDBhttps://orientdb.com
ArangoDBhttps://www.arangodb.com/
Titanhttps://titan.thinkaurelius.com
FlockDBhttps://github.com/twitter-archive/flockdb
DAtomichttps://www.datomic.com
Apache Geodehttps://geode.apache.org/
Gaffer https://github.com/gchq/Gaffer
InfluxDBhttps://github.com/influxdata/influxdb
OpenTSDBhttps://github.com/OpenTSDB/opentsdb
QuestDBhttps://questdb.io/
kairosdbhttps://github.com/kairosdb/kairosdb
Heroichttps://github.com/spotify/heroic
Druidhttps://github.com/apache/incubator-druid
Riak-TShttp://basho.com/products/riak-ts/
Akumulihttps://github.com/akumuli/Akumuli
Rhombushttps://github.com/Pardot/Rhombus
Dalmatiner DBhttps://github.com/dalmatinerdb/dalmatinerdb
Bluefloodhttps://github.com/rackerlabs/blueflood
Timelyhttps://github.com/NationalSecurityAgency/timely
Tarantoolhttps://github.com/tarantool/tarantool/
GreenPlumhttps://github.com/greenplum-db/gpdb
cayleyhttps://github.com/cayleygraph/cayley
Snappydatahttps://github.com/SnappyDataInc/snappydata
TimescaleDBhttps://www.timescale.com/
https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#columnar-databases
Key-Map Data Modelhttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#key-map-data-model
Columnar Storagehttp://the-paper-trail.org/blog/columnar-storage/
Actian Vectorhttp://www.actian.com/
ClickHousehttps://clickhouse.yandex/
EventQLhttp://eventql.io/
MonetDBhttps://www.monetdb.org/
Parquethttp://parquet.apache.org/
Pivotal Greenplumhttps://pivotal.io/pivotal-greenplum
Verticahttps://www.vertica.com/
SQream DBhttp://sqream.com/
Google BigQueryhttps://cloud.google.com/bigquery/what-is-bigquery
Amazon Redshifthttps://aws.amazon.com/redshift/
IndexRhttps://github.com/shunfei/indexr
LocustDBhttps://github.com/cswinter/LocustDB
https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#newsql-databases
Actian Ingreshttp://www.actian.com/products/operational-databases/
ActorDBhttps://github.com/biokoda/actordb
Amazon RedShifthttp://aws.amazon.com/redshift/
BayesDBhttps://github.com/probcomp/BayesDB
Bedrockhttp://bedrockdb.com/
CitusDBhttps://www.citusdata.com/
Cockroachhttps://github.com/cockroachdb/cockroach
Comdb2https://github.com/bloomberg/comdb2
Datomichttp://www.datomic.com/
FoundationDBhttps://foundationdb.com/
Google F1https://research.google.com/pubs/pub41344.html
Google Spannerhttps://research.google.com/archive/spanner.html
H-Storehttp://hstore.cs.brown.edu/
Haeinsahttps://github.com/VCNC/haeinsa
HandlerSockethttps://www.percona.com/doc/percona-server/5.5/performance/handlersocket.html
InfiniSQLhttp://www.infinisql.org/
KarelDBhttps://github.com/rayokota/kareldb
Map-Dhttps://www.mapd.com/
MemSQLhttp://www.memsql.com/
NuoDBhttp://www.nuodb.com/
Oracle TimesTen in-Memory Databasehttp://www.oracle.com/technetwork/database/database-technologies/timesten/overview/index.html
Pivotal GemFire XDhttp://gemfirexd.docs.pivotal.io/latest/
SAP HANAhttps://hana.sap.com/abouthana.html
SenseiDBhttp://senseidb.github.io/sensei/
Skyhttp://skydb.io/
SymmetricDShttp://www.symmetricds.org/
TiDBhttps://github.com/pingcap/tidb
VoltDBhttps://www.voltdb.com/
yugabyteDBhttps://github.com/YugaByte/yugabyte-db
https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#time-series-databases
Axibase Time Series Databasehttp://axibase.com/products/axibase-time-series-database/
Chronixhttp://chronix.io/
Cubehttp://square.github.io/cube/
Heroichttps://spotify.github.io/heroic/#!/index
InfluxDBhttps://www.influxdata.com/
QuestDBhttps://questdb.io/
IronDBhttps://www.circonus.com/irondb/
Kairosdbhttps://github.com/kairosdb/kairosdb
M3DBhttp://m3db.github.io/m3/m3db/
Newtshttps://opennms.github.io/newts/
TDenginehttps://github.com/taosdata/TDengine/
OpenTSDBhttp://opentsdb.net
Prometheushttps://prometheus.io/
Beringeihttps://github.com/facebookincubator/beringei
TrailDBhttp://traildb.io/
Druidhttps://github.com/druid-io/druid/
Riak-TShttp://basho.com/products/riak-ts/
Akumulihttps://github.com/akumuli/Akumuli
Rhombushttps://github.com/Pardot/Rhombus
Dalmatiner DBhttps://github.com/dalmatinerdb/dalmatinerdb
Bluefloodhttps://github.com/rackerlabs/blueflood
Timelyhttps://github.com/NationalSecurityAgency/timely
SiriDBhttps://github.com/transceptor-technology/siridb-server
Thanoshttps://github.com/improbable-eng/thanos
VictoriaMetricshttps://github.com/VictoriaMetrics/VictoriaMetrics
https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#sql-like-processing
Actian SQL for Hadoophttp://www.actian.com/analytic-database/vectorh-sql-hadoop
Apache Drillhttp://drill.apache.org/
Apache HCataloghttps://cwiki.apache.org/confluence/display/Hive/HCatalog
Apache Hivehttp://hive.apache.org/
Apache Calcitehttp://calcite.apache.org/
Apache Phoenixhttp://phoenix.apache.org/index.html
Aster Databasehttp://www.teradata.com/products-and-services/Teradata-Aster/teradata-aster-database
Cloudera Impalahttps://www.cloudera.com/products/apache-hadoop/impala.html
Concurrent Lingualhttp://www.cascading.org/projects/lingual/
Datasalt Splout SQLhttp://www.datasalt.com/products/splout-sql/
Dremiohttps://www.dremio.com/
Facebook PrestoDBhttps://prestodb.io/
Google BigQueryhttps://research.google.com/pubs/pub36632.html
Materializehttps://github.com/materializeinc/materialize
Invantive SQLhttps://documentation.invantive.com/2017R2/invantive-sql-grammar/invantive-sql-grammar-17.30.html
PipelineDBhttps://www.pipelinedb.com/
Pivotal HDBhttps://pivotal.io/pivotal-hdb
RainstorDBhttp://rainstor.com/products/rainstor-database/
Spark Catalysthttps://github.com/apache/spark/tree/master/sql
SparkSQLhttps://databricks.com/blog/2014/03/26/spark-sql-manipulating-structured-data-using-spark-2.html
Splice Machinehttps://www.splicemachine.com/
Stingerhttps://hortonworks.com/innovation/stinger/
Tajohttp://tajo.apache.org/
Trafodionhttps://wiki.trafodion.org/wiki/index.php/Main_Page
https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#data-ingestion
redpandahttps://vectorized.io/redpanda
Amazon Kinesishttps://aws.amazon.com/kinesis/
Amazon Web Services Gluehttps://aws.amazon.com/glue/
Censushttps://getcensus.com/
Apache Chukwahttp://chukwa.apache.org/
Apache Flumehttp://flume.apache.org/
Apache Kafkahttp://kafka.apache.org/
Apache NiFihttps://nifi.apache.org/
Apache Pulsarhttps://github.com/apache/pulsar
Apache Sqoophttp://sqoop.apache.org/
Embulkhttp://www.embulk.org
Facebook Scribehttps://github.com/facebookarchive/scribe
Fluentdhttp://www.fluentd.org
Gazettehttps://github.com/gazette/core
Google Photonhttps://research.google.com/pubs/pub41318.html
Hekahttps://github.com/mozilla-services/heka
HIHOhttps://github.com/sonalgoyal/hiho
Kestrelhttps://github.com/papertrail/kestrel
LinkedIn Databushttps://engineering.linkedin.com/data
LinkedIn Kamikazehttps://github.com/linkedin/kamikaze
LinkedIn White Elephanthttps://github.com/linkedin/white-elephant
Logstashhttps://www.elastic.co/products/logstash
Netflix Surohttps://github.com/Netflix/suro
Pinterest Secorhttps://github.com/pinterest/secor
Linkedin Gobblinhttps://github.com/linkedin/gobblin
Skizzehttps://github.com/skizzehq/skizze
StreamSets Data Collectorhttps://github.com/streamsets/datacollector
Aloomahttps://www.alooma.com/integrations/mysql
RudderStackhttps://github.com/rudderlabs/rudder-server
https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#service-programming
Akka Toolkithttp://akka.io/
Apache Avrohttp://avro.apache.org/
Apache Curatorhttp://curator.apache.org/
Apache Karafhttp://karaf.apache.org/
Apache Thrifthttp://thrift.apache.org//
Apache Zookeeperhttp://zookeeper.apache.org/
Google Chubbyhttps://research.google.com/archive/chubby.html
Hydrosphere Misthttps://github.com/Hydrospheredata/mist
Linkedin Norberthttps://engineering.linkedin.com/data
Marahttps://github.com/mara/data-integration
OpenMPIhttps://www.open-mpi.org/
Serfhttps://www.serf.io/
Spotify Luigihttps://github.com/spotify/luigi
Spring XDhttps://github.com/spring-projects/spring-xd
Twitter Elephant Birdhttps://github.com/twitter/elephant-bird
Twitter Finaglehttps://twitter.github.io/finagle/
https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#scheduling
Apache Airflowhttps://github.com/apache/incubator-airflow
Apache Aurorahttp://aurora.apache.org/
Apache Falconhttp://falcon.apache.org/
Apache Ooziehttp://oozie.apache.org/
Azure Data Factoryhttps://docs.microsoft.com/en-us/azure/data-factory/data-factory-introduction
Chronoshttp://mesos.github.io/chronos/
Croniclehttps://github.com/jhuckaby/Cronicle
Dagsterhttps://github.com/dagster-io/dagster
Linkedin Azkabanhttps://azkaban.github.io/
Schedoscopehttps://github.com/ottogroup/schedoscope
Sparrowhttps://github.com/radlab/sparrow
https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#machine-learning
Azure ML Studiohttps://studio.azureml.net/
brainhttps://github.com/harthur/brain
Oryxhttps://github.com/OryxProject/oryx
Concurrent Patternhttp://www.cascading.org/projects/pattern/
convnetjshttps://github.com/karpathy/convnetjs
DataVechttps://github.com/deeplearning4j/DataVec
Deeplearning4jhttps://github.com/deeplearning4j
Deciderhttps://github.com/danielsdeleo/Decider
ENCOGhttp://www.heatonresearch.com/encog/
etcMLhttp://www.etcml.com/
Etsy Conjecturehttps://github.com/etsy/Conjecture
Feasthttps://github.com/gojek/feast
GraphLab Createhttps://dato.com/products/create/
H2Ohttps://github.com/h2oai/h2o-3/
Karate Clubhttps://github.com/benedekrozemberczki/karateclub
Kerashttps://github.com/fchollet/keras
Lambdohttps://github.com/johnsonc/lambdo
Little Ball of Furhttps://github.com/benedekrozemberczki/littleballoffur
Mahouthttp://mahout.apache.org/
MLbasehttp://www.mlbase.org/
MLPNeuralNethttps://github.com/nikolaypavlov/MLPNeuralNet
ML Workspacehttps://github.com/ml-tooling/ml-workspace
MOAhttp://moa.cms.waikato.ac.nz
MonkeyLearnhttps://monkeylearn.com/
ND4Jhttps://github.com/deeplearning4j/nd4j
nupichttps://github.com/numenta/nupic
PredictionIOhttp://predictionio.incubator.apache.org/index.html
PyTorch Geometric Temporalhttps://github.com/benedekrozemberczki/pytorch_geometric_temporal
RL4Jhttps://github.com/deeplearning4j/rl4j
SAMOAhttp://samoa.incubator.apache.org/
scikit-learnhttps://github.com/scikit-learn/scikit-learn
Shapleyhttps://github.com/benedekrozemberczki/shapley
Spark MLlibhttp://spark.apache.org/docs/0.9.0/mllib-guide.html
Sibylhttps://users.soe.ucsc.edu/~niejiazhong/slides/chandra.pdf
TensorFlowhttps://github.com/tensorflow/tensorflow
Theanohttps://github.com/theano
Torchhttps://github.com/torch
Veloxhttps://github.com/amplab/velox-modelserver
Vowpal Wabbithttps://github.com/JohnLangford/vowpal_wabbit/wiki
WEKAhttp://www.cs.waikato.ac.nz/ml/weka/
BidMachhttps://github.com/BIDData/BIDMach
https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#benchmarking
Apache Hadoop Benchmarkinghttps://issues.apache.org/jira/browse/MAPREDUCE-3561
Berkeley SWIM Benchmarkhttps://github.com/SWIMProjectUCB/SWIM/wiki
Intel HiBenchhttps://github.com/intel-hadoop/HiBench
PUMA Benchmarkinghttps://issues.apache.org/jira/browse/MAPREDUCE-5116
Yahoo Gridmix3http://yahoohadoop.tumblr.com/post/98294079296/gridmix3-emulating-production-workload-for
Deeplearning4j Benchmarkshttps://github.com/deeplearning4j/dl4j-benchmark
https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#security
Apache Rangerhttp://ranger.apache.org/
Apache Eaglehttp://eagle.apache.org/
Apache Knox Gatewayhttp://knox.apache.org/
Apache Sentryhttp://incubator.apache.org/projects/sentry.html
BDAhttps://github.com/kotobukki/BDA/
https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#system-deployment
Apache Ambarihttp://ambari.apache.org/
Apache Bigtophttp://bigtop.apache.org//
Apache Helixhttp://helix.apache.org/
Apache Mesoshttp://mesos.apache.org/
Apache Sliderhttps://github.com/apache/incubator-slider
Apache Whirrhttp://whirr.apache.org/
Apache YARNhttps://hortonworks.com/hadoop/yarn/
Brooklynhttp://brooklyncentral.github.io/
Buildoophttp://buildoop.github.io/
Cloudera HUEhttp://gethue.com/
Facebook Prismhttp://www.wired.com/2012/08/facebook-prism/
Google Borghttps://www.wired.com/2013/03/google-borg-twitter-mesos/all/
Google Omegahttps://www.youtube.com/watch?v=0ZFMlO98Jkc
Hortonworks HOYAhttps://hortonworks.com/blog/introducing-hoya-hbase-on-yarn/
Kuberneteshttps://kubernetes.io/
Marathonhttps://github.com/mesosphere/marathon
Linkishttps://github.com/WeBankFinTech/Linkis
https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#applications
411https://github.com/etsy/411
Adobe spindlehttps://github.com/adobe-research/spindle
Apache Metronhttp://metron.apache.org/
Apache Nutchhttp://nutch.apache.org/
Apache OODThttp://oodt.apache.org/
Apache Tikahttps://tika.apache.org/
Argushttps://github.com/salesforce/Argus
AthenaXhttps://github.com/uber/AthenaX
Atlashttps://github.com/Netflix/atlas
Countlyhttps://count.ly/
Dominohttps://www.dominodatalab.com/
Eclipse BIRThttp://www.eclipse.org/birt/
ElastAerthttps://github.com/Yelp/elastalert
Eventhubhttps://github.com/Codecademy/EventHub
HASHhttps://hash.ai
Hermeshttps://github.com/allegro/hermes
Hunkhttps://www.splunk.com/en_us/download/hunk.html
Imhotephttp://opensource.indeedeng.io/imhotep/
Indicativehttps://www.indicative.com/
Jupyterhttps://jupyter.org/
MADlibhttp://madlib.incubator.apache.org/community/
Kapacitorhttps://github.com/influxdata/kapacitor
Kylinhttp://kylin.apache.org/
PivotalRhttps://github.com/pivotalsoftware/PivotalR
Rakamhttps://github.com/rakam-io/rakam
Qubolehttps://www.qubole.com/
SnappyDatahttps://github.com/SnappyDataInc/snappydata
Snowplowhttps://github.com/snowplow/snowplow
SparkRhttp://amplab-extras.github.io/SparkR-pkg/
Splunkhttps://www.splunk.com/
Sumo Logichttps://www.sumologic.com/
Talendhttp://www.talend.com/products/big-data/
https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#search-engine-and-framework
Apache Lucenehttp://lucene.apache.org/
Apache Solrhttp://lucene.apache.org/solr/
Elassandrahttps://github.com/strapdata/elassandra
ElasticSearchhttps://www.elastic.co/
Enigma.iohttps://www.enigma.com/
Google Caffeinehttps://googleblog.blogspot.it/2010/06/our-new-search-index-caffeine.html
Google Percolatorhttps://research.google.com/pubs/pub36726.html
HBase Coprocessorhttps://blogs.apache.org/hbase/entry/coprocessor_introduction
Lily HBase Indexerhttp://ngdata.github.io/hbase-indexer/
LinkedIn Bobohttp://senseidb.github.io/bobo/
LinkedIn Cleohttps://github.com/linkedin/cleo
LinkedIn Galenehttps://engineering.linkedin.com/search/did-you-mean-galene
LinkedIn Zoiehttps://github.com/senseidb/zoie
MG4Jhttp://mg4j.di.unimi.it/
Sphinx Search Serverhttp://sphinxsearch.com/
Vespahttp://vespa.ai/
Facebook Faisshttps://github.com/facebookresearch/faiss
Annoyhttps://github.com/spotify/annoy
Weaviatehttps://github.com/semi-technologies/weaviate
https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#mysql-forks-and-evolutions
Amazon RDShttps://aws.amazon.com/rds/
Drizzlehttp://www.drizzle.org/
Google Cloud SQLhttps://cloud.google.com/sql/docs/
MariaDBhttps://mariadb.org/
MySQL Clusterhttps://www.mysql.com/products/cluster/
Percona Serverhttps://www.percona.com/software/mysql-database/percona-server
ProxySQLhttps://github.com/renecannao/proxysql
TokuDBhttps://www.percona.com/
WebScaleSQLhttp://webscalesql.org/
https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#postgresql-forks-and-evolutions
HadoopDBhttp://db.cs.yale.edu/hadoopdb/hadoopdb.html
IBM Netezzahttp://www-01.ibm.com/software/data/netezza/
Postgres-XLhttp://www.postgres-xl.org/
RecDBhttp://www-users.cs.umn.edu/~sarwat/RecDB/
Stadohttp://www.stormdb.com/community/stado
Yahoo Everesthttps://www.scribd.com/doc/3159239/70-Everest-PGCon-RT
TimescaleDBhttp://www.timescale.com/
PipelineDBhttps://www.pipelinedb.com/
https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#memcached-forks-and-evolutions
Facebook McDipperhttps://www.facebook.com/notes/facebook-engineering/mcdipper-a-key-value-cache-for-flash-storage/10151347090423920
Facebook Memcachedhttps://www.facebook.com/notes/facebook-engineering/scaling-memcache-at-facebook/10151411410803920
Twemproxyhttps://github.com/twitter/twemproxy
Twitter Fatcachehttps://github.com/twitter/fatcache
Twitter Twemcachehttps://github.com/twitter/twemcache
https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#embedded-databases
Actian PSQLhttp://www.actian.com/products/operational-databases/
BerkeleyDBhttps://www.oracle.com/database/berkeley-db/index.html
HanoiDBhttps://github.com/krestenkrab/hanoidb
LevelDBhttps://github.com/google/leveldb
LMDBhttps://symas.com/mdb/
RocksDBhttp://rocksdb.org/
https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#business-intelligence
BIME Analyticshttps://www.bimeanalytics.com/?lang=en
Blazerhttps://github.com/ankane/blazer
Chartiohttps://chartio.com
Counthttps://count.co
datapinehttps://www.datapine.com/
Dekarthttps://dekart.xyz/
GoodDatahttps://www.gooddata.com/
Jaspersofthttps://www.jaspersoft.com/
Jedox Palohttps://www.jedox.com/en/
Jethrodatahttps://jethro.io/
intermix.iohttps://intermix.io/
Metabasehttps://github.com/metabase/metabase
Microsofthttp://www.microsoft.com/en-us/server-cloud/solutions/business-intelligence/default.aspx
Microstrategyhttps://www.microstrategy.com/
Numeracyhttps://numeracy.co/
Pentahohttp://www.pentaho.com/
Qlikhttp://www.qlik.com/us/
Redashhttps://redash.io/
Saiku Analyticshttps://www.meteorite.bi/
Knowagehttps://www.knowage-suite.com/
SpagoBihttp://www.spagobi.org/
SparklineData SNAPhttp://sparklinedata.com/
Tableauhttps://www.tableau.com/
Zoomdatahttps://www.zoomdata.com/
https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#data-visualization
Airpalhttps://github.com/airbnb/airpal
AnyCharthttp://www.anychart.com
Arborhttps://github.com/samizdatco/arbor
Bananahttps://github.com/LucidWorks/banana
Bloomeryhttps://github.com/ufukomer/bloomery
Bokehhttp://bokeh.pydata.org/en/latest/
C3http://c3js.org/
CartoDBhttps://github.com/CartoDB/cartodb
chartdhttp://chartd.co/
Chart.jshttp://www.chartjs.org/
Chartist.jshttps://github.com/gionkunz/chartist-js
Crossfilterhttp://square.github.io/crossfilter/
Cubismhttps://github.com/square/cubism
Cytoscapehttp://cytoscape.github.io/
DC.jshttp://dc-js.github.io/dc.js/
D3https://d3js.org/
D3.composehttps://github.com/CSNW/d3.compose
D3Plushttp://d3plus.org
Dashhttps://github.com/plotly/dash
Dekarthttps://dekart.xyz/
DevExtreme React Charthttps://devexpress.github.io/devextreme-reactive/react/chart/
Echartshttps://github.com/ecomfe/echarts
Envisionjshttps://github.com/HumbleSoftware/envisionjs
FnordMetrichttps://metrictools.org/
Frappe Chartshttps://frappe.io/charts
Freeboardhttps://github.com/Freeboard/freeboard
Gephihttps://github.com/gephi/gephi
Google Chartshttps://developers.google.com/chart/
Grafanahttps://grafana.com/
Graphitehttp://graphiteapp.org/
Highchartshttps://www.highcharts.com/
IPythonhttp://ipython.org/
Kibanahttps://www.elastic.co/products/kibana
Lumifyhttp://lumify.io/
Matplotlibhttps://github.com/matplotlib/matplotlib
Metricsgraphic.jshttps://metricsgraphicsjs.org/
NVD3http://nvd3.org/
Peityhttps://github.com/benpickles/peity
Plot.lyhttps://plot.ly/
Plotly.jshttps://github.com/plotly/plotly.js
Reclinehttps://github.com/okfn/recline
Redashhttps://github.com/getredash/redash
ReChartshttp://recharts.org/
Shinyhttp://shiny.rstudio.com/
Sigma.jshttps://github.com/jacomyal/sigma.js
Supersethttps://github.com/apache/incubator-superset
Vegahttps://github.com/vega/vega
Zeppelinhttps://github.com/ZEPL/zeppelin
Zing Chartshttps://www.zingchart.com/
DataSphere Studiohttps://github.com/WeBankFinTech/DataSphereStudio
https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#internet-of-things-and-sensor-data
Apache Edgent (Incubating)http://edgent.apache.org/
Azure IoT Hubhttps://azure.microsoft.com/en-us/services/iot-hub/
TempoIQhttps://www.tempoiq.com/
2lemetryhttp://2lemetry.com/
Pubnubhttps://www.pubnub.com/
ThingWorxhttps://www.thingworx.com/
IFTTThttps://ifttt.com/
Evrythinghttps://evrythng.com/
NetLyticshttps://github.com/marty90/netlytics/
Ablyhttps://ably.com/
https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#interesting-readings
Big Data Benchmarkhttps://amplab.cs.berkeley.edu/benchmark/
NoSQL Comparisonhttps://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis
Monitoring Kafka performancehttps://www.datadoghq.com/blog/monitoring-kafka-performance-metrics?ref=awesome
Monitoring Hadoop performancehttps://www.datadoghq.com/blog/monitor-hadoop-metrics?ref=awesome
Monitoring Cassandra performancehttps://www.datadoghq.com/blog/how-to-monitor-cassandra-performance-metrics/?ref=awesome
https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#interesting-papers
https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#2015---2016
2015http://www.vldb.org/pvldb/vol8/p1804-ching.pdf
https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#2013---2014
2014http://infolab.stanford.edu/~ullman/mmds/book.pdf
2013https://amplab.cs.berkeley.edu/wp-content/uploads/2013/03/eurosys13-paper83.pdf
2013https://amplab.cs.berkeley.edu/wp-content/uploads/2013/01/dmx1.pdf
2013https://amplab.cs.berkeley.edu/wp-content/uploads/2013/02/shark_sigmod2013.pdf
2013https://amplab.cs.berkeley.edu/wp-content/uploads/2013/05/grades-graphx_with_fonts.pdf
2013http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/40671.pdf
2013http://research.microsoft.com/pubs/200169/now-vldb.pdf
2013http://static.druid.io/docs/druid.pdf
2013http://db.disi.unitn.eu/pages/VLDBProgram/pdf/industry/p764-rae.pdf
2013http://static.googleusercontent.com/media/research.google.com/en/us/pubs/archive/41344.pdf
2013http://db.disi.unitn.eu/pages/VLDBProgram/pdf/industry/p734-akidau.pdf
2013http://db.disi.unitn.eu/pages/VLDBProgram/pdf/industry/p767-wiener.pdf
2013http://db.disi.unitn.eu/pages/VLDBProgram/pdf/industry/p871-curtiss.pdf
2013https://www.usenix.org/system/files/conference/nsdi13/nsdi13-final170_update.pdf
https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#2011---2012
2012http://vldb.org/pvldb/vol5/p1771_georgelee_vldb2012.pdf
2012https://amplab.cs.berkeley.edu/wp-content/uploads/2013/04/blinkdb_vldb12_demo.pdf
2012https://www.usenix.org/system/files/login/articles/zaharia.pdf
2012https://amplab.cs.berkeley.edu/wp-content/uploads/2012/03/mod482-xin1.pdf
2012https://www.usenix.org/legacy/event/nsdi11/tech/full_papers/Bolosky.pdf
2012http://research.microsoft.com/pubs/178045/ppaoxs-paper29.pdf
2012https://arxiv.org/pdf/1203.5485.pdf
2012http://vldb.org/pvldb/vol5/p1436_alexanderhall_vldb2012.pdf
2012http://static.googleusercontent.com/media/research.google.com/en//archive/spanner-osdi2012.pdf
2011https://amplab.cs.berkeley.edu/wp-content/uploads/2011/06/euro118-ananthanarayanan.pdf
2011https://amplab.cs.berkeley.edu/wp-content/uploads/2011/06/Mesos-A-Platform-for-Fine-Grained-Resource-Sharing-in-the-Data-Center.pdf
2011http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/36971.pdf
https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#2001---2010
2010https://www.usenix.org/legacy/event/osdi10/tech/full_papers/Beaver.pdf
2010https://amplab.cs.berkeley.edu/wp-content/uploads/2011/06/Spark-Cluster-Computing-with-Working-Sets.pdf
2010http://kowshik.github.io/JPregel/pregel_paper.pdf
2010http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/36726.pdf
2010http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/36632.pdf
2010http://leoneu.github.io/
2009http://www.cs.umd.edu/~abadi/papers/hadoopdb.pdf
2008https://cwiki.apache.org/confluence/download/attachments/120729877/chukwa_cca08.pdf?version=1&modificationDate=1562667399000&api=v2
2007http://www.read.seas.harvard.edu/~kohler/class/cs239-w08/decandia07dynamo.pdf
2006http://static.googleusercontent.com/media/research.google.com/en//archive/chubby-osdi06.pdf
2006http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en//archive/bigtable-osdi06.pdf
2004http://static.googleusercontent.com/media/research.google.com/en//archive/mapreduce-osdi04.pdf
2003http://static.googleusercontent.com/media/research.google.com/en//archive/gfs-sosp2003.pdf
https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#videos
Spark in Motionhttps://www.manning.com/livevideo/spark-in-motion
Machine Learning, Data Science and Deep Learning with Python https://www.manning.com/livevideo/machine-learning-data-science-and-deep-learning-with-python
Data warehouse schema design - dimensional modeling and star schemahttps://snir.dev/talks/data-warehouse-schema-design
Elasticsearch 7 and Elastic Stackhttps://www.manning.com/livevideo/elasticsearch-7-and-elastic-stack
https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#books
https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#streaming
Data Science at Scale with Python and Daskhttps://www.manning.com/books/data-science-at-scale-with-python-and-dask
Streaming Datahttps://www.manning.com/books/streaming-data
Storm Appliedhttps://www.manning.com/books/storm-applied
Fundamentals of Stream Processing: Application Design, Systems, and Analyticshttp://www.cambridge.org/us/academic/subjects/engineering/communications-and-signal-processing/fundamentals-stream-processing-application-design-systems-and-analytics
Stream Data Processing: A Quality of Service Perspectivehttp://www.springer.com/us/book/9780387710020
Unified Log Processinghttps://www.manning.com/books/event-streams-in-action
Kafka Streams in Actionhttps://www.manning.com/books/kafka-streams-in-action
Big Datahttps://www.manning.com/books/big-data
Spark in Actionhttps://www.manning.com/books/spark-in-action
Spark in Action 2nd Ed.https://www.manning.com/books/spark-in-action-second-edition
Kafka in Actionhttps://www.manning.com/books/kafka-in-action
Fusion in Actionhttps://www.manning.com/books/fusion-in-action
Reactive Data Handlinghttps://www.manning.com/books/reactive-data-handling
Azure Data Engineeringhttps://www.manning.com/books/azure-data-engineering
Grokking Streaming Systemshttps://www.manning.com/books/grokking-streaming-systems
https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#distributed-systems
Distributed Systems for fun and profithttp://book.mixu.net/distsys/
https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#graph-based-approach
Graph-Powered Machine Learninghttps://www.manning.com/books/graph-powered-machine-learning
https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#data-visualization-1
The beauty of data visualizationhttps://www.youtube.com/watch?v=5Zg-C8AAIGg
Designing Data Visualizations with Noah Iliinskyhttps://www.youtube.com/watch?v=R-oiKt7bUU8
Hans Rosling's 200 Countries, 200 Years, 4 Minuteshttps://www.youtube.com/watch?v=jbkSRLYSojo
Ice Bucket Challenge Data Visualizationhttps://www.youtube.com/watch?v=qTEchen97rQ
https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#other-awesome-lists
awesome-awesomenesshttps://github.com/bayandin/awesome-awesomeness
awesomehttps://github.com/sindresorhus/awesome
listhttps://github.com/jnv/lists
awesome-awesome-awesomehttps://github.com/t3chnoboy/awesome-awesome-awesome
awesome-analyticshttps://github.com/onurakpolat/awesome-analytics
awesome-public-datasetshttps://github.com/awesomedata/awesome-public-datasets
awesome-graph-classificationhttps://github.com/benedekrozemberczki/awesome-graph-classification
awesome-network-embeddinghttps://github.com/chihming/awesome-network-embedding
awesome-community-detectionhttps://github.com/benedekrozemberczki/awesome-community-detection
awesome-decision-tree-papershttps://github.com/benedekrozemberczki/awesome-decision-tree-papers
awesome-fraud-detection-papershttps://github.com/benedekrozemberczki/awesome-fraud-detection-papers
awesome-gradient-boosting-papershttps://github.com/benedekrozemberczki/awesome-gradient-boosting-papers
awesome-monte-carlo-tree-search-papershttps://github.com/benedekrozemberczki/awesome-monte-carlo-tree-search-papers
awesome-kafkahttps://github.com/monksy/awesome-kafka
Google Bigtablehttps://github.com/zrosenbauer/awesome-bigtable
https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#contributing
pull requesthttps://github.com/exajobs/artificial-intelligence-collection/pulls
https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#license
http://creativecommons.org/publicdomain/zero/1.0/
Exajobshttps://github.com/exajobs
Back to tophttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#data-engineering-collection
engineering https://patch-diff.githubusercontent.com/topics/engineering
data-science https://patch-diff.githubusercontent.com/topics/data-science
query https://patch-diff.githubusercontent.com/topics/query
big-data https://patch-diff.githubusercontent.com/topics/big-data
hadoop https://patch-diff.githubusercontent.com/topics/hadoop
bigdata https://patch-diff.githubusercontent.com/topics/bigdata
databases https://patch-diff.githubusercontent.com/topics/databases
data-visualization https://patch-diff.githubusercontent.com/topics/data-visualization
data-structures https://patch-diff.githubusercontent.com/topics/data-structures
database-migrations https://patch-diff.githubusercontent.com/topics/database-migrations
awesome-list https://patch-diff.githubusercontent.com/topics/awesome-list
data-scientists https://patch-diff.githubusercontent.com/topics/data-scientists
series-data https://patch-diff.githubusercontent.com/topics/series-data
streaming-data https://patch-diff.githubusercontent.com/topics/streaming-data
database-design https://patch-diff.githubusercontent.com/topics/database-design
big-data-analytics https://patch-diff.githubusercontent.com/topics/big-data-analytics
database-deployment https://patch-diff.githubusercontent.com/topics/database-deployment
database-development https://patch-diff.githubusercontent.com/topics/database-development
bigdata-module https://patch-diff.githubusercontent.com/topics/bigdata-module
Readme https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#readme-ov-file
MIT license https://patch-diff.githubusercontent.com/exajobs/data-engineering-collection#MIT-1-ov-file
Please reload this pagehttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection
Activityhttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection/activity
11 starshttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection/stargazers
2 watchinghttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection/watchers
1 forkhttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection/forks
Report repository https://patch-diff.githubusercontent.com/contact/report-content?content_url=https%3A%2F%2Fgithub.com%2Fexajobs%2Fdata-engineering-collection&report=exajobs+%28user%29
Releaseshttps://patch-diff.githubusercontent.com/exajobs/data-engineering-collection/releases
Packages 0https://patch-diff.githubusercontent.com/users/exajobs/packages?repo_name=data-engineering-collection
https://github.com
Termshttps://docs.github.com/site-policy/github-terms/github-terms-of-service
Privacyhttps://docs.github.com/site-policy/privacy-policies/github-privacy-statement
Securityhttps://github.com/security
Statushttps://www.githubstatus.com/
Communityhttps://github.community/
Docshttps://docs.github.com/
Contacthttps://support.github.com?tags=dotcom-footer

Viewport: width=device-width


URLs of crawlers that visited me.