gobblin
Data integrator
A distributed data integration framework for managing structured and byte-oriented data in heterogeneous ecosystems
A distributed data integration framework that simplifies common aspects of big data integration such as data ingestion, replication, organization and lifecycle management for both streaming and batch data ecosystems.
2k stars
165 watching
751 forks
Language: Java
last commit: 7 days ago
Linked from 6 awesome lists
apachedataingestionmanagementreplication
Related projects:
Repository | Description | Stars |
---|---|---|
sonalgoyal/hiho | A tool for integrating data from various sources into a centralized repository on Hadoop | 91 |
usc-isi-i2/web-karma | An information integration tool for data modeling and transformation from various data sources into standardized RDF format. | 588 |
smooks/smooks | A Java framework for building flexible data integration pipelines | 397 |
379547990/tdengine_hivemq | A system that integrates data from various sources through MQTT and TDengine | 0 |
rudderlabs/airbyte | A platform for replicating data from various sources to different destinations, enabling flexible and customizable data integration. | 3 |
jsxlei/scalex | A tool for integrating single-cell data from heterogeneous sources into a common cell-space using deep learning techniques. | 72 |
apache/flink-connector-hbase | A connector for integrating HBase with the Apache Flink stream processing framework | 29 |
linkedpipes/etl | An ETL tool for integrating data from various sources into a centralized knowledge graph using RDF | 147 |
anishkny/integrify | Enforces referential and data integrity in Cloud Firestore using triggers to automate tasks such as attribute replication, reference deletion, and counting updates. | 109 |
cipher387/maltego-transforms-list | A curated list of tools that provide data processing and integration capabilities for the Maltego graphical analysis tool. | 226 |
okgrow/analytics | An integration package for Meteor that automatically records and sends user data and page view events to various analytics services. | 213 |
jazzband/django-analytical | An application that integrates multiple analytics services into Django projects in a generic and customizable way. | 1,201 |
googlecloudplatform/healthcare-data-harmonization | A mapping language and engine for converting complex data from one schema to another, applicable across domains. | 213 |
mravi/kafka-connect-hbase | A tool that enables real-time data integration between Apache Kafka and HBase using Java | 43 |
unipop-graph/unipop | An integration platform for graph data models using Gremlin and Tinkerpop | 205 |