hiho

Data Integrator

A tool for integrating data from various sources into a centralized repository on Hadoop

Hadoop Data Integration with various databases, ftp servers, salesforce. Incremental update, dedup, append, merge your data on Hadoop.

GitHub

91 stars
11 watching
32 forks
Language: Java
last commit: over 11 years ago
Linked from 1 awesome list


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
apache/gobblin A distributed data integration framework for managing structured and byte-oriented data in heterogeneous ecosystems 2,232
usc-isi-i2/web-karma An information integration tool for data modeling and transformation from various data sources into standardized RDF format. 588
379547990/tdengine_hivemq A system that integrates data from various sources through MQTT and TDengine 0
nextgenhealthcare/connect An integration platform for healthcare data exchange between disparate systems. 933
smooks/smooks A Java framework for building flexible data integration pipelines 398
jsxlei/scalex A tool for integrating single-cell data from heterogeneous sources into a common cell-space using deep learning techniques. 72
cipher387/maltego-transforms-list A curated list of tools that provide data processing and integration capabilities for the Maltego graphical analysis tool. 226
apache/flink-connector-hbase A connector for integrating HBase with the Apache Flink stream processing framework 29
linkedpipes/etl An ETL tool for integrating data from various sources into a centralized knowledge graph using RDF 147
karamokoisrael/directus-hackathon-submission Provides a set of extensions to integrate machine learning operations into Directus, simplifying data processing and model deployment 12
mravi/kafka-connect-hbase A tool that enables real-time data integration between Apache Kafka and HBase using Java 43
joshwingreene/obsidian-jg-method A tool to integrate Obsidian notes with an Alfred workflow for task management and knowledge base organization. 182
inverse/hassio-addon-emoncms An integration that allows Home Assistant to interact with Emoncms data processing and visualization capabilities. 13
helgeho/hadoopconcatgz Provides a custom input format for handling concatenated GZIP files in distributed processing systems like Hadoop 9
phaneesh/riemann-bundle A Java bundle that simplifies integrating Dropwizard metrics with Riemann 0