tez
data pipeline engine
A system that enables flexible data processing pipelines using a low-level engine for higher-level frameworks
Apache Tez
482 stars
34 watching
424 forks
Language: Java
last commit: 11 months ago
Linked from 1 awesome list
apachebig-datahadoopjavatez
Related projects:
| Repository | Description | Stars |
|---|---|---|
| | A toolbox for industrial data analytics and stream processing | 614 |
| | A distributed stream processing framework for handling high-volume data streams with fault tolerance and durability guarantees | 817 |
| | A data pipeline engine designed to manage and process large volumes of security telemetry data at scale | 651 |
| | An analytics engine designed to handle large-scale data processing and analysis | 40,170 |
| | A high-performance real-time analytics database for fast queries and ingest | 13,548 |
| | Enables data processing and transformation in large files using a high-level language with compile-time optimizations for efficient execution on distributed computing frameworks. | 682 |
| | A Java framework that simplifies Hadoop's MapReduce API to build efficient data processing pipelines | 57 |
| | A distributed system for streaming data between heterogeneous systems with high reliability and throughput at scale | 931 |
| | A workflow management system designed to efficiently run pipelines in various environments. | 901 |
| | A workflow engine for unifying feature engineering and machine learning operations in data analysis pipelines | 1 |
| | Distributed query engine for Apache DataFusion applications | 1,580 |
| | A lightweight tool for managing and versioning large data alongside source code in data pipelines | 184 |
| | A distributed data pipeline service for collecting, aggregating, and dispatching large volumes of application events. | 794 |
| | A tool for streaming data between Apache RocketMQ and other systems | 122 |
| | A software library of stochastic streaming algorithms, providing efficient data processing and analysis tools | 899 |