tez
data pipeline engine
A system that enables flexible data processing pipelines using a low-level engine for higher-level frameworks
Apache Tez
482 stars
34 watching
424 forks
Language: Java
last commit: 2 months ago
Linked from 1 awesome list
apachebig-datahadoopjavatez
Related projects:
Repository | Description | Stars |
---|---|---|
| A toolbox for industrial data analytics and stream processing | 614 |
| A distributed stream processing framework for handling high-volume data streams with fault tolerance and durability guarantees | 817 |
| A data pipeline engine designed to manage and process large volumes of security telemetry data at scale | 651 |
| An analytics engine designed to handle large-scale data processing and analysis | 40,170 |
| A high-performance real-time analytics database for fast queries and ingest | 13,548 |
| Enables data processing and transformation in large files using a high-level language with compile-time optimizations for efficient execution on distributed computing frameworks. | 682 |
| A Java framework that simplifies Hadoop's MapReduce API to build efficient data processing pipelines | 57 |
| A distributed system for streaming data between heterogeneous systems with high reliability and throughput at scale | 931 |
| A workflow management system designed to efficiently run pipelines in various environments. | 901 |
| A workflow engine for unifying feature engineering and machine learning operations in data analysis pipelines | 1 |
| Distributed query engine for Apache DataFusion applications | 1,580 |
| A lightweight tool for managing and versioning large data alongside source code in data pipelines | 184 |
| A distributed data pipeline service for collecting, aggregating, and dispatching large volumes of application events. | 794 |
| A tool for streaming data between Apache RocketMQ and other systems | 122 |
| A software library of stochastic streaming algorithms, providing efficient data processing and analysis tools | 899 |