hudi
Data manager
Manages large analytical datasets on distributed storage systems by enabling incremental processing and snapshot isolation.
Upserts, Deletes And Incremental Processing on Big Data.
5k stars
1k watching
2k forks
Language: Java
last commit: about 19 hours ago
Linked from 2 awesome lists
apacheflinkapachehudiapachesparkbigdatadata-integrationdatalakehudiincremental-processingstream-processing
Related projects:
Repository | Description | Stars |
---|---|---|
apache/hive | A software project that enables data warehousing and management of large datasets using SQL | 5,561 |
apache/kylin | An OLAP engine designed to handle Big Data with sub-second query latency and seamless integration with BI tools. | 3,655 |
apache/incubator-hugegraph | A graph database designed to handle large-scale data storage and querying | 2,655 |
apache/iotdb | A time-series data management system for industrial IoT applications | 5,625 |
apache/shardingsphere | A distributed SQL query and transaction engine for sharding, scaling, encryption, and more on any database | 19,985 |
apache/ignite | A distributed, in-memory database system for high-performance computing and data processing | 4,822 |
juicedata/juicefs | A distributed POSIX file system designed for cloud-native environments, providing high performance and compatibility with various storage engines. | 10,948 |
intel-bigdata/hibench | A set of benchmarking tools to evaluate big data frameworks' performance and resource utilization | 1,458 |
apache/hbase | A distributed, versioned column-oriented store modelled after Google Bigtable | 5,233 |
pulumi/examples | Demonstrates building and deploying cloud applications and infrastructure across multiple clouds and programming languages using Pulumi. | 2,394 |
hi-primus/optimus | A Python library that provides a simple API for data preparation and analysis on various big-data engines | 1,481 |
apache/iceberg | Enables reliable and simple access to huge analytic tables across multiple engines | 6,494 |
apache/flink | An open-source stream processing framework with powerful capabilities for handling high-throughput and low-latency data streams in various programming languages | 24,156 |
volfpeter/fastapi-htmx-tailwind-example | An IoT dashboard application showcasing the integration of FastAPI, HTMX, TailwindCSS, and MongoDB for a interactive frontend experience. | 37 |
apache/pinot | A distributed real-time analytics system with low latency | 5,523 |