delta
Storage framework
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
8k stars
217 watching
2k forks
Language: Scala
last commit: about 2 months ago
Linked from 4 awesome lists
acidanalyticsbig-datadelta-lakespark
Related projects:
Repository | Description | Stars |
---|---|---|
delta-io/delta-rs | A Rust library that provides low-level APIs and operations for working with the Delta Lake data storage format | 2,386 |
apache/kyuubi | An Apache project providing a distributed and multi-tenant gateway to enable serverless SQL on data warehouses and lakehouses | 2,116 |
delta-incubator/deltaray | A Python library that provides a Delta Lake table reader for the Ray open-source ML toolkit | 43 |
treeverse/lakefs | A tool for managing data lakes and versioning data transformations | 4,496 |
pypi/warehouse | The software behind the Python Package Index. | 3,617 |
databricks/koalas | A Python package that allows users to work with pandas DataFrames on top of Apache Spark | 3,343 |
deltares/pyflwdir | A Python package for fast and efficient hydrological and topographic data processing | 78 |
juicedata/juicefs | A distributed POSIX file system designed for cloud-native environments, providing high performance and compatibility with various storage engines. | 11,030 |
apache/spark | An analytics engine designed to handle large-scale data processing and analysis | 40,170 |
deltares/wflow.jl | A Julia package for simulating hydrological processes in various configurations | 122 |
helgeho/archivespark | A framework for efficient data processing and extraction from archival collections, enabling the transformation of raw data into more accessible formats. | 145 |
internetarchive/sparkling | A data processing library built on top of Apache Spark to handle temporal web data | 11 |
dandavison/delta | A tool for efficiently viewing and navigating version control output | 24,778 |
dropbox/pyhive | Provides interfaces to connect and interact with data sources like Hive and Presto using Python. | 1,676 |
deltares/ribasim | A water resources modeling framework built in Julia that simulates the behavior of complex hydrological systems. | 42 |