delta

Storage framework

An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs

An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs

GitHub

8k stars
217 watching
2k forks
Language: Scala
last commit: 6 days ago
Linked from 4 awesome lists

acidanalyticsbig-datadelta-lakespark

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
delta-io/delta-rs A Rust library that provides low-level APIs and operations for working with the Delta Lake data storage format 2,324
apache/kyuubi An Apache project providing a distributed and multi-tenant gateway to enable serverless SQL on data warehouses and lakehouses 2,105
delta-incubator/deltaray A Python library that provides a Delta Lake table reader for the Ray open-source ML toolkit 43
treeverse/lakefs A tool for managing data lakes and versioning data transformations 4,458
pypi/warehouse A software system that powers the package registry for Python packages 3,601
databricks/koalas A Python package that allows users to work with pandas DataFrames on top of Apache Spark 3,336
deltares/pyflwdir A Python package for fast and efficient hydrological and topographic data processing 75
juicedata/juicefs A distributed POSIX file system designed for cloud-native environments, providing high performance and compatibility with various storage engines. 10,904
apache/spark An analytics engine designed to handle large-scale data processing and analysis 39,916
deltares/wflow.jl A Julia package for simulating hydrological processes in various configurations 120
helgeho/archivespark A framework for efficient data processing and extraction from archival collections, enabling the transformation of raw data into more accessible formats. 145
internetarchive/sparkling A data processing library built on top of Apache Spark to handle temporal web data 11
dandavison/delta A tool for efficiently viewing and navigating version control output 24,394
dropbox/pyhive Provides interfaces to connect and interact with data sources like Hive and Presto using Python. 1,671
deltares/ribasim A water resources modeling framework built in Julia that simulates the behavior of complex hydrological systems. 42