lakeFS
Data lake manager
A tool for managing data lakes and versioning data transformations
lakeFS - Data version control for your data lake | Git for data
4k stars
44 watching
360 forks
Language: Go
last commit: 1 day ago
Linked from 6 awesome lists
apache-sparkapache-sparksqlaws-s3azure-blob-storageazure-storagedata-engineeringdata-lakedata-qualitydata-version-controldata-versioningdatalakedatalakesgit-for-datagogolanggoogle-cloud-storagehadoop-filesystemlakefsobject-storage
Related projects:
Repository | Description | Stars |
---|---|---|
juicedata/juicefs | A distributed POSIX file system designed for cloud-native environments, providing high performance and compatibility with various storage engines. | 11,030 |
delta-io/delta | An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs | 7,677 |
airbytehq/airbyte | A platform for building data integration pipelines between various data sources and destinations | 16,441 |
dotnet/efcore | A modern object-database mapper for .NET, supporting LINQ queries and schema migrations. | 13,838 |
cubefs/cubefs | A cloud-native file storage system designed to support large-scale data centers and hybrid cloud infrastructures | 4,724 |
netdata/netdata | A high-performance observability platform designed to simplify modern infrastructure monitoring and provide real-time insights into systems and applications. | 72,607 |
dolthub/dolt | A system that integrates version control with SQL databases, allowing developers to track changes and collaborate on database schema and data. | 18,052 |
databendlabs/databend | A high-performance, scalable data warehouse built on Rust, offering blazing-fast query execution and real-time analytics capabilities. | 7,978 |
microsoft/vfsforgit | A Windows-based virtual file system that optimizes Git performance by caching and managing files on demand. | 5,991 |
cloudquery/cloudquery | An open-source ELT framework that enables data movement between any source and destination using high-performance data ingestion and processing | 5,913 |
teevity/ice | An AWS usage and cost management tool that aggregates data from billing files to provide insights and enable informed decision-making for cloud resource allocation and reservations. | 2,861 |
git-lfs/git-lfs | Manages large files in version control systems like Git | 13,096 |
openmined/pysyft | Enables data scientists to perform analysis on private data without accessing the underlying data, using a secure and decentralized server architecture. | 9,557 |
rowyio/rowy | A web-based platform for managing data in Firestore using a spreadsheet-like interface and automating workflows with cloud functions. | 6,233 |
foundation/foundation-sites | A comprehensive front-end framework for building responsive sites and apps on various devices | 29,671 |