lakeFS
Data lake manager
A tool for managing data lakes and versioning data transformations
lakeFS - Data version control for your data lake | Git for data
4k stars
44 watching
355 forks
Language: Go
last commit: 4 days ago
Linked from 6 awesome lists
apache-sparkapache-sparksqlaws-s3azure-blob-storageazure-storagedata-engineeringdata-lakedata-qualitydata-version-controldata-versioningdatalakedatalakesgit-for-datagogolanggoogle-cloud-storagehadoop-filesystemlakefsobject-storage
Related projects:
Repository | Description | Stars |
---|---|---|
juicedata/juicefs | A distributed POSIX file system designed for cloud-native environments, providing high performance and compatibility with various storage engines. | 10,904 |
delta-io/delta | An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs | 7,593 |
airbytehq/airbyte | A platform for building data integration pipelines between various data sources and destinations | 16,184 |
dotnet/efcore | A modern object-database mapper for .NET, supporting LINQ queries and schema migrations. | 13,787 |
cubefs/cubefs | A cloud-native distributed storage system that enables scalable and high-performance data storage for various applications. | 4,672 |
netdata/netdata | An observability platform designed to monitor and analyze systems in real-time with automated anomaly detection and root cause analysis. | 72,075 |
dolthub/dolt | A system that integrates version control with SQL databases, allowing developers to track changes and collaborate on database schema and data. | 17,965 |
databendlabs/databend | An open-source cloud-based data warehouse built on Rust with a focus on high-performance analytics and scalable storage | 7,856 |
microsoft/vfsforgit | A Windows-based virtual file system that optimizes Git performance by caching and managing files on demand. | 5,984 |
cloudquery/cloudquery | An open-source ELT framework that enables data movement between any source and destination using high-performance data ingestion and processing | 5,877 |
teevity/ice | An AWS usage and cost management tool that aggregates data from billing files to provide insights and enable informed decision-making for cloud resource allocation and reservations. | 2,856 |
git-lfs/git-lfs | Manages large files in version control systems like Git | 12,998 |
openmined/pysyft | Enables data scientists to perform analysis on private data without accessing the underlying data, using a secure and decentralized server architecture. | 9,516 |
rowyio/rowy | A low-code platform for managing Firestore databases and building cloud functions workflows on the web. | 6,171 |
foundation/foundation-sites | A comprehensive front-end framework for building responsive sites and apps on various devices | 29,666 |