dataform
Data pipeline framework
A framework for managing data operations in BigQuery using SQL and software engineering best practices
Dataform is a framework for managing SQL based data operations in BigQuery
856 stars
27 watching
166 forks
Language: TypeScript
last commit: 2 days ago
Linked from 1 awesome list
analyticsbusiness-intelligencedata-engineeringdata-pipelineseltetlhacktoberfest
Related projects:
Repository | Description | Stars |
---|---|---|
sitecore/data-exchange-framework-docs | A documentation project for an ETL tool used in Sitecore to exchange and process data | 1 |
vectaport/flowgraph | A software framework for building scalable, asynchronous data pipelines with explicit back-pressure management and logging capabilities. | 60 |
huggingface/datatrove | A platform-agnostic data processing framework for large-scale text data pipelines | 2,073 |
googlecloudplatform/dataflowtemplates | A collection of pre-implemented data pipelines using Google Cloud Dataflow and Apache Beam | 1,164 |
log2timeline/dftimewolf | A framework for orchestrating data collection, processing, and export | 296 |
giacbrd/smartpipeline | A framework for designing and executing concurrent data pipelines with a focus on simplicity and efficiency | 23 |
dagworks-inc/hamilton | Helps define and manage data transformations with a modular, self-documenting, and portable framework for directed acyclic graphs (DAGs) of data transformations. | 1,884 |
dataformsjs/dataformsjs | A minimal JavaScript framework for rapid development of high-quality websites and single-page applications using JSX, Web Components, and templating engines. | 191 |
ph200/cycle-react | An RxJS-based framework for building functional React applications with controlled data flow | 370 |
microsoft/chart-parts | A React-based data visualization framework that abstracts away common charting complexities. | 608 |
raftlib/raftlib | A C++ library providing a framework for implementing parallel and concurrent data processing pipelines. | 953 |
jexia/semaphore | Builds high-performance data flows that can be exposed through multiple protocols and integrates with existing systems. | 94 |
galaxyproject/galaxy | A platform for data-intensive scientific analysis and workflow management | 1,416 |
biocorecrg/bionextflow | A collection of reusable modules and sub-workflows for Nextflow pipelines in bioinformatics | 26 |
datacrypt-project/hitchhiker-tree | A data structure and application framework for building fast, persistent, and scalable databases. | 1,190 |