dataform

Data pipeline framework

A framework for managing data operations in BigQuery using SQL and software engineering best practices

Dataform is a framework for managing SQL based data operations in BigQuery

GitHub

860 stars
27 watching
167 forks
Language: TypeScript
last commit: about 1 month ago
Linked from 1 awesome list

analyticsbusiness-intelligencedata-engineeringdata-pipelineseltetlhacktoberfest

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
sitecore/data-exchange-framework-docs A documentation project for an ETL tool used in Sitecore to exchange and process data 1
vectaport/flowgraph A software framework for building scalable, asynchronous data pipelines with explicit back-pressure management and logging capabilities. 60
huggingface/datatrove A platform-agnostic data processing framework for large-scale text data pipelines 2,103
googlecloudplatform/dataflowtemplates A collection of pre-implemented data pipelines using Google Cloud Dataflow and Apache Beam 1,169
log2timeline/dftimewolf A framework for orchestrating data collection, processing, and export 299
giacbrd/smartpipeline A framework for designing and executing concurrent data pipelines with a focus on simplicity and efficiency 25
dagworks-inc/hamilton Helps define and manage data transformations with a modular, self-documenting, and portable framework for directed acyclic graphs (DAGs) of data transformations. 1,900
dataformsjs/dataformsjs A minimal JavaScript framework for rapid development of high-quality websites and single-page applications using JSX, Web Components, and templating engines. 191
ph200/cycle-react An RxJS-based framework for building functional React applications with controlled data flow 370
microsoft/chart-parts A React-based data visualization framework that abstracts away common charting complexities. 609
raftlib/raftlib A C++ library providing a framework for implementing parallel and concurrent data processing pipelines. 955
jexia/semaphore A tool to manage and expose complex data flows through multiple protocols. 94
galaxyproject/galaxy A platform for data-intensive scientific analysis and workflow management 1,431
biocorecrg/bionextflow A collection of reusable modules and sub-workflows for Nextflow pipelines in bioinformatics 26
datacrypt-project/hitchhiker-tree A data structure and application framework for building fast, persistent, and scalable databases. 1,191