dftimewolf

Data pipeline

A framework for orchestrating data collection, processing, and export

A framework for orchestrating forensic collection, processing and data export

GitHub

299 stars
27 watching
72 forks
Language: Python
last commit: about 2 months ago
Linked from 3 awesome lists


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
giacbrd/smartpipeline A framework for designing and executing concurrent data pipelines with a focus on simplicity and efficiency 25
databiosphere/toil A workflow management system designed to efficiently run pipelines in various environments. 901
huggingface/datatrove A platform-agnostic data processing framework for large-scale text data pipelines 2,103
kevin-hanselman/dud A lightweight tool for managing and versioning large data alongside source code in data pipelines 184
sitecore/data-exchange-framework-docs A documentation project for an ETL tool used in Sitecore to exchange and process data 1
pdpipe/pdpipe Provides a set of pre-defined data processing pipelines for pandas DataFrames. 718
elastic/logstash A real-time data processing pipeline that transforms and sends data to a storage system 14,293
intentmedia/mario A library that enables the definition of complex data pipelines in a functional, typesafe, and efficient way using a declarative syntax 139
vectaport/flowgraph A software framework for building scalable, asynchronous data pipelines with explicit back-pressure management and logging capabilities. 60
d6t/d6tflow A Python library to build and manage complex data science workflows efficiently 953
galaxyproject/galaxy A platform for data-intensive scientific analysis and workflow management 1,431
druths/xp A tool for creating flexible and self-documenting data science pipelines 56
tenzir/tenzir A data pipeline engine designed to manage and process large volumes of security telemetry data at scale 651
vishalanandl177/drf-api-logger Logs API requests and responses to a database or listens for signals to store data in various formats. 314
dataform-co/dataform A framework for managing data operations in BigQuery using SQL and software engineering best practices 860