dftimewolf
Data pipeline
A framework for orchestrating data collection, processing, and export
A framework for orchestrating forensic collection, processing and data export
299 stars
27 watching
72 forks
Language: Python
last commit: about 2 months ago
Linked from 3 awesome lists
Related projects:
Repository | Description | Stars |
---|---|---|
giacbrd/smartpipeline | A framework for designing and executing concurrent data pipelines with a focus on simplicity and efficiency | 25 |
databiosphere/toil | A workflow management system designed to efficiently run pipelines in various environments. | 901 |
huggingface/datatrove | A platform-agnostic data processing framework for large-scale text data pipelines | 2,103 |
kevin-hanselman/dud | A lightweight tool for managing and versioning large data alongside source code in data pipelines | 184 |
sitecore/data-exchange-framework-docs | A documentation project for an ETL tool used in Sitecore to exchange and process data | 1 |
pdpipe/pdpipe | Provides a set of pre-defined data processing pipelines for pandas DataFrames. | 718 |
elastic/logstash | A real-time data processing pipeline that transforms and sends data to a storage system | 14,293 |
intentmedia/mario | A library that enables the definition of complex data pipelines in a functional, typesafe, and efficient way using a declarative syntax | 139 |
vectaport/flowgraph | A software framework for building scalable, asynchronous data pipelines with explicit back-pressure management and logging capabilities. | 60 |
d6t/d6tflow | A Python library to build and manage complex data science workflows efficiently | 953 |
galaxyproject/galaxy | A platform for data-intensive scientific analysis and workflow management | 1,431 |
druths/xp | A tool for creating flexible and self-documenting data science pipelines | 56 |
tenzir/tenzir | A data pipeline engine designed to manage and process large volumes of security telemetry data at scale | 651 |
vishalanandl177/drf-api-logger | Logs API requests and responses to a database or listens for signals to store data in various formats. | 314 |
dataform-co/dataform | A framework for managing data operations in BigQuery using SQL and software engineering best practices | 860 |