galaxy

Data pipeline manager

An integrated framework for data-intensive scientific analysis and workflow management

Data intensive science for everyone.

GitHub

1k stars
71 watching
1k forks
Language: Python
last commit: 4 days ago
Linked from 1 awesome list

bioinformaticsdnagenomicshacktoberfestngspipelinesciencesequencingusegalaxyworkflowworkflow-engine

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
samapriya/planet-gee-pipeline-cli A command-line tool for automating data processing and uploads from Planet's API to Google Earth Engine. 42
bjpop/rubra A bioinformatics pipeline system that supports running workflow stages on a distributed compute cluster. 38
natcap/taskgraph A Python library for managing and optimizing computational workflows with parallel processing and data reuse. 21
kevin-hanselman/dud A lightweight tool for managing and versioning large data alongside source code in data pipelines 183
databiosphere/toil A workflow management system designed to efficiently run pipelines in various environments. 901
prodmodel/prodmodel A tool for managing data science pipelines by automating build, testing, and deployment processes while ensuring correctness and performance. 59
linkedin/brooklin A distributed system for streaming data between heterogeneous systems with high reliability and throughput at scale 920
scipipe/scipipe A flexible and efficient way to write and run complex workflows using Go programming language 1,075
giacbrd/smartpipeline A framework for designing and executing concurrent data pipelines with a focus on simplicity and efficiency 23
montilab/pipeliner A framework for defining and automating bioinformatics pipelines using Nextflow. 44
lightforever/mlcomp A distributed framework for building and managing complex machine learning pipelines with a user-friendly interface. 188
pipefunc/pipefunc Automates and simplifies the creation of function pipelines for efficient execution of scientific workflows. 215
whitaker-io/machine A library for creating data workflows that can be simple or complex, with features like recursion and memoization. 158
formlio/forml A framework for managing the lifecycle of data science projects from research to production 104
huggingface/datatrove A platform-agnostic data processing framework for large-scale text data pipelines 2,043