butterfree

Feature pipeline builder

A Python library for building data pipelines to create and load features into a feature store using Apache Spark.

A tool for building feature stores.

GitHub

288 stars

195 watching

36 forks

Language: Python

last commit: over 1 year ago

Linked from 2 awesome lists

data-engineeringdata-scienceetletl-frameworkfeature-storepackagepysparkpython

Backlinks from these awesome lists:

Related projects:

Repository	Description	Stars
amphi-ai/amphi-etl	A tool that enables data analysts to create and manage data pipelines with an intuitive interface, generating Python code for deployment anywhere.	933
jazzband/django-pipeline	An asset packaging library for Django that simplifies CSS and JavaScript concatenation and compression.	1,520
kubeflow-kale/kale	Simplifies the deployment of Kubeflow Pipelines workflows by providing a graphical interface for Data Scientists to define and deploy pipelines directly from JupyterLab.	632
druths/xp	A tool for creating flexible and self-documenting data science pipelines	56
py-universe/django-rest-cli	A tool that speeds up the development of Django Rest APIs by automating repetitive tasks.	118
zorbash/opus	A framework for building pluggable business logic pipelines with a focus on modular and composable components.	362
johnsonc/lambdo	A workflow engine for unifying feature engineering and machine learning operations in data analysis pipelines	1
minyus/pipelinex	A Python package to build and experiment with machine learning pipelines using Kedro, MLflow, and other tools	226
pakoito/rxfunctions	A library for composing and chaining functions on Observables in RxJava to simplify complex data processing pipelines.	49
jackqqwang/pfedhr	A Python project implementing a novel approach to high-performance feature learning and dimensionality reduction in deep neural networks	7
ypares/porcupine	A tool that enables data manipulation and analysis pipelines to be flexible, reusable, and reproducible in different environments	89
bytehub-ai/bytehub	A Python-based feature store library with a simple, scalable, and flexible architecture for storing and managing data for machine learning applications.	58
datasalt/pangool	A Java framework that simplifies Hadoop's MapReduce API to build efficient data processing pipelines	57
quixio/quix-streams	A Python framework for real-time data processing on Apache Kafka streams	1,246
giacbrd/smartpipeline	A framework for designing and executing concurrent data pipelines with a focus on simplicity and efficiency	25