feathr

Data pipeline management

A unified data and AI engineering platform for enterprise

Feathr – A scalable, unified data and AI engineering platform for enterprise

GitHub

2k stars
84 watching
260 forks
Language: Scala
last commit: 10 months ago
Linked from 2 awesome lists

apache-sparkartificial-intelligenceazuredata-engineeringdata-qualitydata-sciencefeature-engineeringfeature-governancefeature-managementfeature-marketplacefeature-metadatafeature-platformfeature-storemachine-learningmlops

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
apache/tez A system that enables flexible data processing pipelines using a low-level engine for higher-level frameworks 482
floomai/floom A platform for orchestrating and executing Generative AI pipelines, empowering developers to automate complex tasks 37
combust/mleap Enables deployment of machine learning pipelines from Spark and Scikit-Learn to production 1,506
apache/streampipes A toolbox for industrial data analytics and stream processing 614
tenzir/tenzir A data pipeline engine designed to manage and process large volumes of security telemetry data at scale 651
linkedin/brooklin A distributed system for streaming data between heterogeneous systems with high reliability and throughput at scale 931
apache/spark An analytics engine designed to handle large-scale data processing and analysis 40,170
stratio/sparta An Apache Spark-based platform for building real-time analytics workflows with a focus on simplicity and extensibility. 525
huo-ju/dfserver A distributed backend AI pipeline server for building and managing GPU clusters to run various AI models. 349
kbrw/adap A data augmentation pipeline built on top of Elixir, designed to process data streams and apply transformation rules in real-time. 16
nessos/streams A lightweight library for building efficient data pipelines using functional programming concepts 383
cloud-cv/evalai A platform for comparing and evaluating AI and machine learning algorithms at scale 1,779
kevin-hanselman/dud A lightweight tool for managing and versioning large data alongside source code in data pipelines 184
databricks/tensorframes Enables manipulation of Apache Spark DataFrames using TensorFlow programs 749
webankfintech/dataspherestudio A comprehensive platform for managing and developing data applications, providing tools for data exchange, analysis, visualization, and workflow management. 3,100