feathr

Data pipeline management

A unified data and AI engineering platform for enterprise

Feathr – A scalable, unified data and AI engineering platform for enterprise

GitHub

2k stars
84 watching
260 forks
Language: Scala
last commit: 8 months ago
Linked from 2 awesome lists

apache-sparkartificial-intelligenceazuredata-engineeringdata-qualitydata-sciencefeature-engineeringfeature-governancefeature-managementfeature-marketplacefeature-metadatafeature-platformfeature-storemachine-learningmlops

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
apache/tez A system that enables flexible data processing pipelines using a low-level engine for higher-level frameworks 479
floomai/floom A platform for orchestrating and executing Generative AI pipelines, empowering developers to automate complex tasks 36
combust/mleap Enables deployment of machine learning data pipelines and algorithms to production 1,504
apache/streampipes A toolbox for industrial data analytics and stream processing 605
tenzir/tenzir A data pipeline engine designed to manage and process large volumes of security telemetry data at scale 645
linkedin/brooklin A distributed system for streaming data between heterogeneous systems with high reliability and throughput at scale 920
apache/spark An analytics engine designed to handle large-scale data processing and analysis 39,916
stratio/sparta A real-time analytics platform built on Apache Spark and Kafka, allowing users to process large datasets in near-real time using declarative workflows. 525
huo-ju/dfserver A distributed backend AI pipeline server for building and managing GPU clusters to run various AI models. 348
kbrw/adap A data augmentation pipeline built on top of Elixir, designed to process data streams and apply transformation rules in real-time. 16
nessos/streams A lightweight library for building efficient data pipelines using functional programming concepts 383
cloud-cv/evalai A platform for comparing and evaluating AI and machine learning algorithms at scale 1,771
kevin-hanselman/dud A lightweight tool for managing and versioning large data alongside source code in data pipelines 183
databricks/tensorframes Enables manipulation of Apache Spark DataFrames using TensorFlow programs 749
webankfintech/dataspherestudio A comprehensive platform for managing and developing data applications, providing tools for data exchange, analysis, visualization, and workflow management. 3,089