sparkplug

Data fixer

A Spark-based package to apply data fixes using rule-based SQL conditions

Spark package to "plug" holes in data using SQL based rules ⚡️ 🔌

GitHub

28 stars
7 watching
2 forks
Language: Scala
last commit: over 4 years ago
datapipelinesparkspark-sql

Related projects:

Repository Description Stars
databricks/spark-xml A library that parses and queries XML data in Apache Spark 505
apache/spark An analytics engine designed to handle large-scale data processing and analysis 39,916
datastax/spark-cassandra-connector A library that enables integration between Apache Spark and Apache Cassandra for fast data processing and analysis. 1,943
willb/silex A library of reusable code for building scalable Spark applications 19
datastax/spark-cassandra-stress A tool for testing the performance and stability of data integration between Apache Spark and Cassandra databases. 25
databricks/spark-csv A library for parsing and querying CSV data with Apache Spark 1,053
sparklyr/sparklyr An R interface to Apache Spark for distributed data analysis and machine learning 957
dotnet/spark Provides high-performance APIs for using Apache Spark with .NET 2,023
wanwizard/sparks-datamapper A PHP library that maps database tables into objects with automated relationship management and validation. 51
mrpowers-io/spark-fast-tests A testing helper library for Apache Spark applications. 436
tubular/sparkly A set of Python libraries and tools to simplify interactions with various data sources using Apache Spark. 60
rougin/spark-plug A tool that simplifies testing and development with Codeigniter 3 by providing an application instance as a single variable. 15
sparkica/dbpedia-extension An extension for Google Refine that adds columns to reconciled data from DBpedia 39
lucidworks/spark-solr Provides tools to read data from Solr and write it to Spark DataFrames/RDDs, enabling integration with Solr. 445
tdeckers/sparkcli A command-line interface to Cisco Spark 14