docker-spark-iceberg
Spark environment
A Docker-based environment for running Spark and Iceberg in a quick start scenario.
264 stars
13 watching
139 forks
Language: Jupyter Notebook
last commit: 10 months ago Related projects:
| Repository | Description | Stars |
|---|---|---|
| | An open source library that enables interactive development of applications using remote Spark clusters | 1,334 |
| | A Docker image with Apache Spark pre-installed and configured for easy deployment on YARN clusters. | 765 |
| | A library that parses and queries XML data in Apache Spark | 504 |
| | A library for parsing and querying CSV data with Apache Spark | 1,052 |
| | A tool to simplify running Spark on Kubernetes | 181 |
| | A lightweight Python implementation of Spark's RDD and DStream interfaces for improved performance on small datasets | 262 |
| | An environment for experimenting with real-time data processing using Kafka, Spark streaming, and Cassandra | 97 |
| | Wraps Stanford CoreNLP annotators as Spark DataFrame functions for natural language processing tasks | 422 |
| | A Spark-based package to apply data fixes using rule-based SQL conditions | 28 |
| | An R interface to Apache Spark for distributed data analysis and machine learning | 955 |
| | An analytics engine designed to handle large-scale data processing and analysis | 40,170 |
| | Enables manipulation of Apache Spark DataFrames using TensorFlow programs | 749 |
| | An introductory Scala app using Apache Spark Streaming to process data from Kafka and write summaries to Cassandra. | 23 |
| | A comprehensive guide to Docker development and deployment | 14 |
| | Automates deployment of an AWS EMR cluster and execution of Spark jobs | 8 |