spark-xml
XML parser
A library that parses and queries XML data in Apache Spark
XML data source for Spark SQL and DataFrames
504 stars
39 watching
226 forks
Language: Scala
last commit: 4 months ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
databricks/spark-csv | A library for parsing and querying CSV data with Apache Spark | 1,052 |
apache/spark | An analytics engine designed to handle large-scale data processing and analysis | 40,170 |
indix/sparkplug | A Spark-based package to apply data fixes using rule-based SQL conditions | 28 |
databricks/tensorframes | Enables manipulation of Apache Spark DataFrames using TensorFlow programs | 749 |
databricks/spark-corenlp | Wraps Stanford CoreNLP annotators as Spark DataFrame functions for natural language processing tasks | 422 |
spiritlab/spark | A research-focused implementation of Apache Spark with homomorphic encryption support | 3 |
datastax/spark-cassandra-connector | A library that enables integration between Apache Spark and Apache Cassandra for fast data processing and analysis. | 1,944 |
drmohundro/swxmlhash | A Swift wrapper around XML parsing APIs, providing a simple way to parse XML into dictionary of arrays. | 1,412 |
tugdualsarazin/spark-clustering | Implementations of clustering algorithms using Spark in Scala | 18 |
scalawilliam/xs4s | Utilities and API for parsing and streaming XML data in Scala | 60 |
dotnet/spark | Provides high-performance APIs for using Apache Spark with .NET | 2,032 |
yahoojapan/swiftyxmlparser | An XML parsing library implemented in Swift | 584 |
databricks/docker-spark-iceberg | A Docker-based environment for running Spark and Iceberg in a quick start scenario. | 261 |
svenkreiss/pysparkling | A lightweight Python implementation of Spark's RDD and DStream interfaces for improved performance on small datasets | 262 |
spark-clustering-notebook/g-stream | An implementation of data stream clustering algorithms using Spark Streaming. | 3 |