spark-xml

XML parser

A library that parses and queries XML data in Apache Spark

XML data source for Spark SQL and DataFrames

GitHub

504 stars
39 watching
226 forks
Language: Scala
last commit: 4 months ago
Linked from 1 awesome list


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
databricks/spark-csv A library for parsing and querying CSV data with Apache Spark 1,052
apache/spark An analytics engine designed to handle large-scale data processing and analysis 40,170
indix/sparkplug A Spark-based package to apply data fixes using rule-based SQL conditions 28
databricks/tensorframes Enables manipulation of Apache Spark DataFrames using TensorFlow programs 749
databricks/spark-corenlp Wraps Stanford CoreNLP annotators as Spark DataFrame functions for natural language processing tasks 422
spiritlab/spark A research-focused implementation of Apache Spark with homomorphic encryption support 3
datastax/spark-cassandra-connector A library that enables integration between Apache Spark and Apache Cassandra for fast data processing and analysis. 1,944
drmohundro/swxmlhash A Swift wrapper around XML parsing APIs, providing a simple way to parse XML into dictionary of arrays. 1,412
tugdualsarazin/spark-clustering Implementations of clustering algorithms using Spark in Scala 18
scalawilliam/xs4s Utilities and API for parsing and streaming XML data in Scala 60
dotnet/spark Provides high-performance APIs for using Apache Spark with .NET 2,032
yahoojapan/swiftyxmlparser An XML parsing library implemented in Swift 584
databricks/docker-spark-iceberg A Docker-based environment for running Spark and Iceberg in a quick start scenario. 261
svenkreiss/pysparkling A lightweight Python implementation of Spark's RDD and DStream interfaces for improved performance on small datasets 262
spark-clustering-notebook/g-stream An implementation of data stream clustering algorithms using Spark Streaming. 3