crunch

Data processor

A toolkit for extracting insights from large datasets by parsing and processing semi-structured data

A fast to develop, fast to run, Go based toolkit for ETL and feature extraction on Hadoop.

GitHub

214 stars
18 watching
16 forks
Language: Go
last commit: about 10 years ago
Linked from 2 awesome lists


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
pietern/goestools A toolset for working with signals and files from GOES satellites 372
apache/spark An analytics engine designed to handle large-scale data processing and analysis 39,916
elastic/logstash A real-time data processing pipeline that transforms and sends data to a storage system 75
cube2222/jql A JSON query processor with a custom syntax that simplifies complex queries by breaking them down into step-by-step operations. 896
nikolaydubina/go-featureprocessing A Go library for fast and simple feature engineering and machine learning data preprocessing 121
jason-kerney/peelandslice.java A Java implementation of a self-contained, serverless, and zero-configuration data processing framework 1
snoyberg/conduit A framework for handling and transforming streaming data in a consistent and efficient way 903
obrok/lens A utility for working with nested data structures 190
benmack/eo-box A toolbox for processing earth observation data with Python. 14
yaa110/goterator An iterator implementation providing map and reduce functionalities for data processing in Go. 16
intake/intake A package for describing, loading, and processing data in a declarative way 1,013
davidbyttow/govips A fast image processing and resizing library for Go. 1,298
utdemir/distributed-dataset A Haskell-based framework for processing and distributing large datasets across multiple nodes in parallel. 116
sinhashubham95/jsonic A comprehensive set of utilities to handle JSON data in Go. 11
deatil/go-array A Go package for working with nested data structures like maps and slices 20