bonobo

Data processor

A Python framework for parallelizing data transformations and processing

Extract Transform Load for Python 3.5+

GitHub

2k stars
58 watching
146 forks
Language: Python
last commit: over 1 year ago
Linked from 1 awesome list

automationbonobodata-processingextract-transform-loadparallelizationpython3

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
gbif/pygbif A Python client for accessing biodiversity data from the GBIF API. 111
apache/dubbo-python A Python implementation of a high-performance RPC framework with service discovery and traffic management features 268
reubano/meza A lightweight toolkit for processing tabular data with a focus on functional programming and PyPy compatibility. 416
pytorch/data A PyTorch project providing data loading utilities and scalable dataloading solutions 1,133
apache/pig Enables data processing and transformation in large files using a high-level language with compile-time optimizations for efficient execution on distributed computing frameworks. 681
dr-leo/pandasdmx Provides tools to access and manipulate SDMX-compliant data in various formats 127
mahmoud/glom Provides a declarative way to handle nested data structures in Python 1,917
cgarciae/phi A Python library for functional programming that aims to simplify the experience by providing a unified API and operator overloading for common data transformations and operations. 134
pyjanitor-devs/pyjanitor A Python library providing a clean and expressive API for data cleaning by chaining multiple operations together in a logical order. 1,364
belgianbiodiversityplatform/python-dwca-reader A tool to parse and retrieve biodiversity data from archived files 45
proycon/python-frog A Python binding to a C++ NLP tool for Dutch language processing tasks 47
dano/aioprocessing A Python library that integrates asyncio with multiprocessing for concurrent task execution 655
benmack/eo-box A toolbox for processing earth observation data with Python. 14
bytewax/bytewax A Python framework for stateful stream and event processing with built-in connectors and flexible dataflow capabilities. 1,558
h2oai/datatable A Python package for manipulating 2-dimensional tabular data structures with an emphasis on speed and big data support. 1,817