petastorm

Deep learning data loader

Enables training and evaluation of deep learning models from Apache Parquet datasets in various machine learning frameworks

Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.

GitHub

2k stars
40 watching
284 forks
Language: Python
last commit: 12 months ago
Linked from 1 awesome list

deep-learningmachine-learningparquetparquet-filespyarrowpysparkpytorchsysmltensorflow

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
open-mmlab/mmengine Provides a flexible and configurable framework for training deep learning models with PyTorch. 1,179
uber/neuropod A unified interface to run deep learning models from multiple frameworks using C++ and Python. 936
erotemic/netharn A PyTorch framework for managing and automating deep learning training loops with features like hyperparameter tracking and single-file deployments. 39
eduardoleao052/js-pytorch A JavaScript library that provides GPU-accelerated deep learning capabilities with automatic differentiation and neural network layers. 1,084
baguasys/bagua A framework for accelerating PyTorch deep learning training 877
dmmiller612/sparktorch A PyTorch implementation on Apache Spark for distributed deep learning model training and inference. 339
pyg-team/pytorch-frame A deep learning framework for handling heterogeneous tabular data with diverse column types 543
zhanghang1989/pytorch-encoding A Python framework for building deep learning models with optimized encoding layers and batch normalization. 2,041
graal-research/poutyne A PyTorch framework simplifying neural network training with automated boilerplate code and callback utilities 569
isht7/pytorch-deeplab-resnet A deep learning model implementation of the DeepLab ResNet architecture for image segmentation tasks. 602
xxradon/igcv3-pytorch Reimplements MobileNet-V2 and IGCV3 using PyTorch for efficient deep learning. 19
ramon-oliveira/aorun A deep learning framework on top of PyTorch for building neural networks. 61
rdspring1/pytorch_gbw_lm Trains a large-scale PyTorch language model on the 1-Billion Word dataset 123
4uiiurz1/pytorch-res2net Implementations of deep learning architectures using PyTorch for image classification tasks on various datasets. 112
dmarnerides/pydlt A PyTorch-based toolbox for building and training deep learning models with ease. 204