nidaba
OCR pipeline
Automates OCR pipeline for text digitization and conversion of raw images into citable texts.
An expandable and scalable OCR pipeline
86 stars
9 watching
12 forks
Language: Python
last commit: over 7 years ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
| Customizes TEI XML for metadata from OCR processes to capture detailed layout and content information | 1 |
| Provides a PyTorch implementation of several computer vision tasks including object detection, segmentation and parsing. | 1,191 |
| Performs OCR on images and scans them for matches to Yara rules | 40 |
| Custom spaCy models and pipelines for scientific documents | 1,724 |
| Provides a base image for creating Python CI pipelines with package manager support | 11 |
| A decentralized task pipeline on Golem.network using Python. | 5 |
| An OCR library allowing developers to embed high-quality character recognition functionality in their products. | 18 |
| A bioinformatics pipeline system that supports running workflow stages on a distributed compute cluster. | 38 |
| Provides gold standard data for training and testing optical character recognition (OCR) engines. | 15 |
| An OCR API wrapper that enables concurrent execution using Python's threading module and releases the GIL. | 2,026 |
| A ROS package providing an image processing pipeline | 811 |
| Generates ASCII art from images using deep learning-based convolutional neural networks | 1,524 |
| An OCR engine with modular design and a command-line interface, providing pre-trained models and a Python API for customization. | 1,056 |
| A tool for creating flexible and self-documenting data science pipelines | 56 |