 nidaba
 nidaba 
 OCR pipeline
 Automates OCR pipeline for text digitization and conversion of raw images into citable texts.
An expandable and scalable OCR pipeline
86 stars
 9 watching
 12 forks
 
Language: Python 
last commit: almost 8 years ago 
Linked from   1 awesome list  
 Related projects:
| Repository | Description | Stars | 
|---|---|---|
|  | Customizes TEI XML for metadata from OCR processes to capture detailed layout and content information | 1 | 
|  | Provides a PyTorch implementation of several computer vision tasks including object detection, segmentation and parsing. | 1,191 | 
|  | Performs OCR on images and scans them for matches to Yara rules | 40 | 
|  | Custom spaCy models and pipelines for scientific documents | 1,724 | 
|  | Provides a base image for creating Python CI pipelines with package manager support | 11 | 
|  | A decentralized task pipeline on Golem.network using Python. | 5 | 
|  | An OCR library allowing developers to embed high-quality character recognition functionality in their products. | 18 | 
|  | A bioinformatics pipeline system that supports running workflow stages on a distributed compute cluster. | 38 | 
|  | Provides gold standard data for training and testing optical character recognition (OCR) engines. | 15 | 
|  | An OCR API wrapper that enables concurrent execution using Python's threading module and releases the GIL. | 2,026 | 
|  | A ROS package providing an image processing pipeline | 811 | 
|  | Generates ASCII art from images using deep learning-based convolutional neural networks | 1,524 | 
|  | An OCR engine with modular design and a command-line interface, providing pre-trained models and a Python API for customization. | 1,056 | 
|  | A tool for creating flexible and self-documenting data science pipelines | 56 |