nidaba
OCR pipeline
Automates OCR pipeline for text digitization and conversion of raw images into citable texts.
An expandable and scalable OCR pipeline
86 stars
9 watching
12 forks
Language: Python
last commit: about 7 years ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
openphilology/tei-ocr | Customizes TEI XML for metadata from OCR processes to capture detailed layout and content information | 1 |
openseg-group/openseg.pytorch | Provides a PyTorch implementation of several computer vision tasks including object detection, segmentation and parsing. | 1,190 |
bandrel/ocyara | Performs OCR on images and scans them for matches to Yara rules | 40 |
allenai/scispacy | A collection of custom spaCy pipelines and models for analyzing scientific documents. | 1,709 |
seven45/pdm-ci | Provides a base image for creating Python CI pipelines with package manager support | 11 |
hhio618/golem-ci | A decentralized task pipeline on Golem.network using Python. | 5 |
hamdikahloun/windows_ocr | An OCR library allowing developers to embed high-quality character recognition functionality in their products. | 18 |
bjpop/rubra | A bioinformatics pipeline system that supports running workflow stages on a distributed compute cluster. | 38 |
openiti/ocr_gs_data | Provides gold standard data for training and testing optical character recognition (OCR) engines. | 15 |
sirfz/tesserocr | An OCR API wrapper that enables concurrent execution using Python's threading module and releases the GIL. | 2,016 |
ros-perception/image_pipeline | A ROS package providing an image processing pipeline | 800 |
osciiart/deepaa | Generates ASCII art from images using deep learning-based convolutional neural networks | 1,522 |
calamari-ocr/calamari | An OCR engine with modular design and a command-line interface, providing pre-trained models and a Python API for customization. | 1,049 |
druths/xp | A tool for creating flexible and self-documenting data science pipelines | 56 |