nidaba

OCR pipeline

Automates OCR pipeline for text digitization and conversion of raw images into citable texts.

An expandable and scalable OCR pipeline

86 stars

9 watching

12 forks

Language: Python

last commit: over 8 years ago

Linked from 1 awesome list

Backlinks from these awesome lists:

kba/awesome-ocr

Related projects:

Repository	Description	Stars
openphilology/tei-ocr	Customizes TEI XML for metadata from OCR processes to capture detailed layout and content information	1
openseg-group/openseg.pytorch	Provides a PyTorch implementation of several computer vision tasks including object detection, segmentation and parsing.	1,191
bandrel/ocyara	Performs OCR on images and scans them for matches to Yara rules	40
allenai/scispacy	Custom spaCy models and pipelines for scientific documents	1,724
seven45/pdm-ci	Provides a base image for creating Python CI pipelines with package manager support	11
hhio618/golem-ci	A decentralized task pipeline on Golem.network using Python.	5
hamdikahloun/windows_ocr	An OCR library allowing developers to embed high-quality character recognition functionality in their products.	18
bjpop/rubra	A bioinformatics pipeline system that supports running workflow stages on a distributed compute cluster.	38
openiti/ocr_gs_data	Provides gold standard data for training and testing optical character recognition (OCR) engines.	15
sirfz/tesserocr	An OCR API wrapper that enables concurrent execution using Python's threading module and releases the GIL.	2,026
ros-perception/image_pipeline	A ROS package providing an image processing pipeline	811
osciiart/deepaa	Generates ASCII art from images using deep learning-based convolutional neural networks	1,524
calamari-ocr/calamari	An OCR engine with modular design and a command-line interface, providing pre-trained models and a Python API for customization.	1,056
druths/xp	A tool for creating flexible and self-documenting data science pipelines	56