historical-texts
Historical texts
A collection of English historical texts digitized and corrected by OCR
Collections of english historical texts and data relating to them
18 stars
6 watching
6 forks
Language: Shell
last commit: over 3 years ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
chreul/ocr_testdata_earlyprintedbooks | Provides test data and models for training Optical Character Recognition (OCR) systems on historical printed books. | 10 |
tberg12/ocular | An OCR system designed to transcribe historical documents with high accuracy, handling various challenges such as font variation and code-switching. | 255 |
jbaiter/archiscribe-corpus | A repository of transcribed 19th century German texts from various sources. | 8 |
adzap/timeliness | A date/time parsing library with extensibility and control features. | 224 |
1history/1history | A command line tool to backup and visualize browser histories into a single file. | 456 |
mittagessen/kraken | An OCR system optimized for historical and non-Latin scripts | 748 |
elte-dh/drama-corpus | A comprehensive annotated corpus of Hungarian drama texts, including structural annotations and grammatical features. | 1 |
pld-linux/apertium-dict-es-gl | A dictionary file for machine translation between two languages using a specific rule-based machine translation system | 1 |
apertium/apertium-en-gl | A machine translation system translating English to Galician language | 0 |
agoldst/dfr-browser | A tool for browsing and visualizing topic models of text corpora | 99 |
pld-linux/apertium-dict-en-gl | An English-Galician language translation dictionary for the Apertium platform. | 1 |
gaudard/scripts | A collection of scripts and tools for administrative tasks, penetration testing, and incident response. | 18 |
esamattis/jslibs | A curated collection of useful JavaScript libraries for building web applications. | 59 |
alvations/seedling | A corpus and API for human language data | 11 |
cistern/catena | A storage engine for time series data. | 391 |