MathOCR

Document analyzer

A software project that enables the recognition and analysis of printed scientific documents, particularly focusing on mathematical expressions.

A scientific document recognition system

GitHub

167 stars
11 watching
41 forks
Language: Java
last commit: about 2 years ago
Linked from 1 awesome list

latexoptical-character-recognitionscientific-document-recognition

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
jsv4/opencontracts A document analytics platform providing features for managing documents, extracting layout information and vector embeddings, annotating documents, and querying them using LlamaIndex. 717
bobld/documentlayoutanalysis Develops tools and algorithms for analyzing layout and structure of documents in PDF format 583
icij/datashare An application that helps investigate journalists analyze and search documents, using natural language processing and entity recognition techniques. 597
mingyuan-xia/patdroid An Android-specific toolkit for analyzing and understanding APK files 118
open-korean-text/elasticsearch-analysis-openkoreantext An Elasticsearch analyzer plugin for analyzing Korean text using the Open-Korean Text module. 127
tingxueronghua/chartllama-code A multimodal LLM for understanding and generating charts in various formats. 196
wangqianwen0418/discrilens A tool for analyzing and visualizing discrimination in machine learning models 7
ddmcdonald/sparser A model-driven language text analysis system with a rule-based approach to extract information from large volumes of text 57
runem/web-component-analyzer Analyzes web components and emits documentation in various formats 506
uglytoad/pdfpig A C# library for extracting and analyzing text from PDF files 1,733
ohjeongwook/darungrim Analyzes software patches to identify vulnerabilities and weaknesses 359
johannesbuchner/languagecheck A tool to analyze and improve the language of scientific papers before submission. 97
tylabs/qs_old A tool to analyze and extract malicious content from office documents and executables 126
dlang-community/d-scanner Analyzes D source code for syntax, style, and security issues 242
ckorzen/pdf-text-extraction-benchmark Evaluates PDF extraction tools' ability to extract meaningful text from scientific articles 65