MathOCR
Document analyzer
A software project that enables the recognition and analysis of printed scientific documents, particularly focusing on mathematical expressions.
A scientific document recognition system
168 stars
11 watching
41 forks
Language: Java
last commit: about 2 years ago
Linked from 1 awesome list
latexoptical-character-recognitionscientific-document-recognition
Related projects:
Repository | Description | Stars |
---|---|---|
jsv4/opencontracts | A document analytics platform providing features for managing documents, extracting layout information and vector embeddings, annotating documents, and querying them using LlamaIndex. | 728 |
bobld/documentlayoutanalysis | Develops tools and algorithms for analyzing layout and structure of documents in PDF format | 591 |
icij/datashare | An application that helps investigate journalists analyze and search documents, using natural language processing and entity recognition techniques. | 601 |
mingyuan-xia/patdroid | An Android-specific toolkit for analyzing and understanding APK files | 118 |
open-korean-text/elasticsearch-analysis-openkoreantext | An Elasticsearch analyzer plugin for analyzing Korean text using the Open-Korean Text module. | 127 |
tingxueronghua/chartllama-code | A multimodal LLM for understanding and generating charts in various formats. | 202 |
wangqianwen0418/discrilens | A tool for analyzing and visualizing discrimination in machine learning models | 6 |
ddmcdonald/sparser | A model-driven language text analysis system with a rule-based approach to extract information from large volumes of text | 57 |
runem/web-component-analyzer | Analyzes web components and emits documentation in various formats | 509 |
uglytoad/pdfpig | A C# library for extracting and analyzing text from PDF files | 1,794 |
ohjeongwook/darungrim | Analyzes software patches to identify vulnerabilities and weaknesses | 359 |
johannesbuchner/languagecheck | A tool to analyze and improve the language of scientific papers before submission. | 98 |
tylabs/qs_old | A tool to analyze and extract malicious content from office documents and executables | 126 |
dlang-community/d-scanner | Analyzes D source code for syntax, style, and security issues | 242 |
ckorzen/pdf-text-extraction-benchmark | Evaluates PDF extraction tools' ability to extract meaningful text from scientific articles | 65 |