mPLUG-DocOwl

Doc Analyst

A large language model designed to understand documents without OCR, focusing on document structure and content analysis.

mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding

GitHub

2k stars
30 watching
101 forks
Language: Python
last commit: about 2 months ago
chart-understandingdocument-understandingmllmmultimodalmultimodal-large-language-modelstable-understanding

Related projects:

Repository Description Stars
x-plug/mplug-halowl Evaluates and mitigates hallucinations in multimodal large language models 79
bobld/documentlayoutanalysis Develops tools and algorithms for analyzing layout and structure of documents in PDF format 583
docopt/docopt.c A C-code generator for parsing command-line arguments in the docopt language 320
jsv4/opencontracts A document analytics platform providing features for managing documents, extracting layout information and vector embeddings, annotating documents, and querying them using LlamaIndex. 717
applieddatasciencepartners/xgboostexplainer Provides tools to understand and interpret the decisions made by XGBoost models in machine learning 252
fuxiaoliu/mmc Develops a large-scale dataset and benchmark for training multimodal chart understanding models using large language models. 84
uglytoad/pdfpig A C# library for extracting and analyzing text from PDF files 1,733
docopt/docopt.net Automatically derives parsing logic from command-line help text in a .NET implementation 355
melisgl/mgl-pax A documentation system and browser for generating interactive code documentation from embedded docstrings. 75
0xvavaldi/gramify Analyzes text data to extract patterns of words or characters for password cracking and analysis purposes. 28
amiremohamadi/duckx A C++ library that allows reading and writing of Microsoft Office Word .docx files 422
datamllab/xdeep Provides tools for interpreting deep neural networks 42
docopt/docopt.fs A library that generates option parsers based on human-readable help messages. 34
x-plug/cvalues Evaluates and aligns the values of Chinese large language models with safety and responsibility standards 477
tingxueronghua/chartllama-code A multimodal LLM for understanding and generating charts in various formats. 196