mPLUG-DocOwl

Doc Analyst

A large language model designed to understand documents without OCR, focusing on document structure and content analysis.

mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding

GitHub

2k stars
33 watching
115 forks
Language: Python
last commit: 4 months ago
chart-understandingdocument-understandingmllmmultimodalmultimodal-large-language-modelstable-understanding

Related projects:

Repository Description Stars
x-plug/mplug-halowl Evaluates and mitigates hallucinations in multimodal large language models 82
bobld/documentlayoutanalysis Develops tools and algorithms for analyzing layout and structure of documents in PDF format 591
docopt/docopt.c A C-code generator for parsing command-line arguments in the docopt language 320
jsv4/opencontracts A document analytics platform providing features for managing documents, extracting layout information and vector embeddings, annotating documents, and querying them using LlamaIndex. 728
applieddatasciencepartners/xgboostexplainer Provides tools to understand and interpret the decisions made by XGBoost models in machine learning 253
fuxiaoliu/mmc Develops a large-scale dataset and benchmark for training multimodal chart understanding models using large language models. 87
uglytoad/pdfpig A C# library for extracting and analyzing text from PDF files 1,794
docopt/docopt.net A .NET implementation of docopt to automatically derive command-line argument parsing logic from help text. 356
melisgl/mgl-pax A documentation system and browser for generating interactive code documentation from embedded docstrings. 75
0xvavaldi/gramify Analyzes text data to extract patterns of words or characters for password cracking and analysis purposes. 28
amiremohamadi/duckx A C++ library that allows reading and writing of Microsoft Office Word .docx files 430
datamllab/xdeep Provides tools for interpreting deep neural networks 42
docopt/docopt.fs A library that generates option parsers based on human-readable help messages. 34
x-plug/cvalues Evaluates and aligns the values of Chinese large language models with safety and responsibility standards 481
tingxueronghua/chartllama-code A multimodal LLM for understanding and generating charts in various formats. 202