pdfplumber

PDF parser

A tool for extracting detailed information from PDFs

Plumb a PDF for detailed information about each char, rectangle, line, et cetera — and easily extract text and tables.

GitHub

7k stars

93 watching

687 forks

Language: Python

last commit: over 1 year ago

pdfpdf-parsingtable-extraction

Related projects:

Repository	Description	Stars
pdfminer/pdfminer.six	A Python-based tool for extracting information from PDF documents.	6,046
ocrmypdf/ocrmypdf	A tool that adds OCR text to scanned PDF files, allowing them to be searchable and copy-pasted.	14,363
py-pdf/pypdf	A Python library for manipulating and extracting data from PDF files	8,524
jorisschellekens/borb	A Python library for creating and manipulating PDF documents in a JSON-like data structure.	3,413
pdfarranger/pdfarranger	An application that allows users to manipulate PDF documents by merging/splitting and rearranging pages.	3,653
jvns/pandas-cookbook	A comprehensive guide to getting started with Python's pandas library using real-world data examples	6,697
hopding/pdf-lib	A JavaScript library for creating and modifying PDF documents in any environment	7,089
jakevdp/pythondatasciencehandbook	An online guide and set of executable Jupyter notebooks providing an introduction to core libraries for data science in Python.	43,422
mozilla/pdf.js	A general-purpose PDF viewer built with HTML5, allowing parsing and rendering of Portable Document Format files.	49,009
uglytoad/pdfpig	A C# library for extracting and analyzing text from PDF files	1,794
vikparuchuri/marker	Converts PDF documents to text formats with high accuracy and support for various document types	18,618
deepdoctection/deepdoctection	An integrated framework for document AI tasks using deep learning models.	2,628
spotify/chartify	A Python library for creating charts with a consistent input data format and intuitive API	3,546
foliojs/pdfkit	A JavaScript library for generating PDF documents with various features and functionalities	9,970
leofcardoso/pdf2pdfocr	A tool to extract text from PDFs and add a searchable layer to them	279