pdfplumber

PDF parser

A tool for extracting detailed information from PDFs

Plumb a PDF for detailed information about each char, rectangle, line, et cetera — and easily extract text and tables.

GitHub

7k stars
93 watching
687 forks
Language: Python
last commit: 2 days ago
pdfpdf-parsingtable-extraction

Related projects:

Repository Description Stars
pdfminer/pdfminer.six A Python-based tool for extracting information from PDF documents. 6,046
ocrmypdf/ocrmypdf A tool that adds OCR text to scanned PDF files, allowing them to be searchable and copy-pasted. 14,363
py-pdf/pypdf A Python library for manipulating and extracting data from PDF files 8,524
jorisschellekens/borb A Python library for creating and manipulating PDF documents in a JSON-like data structure. 3,413
pdfarranger/pdfarranger An application that allows users to manipulate PDF documents by merging/splitting and rearranging pages. 3,653
jvns/pandas-cookbook A comprehensive guide to getting started with Python's pandas library using real-world data examples 6,697
hopding/pdf-lib A JavaScript library for creating and modifying PDF documents in any environment 7,089
jakevdp/pythondatasciencehandbook An online guide and set of executable Jupyter notebooks providing an introduction to core libraries for data science in Python. 43,422
mozilla/pdf.js A general-purpose PDF viewer built with HTML5, allowing parsing and rendering of Portable Document Format files. 49,009
uglytoad/pdfpig A C# library for extracting and analyzing text from PDF files 1,771
vikparuchuri/marker Converts PDF documents to text formats with high accuracy and support for various document types 18,452
deepdoctection/deepdoctection An integrated framework for document AI tasks using deep learning models. 2,628
spotify/chartify A Python library for creating charts with a consistent input data format and intuitive API 3,546
foliojs/pdfkit A JavaScript library for generating PDF documents with various features and functionalities 9,970
leofcardoso/pdf2pdfocr A tool to extract text from PDFs and add a searchable layer to them 279