pdfplumber
PDF parser
A tool for extracting detailed information from PDFs
Plumb a PDF for detailed information about each char, rectangle, line, et cetera — and easily extract text and tables.
7k stars
93 watching
687 forks
Language: Python
last commit: 2 days ago pdfpdf-parsingtable-extraction
Related projects:
Repository | Description | Stars |
---|---|---|
pdfminer/pdfminer.six | A Python-based tool for extracting information from PDF documents. | 6,046 |
ocrmypdf/ocrmypdf | A tool that adds OCR text to scanned PDF files, allowing them to be searchable and copy-pasted. | 14,363 |
py-pdf/pypdf | A Python library for manipulating and extracting data from PDF files | 8,524 |
jorisschellekens/borb | A Python library for creating and manipulating PDF documents in a JSON-like data structure. | 3,413 |
pdfarranger/pdfarranger | An application that allows users to manipulate PDF documents by merging/splitting and rearranging pages. | 3,653 |
jvns/pandas-cookbook | A comprehensive guide to getting started with Python's pandas library using real-world data examples | 6,697 |
hopding/pdf-lib | A JavaScript library for creating and modifying PDF documents in any environment | 7,089 |
jakevdp/pythondatasciencehandbook | An online guide and set of executable Jupyter notebooks providing an introduction to core libraries for data science in Python. | 43,422 |
mozilla/pdf.js | A general-purpose PDF viewer built with HTML5, allowing parsing and rendering of Portable Document Format files. | 49,009 |
uglytoad/pdfpig | A C# library for extracting and analyzing text from PDF files | 1,771 |
vikparuchuri/marker | Converts PDF documents to text formats with high accuracy and support for various document types | 18,452 |
deepdoctection/deepdoctection | An integrated framework for document AI tasks using deep learning models. | 2,628 |
spotify/chartify | A Python library for creating charts with a consistent input data format and intuitive API | 3,546 |
foliojs/pdfkit | A JavaScript library for generating PDF documents with various features and functionalities | 9,970 |
leofcardoso/pdf2pdfocr | A tool to extract text from PDFs and add a searchable layer to them | 279 |