xeuledoc

Document scraper

A tool to fetch information about public Google documents from various services

Fetch information about a public Google document.

GitHub

846 stars
38 watching
90 forks
Language: Python
last commit: about 1 year ago
malfratsosint

Related projects:

Repository Description Stars
jjelosua/doga_scraper A tool that extracts and converts Galician Official journal documents to different formats based on input year. 0
felipecsl/wombat A Ruby-based web crawler and data extraction tool with an elegant DSL. 1,315
meilisearch/docs-scraper Automates scraping and indexing of documentation content into a search engine 290
needmorecowbell/giggity A tool to scrape and store hierarchical data about GitHub organizations, users, or repositories. 126
9b/malpdfobj Generates a JSON object representing the structure of a malicious PDF file. 52
medialab/minet A command line tool and Python library for extracting data from various web sources. 286
oscarotero/embed A PHP library to extract metadata and embeddable code from any web page using various protocols and scraping techniques. 2,091
benibela/xidel A tool to extract data from web pages using various query languages and selectors. 686
gushonorato/mechanize A web scraping and automation tool for Elixir. 30
michaelhelmick/lassie Library for retrieving basic content from websites 613
aantron/lambdasoup A functional HTML scraping and manipulation library in OCaml 384
propublica/upton A web scraping framework that simplifies the process by handling repetitive tasks and provides options for efficient data retrieval 1,613
itteco/iframely A service that extracts metadata and embeds from web pages 1,528
bendeaton/abaqus-documentation-scraper Extracts keywords and parameters from Abaqus documentation for syntax highlighting plugin 3
nerolation/ethereum-datafarm A tool to harvest event data from Ethereum contracts without requiring an archive or node. 63