DOGA_scraper

Document scraper

A tool that extracts and converts Galician Official journal documents to different formats based on input year.

Galician Official journal scraper

GitHub

0 stars
3 watching
0 forks
Language: Ruby
last commit: over 10 years ago
Linked from 1 awesome list


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
jjelosua/parlamentogalicia Extracts text from Galician Parlament session transcripts and conversations 0
jaimeiniesta/metainspector A Ruby gem for web scraping and extracting metadata from web pages. 1,038
jakopako/goskyr A tool to simplify web scraping of list-like structured data from web pages 36
miyagawa/web-scraper A Perl toolkit for extracting structured data from HTML documents using a DSL-like interface. 104
meilisearch/docs-scraper Automates scraping and indexing of documentation content into a search engine 297
fcannizzaro/jsoup-annotations A Java library that provides annotations to simplify HTML scraping and processing with Jsoup 239
tjatse/node-readability Automates web page scraping and text extraction to make any webpage readable 343
felipecsl/wombat A Ruby-based web crawler and data extraction tool with an elegant DSL. 1,315
davemolk/gogetjs Tools for extracting and analyzing JavaScript files from web pages 41
malfrats/xeuledoc A tool to fetch information about public Google documents from various services 856
benibela/xidel A tool to extract data from web pages using various query languages and selectors. 690
gushonorato/mechanize A web scraping and automation tool for Elixir. 30
eureka101v/weibospidergo A tool for extracting data from Weibo social media platform using Go programming language and Colly library 66
oscarotero/embed A PHP library to retrieve metadata and embed code from any web page 2,100
yhat/scrape A collection of utility functions and tools to simplify web scraping in Go. 1,513