DOGA_scraper

Document scraper

A tool that extracts and converts Galician Official journal documents to different formats based on input year.

Galician Official journal scraper

0 stars

3 watching

0 forks

Language: Ruby

last commit: almost 12 years ago

Linked from 1 awesome list

Backlinks from these awesome lists:

richardlitt/low-resource-languages

Related projects:

Repository	Description	Stars
jjelosua/parlamentogalicia	Extracts text from Galician Parlament session transcripts and conversations	0
jaimeiniesta/metainspector	A Ruby gem for web scraping and extracting metadata from web pages.	1,038
jakopako/goskyr	A tool to simplify web scraping of list-like structured data from web pages	36
miyagawa/web-scraper	A Perl toolkit for extracting structured data from HTML documents using a DSL-like interface.	104
meilisearch/docs-scraper	Automates scraping and indexing of documentation content into a search engine	297
fcannizzaro/jsoup-annotations	A Java library that provides annotations to simplify HTML scraping and processing with Jsoup	239
tjatse/node-readability	Automates web page scraping and text extraction to make any webpage readable	343
felipecsl/wombat	A Ruby-based web crawler and data extraction tool with an elegant DSL.	1,315
davemolk/gogetjs	Tools for extracting and analyzing JavaScript files from web pages	41
malfrats/xeuledoc	A tool to fetch information about public Google documents from various services	856
benibela/xidel	A tool to extract data from web pages using various query languages and selectors.	690
gushonorato/mechanize	A web scraping and automation tool for Elixir.	30
eureka101v/weibospidergo	A tool for extracting data from Weibo social media platform using Go programming language and Colly library	66
oscarotero/embed	A PHP library to retrieve metadata and embed code from any web page	2,100
yhat/scrape	A collection of utility functions and tools to simplify web scraping in Go.	1,513