metainspector
Web scraper
A Ruby gem for web scraping and extracting metadata from web pages.
Ruby gem for web scraping purposes. It scrapes a given URL, and returns you its title, meta description, meta keywords, links, images...
1k stars
26 watching
165 forks
Language: Ruby
last commit: 5 months ago
Linked from 2 awesome lists
Related projects:
Repository | Description | Stars |
---|---|---|
felipecsl/wombat | A Ruby-based web crawler and data extraction tool with an elegant DSL. | 1,315 |
medialab/minet | A command line tool and Python library for extracting data from various web sources. | 286 |
joenorton/rubyretriever | A Ruby-based tool for web crawling and data extraction, aiming to be a replacement for paid software in the SEO space. | 143 |
jjelosua/doga_scraper | A tool that extracts and converts Galician Official journal documents to different formats based on input year. | 0 |
railsmachine/nagiosharder | A Ruby API for querying and managing Nagios installations | 115 |
benibela/xidel | A tool to extract data from web pages using various query languages and selectors. | 681 |
laramies/metagoofil | Extracts metadata from public documents available on websites | 1,028 |
thibaudgg/video_info | A Ruby gem that retrieves metadata from various video sharing platforms | 429 |
tidyverse/rvest | A package for extracting data from web pages using HTML parsing and CSS/XPath selectors. | 1,492 |
gushonorato/mechanize | A web scraping and automation tool for Elixir. | 30 |
spider-rs/spider | A web crawler and scraper built on top of Rust, designed to extract data from the web in a flexible and configurable manner. | 1,140 |
miyagawa/web-scraper | A Perl toolkit for extracting structured data from HTML documents using a DSL-like interface. | 104 |
postmodern/spidr | A Ruby web crawling library that provides flexible and customizable methods to crawl websites | 806 |
oscarotero/embed | A PHP library to extract metadata and embeddable code from any web page using various protocols and scraping techniques. | 2,091 |
slotix/dataflowkit | A framework for extracting structured data from web pages using CSS selectors. | 662 |