metainspector

Web scraper

A Ruby gem for web scraping and extracting metadata from web pages.

Ruby gem for web scraping purposes. It scrapes a given URL, and returns you its title, meta description, meta keywords, links, images...

GitHub

1k stars
26 watching
165 forks
Language: Ruby
last commit: 5 months ago
Linked from 2 awesome lists


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
felipecsl/wombat A Ruby-based web crawler and data extraction tool with an elegant DSL. 1,315
medialab/minet A command line tool and Python library for extracting data from various web sources. 286
joenorton/rubyretriever A Ruby-based tool for web crawling and data extraction, aiming to be a replacement for paid software in the SEO space. 143
jjelosua/doga_scraper A tool that extracts and converts Galician Official journal documents to different formats based on input year. 0
railsmachine/nagiosharder A Ruby API for querying and managing Nagios installations 115
benibela/xidel A tool to extract data from web pages using various query languages and selectors. 681
laramies/metagoofil Extracts metadata from public documents available on websites 1,028
thibaudgg/video_info A Ruby gem that retrieves metadata from various video sharing platforms 429
tidyverse/rvest A package for extracting data from web pages using HTML parsing and CSS/XPath selectors. 1,492
gushonorato/mechanize A web scraping and automation tool for Elixir. 30
spider-rs/spider A web crawler and scraper built on top of Rust, designed to extract data from the web in a flexible and configurable manner. 1,140
miyagawa/web-scraper A Perl toolkit for extracting structured data from HTML documents using a DSL-like interface. 104
postmodern/spidr A Ruby web crawling library that provides flexible and customizable methods to crawl websites 806
oscarotero/embed A PHP library to extract metadata and embeddable code from any web page using various protocols and scraping techniques. 2,091
slotix/dataflowkit A framework for extracting structured data from web pages using CSS selectors. 662