x-ray

Web scraper

A flexible web scraping framework for extracting data from websites with customizable selectors and pagination support.

The next web scraper. See through the noise.

GitHub

6k stars
110 watching
348 forks
Language: JavaScript
last commit: 23 days ago
Linked from 3 awesome lists


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
ruipgil/scraperjs A versatile web scraping module with two scrapers for static and dynamic content extraction. 3,710
apify/crawlee A tool for building reliable web scraping and browser automation pipelines in Node.js. 15,604
ionicabizau/scrape-it A Node.js library and CLI tool for automating web page scraping and parsing 4,012
spatie/crawler A powerful web crawler written in PHP that can execute JavaScript and crawl multiple URLs concurrently. 2,537
yujiosaka/headless-chrome-crawler A distributed crawling framework that leverages Headless Chrome to scrape dynamic websites 5,527
s0md3v/photon A fast and flexible web crawler designed to gather information from the internet 11,067
unclecode/crawl4ai A tool for web crawling and data extraction, designed to work with large language models. 16,180
benibela/xidel A tool to extract data from web pages using various query languages and selectors. 681
rchipka/node-osmosis A fast and flexible web scraping library using native libxml C bindings 4,116
bda-research/node-crawler A NodeJS-based web crawler and spider that extracts data from websites. 6,704
gocolly/colly A framework for extracting structured data from websites in a fast and elegant way 23,317
justanotherarchivist/snscrape A Python-based social media scraper that extracts data from various platforms. 4,490
feng19/spider_man A high-level web crawling and scraping framework for Elixir. 23
philipjkim/goreadability Extracts readable content from web pages using Open Graph and traditional readability rules. 69
samuelclay/newsblur A personal news reader application utilizing multiple technologies to fetch, parse, and store news articles. 6,907