robobrowser

Web scraper

A Python library for interacting with web pages without the need for a standalone browser

GitHub

4k stars
111 watching
337 forks
Language: Python
last commit: over 4 years ago
Linked from 2 awesome lists


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
ruipgil/scraperjs A versatile web scraping module with two scrapers for static and dynamic content extraction. 3,714
bubuanabelas/checkwebpeer A tool to scan WebRTC peers from torrent trackers 19
pjkelly/robocop A middleware that adds a meta tag to HTTP responses to instruct search engines on how to crawl the content. 3
robostack/jupyter-ros Provides ROS support for Jupyter notebooks to enable robotics developers to create interactive and dynamic visualizations of robot behavior. 592
jaeles-project/gospider A tool for web crawling and exploitation written in Go. 2,598
emersonelectricco/boomerang A tool designed to safely capture off-network web resources for network defense and security analysis 38
anorov/cloudflare-scrape A tool to bypass Cloudflare's anti-bot page and access protected websites 3,406
brendonboshell/supercrawler A web crawler designed to crawl websites while obeying robots.txt rules, rate limits and concurrency limits, with customizable content handlers for parsing and processing crawled pages. 380
finic-ai/finic Provides cloud-hosted browsers for automation and scraping tasks to avoid detection by websites. 2,311
servo/servo A Rust-based web browser engine with parallel rendering capabilities 28,713
spider-rs/spider A tool for web data extraction and processing using Rust 1,234
mbrubeck/robinson An educational toy web rendering engine to learn basic implementation techniques 1,562
medialab/minet A command line tool and Python library for extracting data from various web sources. 293
cobrateam/splinter A Python test framework for automating web applications using Selenium and other tools. 2,726
spatie/crawler A powerful web crawler written in PHP that can execute JavaScript and crawl multiple URLs concurrently. 2,552