Photon
Crawler
A fast and flexible web crawler designed to gather information from the internet
Incredibly fast crawler designed for OSINT.
11k stars
325 watching
2k forks
Language: Python
last commit: 3 months ago
Linked from 2 awesome lists
crawlerinformation-gatheringosintpythonspider
Related projects:
Repository | Description | Stars |
---|---|---|
unclecode/crawl4ai | A tool for web crawling and data extraction, designed to work with large language models. | 16,180 |
gocolly/colly | A framework for extracting structured data from websites in a fast and elegant way | 23,351 |
hakluke/hakrawler | A tool for automatically discovering and crawling web application endpoints and assets | 4,502 |
apify/crawlee | A tool for building reliable web scraping and browser automation pipelines in Node.js. | 15,740 |
yujiosaka/headless-chrome-crawler | A distributed crawling framework that leverages Headless Chrome to scrape dynamic websites | 5,527 |
internetarchive/heritrix3 | A web crawler designed to collect and preserve digital artifacts while respecting site policies and load constraints. | 2,833 |
dedsecinside/torbot | An OSINT tool for exploring and analyzing dark web sites using Tor network | 2,969 |
matthewmueller/x-ray | A flexible web scraping framework for extracting data from websites with customizable selectors and pagination support. | 5,878 |
jaeles-project/gospider | A tool for web crawling and exploitation written in Go. | 2,578 |
cobrateam/splinter | A Python test framework for automating web applications using Selenium and other tools. | 2,722 |
spatie/crawler | A powerful web crawler written in PHP that can execute JavaScript and crawl multiple URLs concurrently. | 2,537 |
finic-ai/finic | Provides cloud-hosted browsers for automation and scraping tasks to avoid detection by websites. | 2,302 |
nabla-c0d3/sslyze | An SSL/TLS scanning tool and Python library for assessing server security configurations | 3,267 |
smicallef/spiderfoot | Automates information gathering and analysis from various data sources to support threat intelligence and cybersecurity efforts | 13,156 |
geziyor/geziyor | A fast and flexible web crawling and scraping framework for extracting structured data from websites. | 2,629 |