dom-crawler
DOM parser
A PHP component for navigating and manipulating HTML and XML documents programmatically.
Eases DOM navigation for HTML and XML documents
4k stars
27 watching
123 forks
Language: PHP
last commit: 8 days ago
Linked from 1 awesome list
componentphpsymfonysymfony-component
Related projects:
Repository | Description | Stars |
---|---|---|
turnersoftware/infinitycrawler | A web crawling library for .NET that allows customizable crawling and throttling of websites. | 248 |
meilisearch/docs-scraper | Automates scraping and indexing of documentation content into a search engine | 290 |
brendonboshell/supercrawler | A web crawler designed to crawl websites while obeying robots.txt rules, rate limits and concurrency limits, with customizable content handlers for parsing and processing crawled pages. | 378 |
yujiosaka/headless-chrome-crawler | A distributed crawling framework that leverages Headless Chrome to scrape dynamic websites | 5,527 |
dyweb/scrala | A web crawling framework written in Scala that allows users to define the start URL and parse response from it | 113 |
hominee/dyer | A fast and flexible web crawling tool with features like asynchronous I/O and event-driven design. | 133 |
amoilanen/js-crawler | A Node.js module for crawling web sites and scraping their content | 253 |
naufalardhani/domhttpx | A tool to discover and extract information from web pages using HTTP requests and Google search queries. | 68 |
feng19/spider_man | A high-level web crawling and scraping framework for Elixir. | 23 |
symfony/html-sanitizer | Provides an object-oriented API to sanitize untrusted HTML input | 238 |
iamstoxe/urlgrab | A tool to crawl websites by exploring links recursively with support for JavaScript rendering. | 330 |
symfony/finder | A PHP library that provides an intuitive interface to find files and directories in a file system. | 8,404 |
webrecorder/browsertrix-crawler | A containerized browser-based crawler system for capturing web content in a high-fidelity and customizable manner. | 652 |
symfony/process | Executes commands in separate tasks for concurrent execution | 7,431 |
cocrawler/cocrawler | A versatile web crawler built with modern tools and concurrency to handle various crawl tasks | 187 |