spider_man

Crawler library

A high-level web crawling and scraping framework for Elixir.

SpiderMan,a base-on Broadway fast high-level web crawling & scraping framework for Elixir.

GitHub

23 stars

4 watching

4 forks

Language: Elixir

last commit: over 2 years ago

Linked from 1 awesome list

crawlerdata-miningelixirerlangframeworkspider

Backlinks from these awesome lists:

h4cc/awesome-elixir

Related projects:

Repository	Description	Stars
elixir-crawly/crawly	A framework for extracting structured data from websites	994
fredwu/crawler	A high-performance web crawling and scraping solution with customizable settings and worker pooling.	945
hu17889/go_spider	A modular, concurrent web crawler framework written in Go.	1,827
postmodern/spidr	A Ruby web crawling library that provides flexible and customizable methods to crawl websites	809
matteoredaelli/ebot	An Erlang-based web crawler designed to be scalable and highly configurable	330
chenjiandongx/github-spider	A Python-based web crawler for scraping Github user and repository data.	264
elliotgao2/gain	A Python web crawling framework utilizing asyncio and aiohttp for efficient data extraction from websites.	2,037
spider-rs/spider	A tool for web data extraction and processing using Rust	1,234
howie6879/ruia	An async web scraping micro-framework built with asyncio and aiohttp to simplify URL crawling	1,753
antchfx/antch	A framework for building fast and efficient web crawlers and scrapers in Go.	261
turnersoftware/infinitycrawler	A web crawling library for .NET that allows customizable crawling and throttling of websites.	248
qinxuye/cola	A high-level framework for building distributed data extractors from web pages	1,501
dyweb/scrala	A web crawling framework written in Scala that allows users to define the start URL and parse response from it	113
gushonorato/mechanize	A web scraping and automation tool for Elixir.	30
xianhu/pspider	A Python web crawler framework with support for multi-threading and proxy usage.	1,828