spider_man

Crawler library

A high-level web crawling and scraping framework for Elixir.

SpiderMan,a base-on Broadway fast high-level web crawling & scraping framework for Elixir.

GitHub

23 stars
4 watching
4 forks
Language: Elixir
last commit: 9 months ago
Linked from 1 awesome list

crawlerdata-miningelixirerlangframeworkspider

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
elixir-crawly/crawly A framework for extracting structured data from websites 987
fredwu/crawler A high-performance web crawling and scraping solution with customizable settings and worker pooling. 945
hu17889/go_spider A modular, concurrent web crawler framework written in Go. 1,826
postmodern/spidr A Ruby web crawling library that provides flexible and customizable methods to crawl websites 806
matteoredaelli/ebot An Erlang-based web crawler designed to be scalable and highly configurable 330
chenjiandongx/github-spider A Python-based web crawler for scraping Github user and repository data. 264
elliotgao2/gain A Python web crawling framework utilizing asyncio and aiohttp for efficient data extraction from websites. 2,035
spider-rs/spider A web crawler and scraper built on top of Rust, designed to extract data from the web in a flexible and configurable manner. 1,140
howie6879/ruia An async web scraping micro-framework built with asyncio and aiohttp to simplify URL crawling 1,752
antchfx/antch A framework for building fast and efficient web crawlers and scrapers in Go. 260
turnersoftware/infinitycrawler A web crawling library for .NET that allows customizable crawling and throttling of websites. 248
qinxuye/cola A high-level framework for building distributed data extractors from web pages 1,500
dyweb/scrala A web crawling framework written in Scala that allows users to define the start URL and parse response from it 113
gushonorato/mechanize A web scraping and automation tool for Elixir. 30
xianhu/pspider A Python web crawler framework with support for multi-threading and proxy usage. 1,827