WebCollector

Web crawler library

A Java-based framework for building web crawlers with automatic URL detection and content extraction capabilities.

WebCollector is an open source web crawler framework based on Java.It provides some simple interfaces for crawling the Web,you can setup a multi-threaded web crawler in less than 5 minutes.

GitHub

3k stars
328 watching
1k forks
Language: Java
last commit: 8 months ago
Linked from 1 awesome list


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
pallets/quart An async Python micro framework for building web applications. 3,008
mtytel/helm A polyphonic synthesizer with advanced modulation capabilities and visual feedback. 2,377
ruanyf/weekly A weekly publication featuring curated content on technology and science, written by enthusiasts for the tech community. 47,834
yogeshojha/rengine Automated reconnaissance framework for web applications with customizable workflow and data correlation. 7,514
amyreese/zsh-take Replication of the "take" functionality from Oh My Zsh, allowing users to create and navigate directories with a single command. 3
joaomdmoura/machinery A lightweight Elixir library for creating and managing state machines 535
kdomanski/iso9660 A package for reading and creating ISO9660 images 264
jxub/barrel_ex An Elixir client for interacting with the BarrelDB database system. 0
batate/elixir-pipes Provides macros to extend Elixir's pipe operator for more flexible function composition and error handling 327
fabiokiatkowski/exercism.plugin.zsh A plugin to enhance the Oh My Zsh framework with features and functionality from Excercism.io for learning and improving shell scripting skills. 10
jwhiteman/a-little-elixir-goes-a-long-way Port of The Little Schemer to Elixir with exercises and algorithms in Scheme, along with unit tests and comparisons. 348
hemanth/bangalore-startups An extensive list of Bangalore-based startups organized by name 31
parroty/oauth2ex An OAuth 2.0 client library for Elixir. 57
antonmi/espec_phoenix Provides an Elixir wrapper around ESpec to enable Behavioral Driven Development (BDD) for the Phoenix web framework 138
veelenga/aasm.cr A simple finite state machine library for Crystal classes 51