dataflowkit

Web scraper

A framework for extracting structured data from web pages using CSS selectors.

Extract structured data from web sites. Web sites scraping.

GitHub

667 stars
24 watching
80 forks
Language: Go
last commit: almost 2 years ago
Linked from 4 awesome lists

cdpchrome-fetchercrawlingextract-datagogolanggolang-libraryheadlessscraperscrapingscraping-websites

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
miyagawa/web-scraper A Perl toolkit for extracting structured data from HTML documents using a DSL-like interface. 104
jakopako/goskyr A tool to simplify web scraping of list-like structured data from web pages 36
benibela/xidel A tool to extract data from web pages using various query languages and selectors. 690
davemolk/gogetjs Tools for extracting and analyzing JavaScript files from web pages 41
s0rg/crawley A utility for systematically extracting URLs from web pages and printing them to the console. 268
spider-rs/spider A tool for web data extraction and processing using Rust 1,234
yhat/scrape A collection of utility functions and tools to simplify web scraping in Go. 1,513
spekulatius/phpscraper A web scraping utility for PHP that simplifies the process of extracting information from websites. 544
elixir-crawly/crawly A framework for extracting structured data from websites 994
felipecsl/wombat A Ruby-based web crawler and data extraction tool with an elegant DSL. 1,315
the-markup/blacklight-collector A tool for scraping website content and analyzing browser behavior 205
stewartmckee/cobweb A flexible web crawler that can be used to extract data from websites in a scalable and efficient manner 226
propublica/upton A web scraping framework that simplifies the process by handling repetitive tasks and provides options for efficient data retrieval 1,612
joncanning/skyscraper A framework for building asynchronous web scrapers and crawlers using async/await and Reactive Extensions. 59
jaimeiniesta/metainspector A Ruby gem for web scraping and extracting metadata from web pages. 1,038