lambdasoup

HTML scraper

A functional HTML scraping and manipulation library

Functional HTML scraping and rewriting with CSS in OCaml

GitHub

383 stars
12 watching
31 forks
Language: OCaml
last commit: 3 months ago
Linked from 1 awesome list

csshtmlocamlscrapingsoup

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
aantron/markup.ml A streaming HTML5 and XML parser that detects character encodings, emits signals, and provides error recovery. 146
fcannizzaro/jsoup-annotations A Java library that provides annotations to simplify HTML scraping and processing with Jsoup 239
miyagawa/web-scraper A Perl toolkit for extracting structured data from HTML documents using a DSL-like interface. 104
meilisearch/docs-scraper Automates scraping and indexing of documentation content into a search engine 290
pharo-contributions/soup An HTML parsing and scraping library for Pharo 6
egonschiele/handsomesoup A Haskell library that simplifies HTML parsing by providing CSS selectors and attribute extraction functions. 124
oscarotero/embed A PHP library to extract metadata and embeddable code from any web page using various protocols and scraping techniques. 2,091
felipecsl/wombat A Ruby-based web crawler and data extraction tool with an elegant DSL. 1,315
bendeaton/abaqus-documentation-scraper Extracts keywords and parameters from Abaqus documentation for syntax highlighting plugin 3
jakopako/goskyr A tool to simplify web scraping of list-like structured data from web pages 35
the-markup/blacklight-collector A tool for scraping website content and analyzing browser behavior 202
malfrats/xeuledoc A tool to fetch information about public Google documents from various services 846
scrapy/scrapely A pure-python library for extracting structured data from HTML pages. 1,863
michaelhelmick/lassie Library for retrieving basic content from websites 613
laramies/metagoofil Extracts metadata from public documents available on websites 1,028