goreadability

Web page summary extractor

Extracts readable content from web pages using Open Graph and traditional readability rules.

Webpage summary extractor using Facebook Open Graph and arc90's readability

GitHub

69 stars

7 watching

8 forks

Language: Go

last commit: over 7 years ago

Linked from 2 awesome lists

opengraphreadabilityscraper

Backlinks from these awesome lists:

Related projects:

Repository	Description	Stars
keepcosmos/readability	An Elixir library that extracts and curates primary readable content from web pages.	260
tjatse/node-readability	Automates web page scraping and text extraction to make any webpage readable	343
philipperemy/stanford-openie-python	Provides a Python interface to extract structured relation triples from plain text using CoreNLP's open information extraction system.	639
jonmagic/grim	A tool for extracting pages from PDFs and converting them to images and text strings.	216
erikriver/opengraph	A Python module to extract and parse metadata from web pages using the Open Graph Protocol.	230
cantino/ruby-readability	A Ruby port of a readability tool that extracts primary content from web pages.	927
foolin/pagser	A tool for automatically extracting structured data from HTML pages	105
neon-jungle/wagtail-readability	Analogizes the readability of text content in Wagtail's RichTextField	16
s0rg/crawley	A utility for systematically extracting URLs from web pages and printing them to the console.	268
peburrows/plot	A GraphQL parser and resolver for Elixir that aims to implement the full GraphQL spec.	32
vrothberg/vgrep	A user-friendly pager for text search and editing	669
itteco/iframely	A service that extracts metadata and embeds from web pages	1,537
steelthread/mimeograph	A CoffeeScript library for extracting text from PDF files and creating searchable documents with OCR capabilities	28
serpapi/nokolexbor	A high-performance HTML5 parser for Ruby based on Lexbor with support for CSS selectors and XPath.	327
plainas/tq	Tool that extracts content from HTML documents based on CSS selectors	236