python-readability
HTML parser
Extracts and cleans main body text and title from an HTML document
fast python port of arc90's readability tool, updated to match latest readability.js!
3k stars
95 watching
350 forks
Language: Python
last commit: 3 months ago
Linked from 2 awesome lists
Related projects:
Repository | Description | Stars |
---|---|---|
remarkjs/remark-rehype | Transforms markdown into HTML to support HTML processing plugins | 275 |
wcz-txp/unicode-url-for-textpattern | Automatically converts non-ASCII characters in text links to UTF-8 URLs for improved SEO and readability | 4 |
apostrophecms/sanitize-html | A JavaScript library for cleaning up and sanitizing user-submitted HTML, removing unwanted content while preserving whitelisted elements and attributes. | 3,867 |
webreflection/hyperhtml | A lightweight virtual DOM alternative built on top of HTML template literals | 3,071 |
remarkjs/remark | Tools for processing and transforming markdown text into various formats. | 7,778 |
haml/haml | A templating engine for HTML that uses a concise syntax and automatic indentation to simplify the process of writing and rendering HTML documents | 3,766 |
kkos/oniguruma | A flexible and modern regular expression library with support for various character encodings and APIs. | 2,331 |
overbryd/myhtmlex | Erlang/Elixir bindings for parsing and processing HTML documents | 14 |
rehypejs/rehype-remark | Transforms HTML into Markdown syntax tree to support remark | 82 |
jhy/jsoup | A Java library for parsing and manipulating HTML, XML, and CSS | 10,985 |
markdown-it/linkify-it | Library to recognize and normalize links with full unicode support | 670 |
lexborisov/myhtml | A fast HTML parsing library written in C | 1,657 |
github/markup | Converts raw markup to HTML for rendering on GitHub.com | 5,876 |
archakov06/codex-to-html | Converts JSON-blocks from EditorJS to HTML markup | 15 |
zzzprojects/html-agility-pack | An HTML parsing library that allows developers to parse and manipulate malformed HTML documents | 2,665 |