beagle

Phrase detector

A tool to identify keywords and phrases in streams of text documents

Beagle helps you identify keywords, phrases, regexes, and complex search queries of interest in streams of text documents.

GitHub

52 stars
4 watching
3 forks
Language: Clojure
last commit: over 3 years ago
Linked from 1 awesome list

clojurejavaluceneluwaknlpreal-time-searchstemmingstored-query-enginestream-search

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
abitdodgy/words_counted A Ruby library that tokenizes input and provides various statistical measures about the tokens 159
bigbadbleucheese/kong A .NET library that identifies characteristics of web browsers by parsing their User-Agent header strings. 17
teknologi-umum/flourite Automatically detects programming languages from given strings. 38
c4n/pythonlexto A Python wrapper around the Thai word segmentator LexTo, allowing developers to easily integrate it into their applications. 1
abadojack/whatlanggo A library for detecting and identifying languages in text 643
pemistahl/lingua An accurate language detection library for Java and the JVM suitable for both short and long text inputs. 707
genericsteele/token_phrase A Ruby gem that generates unique phrases by combining words from predefined dictionaries with user-specified separators and settings. 100
pjhampton/woolly A Text Mining & Natural Language Processing API for the Elixir programming language. 51
ndmitchell/tagsoup A Haskell library for parsing and extracting information from HTML/XML documents 233
gueils/belugas-node An engine for detecting Node.js application features through static analysis. 1
turbopape/postagga A Clojure-based natural language processing library for parsing and structuring text input into meaningful data. 159
hashwin/scylla A Ruby-based language detection tool that uses N-Gram based text categorization to identify the language of given text. 36
google/dexmod Tool to analyze and modify Android bytecode for security research and analysis 49
tokenmill/timewords A library for parsing date strings into Java Date objects 30
gaul/modernizer-maven-plugin Detects uses of legacy Java APIs in source code to recommend modern alternatives. 371