scispacy

SciDoc Pipeline

A collection of custom spaCy pipelines and models for analyzing scientific documents.

A full spaCy pipeline and models for scientific/biomedical documents.

GitHub

2k stars
52 watching
228 forks
Language: Python
last commit: 25 days ago
Linked from 1 awesome list

bioinformaticsbiomedicalcustom-pipesnlpscientific-documentsspacy

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
explosion/spacy-stanza Wraps the Stanza NLP library to use Stanford models with spaCy 725
scipipe/scipipe A flexible and efficient way to write and run complex workflows using Go programming language 1,075
explosion/spacy-lookups-data Provides additional data resources for spaCy's natural language processing capabilities 98
openphilology/nidaba Automates OCR pipeline for text digitization and conversion of raw images into citable texts. 86
spencerahill/aospy Automates computations involving gridded climate data and manages results 84
allenai/scibert A BERT model trained on scientific text for natural language processing tasks 1,521
pharmbio/sciluigi A lightweight wrapper around Spotify's Luigi workflow library to simplify writing scientific workflows 334
ncbi-hackathons/spew Automates the packaging and distribution of bioinformatics pipelines for seamless deployment on various workstations. 26
aphp/eds-scikit A Python library providing tools to process and analyze standardized clinical data from healthcare databases 35
johnsonc/lambdo A workflow engine for unifying feature engineering and machine learning operations in data analysis pipelines 1
nlpatvcu/medacy A framework for building medical NLP models using spaCy. 432
weisscharlesj/scicompforchemists A comprehensive textbook and teaching resource on programming in Python with applications to chemistry 227
quanteda/spacyr An R wrapper around spaCy for natural language processing tasks 251
su-informatics-lab/dstg Software implementation of graph-based AI method for decomposing spatial transcriptomics data 34
silascutler/malpipe An ingestion and processing framework for malware and indicator data from various feeds. 103