tape-neurips2019

Protein Embedding Framework

A software framework for evaluating protein embeddings and benchmarking semi-supervised learning tasks in protein biology

Tasks Assessing Protein Embeddings (TAPE), a set of five biologically relevant semi-supervised learning tasks spread across different domains of protein biology. (DEPRECATED)

GitHub

118 stars
9 watching
34 forks
Language: Python
last commit: over 3 years ago
Linked from 1 awesome list

benchmarkdatasetdeep-learninglanguage-modelingprotein-sequencesprotein-structuresemi-supervised-learning

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
songlab-cal/tape Provides pre-trained protein embeddings and benchmarking tools for semi-supervised learning tasks in protein biology 662
tbepler/protein-sequence-embedding-iclr2019 A framework for learning protein sequence and structure embeddings using deep learning models. 258
antoine77340/howto100m Provides code and tools for learning joint text-video embeddings using the HowTo100M dataset 250
largelymfs/topical_word_embeddings A codebase implementing topical word embeddings using various NLP techniques as demonstrated in a paper accepted by AAAI'15. 315
hicai-zju/promptprotein An implementation of a protein language model that uses prompts to learn from multi-level structural information in proteins. 32
gao-lab/glue A software framework for integrating multi-omics data from single cells 382
princetonml/sif A Python implementation of a sentence embedding algorithm using the Smooth Inverse Frequency weighting scheme 1,083
vzhong/embeddings Provides fast and efficient word embeddings for natural language processing. 223
ncbi-nlp/biosentvec Pre-trained word and sentence embeddings for biomedical text analysis 578
gink03/alt-i2v An implementation of a deep learning-based image representation learning approach using a modified fully connected layer and transfer learning from VGG16 34
laion-ai/clap A library for learning audio embeddings from text and audio data using contrastive language-audio pretraining 1,415
kudkudak/word-embeddings-benchmarks Provides methods for evaluating word embeddings on various benchmarks 437
nvlabs/circuitops A software framework providing a data infrastructure for generating datasets and deploying generative AI models in circuit optimization tasks. 72
materialsintelligence/mat2vec Unsupervised word embeddings capture latent knowledge from materials science literature 619
teichlab/gpfates Software to model transcriptional cell fates as mixtures of Gaussian Processes 19