tape

Protein benchmarks

Provides pre-trained protein embeddings and benchmarking tools for semi-supervised learning tasks in protein biology

Tasks Assessing Protein Embeddings (TAPE), a set of five biologically relevant semi-supervised learning tasks spread across different domains of protein biology.

GitHub

662 stars
22 watching
129 forks
Language: Python
last commit: almost 2 years ago
Linked from 1 awesome list

benchmarkdatasetdeep-learninglanguage-modelingprotein-sequencesprotein-structurepytorchsemi-supervised-learning

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
songlab-cal/tape-neurips2019 A software framework for evaluating protein embeddings and benchmarking semi-supervised learning tasks in protein biology 118
tbepler/protein-sequence-embedding-iclr2019 A framework for learning protein sequence and structure embeddings using deep learning models. 258
hicai-zju/promptprotein An implementation of a protein language model that uses prompts to learn from multi-level structural information in proteins. 32
cbcrg/benchfam Generates a benchmark dataset for evaluating protein alignment programs 3
pku-yuangroup/video-bench Evaluates and benchmarks large language models' video understanding capabilities 117
prosodylab/prosodylab.alignertools A package of scripts to prepare data for use in Prosodylab-Aligner by cleaning and relabeling transcriptions and generating orthography-based dictionaries. 12
kudkudak/word-embeddings-benchmarks Provides methods for evaluating word embeddings on various benchmarks 437
antoine77340/howto100m Provides code and tools for learning joint text-video embeddings using the HowTo100M dataset 250
ys-zong/vl-icl A benchmarking suite for multimodal in-context learning models 28
ailab-cvc/seed-bench A benchmark for evaluating large language models' ability to process multimodal input 315
automl/hpobench A collection of benchmark problems for hyperparameter optimization 138
talwalkarlab/leaf A benchmarking framework for federated machine learning tasks across various domains and datasets 851
yandex/rep A toolset for building and running reproducible machine learning experiments in Python 689
jordipons/eusipco2017 Research code for music auto-tagging using deep learning and feature extraction 23
ncbi-nlp/biosentvec Pre-trained word and sentence embeddings for biomedical text analysis 578