sparse_attention

Examples of using sparse attention, as in "Generating Long Sequences with Sparse Transformers"

GitHub

2k stars
45 watching
192 forks
Language: Python
last commit: about 4 years ago