sparse_attention
Examples of using sparse attention, as in "Generating Long Sequences with Sparse Transformers"
2k stars
45 watching
192 forks
Language: Python
last commit: about 4 years ago Examples of using sparse attention, as in "Generating Long Sequences with Sparse Transformers"