unnatural-instructions
Instruction dataset
A collection of automatically generated instructions for training language models.
175 stars
7 watching
10 forks
last commit: over 1 year ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
xuefuzhao/instructionwild | Creating a large-scale user-based instruction dataset for natural language processing research and development | 453 |
x2fd/lvis-instruct4v | A dataset of fine-grained visual instructions generated by prompting a large language model with images from another dataset | 131 |
allenai/natural-instructions | Creating a large collection of tasks and their natural language definitions/instructions to support the development of NLP models with generalization capabilities | 958 |
flagopen/flaginstruct | A collection of diverse instruction corpora for improving the development and tuning of Chinese Language Models | 173 |
vt-nlp/multiinstruct | A multimodal benchmark dataset designed to evaluate the performance of vision-language foundation models through instruction tuning. | 133 |
ordinand/the-art-of-asking-chatgpt-for-high-quality-answers-a-complete-guide-to-prompt-engineering-technique | A comprehensive guide to optimizing chatbot responses using prompt engineering techniques | 984 |
zjunlp/mol-instructions | A dataset and tools package designed to support the training and evaluation of large language models for molecular biology tasks | 252 |
spro/nalgene | Generates training data for intent parsing systems by creating pairs of sentences and grammar trees from a template file | 55 |
rucaibox/comvint | Creating synthetic visual reasoning instructions to improve the performance of large language models on image-related tasks | 18 |
orhun/orhun | A collection of command-line tools and utilities for Linux system management, automation, and hobbyist projects. | 74 |
russianpanda95/nop_plugin | An IDA plugin that removes unnecessary bytes from instructions | 12 |
igobronidze/hrs_training_data | Training data for a handwritten recognition system | 20 |
rondnelson99/opcode_count | Analyzes the frequency of instructions in Game Boy code to provide insights into coding patterns and optimization opportunities | 2 |
nimrodpar/labeled-elfs | Provides labeled ELF binaries for research and testing purposes. | 86 |
mbzuai-nlp/bactrian-x | A collection of multilingual language models trained on a dataset of instructions and responses in various languages. | 94 |