DeepSeek-V2
Mixture-of-Experts Language Model
A high-performance mixture-of-experts language model with strong performance and efficient inference capabilities.
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
4k stars
32 watching
171 forks
last commit: 3 months ago Related projects:
Repository | Description | Stars |
---|---|---|
deepseek-ai/deepseek-coder | A code completion model trained on large amounts of programming language data to help developers write code more efficiently. | 6,987 |
deepseek-ai/deepseek-moe | A large language model with improved efficiency and performance compared to similar models | 1,024 |
confident-ai/deepeval | A framework for evaluating large language models | 4,003 |
microsoft/deepspeed | A deep learning optimization library that simplifies distributed training and inference on modern computing hardware. | 35,863 |
huggingface/text-generation-inference | A toolkit for deploying and serving Large Language Models (LLMs) for high-performance text generation | 9,456 |
coqui-ai/tts | A deep learning toolkit for generating human-like speech from text | 36,118 |
brightmart/text_classification | An NLP project offering various text classification models and techniques for deep learning exploration | 7,881 |
google/big-bench | A benchmark designed to probe large language models and extrapolate their future capabilities through a diverse set of tasks. | 2,899 |
deepseek-ai/deepseek-coder-v2 | A code intelligence model designed to generate and complete code in various programming languages | 2,322 |
deepseek-ai/deepseek-vl | A multimodal AI model that enables real-world vision-language understanding applications | 2,145 |
dair-ai/ml-papers-explained | An explanation of key concepts and advancements in the field of Machine Learning | 7,352 |
databrickslabs/dolly | A large language model trained on a commercial machine learning platform with limited capabilities | 10,820 |
huggingface/alignment-handbook | Provides recipes and guidelines for training language models to align with human preferences and AI goals | 4,800 |
qwenlm/qwen2.5 | A large language model series with various sizes and variants for text generation and understanding. | 10,959 |
tju-drl-lab/ai-optimizer | A next-generation deep reinforcement learning toolkit with libraries for multiagent, self-supervised, offline, and transfer/reinforcement learning | 4,848 |