llama.onnx

LLaMa inference toolset

A project providing onnx models and tools for inference with LLaMa transformer model on various devices

LLaMa/RWKV onnx models, quantization and testcase

GitHub

352 stars
13 watching
31 forks
Language: Python
last commit: over 1 year ago
alpacallamallmonnxonnxruntimequantizationrwkvtransformer

Related projects:

Repository Description Stars
snunez1/llama.cl A Common Lisp port of a Large Language Model (LLM) implementation 35
linksoul-ai/chinese-llama-2-7b A deep learning project providing an open-source implementation of the LLaMA2 model with Chinese and English text data 2,228
xboot/libonnx A lightweight onnx inference engine for embedded devices with hardware acceleration support 583
dicklesworthstone/swiss_army_llama A FastAPI service that facilitates semantic text search using precomputed embeddings and advanced similarity measures. 941
shm007g/llama-cult-and-more Provides insights and practical guides for building and using large language models. 427
andrewzhe/lawyer-llama An AI model trained on legal data to provide answers and explanations in Chinese law 851
run-llama/llamaindexts A data framework for integrating large language models into applications with custom data 1,937
melih-unsal/demogpt A comprehensive toolset for building Large Language Model (LLM) based applications 1,710
vhellendoorn/code-lms A guide to using pre-trained large language models in source code analysis and generation 1,782
openlmlab/openchinesellama An incremental pre-trained Chinese large language model based on the LLaMA-7B model 234
juncongmoo/chatllama An open source implementation of LLaMA-based chatbots using reinforcement learning from human feedback for faster training and inference 1,207
lxtgh/omg-seg Develops an end-to-end model for multiple visual perception and reasoning tasks using a single encoder, decoder, and large language model. 1,300
jerry1993-tech/cornucopia-llama-fin-chinese A Chinese finance-focused large language model fine-tuning framework 589
microsoft/onnxruntime-inference-examples Repository providing examples for using ONNX Runtime (ORT) to perform machine learning inferencing. 1,212
km1994/llmsninestorydemontower Exploring various LLMs and their applications in natural language processing and related areas 1,798