marlin

Matrix multiplier

An optimized FP16xINT4 matrix multiplication kernel for large language models

FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.

GitHub

624 stars
15 watching
47 forks
Language: Python
last commit: 3 months ago
4bitkernelllmquantization

Related projects:

Repository Description Stars
microsoft/deepspeed-mii A Python library designed to accelerate model inference with high-throughput and low latency capabilities 1,898
jalawson/ulinalg A small size matrix handling module with linear algebra operations for MicroPython (Python3) 32
davidstutz/matlab-mnist-two-layer-perceptron A Matlab implementation of a two-layer perceptron to recognize handwritten digits from the MNIST dataset. 60
google/gemmlowp A small C++ library for low-precision matrix multiplication 1,779
janbednarik/micropython-matrix8x8 A Python driver for an 8x8 LED Matrix display using I2C communication 15
microsoft/bitblas A library for efficient mixed-precision matrix multiplications on GPUs for deep learning models 420
mratsim/laser A high-performance computing library providing optimized primitives for tensor and matrix operations 278
jjjkkkjjj/matft A Numpy-like library in Swift for multi-dimensional array and matrix operations 133
iyassou/umatrix A library providing basic matrix arithmetic operations and functions for the MicroPython language. 15
intel/neural-compressor Tools and techniques for optimizing large language models on various frameworks and hardware platforms. 2,226
versilov/matrex A fast and efficient matrix library for Elixir/Erlang with C implementation using CBLAS. 478
akabe/slap A linear algebra library with type-based static size checking for matrix operations. 88
tlk00/bitmagic A C++ library for compact data structures and algorithms optimized for memory efficiency and high performance 412
uncomplicate/neanderthal A Clojure library providing optimized native libraries for fast matrix and linear algebra computations on CPU and GPU. 1,076
numpi/hm-toolbox A toolbox implementing arithmetic operations for HODLR and HSS matrices in MATLAB. 43