lance
Data storage format
A modern columnar data format for machine learning and large language models.
Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, and PyTorch with more integrations coming..
4k stars
44 watching
221 forks
Language: Rust
last commit: 5 days ago
Linked from 1 awesome list
apache-arrowcomputer-visiondata-analysisdata-analyticsdata-centricdata-formatdata-sciencedataopsdeep-learningduckdbembeddingsllmsmachine-learningmlopspythonrust
Related projects:
Repository | Description | Stars |
---|---|---|
lancedb/lancedb | A serverless vector search and storage database built with Rust, enabling efficient similarity searches across multimodal data. | 4,757 |
wesm/feather | A binary data frame storage system that enables efficient and interoperable data sharing across multiple programming languages. | 2,742 |
apache/arrow | A toolkit for efficient data interchange and in-memory analytics in various languages | 14,590 |
aksnzhy/xlearn | A high-performance machine learning package with linear models and factorization machines. | 3,087 |
ml-tooling/opyrator | Automates conversion of machine learning code into production-ready microservices with web API and GUI. | 3,102 |
paradedb/pg_analytics | Enables direct querying of large data volumes from Postgres using a high-performance analytical query engine | 380 |
mlpack/mlpack | A C++ machine learning library with bindings to other languages and bindings for multiple programming languages. | 5,113 |
root-project/root | A software package for analyzing and visualizing large scientific data sets | 2,707 |
ericlbuehler/mistral.rs | A fast and flexible LLM inference platform supporting various models and devices | 4,466 |
ponyorm/pony | An object-relational mapper that allows Python developers to write database queries using Python code | 3,650 |
qdrant/qdrant | A high-performance vector search engine and database for efficient similarity searches in machine learning applications. | 20,607 |
vaexio/vaex | A high-performance Python library for streaming and exploring large tabular datasets. | 8,297 |
postgresml/postgresml | An open-source Postgres extension for machine learning and AI operations directly within the database. | 6,033 |
infiniflow/infinity | A high-performance database designed to support fast search and retrieval of dense vector, sparse vector, tensor, and full-text data | 2,641 |
ayush1997/visualize_ml | A Python package for data analysis and visualization in machine learning | 200 |