MuVI
Multi-view modeler
A software framework for multi-view latent variable modeling with domain-informed structured sparsity
A multi-view latent variable model with domain-informed structured sparsity for integrating noisy feature sets.
27 stars
5 watching
2 forks
Language: Python
last commit: 2 months ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
| An evaluation framework for multimodal language models' visual capabilities using image and question benchmarks. | 296 |
| An end-to-end image captioning system that uses large multi-modal models and provides tools for training, inference, and demo usage. | 1,849 |
| Tool for validating and standardizing chemical structures to improve data quality and facilitate comparisons. | 163 |
| A large multimodal language model designed to process and analyze video, image, text, and audio inputs in real-time. | 1,005 |
| Develops high-resolution multimodal LLMs by combining vision encoders and various input resolutions | 549 |
| Evaluates the capabilities of large multimodal models using a set of diverse tasks and metrics | 274 |
| An evaluation platform for comparing multi-modality models on visual question-answering tasks | 478 |
| Develops large multimodal models for high-resolution understanding and analysis of text, images, and other data types. | 143 |
| A large multimodal model for visual question answering, trained on a dataset of 2.1B image-text pairs and 8.2M instruction sequences. | 78 |
| A large vision-language model using a mixture-of-experts architecture to improve performance on multi-modal learning tasks | 2,023 |
| Develops a multimodal task and dataset to assess vision-language models' ability to handle interleaved image-text inputs. | 33 |
| Evaluates and compares the performance of multimodal large language models on various tasks | 56 |
| An implementation of a unified architecture for multi-modal multi-task learning using PyTorch. | 515 |
| A benchmarking framework for evaluating Large Multimodal Models by providing rigorous metrics and an efficient evaluation pipeline. | 22 |
| A multimodal LLM designed to handle text-rich visual questions | 270 |