FDA
Navigation assistant
This project proposes a novel data augmentation technique to enhance visual-textual matching in vision-and-language navigation tasks.
Official Implementation of Frequency-enhanced Data Augmentation for Vision-and-Language Navigation (NeurIPS2023)
13 stars
3 watching
0 forks
Language: Python
last commit: almost 2 years ago Related projects:
| Repository | Description | Stars |
|---|---|---|
| | This project proposes a novel method for calibrating attention distributions in multimodal models to improve contextualized representations of image-text pairs. | 30 |
| | Improves performance of vision language tasks by integrating computer vision capabilities into large language models | 314 |
| | Generates data for CARLA's visual navigation system using raw camera images and instructions. | 8 |
| | An AR navigation aid for visually impaired individuals. | 26 |
| | Extending pretraining models to handle multiple modalities by aligning language and video representations | 751 |
| | A research project aimed at building large language and vision models for biomedical applications with capabilities comparable to GPT-4. | 1,622 |
| | Improves image restoration performance by converting global operations to local ones during inference | 231 |
| | A tool for annotating and labeling data for autonomous driving applications using semi-supervised machine learning | 1 |
| | Improves the performance of large language models by intervening in their internal workings to reduce hallucinations | 83 |
| | Provides a framework for learning with noisy labels using data augmentation strategies. | 113 |
| | Provides pre-trained ELMo representations for multiple languages to improve NLP tasks. | 1,462 |
| | Develops robot navigation policies in crowded spaces using reinforcement learning and attention mechanisms. | 607 |
| | An AI-powered conversational assistant built on top of a Raspberry Pi. | 747 |
| | Training and deploying large language models on computer vision tasks using region-of-interest inputs | 517 |
| | Provides sentence embeddings for Arabic languages using pre-trained word embeddings and Smooth Inverse Frequency algorithm | 5 |