lip-reading-deeplearning

Lip reader

Deep learning-based system for recognizing speech from lip movements using 3D convolutional neural networks.

unlock Lip Reading - Cross Audio-Visual Recognition using 3D Architectures

GitHub

2k stars
55 watching
321 forks
Language: Python
last commit: about 2 years ago
Linked from 1 awesome list

3d-convolutional-networkcomputer-visiondeep-learningspeech-recognitiontensorflow

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
astorfi/3d-convolutional-speaker-recognition Develops deep learning models using 3D convolutional neural networks for speaker verification tasks 783
vipl-audio-visual-speech-understanding/lipreading-densenet3d A software implementation of a deep learning model designed to understand lip movements in videos 117
keunwoochoi/auralisation Reconstructs audio features learned by convolutional neural networks into audible sounds 42
canjie-luo/moran_v2 A deep learning framework for scene text recognition with rectification and attention mechanisms. 636
laion-ai/clap A library for learning audio embeddings from text and audio data using contrastive language-audio pretraining 1,427
astorfi/speechpy Provides tools and libraries for extracting speech features from audio data. 880
nicholas-leonard/dp A deep learning library for streamlining research and development using the Torch7 distribution. 343
matlab-deep-learning/deepspeech Enables speech-to-text transcription using a pre-trained Deep Speech model in MATLAB. 7
engineering-course/lip_ssl A deep learning framework for human parsing that learns to detect human structures without explicit joint labeling. 229
vita-epfl/monoloco A library for 3D vision tasks using 2D keypoints 428
ayoolaolafenwa/pixellib A deep learning library for image segmentation and object detection using PyTorch. 1,049
imodpasteur/lutorpy A Python library that enables seamless interaction between deep learning frameworks and Lua/Torch libraries. 233
tinghuiz/sfmlearner An unsupervised learning framework for depth and ego-motion estimation from monocular videos 1,967
millionintegrals/vel A collection of modular deep learning components that can be easily configured and reused in various applications. 276
thelegendali/deeplab-context An implementation of a deep learning system for semantic image segmentation using a combination of convolutional neural networks and conditional random fields. 239