3D-convolutional-speaker-recognition
Speaker Verifier
Develops deep learning models using 3D convolutional neural networks for speaker verification tasks
Deep Learning & 3D Convolutional Neural Networks for Speaker Verification
783 stars
58 watching
274 forks
Language: Python
last commit: over 5 years ago
Linked from 1 awesome list
3dconvolutional-neural-networksdeep-learningspeaker-recognition
Related projects:
| Repository | Description | Stars |
|---|---|---|
| | Deep learning-based system for recognizing speech from lip movements using 3D convolutional neural networks. | 1,840 |
| | This project provides an implementation of a deep learning framework to classify audio signals and offers insights into the model's decision-making process using Explainable Artificial Intelligence (AI) techniques. | 351 |
| | Reconstructs audio features learned by convolutional neural networks into audible sounds | 42 |
| | A software implementation of a deep learning model designed to understand lip movements in videos | 117 |
| | A framework for unsupervised depth and ego-motion estimation from monocular videos using deep learning | 1,977 |
| | An implementation of a 3D convolutional neural network based on the AlexNet architecture for image recognition in 3D data. | 42 |
| | A deep learning framework for 3D object detection from RGB-D data | 1,598 |
| | A software framework for 3D vision and computer vision tasks using deep learning and 2D keypoints. | 431 |
| | Provides tools and libraries for extracting speech features from audio data. | 881 |
| | Enables speech-to-text transcription using a pre-trained neural network model in MATLAB. | 7 |
| | A large language model enabling speech, audio event perception and music inputs to achieve multilingual capabilities | 1,091 |
| | An audio processing toolkit using PyTorch convolutional neural networks to generate spectrograms from raw audio data | 1,036 |
| | Enables speech-to-text transcription using a pre-trained Deep Speech model in MATLAB. | 7 |
| | Deep learning models for semantic segmentation of images | 101 |
| | A tool that enables language workers to build speech recognition models using multiple systems, including Kaldi and Huggingface Transformers. | 152 |