ConvolutionaNeuralNetworksToEnhanceCodedSpeech

Public

In this work we propose two postprocessing approaches applying convolutional neural networks (CNNs) either in the time domain or the cepstral domain to enhance the coded speech without any modification of the codecs. The time domain approach follows an end-to-end fashion, while the cepstral domain approach uses analysis-synthesis with cepstral domain features. The proposed postprocessors in both domains are evaluated for various narrowband and wideband speech codecs in a wide range of conditions. The proposed postprocessor improves speech quality (PESQ) by up to 0.25 MOS-LQO points for G.711, 0.30 points for G.726, 0.82 points for G.722, and 0.26 points for adaptive multirate wideband codec (AMR-WB). In a subjective CCR listening test, the proposed postprocessor on G.711-coded speech exceeds the speech quality of an ITU-T-standardized pos

1d-convolution cnn gan generative-adversarial-network keras mfcc post-processing speech-enhancement speech-processing speech-reconstruction

Creat：2018-03-22T04:38:30

Update：2024-01-24T21:28:55

https://ansleliu.github.io/CNN.html

Stars

Stars Increase

Related projects

Annotated_deep_learning_paper_implementations

attention

??? 60+ Implementations/tutorials of deep learning papers with side-by-side notes ?; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), ? reinforcement learning (ppo, dqn), capsnet, distillation, ... ?

59726

1周前

+12today