This is the code repository for the neural speech codec presented in the EMNLP 2024 paper ESC: Efficient Speech Coding with Cross-Scale Residual Vector Quantized Transformers [paper] Training a base ...
Abstract: Previous objective speech quality assessment models, such as bark spectral distortion (BSD), the perceptual speech quality measure (PSQM), and measuring normalizing blocks (MNB), have been ...