I've spent some time on generative error correction recently

Speech Technology

I've spent some time on generative error correction recently, more numbers and results on it later, meanwhile the paper

https://arxiv.org/abs/2409.09785

Large Language Model Based Generative Error Correction: A Challenge and Baselines for Speech Recognition, Speaker Tagging, and Emotion Recognition

Chao-Han Huck Yang, Taejin Park, Yuan Gong, Yuanchao Li, Zhehuai Chen, Yen-Ting Lin, Chen Chen, Yuchen Hu, Kunal Dhawan, Piotr Żelasko, Chao Zhang, Yun-Nung Chen, Yu Tsao, Jagadeesh Balam, Boris Ginsburg, Sabato Marco Siniscalchi, Eng Siong Chng, Peter Bell, Catherine Lai, Shinji Watanabe, Andreas Stolcke

Given recent advances in generative AI technology, a key question is how large language models (LLMs) can enhance acoustic modeling tasks using text decoding results from a frozen, pretrained automatic speech recognition (ASR) model. To explore new capabilities in language modeling for speech processing, we introduce the generative speech transcription error correction (GenSEC) challenge. This challenge comprises three post-ASR language modeling tasks: (i) post-ASR transcription correction, (ii) speaker tagging, and (iii) emotion recognition. These tasks aim to emulate future LLM-based agents handling voice-based interfaces while remaining accessible to a broad audience by utilizing open pretrained language models or agent-based APIs. We also discuss insights from baseline evaluations, as well as lessons learned for designing future evaluations.

another older paper here

https://www.tg-me.com/us/Speech Technology/com.speechtech/1962

arXiv.org

Large Language Model Based Generative Error Correction: A...

Given recent advances in generative AI technology, a key question is how large language models (LLMs) can enhance acoustic modeling tasks using text decoding results from a frozen, pretrained...

www.tg-me.com/us/Speech Technology/com.speechtech/2079

1.3K viewsedited Mar 10 at 00:48

tg-me.com/speechtech/2079

Create: 2025-03-10
Last Update: 2025-07-05 02:49:11

BY Speech Technology

Share with your friend now:
tg-me.com/speechtech/2079

Speech Technology Telegram | DID YOU KNOW?

I've spent some time on generative error correction recently