How to find word error rate of spoken sentence for regression based model?

Question

Shilpa Sonawane el 21 de Oct. de 2023

0
Enlazar

Enlace directo a esta pregunta

https://es.mathworks.com/matlabcentral/answers/2036661-how-to-find-word-error-rate-of-spoken-sentence-for-regression-based-model

Comentada: Shilpa Sonawane el 14 de Dic. de 2023

I am working on visual speech synthesis. I have used GRID dataset which consists of short sentences. The developed model is regression based model.The model takes mute video as a input & generate speech signal. My aim is to find word error rate from output signal(speech signal). I don't know how to seperate words from input and output signal in order to find word error rate.

Kindly guide me about this.

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Shilpa Sonawane el 26 de Oct. de 2023

Thank you so much

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

Drew el 25 de Oct. de 2023

0
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/2036661-how-to-find-word-error-rate-of-spoken-sentence-for-regression-based-model#answer_1340396

Word Error Rate (WER) is a widely used metric for evaluating Automatic Speech Recognition (ASR). To calculate WER for a visual speech synthesis (VSS) system, a reference word transcription and a hypothesis word transcription will be needed, and then standard word error rate alignment can be performed to obtain the WER. These word transcriptions can be obtained in various ways. For example, the reference word transcriptions might come from the visual dataset labels. The hypothesis word transcription might come from the VSS system itself (if the VSS system has an intermediate representation in words), or from running ASR on the synthesized speech. It is important to note that while WER is a widely-used metric, it does not capture all aspects of visual speech synthesis quality. Other evaluation metrics, such as perceptual evaluation of speech quality (PESQ) or subjective user studies, could be conducted to assess the system's performance from different perspectives, including audio-visual synchronization, intelligibility, overall usefulness of the synthesized speech, and naturalness.

If this answer helps you, please remember to accept the answer.

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Shilpa Sonawane el 14 de Dic. de 2023

Thank you.

Iniciar sesión para comentar.

How to find word error rate of spoken sentence for regression based model?

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Respuesta aceptada

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Más respuestas (0)

Ver también

Categorías

Etiquetas

Community Treasure Hunt

How to find word error rate of spoken sentence for regression based model?

1 comentario Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Respuesta aceptada

1 comentario Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Más respuestas (0)

Ver también

Categorías

Etiquetas

Community Treasure Hunt

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos