AI Speech vs Human Speech

Question

0 votos

Is it possible to use matlab to detect whether a human or AI voice is talking? If so, can someone give me links to assist.

3 comentarios
Mostrar 1 comentario más antiguo Ocultar 1 comentario más antiguo

Walter Roberson el 17 de Abr. de 2019

Not if it is a sufficiently good AI program.

But until then:

Sythesized speech is usually cleaner (less noise) than human speech.
Synthesize speech usually says the same word the same way each time. Human speech seldom does
Human speech does much more blending -- modification of the initial sounds of a word depending on the sounds at the end of the previous word. Some of this is just smooth movement between sounds being easier than sudden movement, but humans tend to modify the sounds themselves, in ways that you can notice if you really listen but which you might have trouble expressing
If you can get the voice to say "Merry Mary, marry", and you can clearly understand which word is which, then probably it is AI. If two of the words come out exactly the same, then probably it is AI. If some of the words come out almost but not quite exactly the same and you have trouble saying what the difference is, then the voice might be human. (There are large regional differences in how the words get said, but it takes speech synthesis to make them exactly the same.)
Try it on homonyms. For example, recently I told Alexa to play one of Elton John's albums, and it said that it was going to play "Live in Australia", with a short i (the verb form, as in, "I live in Canada"), instead of using the long i adverb form, "Filmed in front of a live audience")

Brantley el 17 de Abr. de 2019

How would you use matlab to determine if the AI or human is talking?

Walter Roberson el 17 de Abr. de 2019

The first two items I posted are obviously actionable:

Measure noise in the signal. More noise would tend to imply human.
Find copies of the same word and compare them to see how similar they are. You might use mfcc to recognize words, and then once recognized, isolate the words from the stream, and xcorr. High cross correlation makes it more likely that it is AI. You might have a look at dynamic time warping: the less warping that is needed, the more likely that it is AI generated, since AI is less likely to have micro-changes in timing.

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Follow Question

Answer 1

Gagan Agarwal el 30 de Mayo de 2024

0 votos

Hi Brantley

Yes, it's possible to use MATLAB to detect whether a sound is produced by a human or an AI-generated voice. This task falls under the broader category of audio analysis and machine learning.

Here's a high-level overview of how you might approach this problem:

Collect a dataset that includes both human and AI-generated voices. The dataset should be large and diverse enough to train a robust model.
Audio data generally requires preprocessing before it can be used for training a model. This might involve converting the audio files into a uniform format, sampling rate normalization etc.
Choose the deep learning model for training.
After training evaluate the performance of the model.

I hope it helps!

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Iniciar sesión para comentar.

AI Speech vs Human Speech

3 comentarios
Mostrar 1 comentario más antiguo Ocultar 1 comentario más antiguo

Respuestas (1)

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Categorías

Etiquetas

Community Treasure Hunt

AI Speech vs Human Speech

3 comentarios Mostrar 1 comentario más antiguo Ocultar 1 comentario más antiguo

Respuestas (1)

0 comentarios Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Categorías

Etiquetas

Ver también

Community Treasure Hunt

3 comentarios
Mostrar 1 comentario más antiguo Ocultar 1 comentario más antiguo

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos