Data and Modeling in AI-Powered Signal Processing Applications

CHAPTERS

Chapter 3 Improving Quality and Quantity of Training Data

When is noise in your data a good thing? When it accurately reflects real-world conditions.

For speech and voice applications, typical existing large data sets will be recorded in ways that differ from real application scenarios. If your application is supposed to recognize a spoken trigger word, then it needs to cope with poor microphones, specific types of reverberation, and background noise.

These and other effects can be added artificially to grow a training data set using established signal processing methods and domain-specific applications through:

Data augmentation
Data synthesis

Signals can be difficult to measure consistently or observe to build a large data set; this chapter looks at techniques to create more training data. Data synthesis can help create new signals from models or simulations, and data augmentation is a specific type of data synthesis that creates new variations of your existing data.

First, a brief overview of how deep learning works with signal data.

Data Augmentation

Starting from existing labeled samples, augmentation generates:

Training data that is similar to your high-quality validation data
Variations of the available data that the system may encounter in real-world scenarios

Augmentation effects are often domain specific. Common augmentation effects for audio, speech, and acoustic data include stretch time, shift pitch, control volume, and many more.

Kitchen Reverberation

Washing Machine Noise

Synthesis

Data synthesis includes generating training data from scratch using a combination of AI generative models or simulations.

A few examples of domain-specific data synthesis include:

1: Speech

The text2speech function in MATLAB can help you generate high-quality synthetic voice signals by using cloud-based services by IBM^®, Microsoft^®, or Google^®, including via Google’s well-known WaveNet network.

MATLAB Central File Exchange entry for the text2speech app from the MathWorks audio toolbox team.

2: Radar

This example shows how to classify pedestrians and bicyclists based on their micro-Doppler characteristics using a deep learning network and time-frequency analysis. The movements of different parts of an object placed in front of a radar produce micro-Doppler signatures that can be used to identify the object.

Two graphs: one is a plot of bicyclist trajectory, represented in dots forming a person on a bike. The other is a plot of speed on the y-axis versus time on the x-axis.

3: Communications

Communication signals are also very difficult to field-record off the air and then label. The WLAN Router Impersonation Detection example simulates realistic signals for RF fingerprinting. With the algorithm in place, you can use data collected from a software-defined radio to train and test the same system using actual data.

The figure shows three known routers, as well as the observer that collects non-high throughput (non-H T) beacon signals and unknown router data.

Test Your Knowledge

Start quiz

PREVIOUS
Chapter 2: Training and Validating Models NEXT
Chapter 4: Creating Inputs for Deep Networks