How to fill in missing data?

9 visualizaciones (últimos 30 días)
J K
J K el 26 de Nov. de 2016
Respondida: Star Strider el 7 de Oct. de 2024
Hello everybody,
I have a dataset(txt file) which contains some missing values (represented with 0). I would like to replace all this 0 places with numbers. How can I do?

Respuestas (2)

Ayush
Ayush el 7 de Oct. de 2024
Hi,
One way to replace missing values currently represented as 0 is to use interpolation. Start by identifying the indices of the zeros in your dataset. Then, apply the “interp1” function to perform the interpolation. Refer to the pseudo code below for a better understanding:
% Step 1: Load the data
data = readmatrix('your_dataset.txt');
% Step 2: Interpolate to replace zeros
for col = 1:size(data, 2)
x = 1:size(data, 1); % Indices of the data
y = data(:, col); % Data values
% Find indices of non-zero and zero elements
nonZeroIndices = y ~= 0;
zeroIndices = y == 0;
% Perform interpolation only if there are non-zero elements
if any(nonZeroIndices)
% Interpolate only non-zero elements
yInterpolated = interp1(x(nonZeroIndices), y(nonZeroIndices), x(zeroIndices), 'linear', 'extrap');
% Replace zeros with interpolated values
y(zeroIndices) = yInterpolated;
end
% Update the column in the dataset
data(:, col) = y;
end
% Step 3: Save the modified data back to a file (optional)
writematrix(data, 'modified_dataset.txt');
For more information on using the “interp1” function, please refer to the documentation below:

Star Strider
Star Strider el 7 de Oct. de 2024
If you have R2016b, use the fillmissing function (introduced in R2016b) —
T1 = array2table(randi([0 9],10,5)) % Original
T1 = 10x5 table
Var1 Var2 Var3 Var4 Var5 ____ ____ ____ ____ ____ 2 8 7 7 1 8 7 5 6 7 8 4 4 5 3 4 3 3 1 3 4 6 5 5 7 0 4 7 0 4 8 2 2 7 2 9 7 3 0 9 5 0 5 8 3 2 2 6 3 4
loc = table2array(T1) == 0 % Logical Matrix Of ‘0’ Locations
loc = 10x5 logical array
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0
T1 = fillmissing(T1, 'linear', MissingLocations=loc, EndValues='nearest') % 'linear' Interpolation Of Missing Values With 'nearest' For End Values
T1 = 10x5 table
Var1 Var2 Var3 Var4 Var5 ____ ____ ____ ____ ____ 2 8 7 7 1 8 7 5 6 7 8 4 4 5 3 4 3 3 1 3 4 6 5 5 7 6 4 7 6 4 8 2 2 7 2 9 7 3 7.5 9 5 4.5 5 8 3 2 2 6 3 4
There are several methods to fill (interpolate) the missing values. See the documentation for those and other options.
.

Categorías

Más información sobre Data Preprocessing en Help Center y File Exchange.

Etiquetas

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by