Imported data becomes NaN

155 visualizaciones (últimos 30 días)
Robin Nilsson
Robin Nilsson el 17 de Oct. de 2022
Comentada: dpb el 17 de Oct. de 2022
I'm trying to import data from an excel document with many rows and columns.
When I import it in any way Matlab gives Nan.
Ex. 1. I import from excel as Table (also tried column vectors, column vectors and numeric matrix), If I open it in the workspace I see numbers. When loading in matlab command window it becomes NaN all of it.
Ex 2. I save the excel into txt files and other various file types and try to use "loadtable", "readcell", "load" but it becomes "NaN" or "1x1 missing".
Does anyone have an idea of what I'm doing wrong?
Update, I can now see the reason for my troubles are that the imported file gives many NaN and I didn't realise I had so many NaN in the end. All columns are not the same length. Now I just need to remove remove all NaN. I assume I need to take each column and remove nan for that column.
  4 comentarios
Stephen23
Stephen23 el 17 de Oct. de 2022
@Robin Nilsson: upload the actual data file by clicking the paperclip button.
Robin Nilsson
Robin Nilsson el 17 de Oct. de 2022
Sure. Here it is.

Iniciar sesión para comentar.

Respuesta aceptada

dpb
dpb el 17 de Oct. de 2022
opt=detectImportOptions(websave('CA_dvs.xlsx','https://www.mathworks.com/matlabcentral/answers/uploaded_files/1159783/CA_dvs.xlsx'));
tCA=readtable(websave('CA_dvs.xlsx','https://www.mathworks.com/matlabcentral/answers/uploaded_files/1159783/CA_dvs.xlsx'));
head(tCA)
CA10 CA10abs CA20 CA20abs CA30 CA30abs CA40 CA40abs CA50 CA50abs CA60 CA60abs CA70 CA70abs CA80 CA80abs CA70d CA70absd CA60d CA60absd CA50d CA50absd CA40d CA40absd CA30d CA30absd CA20d CA20absd CA10d CA10absd CA0d CA0absd ____ _________ ____ _______ ____ _______ ____ __________ ____ _________ ____ _______ ____ _______ ____ _______ _____ ________ _____ _________ _____ ________ _____ ________ _____ ________ _____ ________ _____ ________ ____ _______ 10 NaN 20 NaN 30 NaN 40 NaN 50 NaN 60 NaN 70 NaN 80 NaN 70 NaN 60 NaN 50 NaN 40 NaN 30 NaN 20 NaN 10 NaN 0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.02 -0.001441 0.02 0 0.02 0 0.02 0 0.02 0 0.01 0 0.02 0 0.02 0 0.02 0 0.01 0 0.02 0 0.02 0 0.02 0 0.01 0 0.02 0 0.01 0 0.04 0 0.04 0 0.03 0 0.04 -0.0014059 0.03 0 0.03 0 0.04 0 0.03 0 0.04 0 0.03 0 0.03 0 0.03 0 0.04 0 0.03 0 0.04 0 0.03 0 0.05 0 0.05 0 0.05 0 0.05 0 0.05 0 0.05 0 0.05 0 0.05 0 0.06 0 0.05 0 0.05 0 0.05 0 0.06 0 0.05 0 0.06 0 0.05 0 0.07 -0.001441 0.07 0 0.07 0 0.07 0 0.07 -0.001395 0.07 0 0.07 0 0.07 0 0.07 0 0.06 0 0.07 0 0.07 0 0.08 0 0.07 0 0.07 0 0.07 0 0.09 -0.001441 0.09 0 0.09 0 0.09 0 0.09 -0.001395 0.08 0 0.09 0 0.09 0 0.09 0 0.08 0.0013491 0.08 0 0.09 0 0.09 0 0.08 0 0.09 0 0.08 0
Looks ok excepting more than likely there are additional header lines in the file besides the first...let's see what the file really, really contains...
CA=readcell(websave('CA_dvs.xlsx','https://www.mathworks.com/matlabcentral/answers/uploaded_files/1159783/CA_dvs.xlsx'));
CA(1:5,:)
ans = 5×32 cell array
{'CA10' } {'CA10abs'} {'CA20' } {'CA20abs'} {'CA30' } {'CA30abs'} {'CA40' } {'CA40abs'} {'CA50' } {'CA50abs'} {'CA60' } {'CA60abs'} {'CA70' } {'CA70abs'} {'CA80' } {'CA80abs'} {'CA70d' } {'CA70absd'} {'CA60d' } {'CA60absd'} {'CA50d' } {'CA50absd'} {'CA40d' } {'CA40absd'} {'CA30d' } {'CA30absd'} {'CA20d' } {'CA20absd'} {'CA10d' } {'CA10absd'} {'CA0d' } {'CA0absd'} {[ 10]} {'%' } {[ 20]} {'%' } {[ 30]} {'%' } {[ 40]} {'%' } {[ 50]} {'%' } {[ 60]} {'%' } {[ 70]} {'%' } {[ 80]} {'%' } {[ 70]} {'%' } {[ 60]} {'%' } {[ 50]} {'%' } {[ 40]} {'%' } {[ 30]} {'%' } {[ 20]} {'%' } {[ 10]} {'%' } {[ 0]} {'%' } {'Time' } {'Abs [%]'} {'Time' } {'Abs [%]'} {'Time' } {'Abs [%]'} {'Time' } {'Abs [%]'} {'Time' } {'Abs [%]'} {'Time' } {'Abs [%]'} {'Time' } {'Abs [%]'} {'Time' } {'Abs [%]'} {'Time' } {'Abs [%]' } {'Time' } {'Abs [%]' } {'Time' } {'Abs [%]' } {'Time' } {'Abs [%]' } {'Time' } {'Abs [%]' } {'Time' } {'Abs [%]' } {'Time' } {'Abs [%]' } {'Time' } {'Abs [%]'} {[ 0]} {[ 0]} {[ 0]} {[ 0]} {[ 0]} {[ 0]} {[ 0]} {[ 0]} {[ 0]} {[ 0]} {[ 0]} {[ 0]} {[ 0]} {[ 0]} {[ 0]} {[ 0]} {[ 0]} {[ 0]} {[ 0]} {[ 0]} {[ 0]} {[ 0]} {[ 0]} {[ 0]} {[ 0]} {[ 0]} {[ 0]} {[ 0]} {[ 0]} {[ 0]} {[ 0]} {[ 0]} {[0.0200]} {[-0.0014]} {[0.0200]} {[ 0]} {[0.0200]} {[ 0]} {[0.0200]} {[ 0]} {[0.0200]} {[ 0]} {[0.0100]} {[ 0]} {[0.0200]} {[ 0]} {[0.0200]} {[ 0]} {[0.0200]} {[ 0]} {[0.0100]} {[ 0]} {[0.0200]} {[ 0]} {[0.0200]} {[ 0]} {[0.0200]} {[ 0]} {[0.0100]} {[ 0]} {[0.0200]} {[ 0]} {[0.0100]} {[ 0]}
And, indeed, that's what we see -- there's another pair of lines prior to the actual data that is metadata apparently; one has to recognize what is in the file one is trying to read and adjust expectations to fit reality.
In this case, I'd suggest modifying the optons object slightly first before importing -- since the percentages are embedded in the first header row, one can extract them later and then just begin reading the real data after skipping the additional two header lines. That would look like--
opt.DataRange='A4'; % keeping the opt object already read, just fixup where data starts
tCA=readtable(websave('CA_dvs.xlsx','https://www.mathworks.com/matlabcentral/answers/uploaded_files/1159783/CA_dvs.xlsx'),opt);
head(tCA)
CA10 CA10abs CA20 CA20abs CA30 CA30abs CA40 CA40abs CA50 CA50abs CA60 CA60abs CA70 CA70abs CA80 CA80abs CA70d CA70absd CA60d CA60absd CA50d CA50absd CA40d CA40absd CA30d CA30absd CA20d CA20absd CA10d CA10absd CA0d CA0absd ____ _________ ____ _______ ____ _______ ____ __________ ____ _________ ____ _______ ____ _______ ____ _______ _____ ________ _____ _________ _____ __________ _____ ________ _____ ________ _____ ________ _____ ________ ____ _______ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.02 -0.001441 0.02 0 0.02 0 0.02 0 0.02 0 0.01 0 0.02 0 0.02 0 0.02 0 0.01 0 0.02 0 0.02 0 0.02 0 0.01 0 0.02 0 0.01 0 0.04 0 0.04 0 0.03 0 0.04 -0.0014059 0.03 0 0.03 0 0.04 0 0.03 0 0.04 0 0.03 0 0.03 0 0.03 0 0.04 0 0.03 0 0.04 0 0.03 0 0.05 0 0.05 0 0.05 0 0.05 0 0.05 0 0.05 0 0.05 0 0.05 0 0.06 0 0.05 0 0.05 0 0.05 0 0.06 0 0.05 0 0.06 0 0.05 0 0.07 -0.001441 0.07 0 0.07 0 0.07 0 0.07 -0.001395 0.07 0 0.07 0 0.07 0 0.07 0 0.06 0 0.07 0 0.07 0 0.08 0 0.07 0 0.07 0 0.07 0 0.09 -0.001441 0.09 0 0.09 0 0.09 0 0.09 -0.001395 0.08 0 0.09 0 0.09 0 0.09 0 0.08 0.0013491 0.08 0 0.09 0 0.09 0 0.08 0 0.09 0 0.08 0 0.11 -0.001441 0.1 0 0.1 0 0.11 0 0.1 -0.001395 0.1 0 0.11 0 0.1 0 0.11 0 0.1 0.0013491 0.1 -0.0013611 0.1 0 0.11 0 0.1 0 0.11 0 0.1 0 0.12 -0.001441 0.12 0 0.12 0 0.12 0 0.12 -0.001395 0.12 0 0.12 0 0.12 0 0.13 0 0.12 0.0013491 0.12 -0.0013611 0.12 0 0.13 0 0.12 0 0.12 0 0.12 0
And, voila! joy ensues...to get the percentages for each pair of observations is easy enough...
pct=str2double(extract(tCA.Properties.VariableNames(1:2:end),digitsPattern))
pct = 1×16
10 20 30 40 50 60 70 80 70 60 50 40 30 20 10 0
  1 comentario
Robin Nilsson
Robin Nilsson el 17 de Oct. de 2022
Thank you for the very detailed help. I noticed that my columns for the different data columns are different length which causes allot of empty NaN which was what I was observing along with the text NaN in the first rows.

Iniciar sesión para comentar.

Más respuestas (1)

Kevin Holly
Kevin Holly el 17 de Oct. de 2022
I would suggest using the Import Data button found on the toolstrip.
After selecting the Excel spreadsheet to open, it should give you a preview and show what cells are unimportable. By default these cells are converted to NaN values. This could help you resolve the problem.
  3 comentarios
Robin Nilsson
Robin Nilsson el 17 de Oct. de 2022
Thank you, I realised that I have many NaN due to some of the columns being much longer than the others.
dpb
dpb el 17 de Oct. de 2022
The import options object has ways to handle those as well; you can either use the default missing value and get the NaN or have any records with missing data skipped entirely, or a plethora of other options.

Iniciar sesión para comentar.

Productos

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by