Readtable reads numerical row as strings

When using the function
tr_data = readtable(...)
on the train.csv file I have attached here, the last column (Weightfirst) is seen as strings. However when creating this CSV (with python) I have made sure that only numerical data was contained in it, I have double checked to make sure and I can confirm that the column only contains float values (and NaNs).
On top of that, when reading test.csv, which I have also attached here, the column is correctly seen as numerical. But both test.csv and train.csv come from the same code, the only difference being in the data those csvs contain.
I have been looking for hours to try and understand what was different about those two csv, but I can't find anything, I have no idea why train.csv is not being read correctly.
I have tried running the element wise function
if ischar(x)
o = str2double(x);
else
o = x;
end
But the column still contains only strings. I have read some similar problems and the solution was to add the characters which caused problems to "TreatAsEmpty", the problem is I'm not even sure any particular character is causing the problem here.
Any insights on what's going on here?

4 comentarios

Stephen23
Stephen23 el 11 de Jun. de 2018
Editada: Stephen23 el 11 de Jun. de 2018
My suspicion is that the difference is on line two: that particular column is empty so possibly for some reason readtable treats that entire column as char. Try entering (by hand) one numeric value in the last column of row two of train.csv, and let us know what happens... then hopefully we can figure out a solution.
Luca Rizzello
Luca Rizzello el 11 de Jun. de 2018
Indeed, I manually set the value of the last column of row two to a number and now the column is being read as numerical correctly, thanks
However, I'm going to use this code on multiple CSVs so modifying them by hand every time would be tedious, any way of doing this automatically?
Stephen23
Stephen23 el 11 de Jun. de 2018
Editada: Stephen23 el 11 de Jun. de 2018
See my answer below.
Luca Rizzello
Luca Rizzello el 11 de Jun. de 2018
Editada: Luca Rizzello el 11 de Jun. de 2018
Alright, I've added some code to set the format manually and now it works, I might send a bug report later
Thank you a lot for your help, if you want to add that solution as an answer I'll validate it.

Iniciar sesión para comentar.

 Respuesta aceptada

Stephen23
Stephen23 el 11 de Jun. de 2018
"any way of doing this automatically?"
I would start by setting the format. You could do this manually, i.e.
repmat('%f',1,numberOfColumns)
or call detectImportOptions and see what options you have there...
I think this is worth making a bug report for:

Más respuestas (0)

Categorías

Más información sobre Variables en Centro de ayuda y File Exchange.

Productos

Versión

R2018a

Etiquetas

Preguntada:

el 11 de Jun. de 2018

Editada:

el 11 de Jun. de 2018

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by