Import of tables from R where the first line describing the column names is one element shorter

6 visualizaciones (últimos 30 días)
Many files exported from R looks something like this:
"var1" \t "var2"
"row1" \t val1 \t val2
"row2" \t val2 \t val2
The problem is that the line describing the variables is one element shorter, which readtable doesn't like much. Is there any way I can make that work? Editing the input file by changing first row to
\t "var1" \t "var2"
fixes the problem
I'm trying to read it with the line
f = readtable(filename, 'ReadVariableNames',true, 'ReadRowNames', true, 'Delimiter', '\t');
This should be a standard thing, but I just cannot make it work. I don't want to have to edit the input files all the time?

Respuesta aceptada

Guillaume
Guillaume el 15 de Sept. de 2019
Editada: Guillaume el 15 de Sept. de 2019
Yes, readtable expects the variable name line to have a placeholder (DimensionName) for the row name column. I suggest you raise an enhancement request with Mathworks.
Here is a roundabout way to get it to work:
%1st grap the variable names. Matlab should add an extra variable name at the end of the list to match the number of data columns
%ignore row names for now
opts = detectImportOptions(yourfile, 'ReadVariableNames', true, 'VariableNamesLine', 1)
varnames = opts.VariableNames;
%then tell matlab that there are row names. That messes up the variable names. So get these from the previous opts
opts = detectImportOptions(yourfile, 'ReadVariableNames', true, 'VariableNamesLine', 1, 'ReadRowNames', true)
opts.varnames = ['RowNames', varnames(1:end-1)]; %Still need a name for the row names columns.
opts.Datalines = [2, Inf]; %that's also messed up
result = readtable(yourfile, opts)
It works on the file I've tested but because of the complex heuristics of detectImportOptions it may break on more complex files.
Tested on 2019b. Not sure how it behaves with 2016b where detectImportOptions may not be as sophisticated.

Más respuestas (1)

Johan Gustafsson
Johan Gustafsson el 15 de Sept. de 2019
Thanks, however this does not work for me, I suspect that the ReadVariableNames property is something that comes with 2019b, is that so? I tried to upgrade to 2018b, but it didn't help. I get the following error:
Error using detectImportOptions
'ReadVariableNames' is not a recognized parameter. For a list of valid name-value pair arguments, see the
documentation for detectImportOptions.
Is there another trick I could use? I was thinking I could do something using fgetl and a regexp, but it is kind of a messy way to do it?

Categorías

Más información sobre Tables en Help Center y File Exchange.

Productos


Versión

R2016b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by