Borrar filtros
Borrar filtros

Reading .txt file in MATLAB with issue in formatting

6 visualizaciones (últimos 30 días)
Tulkkas
Tulkkas el 23 de Feb. de 2022
Respondida: Jeremy Hughes el 24 de Feb. de 2022
I am using the MATLAB 2021b function readtable to read the following text file:
ISSUERID|FISCAL_YEAR|FIELD_ID|VALUE|PUBLISHED_DATE|SOURCE|DATA_TYPE|ADDITIONAL_INFO
IID000000002137286||DIVERSITY_DISCLOSURE_ETHNICITY_SOURCE|"https://www.cubesmart.com/about-us/corporate-responsibility/\""||{}||{}
The separator is the | (bar) character. Aenter code heres you can see, at the end of the "https://www.cubesmart.com/about-us/corporate-responsibility/\"" field value, there is the following \" character, which messes up the reading. I am trying to use the options 'Whitespace' to ignore it but for some reason it does not work. The code I am running is:
T_equ = readtable(file_name, 'FileType', 'text', 'Delimiter', {'|'}, 'Whitespace', '\"');
where file_name is just the path to the .txt file.
The results of the import is an empty table. I understand this results if the character \" would be read as a special character but from my understanding the 'Whitespace', '\"' pair/value argument should force the readtable function to ignore it. What am I missing here?
  3 comentarios
Tulkkas
Tulkkas el 23 de Feb. de 2022
I did try with ouble slash but it does not work either. How would you read the text without interpreting the formatting? And then do the parsing?
Rik
Rik el 23 de Feb. de 2022
For example with my readfile function (which you can get from the FEX or with the AddOn manager), or with the readlines function.
You could use the split function to split based on the | character (or even use regexp).
The result will not be a table yet, but it should be easy to convert it to what you need.

Iniciar sesión para comentar.

Respuestas (1)

Jeremy Hughes
Jeremy Hughes el 24 de Feb. de 2022
The issue is that \" is not how CSV files (and thus readtable) escape doube-quotes. To escape quotes, the file should have "".
Like this:
X|Y|Z|"And something in ""quotes""."
Otherwise, readtable will keep reading after \"" until it finds a lone double-quote character. I would guess that's what you're seeing.
The only way I can think to resolve this is by reading the file, and replacing \" with "" then write the data back out. There's no way to get readtable to treat \" as an escaped quote.
text = fileread(fn);
text = replace(text,'\"','""');
fid = fopen(fn,'w'); % or use a new file name if you don't want to overwrite it.
fwrite(fid,text);
fclose(fid);

Productos


Versión

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by