Fixed-width data import with textread()
4 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
Nikolay Rodionov
el 21 de Dic. de 2013
Hi guys,
Thanks for your help in advance. I'm having issues importing some data using textread(). I am trying to use it to import a fixed-width txt dataset, but the function is blending some data columns when there is no whitespace between string and numerical datatypes.
Specifically, I am trying to import a data extract from a gromacs .gro file. You can find information about the data structure here: http://manual.gromacs.org/online/gro.html
My code:
[resnum resname atomname atomnr x y z a b c] ...
= textread('conf-mod.gro','%5d%5s%-5s%5d%8.3f%8.3f%8.3f%8.4f%8.4f%8.4f');
It works for data lines like:
1ALA CA 3 56.249 52.119 83.467 0.0000 0.0000 0.0000
But it blends 'atomname' and 'atomnr' data into the 'atomname' column for lines like this:
14195ASP OD119731 55.954 54.890 95.494 0.0000 0.0000 0.0000
Note: 'atomname' should equal OD here, and 'resname' comes out fine strangely enough. I don't understand why this is happening because I've clearly outlined the fixed width format of the dataset. I've tried converting %-5s to %-5c, but it did not help.
Any suggestions?
0 comentarios
Respuesta aceptada
dpb
el 21 de Dic. de 2013
Editada: dpb
el 21 de Dic. de 2013
Unfortunately, can't do it with standard Matlab i/o formatting strings; they just don't honor the fixed width, blank-delimited fields w/o at least one blank. Sad and pathetic and imo utterly unacceptable but that's just the way it is.
If you can't write the data files in another format that has delimiters, you've one of several choices...
a) read the whole file as character array and do character substitution to insert delimiters and then parse the modified array (textscan, say),
b) read a line at a time and parse individual fields w/ sscanf or the like. Something like
l=fgetl(fid); % read a line
resnum=[resnum;sscanf(l(1:5),'%d');
resname=[resname;sscanf(l(6:10),'%s');
...etc., ...
When you get past the character data you can then use an array and parse the six numeric fields together. Or, lastly,
c) see if regular expressions will actually honor a field width--I'm not conversant enough with it to know otomh...
Lastly, complain to TMW through official support that they need to find a solution for fixed-width input parsing...altho they seem to want to not admit it, such files do exist and aren't going away irregardless and it's absurd one can't read them easily in Matlab.
0 comentarios
Más respuestas (0)
Ver también
Categorías
Más información sobre Particle & Nuclear Physics en Help Center y File Exchange.
Productos
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!