One more thing, the entries in the file are tab-separated, but I can change the format if needed, as long as I get the data imported.
Error importing file using dataset
1 visualización (últimos 30 días)
Mostrar comentarios más antiguos
I need to import a file into dataset, here's a file snippet:
User Starting time(up) Starting time(down) Duration(up) Duration(down) Bytes(up) Bytes(down) Packets(up) Packets(down)
41.224.9.214 1366669568254 1366669568254 195 94 488 502 10 8
I need the first column as string, 2:5 columns as int, and the rest as double. I've tried using ds=dataset('File', cur_file, 'Format', 's%u64%u64%u64%u64%d%d%d%d', 'HeaderLines', 1, 'Delimiter', '\t');
But get the error: Error using dataset/readFile (line 165) Variable lengths must all be the same. You may have specified the format string, delimiter, or number of header lines incorrectly.
Error in dataset (line 347) a = readFile(a,fileArg,otherArgs);
I've also tried
ds=dataset('File', cur_file, 'Delimiter', '\t', 'ReadObsNames',true);
That gives: Error using dataset/readFile (line 195) The number of variable names read from ~/matlab/data/ContinuousUserServiceProfiles/ctu_cmp_download/147.32.86.92:443:6 does not match the number of data columns. You may have specified the format string, delimiter, or the number of header lines incorrectly.
Error in dataset (line 347) a = readFile(a,fileArg,otherArgs);
What can I do?
Respuestas (2)
Tom Lane
el 11 de Sept. de 2013
Would you explain some more? Your quoted line starts "41.2249.214" but you say you want "4" as string, "1.22" as integer, and the rest as double including apparently "4.9.214".
If I make a file that consists of a header line followed by multiple copies of the line you quoted, I can read it like this:
fmt = '%1s%5f%2f %f %f %f %f %f %f %f %f %f';
dataset('file','deleteme.txt','delimiter',' ','headerlines',1,'readvar',false,'format',fmt)
I hope this gives you an idea of how to proceed.
0 comentarios
Peter Perkins
el 12 de Sept. de 2013
Like Tom, I'm a little unclear on what you're asking for. I'm going to assume from your format string that you want to read the tab-separated line
41.224.9.214 1366669568254 1366669568254 195 94 488 502 10 8
as
- the string (IP address?) "41.224.9.214"
- the integers 1366669568254, 1366669568254, 195, 94
- the doubles 88, 502, 10, 8
So first, you want %f ("floating"), not %d ("decimal", I think). But that's not the problem. Based on the two error messages you're seeing, I'd have to guess that you either have some stray tabs at the ends of some of the lines in your file, or some lines that are short. You might experiment with the first few lines of you file to get the format string working, and then look through your file to try to find the bad line.
If you're up for an adventure, you should be able to use MATLAB's debugger to figure out the line in your data file that caused the problem, by setting a breakpoint at line 165 of dataset/readFile.m. The easiest way to get there is just to click on the "line 165" in the error message in the command window, it's a hyperlink. Set the breakpoint, run your ds=dataset(...) command, and then when execution stops at your breakpoint, take a look at the variable called "raw". It should be a 1x9 cell array, and the lengths of the contents of each cell will tell you how far the import got before it failed. Just type "raw" at the command line and you should see the contents' sizes.
Ver también
Categorías
Más información sobre Spreadsheets en Help Center y File Exchange.
Productos
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!