How to determine number of rows in an excel sheet without uploading it into matlab

15 visualizaciones (últimos 30 días)
Hi,
I have a very large csv file and I want to determine how many rows are in the csv without actually uploading it. How can i do that?

Respuestas (2)

Sulaymon Eshkabilov
Sulaymon Eshkabilov el 17 de Jun. de 2021
Just to see the overall size without importing the data from a *.csv file, uiimport() can be used but this fcn is quite slow though. It shows the size of the data being imported.

Walter Roberson
Walter Roberson el 17 de Jun. de 2021
filename = 'YourFile.csv';
lines_in_file = -inf;
if isunix() %mac or linux
cmd = sprintf('wc -l "%s"', filename);
[status, msg] = system(cmd);
if status == 0
lines_in_file = sscanf(msg, '%d');
else
warning('could not determine number of lines in file "%s', filename);
end
else %windows does not have wc executable
[fid, msg] = fopen(filename, 'rt');
if fid < 0
warning('could not determine number of lines in file "%s', filename);
else
linecount = 0;
lines_in_file = 0;
while ~feof(fid)
thisline = fgetl(fid);
if ~ischar(thisline); break; end %end of file
linecount = linecount + 1;
num_nonspace = nnz(~isspace(thisline));
if num_nonspace > 0
lines_in_file = linecount;
end
end
fclose(fid)
end
end
As a design decision, the count for the windows code deliberately does not count any trailing lines that contain only whitespace.
When you are importing a file and it has trailing whitespace lines, then exactly what happens depends upon exactly how you do the importing: some import methods discard those lines, and some import methods treat them as missing data.
It is quite common that there is an empty line at the end of files. This arises out of the question of whether newline is a line separator or is it a line terminator ? If it is a line terminator, then the last line of a "text" file would have to end with a newline, and then end of file would be immediately after that. If it is a line separator, then the last line of a "text" file might simply end with end of file.
The C and POSIX standards say that newline is a line separator, and so files can just... end... with no trailing newline. This has the advantage that if you just "end" a file then you can continue writing more data on the same line by seeking to end of file and writing there. But using newline as a routine line terminator has the advantage that if you have reason to believe that the line terminator is there, then to write additional lines you can just seek to the end of file and write new data.
But when you treat newline as a line separator instead of a line terminator, then you read the "last" line, see the terminator, stop processing... and you have not seen end-of-file yet. So you ask to read another line, and empty line gets returned (there is emptiness before the end of file.)
So there are technical challenges in considering what the "number of lines" in the file are. Is the last (empty) line between the newline and end of file to be counted? What about if the user simply has some blank lines there?

Etiquetas

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by