Borrar filtros
Borrar filtros

How to split huge .csv files to multiple .csv files based on size?

13 visualizaciones (últimos 30 días)
Anjan
Anjan el 20 de Nov. de 2018
Editada: Walter Roberson el 13 de Feb. de 2021
I am trying to split huge .csv files (11 GB) that has both combination of text and numbers into mutiple files based on size (0.5 GB each). I tried using some of the answers in the matlab community but no luck
I hope someone can help!
  4 comentarios
Walter Roberson
Walter Roberson el 21 de Nov. de 2018
Editada: Walter Roberson el 13 de Feb. de 2021
Which OS? The easiest way to do this is with the unix split command , quite easy . For Windows I would look at https://www.gdgsoft.com/gsplit/
Anjan
Anjan el 25 de Nov. de 2018
Thanks Walter, It does work but I had to split many files. The code that user "dbp" recommended in the below comment works great if you want to split files in Matlab.

Iniciar sesión para comentar.

Respuesta aceptada

dpb
dpb el 21 de Nov. de 2018
fidR=fopen('originalfile.csv','r'); % open the big file to read
NFiles=20; % number of files to create
NPerFile=round(Nrows/NFiles)+1; % rough number records / file
for i=1:NFiles
try
fidW=fopen(num2str(i,'fileNew%00d.csv'),'w'); % open a file to write
for j=1:NPerFile
fwrite(fidW,fgets(fidR)); % transcribe lines verbatim
end
fidW=fclose(fidW); % close that one
catch
fidW=fclose(fidW); % close that one
fidR=fclose(fidR);
end
is a poor-man's split tossed off at the console here; the rounding for number of records per file should catch the whole file; the error routine should only occur when the last file runs into feof on the last one if my logic is right.
And, yes, unfortunately, we can't always make others do sensible things about how they collect data we're subsequently given, granted...I recognized that was likely the case, hence the smiley.
  4 comentarios
Tanumaya Bhowmik
Tanumaya Bhowmik el 13 de Feb. de 2021
How will this script work on a .csv file with header? Would it be better to remove the header from the big file, do the split, and then add headers on the top line of each of those split files? How can I concatanate headers as the first line of an entire directory of .csv files? Thank
Walter Roberson
Walter Roberson el 13 de Feb. de 2021
compose() or the undocumented sprintfc() are good at creating separate outputs.

Iniciar sesión para comentar.

Más respuestas (0)

Categorías

Más información sobre Environment and Settings en Help Center y File Exchange.

Etiquetas

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by