Best way to deal with large data for deep learning?

4 visualizaciones (últimos 30 días)
Mona
Mona el 22 de Jun. de 2016
Respondida: Mona el 22 de Jun. de 2016
Hi, I have been trying image classification with CNNs. I have some 350,000 images that I read and stored in a 4D matrix of size (170 x 170 x 3 x 350,000) in a data.mat file. I used matfile to keep adding new images to my data.mat file. The resultant file is almost 20GB
The problem now is that I cannot access the saved images because I run out of memory.
Do anyone have any suggestions for more efficient ways to build large data for deep learning?
One solution I can apply is to split the data and train two networks one with weights initialized by the others final weights, but I don't want to take that route!
  2 comentarios
KSSV
KSSV el 22 de Jun. de 2016
You want to process the whole data (170 x 170 x 3 x 350,000) at once or you are using only one matrix (170X170X3) at one step?
Mona
Mona el 22 de Jun. de 2016
Yes, I am classifying the images using a CNN
trainNetworkm(Xtrain, Ytrain, opt)
Where Xtrain is supposed to contain all the training examples. So yes, I wish to pass the entire (170 X 170 X 3 X 350,000) to the network!

Iniciar sesión para comentar.

Respuesta aceptada

Mona
Mona el 22 de Jun. de 2016
Ok, I found a way around it. Instead of reading/writing the images to .mat files, I used imageDatastore.
So what I did is, I processed all my images (resized them to 200 x 200 then took random crops of 170 x 170) and then wrote all the processed images to .jpg files.
Then, I used imageDatastore as:
imds = imageDatastore('F:\All_train_images','IncludeSubfolders',true,...
'FileExtensions','.jpg','LabelSource', 'foldernames');
and finally trained the network with
trainNetworkm(imds,layers,opt)
turned out that writing images to .jpg files is even faster and consumes less memory on disk than saving the .mat image files .
Thanks Dr. Siva Srinivas Kolukula for attempting to help!

Más respuestas (0)

Categorías

Más información sobre Image Data Workflows en Help Center y File Exchange.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by