Best way to process large data from Dropbox

9 visualizaciones (últimos 30 días)
Shri Chand
Shri Chand el 27 de Mzo. de 2023
Comentada: Walter Roberson el 30 de Mzo. de 2023
I have a very large permanent Dropbox link which has 6 folders in it. Each folder has the following file organization (I specified the organization for Folder 1 only where - represents a folder and *represents a file and 3 lines with dots means continue same pattern, but all of the main folders and subfolders are organized the same).
- Folder 1
- 12390r3398
- 20253023432
* A2308432.edf
* calibration.xlsx
* EventList.xlsx
* Stage.csv
- 20253023789
* A2308449.edf
* calibration.xlsx
* EventList.xlsx
* Stage.csv
.
.
.
- 202530243808
* A23086903.edf
* calibration.xlsx
* EventList.xlsx
* Stage.csv
- 12390r4490
- 20258900023
* A23489001.edf
* calibration.xlsx
* EventList.xlsx
* Stage.csv
.
.
.
- 20258978290
* A23489876.edf
* calibration.xlsx
* EventList.xlsx
* Stage.csv
.
.
.
- Folder 2
.
.
.
- Folder 3
.
.
.
- Folder 4
.
.
.
- Folder 5
.
.
.
-Folder 6
.
.
.
Now, I wrote a function that I will call myfunction. The inputs of myfunction are a .edf file and a .xlsx file. The output of myfunction is a 6 column array, and the number of rows depends on the input files.
I want to apply myfunction to all of the file pairs in each of the smallest subfolders in this Dropbox link. For example, I want to have arr1 = myfunction('A2308432.edf', 'EventList.xlsx') for the file pair located in subfolder '20253023432' above and then I want arr2 = myfunction('A2308449.edf', 'EventList.xlsx') for the file pair located in subfolder '20253023789' and so on up until the last smallest subfolder in Folder 3 (please notice that even though each subfolder contains an 'EventList.xlsx' file, that each of these Excel files are actually different despite having the same name). I cannot directly download this dropbox link, or even any of the 3/6 main folders, or even a certain subfolder of a main folder as they are too large (the entire Dropbox has about 100 GB of data).
Does anyone know how I can do this in MATLAB by calling myfunction on all of these file pairs directly from the Dropbox link, or if I can iteratively download file pairs from Dropbox and then run my function on the file pair and then delete the file pair before moving on to the next one? If you can provide code to help, I would very much appreciate as I do not have a systems background. Thanks.
  4 comentarios
Walter Roberson
Walter Roberson el 27 de Mzo. de 2023
I mean writing code in MATLAB that would do the downloading.
Do you have "Dropbox Plus" ? That would allow you to install dropbox desktop to get a mountpoint such as ~/Dropbox that can be used to refer to files stored on Dropbox.
If you do not have the "Plus" subscription then Dropbox Desktop would want to copy all of the files to your local system.
Shri Chand
Shri Chand el 27 de Mzo. de 2023
@Walter Roberson I do not have Dropbox Plus, but I would be willing to pay for it for this purpose. I just want to clarify, if I do get Dropbox Plus and download the Dropbox Desktop, then the files in this Dropbox folder won't actually be downloaded on my computer, but instead I will have a way to download them through MATLAB?
If this is true, then do you have any reference for how I could write code to download and then delete these files from Dropbox Desktop?

Iniciar sesión para comentar.

Respuestas (1)

Walter Roberson
Walter Roberson el 27 de Mzo. de 2023
Movida: Walter Roberson el 27 de Mzo. de 2023
I will use MacOS / Linux syntax for referring to files for this purpose
DBdir = "~/Dropbox/Fab403/Infrared/Tests";
EL = "EventList.xlsx";
Tdir = tempname() + ".cache";
mkdir(Tdir);
edfinfo = dir(DBdir, '**/*.edf');
edfnames = fullfile({edfinfo.folder}, {edfinfo.name});
numedf = numel(edfnames);
for K = 1 : numedf
thisedf = edfnames{K};
[thisfolder, thisbase, thisext] = fileparts(thisedf);
thisEL = fullfile(thisfolder, EL);
copyfile({thisedf, thisEL}, TDir);
Tedfname = fullfile(TDir, [thisbase thisext]);
TELname = fullfile(TDir, EL);
try
myfunction(Tedfname, TELname);
end
try delete(Tedfname); end
try delete(TELname); end
end
  3 comentarios
Shri Chand
Shri Chand el 28 de Mzo. de 2023
Editada: Shri Chand el 28 de Mzo. de 2023
@Walter Roberson Update: I think there is a problem with actually copying the file here. Just to clarify, even though I have my entire Dropbox file organization on my local computer, each of the files has size zero bytes, but will download and open as soon as I double click them.
The code edit I posted above works up until I apply myfunction to TELname and Tedfname. Even though Tedfname is a temporary filepath to where I "copied" my EDF file, when I apply myfunction to it, then there is an error (specifically because my function will call edfread(Tedfname) and there is an error coming from edfread saying that "Array indices must be positive integers or logical values." I think this is because even though Tedfname is a temporary filepath, we did not actually download the file we are trying to copy to it. I verified that copying the files did not actually download them (open(TELname) and open(Tedfname) produce empty files), but if I manually download a set of files on Dropbox Desktop and run this code on them, it will work.
Any advice on how I can fix this downloading/copying issue?
Walter Roberson
Walter Roberson el 30 de Mzo. de 2023
Sorry, I do not have Dropbox Plus to test this with (it will not work regular Dropbox), and I am not willing to pay for Dropbox Plus.

Iniciar sesión para comentar.

Categorías

Más información sobre File Operations en Help Center y File Exchange.

Productos


Versión

R2022a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by