MATLAB stuck at "Opening Large Spreadsheet"
9 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
Brandon Flores
el 29 de Oct. de 2023
Comentada: Drew
el 3 de Nov. de 2023
I have been trying to import an excel file to MATLAB. However, it does not work at all. The green box from the import tool just keeps running. I also tried using the commands readtable and such, as well as installing and uninstalling multiple versions of MATLAB. I want to import the whole table to use it in regression learner.
1 comentario
Walter Roberson
el 29 de Oct. de 2023
How large is it? approximate number of rows and columns? Are the columns mostly numeric or are there a bunch of text columns as well? If so are do they have a lot of text in them?
Respuesta aceptada
Drew
el 3 de Nov. de 2023
Perhaps solve the problem with the divide-and-conquer method. You could set this up programatically with code based on the readtable function. This approach could also identify if there are particular rows that are causing issues. You could potentially also explore whether using csv versus xlsx as the source makes a difference.
2 comentarios
Walter Roberson
el 3 de Nov. de 2023
it appears to be an xlsx with a single sheet. The internal representation of that sheet would be a large XML file containing all of the numeric data, and possibly some of the text data, along with references to some of the text data that had been moved to a different XML file.
I do not know about the current code, but the previous readtable strategy was to parse and convert the entire sheet, and then afterwards after the whole sheet had been returned, to trim off the unwanted parts.
For efficiency reasons, unless perhaps you build a finite state machine to do the parsing (such as using yacc and Lex), at the very least you are going to parse a row at a time and convert the row and column number references in the XML text. Potentially the row/column could be tested against the range to determine whether it is worth converting the text representation of the cell contents.
The point here is that splitting the reading by range might not solve much. There is no way in xlsx files to skip directly to a particular row. There is no marginal index kept that might permit quick consultation to skip more directly to a given row.
Drew
el 3 de Nov. de 2023
I agree that always starting with the same large xlsx may not yield any improvement if MATLAB begins by parsing the entire file. Divide and conquer can be tried in various ways. Keep in mind that the issue may be caused by size, or by a particular problem with a paricular header, or a particular data entry. Some ideas:
(1) Use Excel to re-save a subset of the file with a portion of the data. Start with just the header and one data row. Test to see if readtable works.
(2) Use Excel to re-save the file with half the data, or one tenth of the data. See if readtable works. Observe memory use.
(3) Use Excel to re-save as csv. See if readtable works.
Más respuestas (0)
Ver también
Categorías
Más información sobre Spreadsheets en Help Center y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!