Bug in xlsread on linux

Question

0 votos

I am trying to read an XLSX file with xlsread on Linux, but I get an error

>> xlsread('test.xlsx', 'MySheet', 'A:A')
Error using xlsread (line 247)
No right hand side value for assignment.

The XLSX file I have was generated from a Python program and appears to be a valid XLSX file in that I can open it in both LibreOffice Calc and Excel and Python as well as MATLAB on Windows. Following through the code, the error occurs on line 11 of toolbox/matlab/iofun/private/getSheetNames.m

     sheetIDs = regexp(workbook_xml_rels, ...
                   '<Relationship Id="rId(\d+?)" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/worksheet" Target="worksheets/.+?.xml"/>', ...
                   'tokens' );

The issue appears to be that the "relationship" file xl/_rels/workbook.xml.rels has an unexpected format. Specifically, the lines look like

<Relationship Id="rId1" Target="worksheets/sheet1.xml" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/worksheet" />

It seems like MATLAB wants the line to be (Id Type Target)|, but the line in the problem files have (Id Target Type). I would like to replace the regex on line 11 of getSheetNames.m with something that will work for both (Id Type Target) and (Id Target Type), but my regex foo is too weak.

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Iniciar sesión para seguir la actividad

Answer 1

Ken Atwell el 16 de Jul. de 2014

1 voto

xlsread is tested against files created by Excel -- it looks like you found a case that is technically valid but not something Excel would typically create. Can you try:

Open the file in Excel
"Save as" the file to a new file, a "deep copy"

Can MATLAB open this new file?

3 comentarios
Mostrar 1 comentario más antiguo Ocultar 1 comentario más antiguo

Ken Atwell el 17 de Jul. de 2014

Abrir en MATLAB Online

You can try replacing the failing regexp with the following try/catch code to try an alternative should the original regexp fail:

    try
    sheetIDs = regexp(workbook_xml_rels, ...
                 '<Relationship Id="rId(\d+?)" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/worksheet" Target="worksheets/.+?.xml"/>', ...
                 'tokens' );
    catch
    sheetIDs = regexp(workbook_xml_rels, ...
                 '<Relationship Id="rId(\d+?)" Target="worksheets/.+?.xml" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/worksheet" />', ...
                 'tokens' );
    end

However, there is nothing to guarantee that you will not run into another issue if you manage to get past this one.

Daniel Shub el 23 de Jul. de 2014

This gets me over the first hurdle, but there is more parsing of the XML files to be done. I just switched to Python for this bit since it can read XLSX files correctly. TMW have acknowledged that it is a bug in their response to me, but apparently this is one of those bugs that they don't feel belongs on the official list of bugs.

Iniciar sesión para comentar.

Answer 2

Gabriel el 3 de Dic. de 2014

Abrir en MATLAB Online

0 votos

Ken's fix is not enough for me but I managed to get it to work for my file (no idea what application created it) with just two more edits.

2 files to edit:

 <matlabroot>/toolbox/matlab/iofun/private/getSheetNames.m
 <matlabroot>/toolbox/matlab/iofun/private/xlsreadXLSX.m

In getSheetNames.m: replace line 12 with (this is equivalent to Ken's solution):

 '<Relationship Id="rId(\d+?)" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/worksheet" Target="worksheets/.*\.xml"\s*/>', ...

and line 19 with:

 ['<sheet name=".*(?<=<sheet name=")(?<sheetName>.+?)(?=" sheetId=".*" r:id="rId' num2str(sheetIDs(i)) '"\s*/>)'], ...

In xlsreadXLSX.m: replace line 196 with:

 span = regexp(sheetData, '<dimension\s+ref="(?<start>[A-Z]+\d+)(?<end>:[A-Z]+\d+)?"\s*/>', 'names', 'once');

Note: all these do is add some "\s*" in the regexp to authorize possible space characters before the closing tags "/>". As it is compatible with the original version, I doubt it could introduce more issues.

1 comentario
Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos

Gabriel el 3 de Dic. de 2014

Sorry, I just realized that this does not answer your question at all. I should have read it more carefully.

My problem was only with spaces. You seem to have another one related with field ordering. So, in your case, Kens' solution is the way to go but as with my case, you will probably have to fix a few more regexps.

It seems that there is a long way to go before xlsread can be fully compatible with non-Excel sources.

Iniciar sesión para comentar.

Bug in xlsread on linux

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Respuestas (2)

3 comentarios
Mostrar 1 comentario más antiguo Ocultar 1 comentario más antiguo

1 comentario
Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos

Categorías

Etiquetas

Community Treasure Hunt

Bug in xlsread on linux

0 comentarios Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Respuestas (2)

3 comentarios Mostrar 1 comentario más antiguo Ocultar 1 comentario más antiguo

1 comentario Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos

Categorías

Etiquetas

Ver también

Community Treasure Hunt

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

3 comentarios
Mostrar 1 comentario más antiguo Ocultar 1 comentario más antiguo

1 comentario
Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos