How to read MS word file(*.doc/docx).

38 visualizaciones (últimos 30 días)
kevin levin
kevin levin el 14 de Jul. de 2017
Comentada: Ioonut el 30 de Abr. de 2020
I have to import data from MS word file and consider the imported file as 'string' but in some case, their are many pictures, images, table etc have into the file. Is it possible to do that in MATLAB or I could have to switch to any other language? Thank YOU

Respuestas (1)

Guillaume
Guillaume el 14 de Jul. de 2017
Editada: Guillaume el 14 de Jul. de 2017
The issue is not matlab. You'll have to go through exactly the same difficulties whichever language you use. The issue is more that a Word document is not a simple string. You can certainly extract the various sections of text from a word document in any language but you'll have to work for it.
All the functions to navigate a Word document are documented on MSDN. The most important object for you is the Document object.
This could be a start:
word = actxserver('Word.Application');
wdoc = word.Documents.Open('C:\somewhere\somefile.docx');
%wdoc is the Document object which you can query and navigate.
sometext = wdoc.Content.Text;
  3 comentarios
Tobias Huth
Tobias Huth el 26 de Nov. de 2018
Thanks for the help!
One should not forget to close the objects:
wdoc.Close; % close document
word.Quit; % end application
Greetings,
Tobias
Ioonut
Ioonut el 30 de Abr. de 2020
do you know how to extract the pictures from a doc file that has for example 10 pictures inserted?
I understood from MSDN that wdoc.InlineShapes should contain this,but I dont know how I could get a variable like 'picture' or how to index it
picture = wdoc.something(index)

Iniciar sesión para comentar.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by