Traversing Text Document Matlab
11 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
Please provide guidance on this particular inquiry. All responses are highly valued and will be used to further knowledge(not just looking for a copy and paste solution). I am attempting to read a Microsoft Word dictionary into Matlab. From here I would like to be able to traverse it and extract words of a specific length, say four letter words, and put them into an array. Then I would like to select random words from the array and put them into a matrix. ?
0 comentarios
Respuestas (1)
Adam Danz
el 17 de Nov. de 2019
Editada: Adam Danz
el 17 de Nov. de 2019
Reading from word doc
Here's the general approach to reading a Microsoft word document.
directory = 'C:\Users\AOC\Documents\MATLAB';
file = 'myDocFile.docx';
% Full path to the MS Word file
filePath = fullfile(directory,file);
% Read MS Word file using actxserver function
word = actxserver('Word.Application');
wdoc = word.Documents.Open(filePath);
txt = wdoc.Content.Text;
Quit(word)
delete(word)
The variable txt is a char array containing the text in your document.
Extracting 4-letter words
There are several approaches you could use. This one is fast and doesn't require segementing each word and counting each word-length. Instead, it uses a regular expression to search for this pattern:
[non-letter],[4-letters],[non-letter]
It also uses strtrim() to remove the leading and trailing white space.
% Extract 4-letter words.
s = strtrim(regexp(txt, '([^a-zA-Z])[a-zA-Z]{4}([^a-zA-Z])', 'match'));
s is a 1xn cell array of 4-letter words at character arrays.
Randomly select words
You can't put non-numeric values into a matrix but you can put them into a cell array. This example below chooses n random values from the extracted words.
n = 10;
if n > numel(s)
error('There are only %d words available. You selected %d words.' numel(s), n)
end
randIdx = randi(numel(s),1,n);
randWords = s(randIDx); % Here is your random selection
Ver también
Categorías
Más información sobre Text Files en Help Center y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!