How to draw co-occurence network by using "Nouns" only in MATLAB Text Analytics Toolbox?

1 visualización (últimos 30 días)
Hello,
I have some trouble when conducting text analytics by using MATLAB.
I want to perform 1) Draw Co-occurence Network Diagram by using most occured 100 Nouns Only and 2) Draw Frequency Table/bar plot of most occured Nouns.
My code is as follows. I conducted the POS(Part of Speech) , but i can't proceed the from now.
Thanks in Advance!!!
T = readtable('D:/OneDrive/evpostridereview.csv');
t.desc = T.review;
cleanedDocuments = tokenizedDocument(t.desc); % 한번 뻗었는데 두번째 시도에서 됨
cleanedDocuments = addPartOfSpeechDetails(cleanedDocuments);
% Remove a list of stop words then lemmatize the words. To improve
% lemmatization, first use addPartOfSpeechDetails.
cleanedDocuments = removeStopWords(cleanedDocuments); % 실행 성공
stopwords =["전기차","하이브리드","현대","기아","아이오닉","쏘나타","카렌스","sm5","소나타","아이오","테슬라","를","의","이","중고차","휴게소","자동차"];
cleanedDocuments = removeWords(cleanedDocuments,stopwords);
cleanedDocuments = normalizeWords(cleanedDocuments,'Style','lemma'); % 뻗었다가 다시 됨
% Erase punctuation.
cleanedDocuments = erasePunctuation(cleanedDocuments); % 한번에 성공
% Remove words with 2 or fewer characters, and words with 15 or more
% characters.
cleanedDocuments = removeShortWords(cleanedDocuments,2); % 한번에 성공
cleanedDocuments = removeLongWords(cleanedDocuments,15);
tdetails = tokenDetails(cleanedDocuments);
head(tdetails)
% Extract Noun
nouns = tdetails.Token(tdetails.PartOfSpeech=='noun');
% Wordcloud for nouns
figure
wordcloud(nouns)
title("EVPost 전기차 주행기 워드 클라우드")
% Co-Occurence Network
bag = bagOfWords(cleanedDocuments);
counts = bag.Counts;
cooccurence=counts.'*counts;
figure
G = graph(cooccurence,bag.Vocabulary,'omitselfloops');
LWidths = 5*G.Edges.Weight/max(G.Edges.Weight);
plot(G,'LineWidth',LWidths)
title("Co-occurence Network")
% Center Keyword Setting
word = "디자인"
idx = find(bag.Vocabulary == word);
nbrs = neighbors(G,idx);
bag.Vocabulary(nbrs)'
H = subgraph(G,[idx; nbrs]);
LWidths = 5*H.Edges.Weight/max(H.Edges.Weight);
plot(H,'LineWidth',LWidths)
title("Co-occurence Network - Word: """ + word + """");
  2 comentarios
Piyush Dubey
Piyush Dubey el 26 de Jun. de 2023
The code seems to be algorithmically perfect can you elaborate on what issue are you facing while creating the co-occurence network.
상원 음
상원 음 el 27 de Jun. de 2023
Dear Piyush, As you mentioned, this code can make co-occurence network. But what i want is that the i want to draw co-occurence network by considering only "nouns" and i want to consider top occuring 100 nouns.

Iniciar sesión para comentar.

Respuestas (1)

Saksham
Saksham el 18 de Ag. de 2023
Hi 상원 음,
I understand that you already have code for co-occurrence network and want to create co-occurrence network only for top occurring 100 nouns.
I also observed that the code is extracting nouns in “nouns” variable. After the comment “% Co-Occurrence Network, please pass variable “nouns” in the bagOfWords function.
To find top 100 occurring nouns, you may try finding frequency of each word and then filter the words accordingly. To know more about counting word frequency, please follow the below link:
I hope the above shared suggestion and resource will be useful to you.
Sincerely,
Saksham

Categorías

Más información sobre MATLAB en Help Center y File Exchange.

Productos


Versión

R2022b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by