How to draw co-occurence network by using "Nouns" only in MATLAB Text Analytics Toolbox?
1 visualización (últimos 30 días)
Mostrar comentarios más antiguos
Hello,
I have some trouble when conducting text analytics by using MATLAB.
I want to perform 1) Draw Co-occurence Network Diagram by using most occured 100 Nouns Only and 2) Draw Frequency Table/bar plot of most occured Nouns.
My code is as follows. I conducted the POS(Part of Speech) , but i can't proceed the from now.
Thanks in Advance!!!
T = readtable('D:/OneDrive/evpostridereview.csv');
t.desc = T.review;
cleanedDocuments = tokenizedDocument(t.desc); % 한번 뻗었는데 두번째 시도에서 됨
cleanedDocuments = addPartOfSpeechDetails(cleanedDocuments);
% Remove a list of stop words then lemmatize the words. To improve
% lemmatization, first use addPartOfSpeechDetails.
cleanedDocuments = removeStopWords(cleanedDocuments); % 실행 성공
stopwords =["전기차","하이브리드","현대","기아","아이오닉","쏘나타","카렌스","sm5","소나타","아이오","테슬라","를","의","이","중고차","휴게소","자동차"];
cleanedDocuments = removeWords(cleanedDocuments,stopwords);
cleanedDocuments = normalizeWords(cleanedDocuments,'Style','lemma'); % 뻗었다가 다시 됨
% Erase punctuation.
cleanedDocuments = erasePunctuation(cleanedDocuments); % 한번에 성공
% Remove words with 2 or fewer characters, and words with 15 or more
% characters.
cleanedDocuments = removeShortWords(cleanedDocuments,2); % 한번에 성공
cleanedDocuments = removeLongWords(cleanedDocuments,15);
tdetails = tokenDetails(cleanedDocuments);
head(tdetails)
% Extract Noun
nouns = tdetails.Token(tdetails.PartOfSpeech=='noun');
% Wordcloud for nouns
figure
wordcloud(nouns)
title("EVPost 전기차 주행기 워드 클라우드")
% Co-Occurence Network
bag = bagOfWords(cleanedDocuments);
counts = bag.Counts;
cooccurence=counts.'*counts;
figure
G = graph(cooccurence,bag.Vocabulary,'omitselfloops');
LWidths = 5*G.Edges.Weight/max(G.Edges.Weight);
plot(G,'LineWidth',LWidths)
title("Co-occurence Network")
% Center Keyword Setting
word = "디자인"
idx = find(bag.Vocabulary == word);
nbrs = neighbors(G,idx);
bag.Vocabulary(nbrs)'
H = subgraph(G,[idx; nbrs]);
LWidths = 5*H.Edges.Weight/max(H.Edges.Weight);
plot(H,'LineWidth',LWidths)
title("Co-occurence Network - Word: """ + word + """");
2 comentarios
Piyush Dubey
el 26 de Jun. de 2023
The code seems to be algorithmically perfect can you elaborate on what issue are you facing while creating the co-occurence network.
Respuestas (1)
Saksham
el 18 de Ag. de 2023
Hi 상원 음,
I understand that you already have code for co-occurrence network and want to create co-occurrence network only for top occurring 100 nouns.
I also observed that the code is extracting nouns in “nouns” variable. After the comment “% Co-Occurrence Network”, please pass variable “nouns” in the “bagOfWords” function.
To find top 100 occurring nouns, you may try finding frequency of each word and then filter the words accordingly. To know more about counting word frequency, please follow the below link:
I hope the above shared suggestion and resource will be useful to you.
Sincerely,
Saksham
0 comentarios
Ver también
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!