How can i extract data from data from table using a string of words.

6 visualizaciones (últimos 30 días)
Jose Mendoza Garcia
Jose Mendoza Garcia el 9 de Sept. de 2022
Editada: dpb el 9 de Sept. de 2022
I am trying to extract data from a file but using a nested loop and if statement has not worked. The main file has column 1 as genes ranging form (1:60000) its a very large file. These genes have names like DUOX1 and so on. Now each column has the patient name a total of 45. Each gene has an expression number for each patient. What i want to do is to extract a certain amount of genes by using their names to extract all the expression for all patients.
the loop that i have been trying has been:
%16 TDS Genes
data_table=readtable("Normal and Tumor TPM.xlsx","Sheet","Normal Expression","VariableNamingRule","preserve");
tds_genes_table=readtable("16 TDS.xlsx","Sheet","Normal");
genes_16 = string(tds_genes_table.Properties.VariableNames);
n = max(size(data_table)) %gives total amount of genes in the main file "data". Each has a name like DUOX1
m = max(size(genes_16)) %contains gene names that i want to extract from data
%loop to match 16 tds genes with patient and expression
for i = 1:n
for j = 1:m
if i=j
expression(j,:) = data_table(i,:)
end
end
end
  6 comentarios
Jose Mendoza Garcia
Jose Mendoza Garcia el 9 de Sept. de 2022
thats a good representation of how the file looks on excel
Jose Mendoza Garcia
Jose Mendoza Garcia el 9 de Sept. de 2022
what i want to do is extract these 16 genes form the large file with all the expression information
this are the 16 genes:
DIO1 DIO2 DUOX1 DUOX2 FOXE1 GLIS3 NKX2-1 PAX8 SLC26A4 SLC5A5 SLC5A8 TG THRA THRB TPO TSHR

Iniciar sesión para comentar.

Respuestas (2)

Cris LaPierre
Cris LaPierre el 9 de Sept. de 2022
I think the Access Data in a Table page is what you want. You might also be interested in the height and width functions for tables. Also, I think this is a scenario where it is best to set the TextType to string.
I think the code might look something like this.
data_table=readtable("Normal and Tumor TPM.xlsx","Sheet","Normal Expression","VariableNamingRule","preserve","TextType","string")
tds_genes_table=readtable("16 TDS.xlsx","Sheet","Normal","TextType","string")
genes_16 = string(tds_genes_table.Properties.VariableNames)
idx = ismember(data_table.Gene_ID,genes_16)
expression = data_table(idx,:)

dpb
dpb el 9 de Sept. de 2022
Editada: dpb el 9 de Sept. de 2022
data_table=readtable("Normal and Tumor TPM.xlsx","Sheet","Normal Expression","VariableNamingRule","preserve");
geneslist=strtrim({'DIO1';'DIO2';'DUOX1';'DUOX2';'FOXE1';'GLIS3';'NKX2-1';'PAX8';'SLC26A4 ';'SLC5A5 ';'SLC5A8 ';'TG ';'THRA ';'THRB ';'TPO ';'TSHR'});
data_table.GENEVARIABLENAME=categorical(data_table.GENEVARIABLENAME); % turn to categorical if isn't; you don't tell us what the variable names are...
tWantedByGene=data_table(matches,data_table.GENEVARIABLENAME,geneslist),:); % select rows with matching genes
should be all you're looking for...
As always, attaching a representative section of the data file would facilitate writing actual code to address any problems.

Etiquetas

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by