creating vectors showing the non zero values

Question

Jwana el 25 de Oct. de 2012

0
Enlazar

Enlace directo a esta pregunta

https://es.mathworks.com/matlabcentral/answers/51840-creating-vectors-showing-the-non-zero-values

Hi all,

I have a code that can generate 3 vectors (GO_Terms, is_a_relations and part_of_relations) from a text file . The code works as follows: the text file contains paragraphs, each paragraph starts with word: [Term].I need from each [Term] paragraph to take the values of GO_Terms, is_a_relations and part_of_relations. some [Term] paragraphs doesn't contains is_a_relations and some doesn't contain part_of_relations (but all have GO_Term values).

my question is that how can I show a zero value in the vector if there is no such value in each paragraph(for example; if there is no is_a_relations in the [Term] paragraph). Now, my code shows only the nonzero values for the vectors (and this is doesn't work with me since I need the length of vectors to be equal in order to put them in an array and make some process on them)

my code is:

s={}; 
        fid = fopen('gos.txt'); 
        tline = fgetl(fid); 
        while ischar(tline) 
           s=[s;tline]; 
           tline = fgetl(fid); 
        end 
% find start and end positions of every [Term] marker in s 
    terms = [find(~cellfun('isempty', regexp(s, '\[Term\]'))); numel(s)+1];
      % for every [Term] section, run the previously implemented regexps
      % and save the results into a map - a cell array with 3 columns
      GO_Terms=[];
      is_a_relations=[];
      part_of_relations=[];
      %map = cell(0,3);
      for term=1:numel(terms)-1
          % extract single [Term]  data
          s_term = s(terms(term):terms(term+1)-1);
          % match regexps
          %To generate the GO_Terms vector from the text file
          tok = regexp(s_term, '^id: (GO:\w*)', 'tokens');
          idx = ~cellfun('isempty', tok); 
          GO_Terms=[GO_Terms , cellfun(@(x)x{1}, {tok{idx}})];
          %To generate the is_a relations vector from the text file
          tok = regexp(s_term, '^is_a: (GO:\w*)', 'tokens'); 
          idx = ~cellfun('isempty', tok); 
          is_a_relations  =[is_a_relations , cellfun(@(x)x{1}, {tok{idx}})];
          %To generate the part_of relaions vector from the text file
          tok = regexp(s_term, '^relationship: part_of (GO:\w*)', 'tokens'); 
          idx = ~cellfun('isempty', tok); 
          part_of_relations =[part_of_relations ,cellfun(@(x)x{1}, {tok{idx}})];
          %part_of_relations(cellfun(@isempty, part_of_relations)) = [0];
          % map. note the end+1 - here we create a new map row. Only once!
         % map{end+1,1} = GO_Terms;
          %map{end,  2} = is_a_relations;
          %map{end,  3} = part_of_relations;
      end
  GO_Terms=GO_Terms'
  is_a_relations=is_a_relations'
  part_of_relations=part_of_relations'

the results of the code show as follows

GO_Terms = 
      'GO:0008150'
      'GO:0016740'
      'GO:0016787'
      'GO:0006810'
      'GO:0006412'
      'GO:0004672'
      'GO:0016779'
      'GO:0004386'
      'GO:0003774'
      'GO:0016298'
      'GO:0016192'
      'GO:0005215'
      'GO:0030533'
is_a_relations = 
      'GO:0008150'
      'GO:0016740'
      'GO:0016787'
      'GO:0008150'
      'GO:0016740'
      'GO:0016740'
      'GO:0016787'
      'GO:0016787'
      'GO:0016787'
      'GO:0006810'
      'GO:0006412'
      'GO:0004672'
part_of_relations = 
      'GO:0008150'
      'GO:0008150'
      'GO:0006810'
      'GO:0016192'
      'GO:0006810'
      'GO:0005215'