Remove intermittent text when reading in a table from a .dat file
2 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
L. Borealis
el 19 de Feb. de 2021
Comentada: L. Borealis
el 25 de Feb. de 2021
Hi,
I am trying to use readtable for read in a .dat file. The file looks like this, where there could be 1 to very many entries in the columns that start with a "1'" here.
# NetMHCIIpan version 4.0
# Input is in PEPTIDE format
# Prediction Mode: EL+BA
# Threshold for Strong binding peptides (%Rank) 2%
# Threshold for Weak binding peptides (%Rank) 10%
# Allele: HLA-DPA10103-DPB10101
--------------------------------------------------------------------------------------------------------------------------------------------
Pos MHC Peptide Of Core Core_Rel Identity Score_EL %Rank_EL Exp_Bind Score_BA Affinity(nM) %Rank_BA BindLevel
--------------------------------------------------------------------------------------------------------------------------------------------
1 HLA-DPA10103-DPB10101 AAAAAAAAAAAAAAA 3 AAAAAAAAA 0.380 Sequence 0.020745 81.44 NA 0.366182 951.24 32.45
--------------------------------------------------------------------------------------------------------------------------------------------
Number of strong binders: 2 Number of weak binders: 0
--------------------------------------------------------------------------------------------------------------------------------------------
# Allele: HLA-DPA10103-DPB10201
--------------------------------------------------------------------------------------------------------------------------------------------
Pos MHC Peptide Of Core Core_Rel Identity Score_EL %Rank_EL Exp_Bind Score_BA Affinity(nM) %Rank_BA BindLevel
--------------------------------------------------------------------------------------------------------------------------------------------
1 HLA-DPA10103-DPB10201 BBBBBBBBBBBBBBBB 2 BBBBBBBBB 0.960 Sequence 0.491911 1.02 NA 0.712020 22.55 0.27 <=SB
--------------------------------------------------------------------------------------------------------------------------------------------
Number of strong binders: 2 Number of weak binders: 0
--------------------------------------------------------------------------------------------------------------------------------------------
# Allele: HLA-DPA10103-DPB10202
--------------------------------------------------------------------------------------------------------------------------------------------
Pos MHC Peptide Of Core Core_Rel Identity Score_EL %Rank_EL Exp_Bind Score_BA Affinity(nM) %Rank_BA BindLevel
--------------------------------------------------------------------------------------------------------------------------------------------
1[.......]
These columns would then start 2,3,4,[...]. I successfully use
opts = detectImportOptions('filename.dat');
opts.DataLines = [16 Inf];
opts.VariableNamesLine = 14;
readtable(fullfile('path','filename.dat',opts,'ReadVariableNames', true);
for files with a large number of columns between the ----, i.e. e.g.
# Allele: HLA-DPA10103-DPB10101
--------------------------------------------------------------------------------------------------------------------------------------------
Pos MHC Peptide Of Core Core_Rel Identity Score_EL %Rank_EL Exp_Bind Score_BA Affinity(nM) %Rank_BA BindLevel
--------------------------------------------------------------------------------------------------------------------------------------------
1 HLA-DPA10103-DPB10101 AAAAAAAAAAAAAAA 3 AAAAAAAAA 0.380 Sequence 0.020745 81.44 NA 0.366182 951.24 32.45
2 HLA-....
3 ....
....
....
50 HLA....
--------------------------------------------------------------------------------------------------------------------------------------------
Number of strong binders: 2 Number of weak binders: 0
--------------------------------------------------------------------------------------------------------------------------------------------
However, this does not work for short "fillings" and my code very much depends on being robust in either scenario.
I tried playing with the opts but did not get it to work. I would be very grateful for any advice! Maybe a method other than readtable (readtext?) is needed and then a conversion to a table? In the end I will need a table like this:
Thank you very much for your advice! I have spent a long time deleoping the code around this and this is the final part that keeps breaking...
0 comentarios
Respuesta aceptada
Vimal Rathod
el 22 de Feb. de 2021
Hi,
Please refer to the following similar question which could be helpful to you.
Más respuestas (0)
Ver también
Categorías
Más información sobre Cell Arrays en Help Center y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!