Documentation

gethmmprof

Retrieve hidden Markov model (HMM) profile from PFAM database

Syntax

HMMStruct = gethmmprof(PFAMName)
HMMStruct = gethmmprof(PFAMNumber)

HMMStruct = gethmmprof(..., 'ToFile', ToFileValue, ...)
HMMStruct = gethmmprof(..., 'Mode', ModeValue, ...)
HMMStruct = gethmmprof(..., 'Mirror', MirrorValue, ...)

Input Arguments

PFAMNameString specifying a protein family name (unique identifier) of an HMM profile record in the PFAM database. For example, '7tm_2'.
PFAMNumberInteger specifying a protein family number of an HMM profile record in the PFAM database. For example, 2 is the protein family number for the protein family 'PF00002'.
ToFileValueString specifying a file name or a path and file name for saving the data. If you specify only a file name, that file will be saved in the MATLAB® Current Folder.
ModeValue

String that specifies the returned alignment mode. Choices are:

  • 'ls' — Default. Global alignment mode.

  • 'fs' — Local alignment mode.

MirrorValue

String that specifies a Web database. Choices are:

  • 'Sanger' (default)

  • 'Janelia'

Output Arguments

HMMStructMATLAB structure containing information for an HMM profile retrieved from the PFAM database.

Description

    Note:   gethmmprof retrieves information from PFAM-HMM profiles, from file format version HMMER2.0 to HMMER3/b.

HMMStruct = gethmmprof(PFAMName) searches the PFAM database for the record represented by PFAMName (a protein family name), retrieves the HMM profile information, and stores it in HMMStruct, a MATLAB structure containing the following fields corresponding to parameters of an HMM profile.

FieldDescription
NameThe protein family name (unique identifier) of the HMM profile record in the PFAM database.
PfamAccessionNumberThe protein family accession number of the HMM profile record in the PFAM database.
ModelDescriptionDescription of the HMM profile.
ModelLengthThe length of the profile (number of MATCH states).
AlphabetThe alphabet used in the model, 'AA' or 'NT'.

    Note:   AlphaLength is 20 for 'AA' and 4 for 'NT'.

MatchEmission

Symbol emission probabilities in the MATCH states.

The format is a matrix of size ModelLength-by-AlphaLength, where each row corresponds to the emission distribution for a specific MATCH state.

InsertEmission

Symbol emission probabilities in the INSERT state.

The format is a matrix of size ModelLength-by-AlphaLength, where each row corresponds to the emission distribution for a specific INSERT state.

NullEmission

Symbol emission probabilities in the MATCH and INSERT states for the NULL model.

The format is a 1-by-AlphaLength row vector.

    Note:   NULL probabilities are also known as the background probabilities.

BeginX

BEGIN state transition probabilities.

Format is a 1-by-(ModelLength + 1) row vector:

[B->D1 B->M1 B->M2 B->M3 .... B->Mend]
MatchX

MATCH state transition probabilities.

Format is a 4-by-(ModelLength - 1) matrix:

[M1->M2 M2->M3 ... M[end-1]->Mend;
 M1->I1 M2->I2 ... M[end-1]->I[end-1];
 M1->D2 M2->D3 ... M[end-1]->Dend;
 M1->E  M2->E  ... M[end-1]->E  ]
InsertX

INSERT state transition probabilities.

Format is a 2-by-(ModelLength - 1) matrix:

[ I1->M2 I2->M3 ... I[end-1]->Mend;
  I1->I1 I2->I2 ... I[end-1]->I[end-1] ]
DeleteX

DELETE state transition probabilities.

Format is a 2-by-(ModelLength - 1) matrix:

[ D1->M2 D2->M3 ... D[end-1]->Mend ;
  D1->D2 D2->D3 ... D[end-1]->Dend ]
FlankingInsertX

Flanking insert states (N and C) used for LOCAL profile alignment.

Format is a 2-by-2 matrix:

[N->B  C->T ;
 N->N  C->C]
LoopX

Loop states transition probabilities used for multiple hits alignment.

Format is a 2-by-2 matrix:

[E->C  J->B ;
 E->J  J->J]
NullX

Null transition probabilities used to provide scores with log-odds values also for state transitions.

Format is a 2-by-1 column vector:

[G->F ; G->G]

HMMStruct = gethmmprof(PFAMNumber) determines a protein family accession number from PFAMNumber (an integer), searches the PFAM database for the associated record, retrieves the HMM profile information, and stores it in HMMStruct, a MATLAB structure.

HMMStruct = gethmmprof(..., 'PropertyName', PropertyValue, ...) calls gethmmprof with optional properties that use property name/property value pairs. You can specify one or more properties in any order. Each PropertyName must be enclosed in single quotation marks and is case insensitive. These property name/property value pairs are as follows:


HMMStruct = gethmmprof(..., 'ToFile', ToFileValue, ...)
saves the data returned from the PFAM database in a file specified by ToFileValue.

    Note:   You can read an HMM-formatted file back into the MATLAB software using the pfamhmmread function.

HMMStruct = gethmmprof(..., 'Mode', ModeValue, ...) specifies the returned alignment mode. Choices are:

  • 'ls' (default) — Global alignment mode.

  • 'fs' — Local alignment mode.

HMMStruct = gethmmprof(..., 'Mirror', MirrorValue, ...) specifies a Web database. Choices are:

  • 'Sanger' (default)

  • 'Janelia'

You can reach other mirror sites by passing the complete URL to the pfamhmmread function.

    Note:   These mirror sites are maintained separately and may have slight variations.

For more information about the PFAM database, see:

http://pfam.sanger.ac.uk
http://pfam.janelia.org/

For more information on HMM profile models, see HMM Profile Model.

Examples

To retrieve a hidden Markov model (HMM) profile for the global alignment of the 7-transmembrane receptor protein in the secretin family, enter either of the following:

hmm = gethmmprof(2)

hmm = gethmmprof('7tm_2')

hmm = 

                   Name: '7tm_2'
    PfamAccessionNumber: 'PF00002.14'
       ModelDescription: [1x42 char]
            ModelLength: 296
               Alphabet: 'AA'
          MatchEmission: [296x20 double]
         InsertEmission: [296x20 double]
           NullEmission: [1x20 double]
                 BeginX: [297x1 double]
                 MatchX: [295x4 double]
                InsertX: [295x2 double]
                DeleteX: [295x2 double]
        FlankingInsertX: [2x2 double]
                  LoopX: [2x2 double]
                  NullX: [2x1 double]
Was this topic helpful?