HMMER is a powerful bioinformatics tool used for searching sequence databases for homologs and for making sequence alignments. It is based on probabilistic models known as profile hidden Markov models (profile HMMs). These models are particularly effective in detecting remote homologs, which are sequences that have diverged significantly from their common ancestors but still retain functional or structural similarities.
The strength of HMMER lies in its underlying probability models, which allow it to detect homologs with high sensitivity. Historically, this computational power came with a significant cost in terms of processing time. However, with the release of HMMER3, the tool has become as fast as BLAST, a widely used program for sequence comparison.
HMMER can be used in conjunction with profile databases such as Pfam or those that participate in Interpro. It can also work with individual query sequences, similar to BLAST, using commands like
phmmer for searching a protein query sequence against a database, or
jackhmmer for iterative searches.
The tool is available for download and can be installed as a command-line tool on your own hardware. Additionally, it is accessible to the scientific community through search servers at the European Bioinformatics Institute, where users can search against the latest Uniprot databases.
For more detailed information and guidance, the HMMER User's Guide is available in PDF format, and ongoing discussions about the tool can be found on the blog Cryptogenomicon.
To install HMMER, you will need to download the source code from the official HMMER website. The latest version available is v3.4, but archived older versions are also accessible. The installation process typically involves compiling the source code on your system, which requires a C compiler and possibly other development tools, depending on your operating system.
The installation instructions are detailed in the HMMER User's Guide, which provides step-by-step guidance for different platforms. It is important to follow these instructions carefully to ensure that the tool is installed correctly and all necessary dependencies are met.
Once HMMER is installed, you can begin using it to search sequence databases for homologs or to create sequence alignments. The tool comes with a variety of commands, each designed for specific tasks. For a quick start, you can use the
phmmer command to search a protein sequence against a database or
jackhmmer for iterative searches that can identify more distant homologs.
The basic syntax for using
phmmer is as follows:
phmmer -i query.fasta -d database.fasta
This command will take a query sequence from
query.fasta and search it against the sequences in
database.fasta. The results will include a list of potential homologs ranked by their probability scores.
jackhmmer, the syntax is similar:
jackhmmer -i query.fasta -d database.fasta
jackhmmer will perform multiple rounds of searching, refining the search space with each iteration to find more distant homologs.
Here are five popular commands used in HMMER and examples of how to use them:
phmmer: Search a protein sequence against a database.
phmmer --cpu 4 -i query.fasta -d database.fasta -o results.out
This command uses 4 CPU cores to search the query sequence in
query.fastaagainst the database
database.fasta, with the results written to
jackhmmer: Perform iterative searches to find distant homologs.
jackhmmer --cpu 4 -N 5 -i query.fasta -d database.fasta -o results.out
This command performs 5 iterations (
-N 5) of searching using 4 CPU cores.
hmmbuild: Build a profile HMM from a multiple sequence alignment.
hmmbuild -n mymodel profile.hmm alignment.sto
This command creates a profile HMM named
mymodeland saves it to
profile.hmmusing the alignment in
hmmsearch: Search a sequence database with a profile HMM.
hmmsearch --tblout hits.table profile.hmm database.fasta
This command searches the database
database.fastawith the profile HMM
profile.hmmand outputs the results in a tabular format to
hmmscan: Scan a sequence against a database of profile HMMs.
hmmscan --domtblout domains.table pfam_db.hmm query.fasta
This command scans the query sequence in
query.fastaagainst a database of profile HMMs
pfam_db.hmmand writes domain hits to
These commands represent just a fraction of HMMER's capabilities, but they are among the most commonly used for sequence analysis tasks. Each command comes with a variety of options and flags that can be used to customize the analysis, and users are encouraged to consult the HMMER User's Guide for comprehensive documentation on each command.
In conclusion, HMMER is a versatile and powerful tool for bioinformatics research, providing researchers with the ability to detect and analyze sequence homologs with high sensitivity and speed. Whether you are working with individual sequences or large databases, HMMER offers a range of commands to suit your research needs.
Updated about 1 month ago