ehmmalign |
Please help by correcting and extending the Wiki pages.
Usage:
ehmmalign [options] hmmfile seqfile outfile
The outfile parameter is new to EMBASSY HMMER. The multiple sequence alignment is always written to outfile. The name of outfile is specified by the -o option as normal.
hmmalign reads an HMM profile
Go to the input files for this example
More or less all options documented as "expert" in the original hmmer user guide are given in ACD as "advanced" options (-options must be specified on the command-line in order to be prompted for a value for them).
ehmmalign reads any normal sequence USAs.
Please read the 'Notes' section below for a description of the differences between the original and EMBASSY HMMER, particularly which application command line options are supported.
Please report all bugs to the EMBOSS bug team (emboss-bug © emboss.open-bio.org) not to the original author.
Jon Ison
Please report all bugs to the EMBOSS bug team (emboss-bug © emboss.open-bio.org) not to the original author.
This program is an EMBASSY wrapper to a program written by Sean Eddy as part of his hmmer package.
Please report any bugs to the EMBOSS bug team in the first instance, not to Sean Eddy.
Algorithm
Please read the Userguide.pdf distributed with the original HMMER and included in the EMBASSY HMMER distribution under the DOCS directory.
Usage
Here is a sample session with ehmmalign
% ehmmalign ../ehmmcalibrate-ex-keep/globino.hmm globins630.fa globins630.ali
Align sequences to an HMM profile
hmmalign - align sequences to an HMM profile
HMMER 2.3.2 (Oct 2003)
Copyright (C) 1992-2003 HHMI/Washington University School of Medicine
Freely distributed under the GNU General Public License (GPL)
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
HMM file: ../ehmmcalibrate-ex-keep/globino.hmm
Sequence file: ./ehmmalign-1234567890.1234
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Alignment saved in file globins630.ali
/shared/software/bin/hmmalign --informat FASTA --outformat A2M -o globins630.ali ../ehmmcalibrate-ex-keep/globino.hmm ./ehmmalign-1234567890.1234
Go to the output files for this example
Command line arguments
Where possible, the same command-line qualifier names and parameter order is used as in the original hmmer. There are however several unavoidable differences and these are clearly documented in the "Notes" section below.
Align sequences to an HMM profile
Version: EMBOSS:6.4.0.0
Standard (Mandatory) qualifiers:
[-hmmfile] infile File containing a HMM profile
[-seqfile] seqset File containing a (set of) sequence(s)
[-o] align [*.ehmmalign] Multiple sequence alignment
output file. (default -aformat fasta)
Additional (Optional) qualifiers:
-m boolean [N] Include in the alignment only those
symbols aligned to match states. Do not show
symbols assigned to insert states.
-q boolean [N] Quiet; suppress all output except the
alignment itself. Useful for piping or
redirecting the output.
Advanced (Unprompted) qualifiers:
-mapali infile Reads an alignment from file and aligns it
as a single object to the HMM; e.g. the
alignment in file is held fixed. This allows
you to align sequences to a model with
hmmalign and view them in the context of an
existing trusted multiple alignment. The
alignment to the alignment is defined by a
'map' kept in the HMM, and so is fast and
guaranteed to be consistent with the way the
HMM was constructed from the alignment. The
alignment in the file must be exactly the
alignment that the HMM was built from.
Compare the -withali option.
-withali infile Reads an alignment from file and aligns it
as a single object to the HMM; e.g. the
alignment in file is held fixed. This allows
you to align sequences to a model with
hmmalign and view them in the context of an
existing trusted multiple alignment. The
alignment to the alignment is done with a
heuristic (nonoptimal) dynamic programming
procedure, which may be somewhat slow and is
not guaranteed to be completely consistent
with the way the HMM was constructed (though
it should be quite close). However, any
alignment can be used, not just the
alignment that the HMM was built from.
Compare the -mapali option.
Associated qualifiers:
"-seqfile" associated qualifiers
-sbegin2 integer Start of each sequence to be used
-send2 integer End of each sequence to be used
-sreverse2 boolean Reverse (if DNA)
-sask2 boolean Ask for begin/end/reverse
-snucleotide2 boolean Sequence is nucleotide
-sprotein2 boolean Sequence is protein
-slower2 boolean Make lower case
-supper2 boolean Make upper case
-sformat2 string Input sequence format
-sdbname2 string Database name
-sid2 string Entryname
-ufo2 string UFO features
-fformat2 string Features format
-fopenfile2 string Features file name
"-o" associated qualifiers
-aformat3 string Alignment format
-aextension3 string File name extension
-adirectory3 string Output directory
-aname3 string Base file name
-awidth3 integer Alignment width
-aaccshow3 boolean Show accession number in the header
-adesshow3 boolean Show description in the header
-ausashow3 boolean Show the full USA in the alignment
-aglobal3 boolean Show the full sequence in alignment
General qualifiers:
-auto boolean Turn off prompts
-stdout boolean Write first file to standard output
-filter boolean Read first file from standard input, write
first file to standard output
-options boolean Prompt for standard and additional values
-debug boolean Write debug output to program.dbg
-verbose boolean Report some/full command line options
-help boolean Report command line options and exit. More
information on associated and general
qualifiers can be found with -help -verbose
-warning boolean Report warnings
-error boolean Report errors
-fatal boolean Report fatal errors
-die boolean Report dying program messages
-version boolean Report version number and exit
Qualifier
Type
Description
Allowed values
Default
Standard (Mandatory) qualifiers
[-hmmfile]
(Parameter 1)infile
File containing a HMM profile
Input file
Required
[-seqfile]
(Parameter 2)seqset
File containing a (set of) sequence(s)
Readable set of sequences
Required
[-o]
(Parameter 3)align
Multiple sequence alignment output file.
(default -aformat fasta)
<*>.ehmmalign
Additional (Optional) qualifiers
-m
boolean
Include in the alignment only those symbols aligned to match states. Do not show symbols assigned to insert states.
Boolean value Yes/No
No
-q
boolean
Quiet; suppress all output except the alignment itself. Useful for piping or redirecting the output.
Boolean value Yes/No
No
Advanced (Unprompted) qualifiers
-mapali
infile
Reads an alignment from file and aligns it as a single object to the HMM; e.g. the alignment in file is held fixed. This allows you to align sequences to a model with hmmalign and view them in the context of an existing trusted multiple alignment. The alignment to the alignment is defined by a 'map' kept in the HMM, and so is fast and guaranteed to be consistent with the way the HMM was constructed from the alignment. The alignment in the file must be exactly the alignment that the HMM was built from. Compare the -withali option.
Input file
Required
-withali
infile
Reads an alignment from file and aligns it as a single object to the HMM; e.g. the alignment in file is held fixed. This allows you to align sequences to a model with hmmalign and view them in the context of an existing trusted multiple alignment. The alignment to the alignment is done with a heuristic (nonoptimal) dynamic programming procedure, which may be somewhat slow and is not guaranteed to be completely consistent with the way the HMM was constructed (though it should be quite close). However, any alignment can be used, not just the alignment that the HMM was built from. Compare the -mapali option.
Input file
Required
Associated qualifiers
"-seqfile" associated seqset qualifiers
-sbegin2
-sbegin_seqfileinteger
Start of each sequence to be used
Any integer value
0
-send2
-send_seqfileinteger
End of each sequence to be used
Any integer value
0
-sreverse2
-sreverse_seqfileboolean
Reverse (if DNA)
Boolean value Yes/No
N
-sask2
-sask_seqfileboolean
Ask for begin/end/reverse
Boolean value Yes/No
N
-snucleotide2
-snucleotide_seqfileboolean
Sequence is nucleotide
Boolean value Yes/No
N
-sprotein2
-sprotein_seqfileboolean
Sequence is protein
Boolean value Yes/No
N
-slower2
-slower_seqfileboolean
Make lower case
Boolean value Yes/No
N
-supper2
-supper_seqfileboolean
Make upper case
Boolean value Yes/No
N
-sformat2
-sformat_seqfilestring
Input sequence format
Any string
-sdbname2
-sdbname_seqfilestring
Database name
Any string
-sid2
-sid_seqfilestring
Entryname
Any string
-ufo2
-ufo_seqfilestring
UFO features
Any string
-fformat2
-fformat_seqfilestring
Features format
Any string
-fopenfile2
-fopenfile_seqfilestring
Features file name
Any string
"-o" associated align qualifiers
-aformat3
-aformat_ostring
Alignment format
Any string
fasta
-aextension3
-aextension_ostring
File name extension
Any string
-adirectory3
-adirectory_ostring
Output directory
Any string
-aname3
-aname_ostring
Base file name
Any string
-awidth3
-awidth_ointeger
Alignment width
Any integer value
0
-aaccshow3
-aaccshow_oboolean
Show accession number in the header
Boolean value Yes/No
N
-adesshow3
-adesshow_oboolean
Show description in the header
Boolean value Yes/No
N
-ausashow3
-ausashow_oboolean
Show the full USA in the alignment
Boolean value Yes/No
N
-aglobal3
-aglobal_oboolean
Show the full sequence in alignment
Boolean value Yes/No
N
General qualifiers
-auto
boolean
Turn off prompts
Boolean value Yes/No
N
-stdout
boolean
Write first file to standard output
Boolean value Yes/No
N
-filter
boolean
Read first file from standard input, write first file to standard output
Boolean value Yes/No
N
-options
boolean
Prompt for standard and additional values
Boolean value Yes/No
N
-debug
boolean
Write debug output to program.dbg
Boolean value Yes/No
N
-verbose
boolean
Report some/full command line options
Boolean value Yes/No
Y
-help
boolean
Report command line options and exit. More information on associated and general qualifiers can be found with -help -verbose
Boolean value Yes/No
N
-warning
boolean
Report warnings
Boolean value Yes/No
Y
-error
boolean
Report errors
Boolean value Yes/No
Y
-fatal
boolean
Report fatal errors
Boolean value Yes/No
Y
-die
boolean
Report dying program messages
Boolean value Yes/No
Y
-version
boolean
Report version number and exit
Boolean value Yes/No
N
Input file format
Alignment and sequence formats
Input and output of alignments and sequences is limited to the formats that the original hmmer supports. These include stockholm, SELEX, MSF, Clustal, Phylip and A2M /aligned FASTA (alignments) and FASTA, GENBANK, EMBL, GCG, PIR (sequences). It would be fairly straightforward to adapt the code to support all EMBOSS-supported formats.
Compressed input files
Automatic processing of gzipped files is not supported.
Input files for usage example
File: ../ehmmcalibrate-ex-keep/globino.hmm
HMMER2.0 [2.3.2]
NAME globins50
LENG 143
ALPH Amino
RF no
CS no
MAP yes
COM /shared/software/bin/hmmbuild -n globins50 --pbswitch 1000 --archpri 0.850000 --idlevel 0.620000 --swentry 0.500000 --swexit 0.500000 --wgsc -A -F globin.hmm ../../data/hmmnew/globins50.msf
COM /shared/software/bin/hmmcalibrate --mean 350.000000 --num 5000 --sd 350.000000 --seed 1 ../ehmmbuild-ex-keep/globin.hmm
NSEQ 50
DATE Fri Jul 15 12:00:00 2011
CKSUM 9858
XT -8455 -4 -1000 -1000 -8455 -4 -8455 -4
NULT -4 -8455
NULE 595 -1558 85 338 -294 453 -1158 197 249 902 -1085 -142 -21 -313 45 531 201 384 -1998 -644
EVD -35.959286 0.267496
HMM A C D E F G H I K L M N P Q R S T V W Y
m->m m->i m->d i->m i->i d->m d->d b->m m->e
-450 * -1900
1 591 -1587 159 1351 -1874 -201 151 -1600 998 -1591 -693 389 -1272 595 42 -31 27 -693 -1797 -1134 14
- -149 -500 233 43 -381 399 106 -626 210 -466 -720 275 394 45 96 359 117 -369 -294 -249
- -23 -6528 -7571 -894 -1115 -701 -1378 -450 *
2 -926 -2616 2221 2269 -2845 -1178 -325 -2678 -300 -2596 -1810 220 -1592 939 -974 -671 -939 -2204 -2785 -1925 15
- -149 -500 233 43 -381 399 106 -626 210 -466 -720 275 394 45 96 359 117 -369 -294 -249
- -23 -6528 -7571 -894 -1115 -701 -1378 * *
3 -638 -1715 -680 497 -2043 -1540 23 -1671 2380 -1641 -840 -222 -1595 437 1040 -564 -523 -1363 2124 -1313 16
- -149 -500 233 43 -381 399 106 -626 210 -466 -720 275 394 45 96 359 117 -369 -294 -249
- -23 -6528 -7571 -894 -1115 -701 -1378 * *
4 829 -1571 -37 660 -1856 -873 152 -1578 894 -1573 -678 769 -1273 1284 58 224 447 -1175 -1782 -1125 17
- -149 -500 233 43 -381 399 106 -626 210 -466 -720 275 394 45 96 359 117 -369 -294 -249
- -23 -6528 -7571 -894 -1115 -701 -1378 * *
5 369 -433 -475 286 -974 -1312 -19 -412 664 398 406 1030 -1394 388 -214 -261 85 -166 -1227 -725 18
- -149 -500 233 43 -381 399 106 -626 210 -466 -720 275 394 45 96 359 117 -369 -294 -249
- -23 -6528 -7571 -894 -1115 -701 -1378 * *
6 -1291 -884 -3696 -3261 -1137 -3425 -2802 2322 -3066 111 19 -3028 -3275 -2855 -3100 -2670 -1269 2738 -2450 -2062 19
- -149 -500 233 43 -381 399 106 -626 210 -466 -720 275 394 45 96 359 117 -369 -294 -249
- -23 -6528 -7571 -894 -1115 -701 -1378 * *
7 157 -413 -236 316 -1387 -1231 89 -863 1084 -431 -348 910 -1319 635 297 15 704 -483 -1497 -922 20
- -149 -500 233 43 -381 399 106 -626 210 -466 -720 275 394 45 96 359 117 -369 -294 -249
- -23 -6528 -7571 -894 -1115 -701 -1378 * *
8 770 -1431 -43 459 -1751 -340 78 -1449 440 -1497 -631 866 -1302 825 -51 953 364 -1076 -1750 -1121 21
- -149 -500 233 43 -381 399 106 -626 210 -466 -720 275 394 45 96 359 117 -369 -294 -249
- -23 -6528 -7571 -894 -1115 -701 -1378 * *
9 420 -186 -2172 -1577 8 -1818 -694 1477 -1281 760 614 -1299 -1867 -1001 -1262 -189 -12 1401 -722 -364 22
- -149 -500 233 43 -381 399 106 -626 210 -466 -720 275 394 45 96 359 117 -369 -294 -249
- -23 -6528 -7571 -894 -1115 -701 -1378 * *
10 -961 -879 -2277 -1821 1366 -2213 -204 -399 -1500 -130 -39 -1427 -2266 -1186 -1511 -159 -913 -367 4721 1177 23
- -149 -500 233 43 -381 399 106 -626 210 -466 -720 275 394 45 96 359 117 -369 -294 -249
- -23 -6528 -7571 -894 -1115 -701 -1378 * *
11 -48 -1782 809 844 -2073 1456 8 -1811 315 -1803 -932 180 -1365 921 -218 173 -115 -1399 -2018 -1327 24
[Part of this file has been deleted for brevity]
- -149 -500 233 43 -381 399 106 -626 210 -466 -720 275 394 45 96 359 117 -369 -294 -249
- -23 -6528 -7571 -894 -1115 -701 -1378 * *
128 -415 -1926 1575 1399 -2219 -1163 17 -1983 527 -1929 -1039 341 -1367 1597 -212 257 -222 -1536 -2109 -1387 144
- -149 -500 233 43 -381 399 106 -626 210 -466 -720 275 394 45 96 359 117 -369 -294 -249
- -23 -6528 -7571 -894 -1115 -701 -1378 * *
129 -529 -1434 -629 -143 -1926 -626 -171 -1460 2679 -1597 -839 -309 -1599 207 317 -530 -510 -130 -1840 -1369 145
- -149 -500 233 43 -381 399 106 -626 210 -466 -720 275 394 45 96 359 117 -369 -294 -249
- -23 -6528 -7571 -894 -1115 -701 -1378 * *
130 811 -397 -2389 -1807 1883 -2039 -907 594 -1512 1077 687 -1532 -2065 -1201 -1483 -1125 -465 1067 -843 -472 146
- -149 -500 233 43 -381 399 106 -626 210 -466 -720 275 394 45 96 359 117 -369 -294 -249
- -23 -6528 -7571 -894 -1115 -701 -1378 * *
131 -241 -102 -2327 -1710 724 -1767 -616 650 -1363 1074 1765 -718 -1809 -1026 -1252 -842 -181 1331 -541 695 147
- -149 -500 233 43 -381 399 106 -626 210 -466 -720 275 394 45 96 359 117 -369 -294 -249
- -23 -6528 -7571 -894 -1115 -701 -1378 * *
132 723 95 385 823 -1820 -1168 167 -1540 875 -1362 -644 320 -1261 810 246 693 -67 -1141 -1753 -1098 148
- -149 -500 233 43 -381 399 106 -626 210 -466 -720 275 394 45 96 359 117 -369 -294 -249
- -23 -6528 -7571 -894 -1115 -701 -1378 * *
133 551 -430 -1049 -481 -442 469 -241 465 -313 133 947 -411 -1543 197 -587 -146 202 522 -843 -429 149
- -149 -500 233 43 -381 399 106 -626 210 -466 -720 275 394 45 96 359 117 -369 -294 -249
- -23 -6528 -7571 -894 -1115 -701 -1378 * *
134 -1086 -777 -3351 -2800 816 -2898 -1861 1501 -2515 1149 586 -2483 -2775 -2108 -2400 -2046 -1030 2380 -1511 -1216 150
- -149 -500 233 43 -381 399 106 -626 210 -466 -720 275 394 45 96 359 117 -369 -294 -249
- -23 -6528 -7571 -894 -1115 -701 -1378 * *
135 1393 1409 -876 -345 -997 -525 -315 -590 -198 -847 -109 -420 -1441 -97 412 766 -130 139 -1306 -858 151
- -149 -500 233 43 -381 399 106 -626 210 -466 -720 275 394 45 96 359 117 -369 -294 -249
- -23 -6528 -7571 -894 -1115 -701 -1378 * *
136 98 -1299 36 365 -1495 -1211 1241 -404 523 -952 -426 1174 -1303 511 -18 347 882 -853 -1566 -970 152
- -149 -500 233 43 -381 399 106 -626 210 -466 -720 275 394 45 96 359 117 -369 -294 -249
- -23 -6528 -7571 -894 -1115 -701 -1378 * *
137 1308 -787 564 -132 -966 -1332 -203 -362 -49 -395 -57 -305 -1481 49 -437 -190 -182 1020 -1282 -802 153
- -149 -500 233 43 -381 399 106 -626 210 -466 -720 275 394 45 96 359 117 -369 -294 -249
- -23 -6528 -7571 -894 -1115 -701 -1378 * *
138 -1746 -1358 -3897 -3341 -216 -3621 -2478 1774 -3040 2442 1157 -3189 -3229 -2422 -2853 -2824 -1659 392 -1720 -1647 154
- -149 -500 233 43 -381 399 106 -626 210 -466 -720 275 394 45 96 359 117 -369 -294 -249
- -23 -6528 -7571 -894 -1115 -701 -1378 * *
139 1176 -1289 -179 534 -1606 -607 34 -1278 734 -1372 -534 44 -1325 433 -89 521 826 -941 -1666 -1072 155
- -149 -500 233 43 -381 399 106 -626 210 -466 -720 275 394 45 96 359 117 -369 -294 -249
- -23 -6528 -7571 -894 -1115 -701 -1378 * *
140 602 -1500 -135 850 -1753 -1214 1951 -1452 838 -1484 431 118 -1306 555 347 489 -153 -1085 -1723 -1092 156
- -149 -500 233 43 -381 399 106 -626 210 -466 -720 275 394 45 96 359 117 -369 -294 -249
- -22 -6602 -7644 -894 -1115 -701 -1378 * *
141 351 -1646 -165 546 -1976 -498 46 -1667 2193 -1662 -798 35 -1405 476 311 -73 -306 -1287 -1859 -1254 157
- -149 -500 233 43 -381 399 106 -626 210 -466 -720 275 394 45 96 359 117 -369 -294 -249
- -23 -6561 -7603 -894 -1115 -701 -1378 * *
142 -1995 -1606 -3095 -2870 1739 -3015 -98 -1012 -2520 -730 655 -1990 -2962 -1884 -2326 -2167 -1915 -1128 548 4089 158
- -149 -500 233 43 -381 399 106 -626 210 -466 -720 275 394 45 96 359 117 -369 -294 -249
- -25 -6455 -7497 -894 -1115 -701 -1378 * *
143 -253 -1373 -267 301 -911 -565 1956 -450 1188 -1330 -497 33 -1352 502 1358 -205 -184 -941 -1604 -1026 159
- * * * * * * * * * * * * * * * * * * * *
- * * * * * * * * 0
//
File: globins630.fa
> BAHG_VITSP
MLDQQTINIIKATVPVLKEHGVTITTTFYKNLFAKHPEVRPLFDMGRQESLEQPKALAM
TVLAAAQNIENLPAILPAVKKIAVKHCQAGVAAAHYPIVGQELLGAIKEVLGDAATDDIL
DAWGKAYGVIADVfiqveadLYAQAVE
> GLB1_ANABR
PSVQGAAAQLTADVKKDLRDSWKVIGSDKKGNGVALMTTLFADNQETIGYFKRLGNVSQ
GMANDKLRGHSITLMYALQNFIDQLDNTDDLVCVVEKFAVNHITRKISAAEFGKINGPIK
KVLASKNFGDKYANAWAKLVAVVQAAL
> GLB1_ARTSX
ERVDPITGLSGLEKNAILDTWGKVRGNLQEVGKATFGKLFAAHPEYQQMFRFFQGVQLA
FLVQSPKFAAHTQRVVSALDQTLLALNRPSDQFVYMIKELGLDHINRGTDRSFVEYLKES
LGDSVDEFTVQSFGEVIVNFLNEGLRQA
> GLB1_CALSO
VSANDIKNVQDTWGKLYDQWDAVHAsKFYNKLFKDSEDISEAFVKAGTGSGIAMKRQAL
VFGAILQEFVANLNDPTALTLKIKGLCATHKTRGITNMELFAFALADLVAYMGTtISFTA
AQKASWTAVNDVILHQMSSYFATVA
> GLB1_CHITH
GPSGDQIAAAKASWNTVKNNQVDILYAVFKANPDIQTAFSQFAGKDLDSIKGTPDFSKH
AGRVVGLFSEVMDLLGNDANTPTILAKAKDFGKSHKSRASPAQLDNFRKSLVVYLKGATK
WDSAVESSWAPVLDFVFSTLKNEL
> GLB1_GLYDI
GLSAAQRQVIAATWKDIAGADNGAGVGKDCLIKFLSAHPQMAAVFGFSGASDPGVAALG
AKVLAQIGVAVSHLGDEGKMVAQMKAVGVRHKGYGNKHIKAQYFEPLGASLLSAMEHRIG
GKMNAAAKDAWAAAYADISGALISGLQS
> GLB1_LUMTE
ECLVTEGLKVKLQWASAFGHAHQRVAFGLELwkgILREHPEIKAPFSRVRGDNIYSPQF
GAHSQRVLSGLDITISMLDTPDmLAAQLAHLKVQHVERNLKPEFFDIFLKHLLHVLGDRL
GTHFDFGAWHDCVDQIIDGIKDI
> GLB1_MORMR
PIVDSGSVSPLSDAEKNKIRAAWDLVYKDYEKTGVDILVKFFTGTPAAQAFFPKFKGLT
TADDLKQSSDVRWHAERIINAVNDAVKSMDDTEKMSMKLKELSIKHAQSFYVDRQYFKVL
AGIIADTTAPGDAGFEKLMSMICILLSSAY
> GLB1_PARCH
GGTLAIQSHGDLTLAQKKIVRKTWHQLMRNKTSFVTDLFIRIFAYDPAAQNKFPQMAGM
SASQLRSSRQMQAHAIRVSSIMSEYIEELDSDILPELLATLARTHDLNKVGPAHYDLFAK
VLMEALQAELGSDFNQKTRDSWAKAFSIVQAVLLVKHG
> GLB1_PETMA
PIVDSGSVPALTAAEKATIRTAWAPVYAKYQSTGVDILIKFFTSNPAAQAFFPKFQGLT
SADQLKKSMDVRWHAERIINAVNDAVVAMDDTEKMSLKLRELSGKHAKSFQVDPQYFKVL
AAVIVDTVLPGDAGLEKLMSMICILLRSSY
> GLB1_PHESE
DCNTLKRFKVKHQWQQVFSGEhHRTEFSLHFWKEFLHDHPDLVSLFKRVQGENIYSPEF
QAHGIRVLAGLDSVIGVLDEDDTFTVQLAHLKAQHTERGTKPEYFDLFGTQLFDILGDKL
GTHFDQAAWRDCYAVIAAGIKP
> GLB1_SCAIN
PSVYDAAAQLTADVKKDLRDSWKVIGSDKKGNGVALMTTLFADNQETIGYFKRLGNVSQ
GMANDKLRGHSITLMYALQNFIDQLDNPDDLVCVVEKFAVNHITRKISAAEFGKINGPIK
KVLASKNFGDKYANAWAKLVAVVQAAL
> GLB1_TYLHE
TDCGILQRIKVKQQWAQVYSVGESRTDFAIDVFNNFFRTNPDRSLFNRVNGDNVYSPEF
[Part of this file has been deleted for brevity]
GLSDAEWQLVLNVWGKVEADLAGHGQEVLIRLFHTHPETLEKFDKFKHLKSEDEMKASE
DLKKHGNTVLTALGAILKKKGHHEAEIKPLAQSHATKHKIPVKYLEFISEAIIHVLHSKH
PGDFGADAQAAMSKALELFRNDIAAQYKELGFQG
> MYG_ROUAE
GLSDGEWQLVLNVWGKVEADIPGHGQEVLIRLFKGHPETLEKFDKFKHLKSEDEMKASE
DLKKHGATVLTALGGILKKKGQHEAQLKPLAQSHATKHKIPVKYLEFISEVIIQVLQSKH
PGDFGADAQGAMGKALELFRNDIAAKYKELGFQG
> MYG_SAISC
GLSDGEWQLVLNIWGKVEADIPSHGQEVLISLFKGHPETLEKFDKFKHLKSEDEMKASE
ELKKHGTTVLTALGGILKKKGQHEAELKPLAQSHATKHKIPVKYLELISDAIVHVLQKKH
PGDFGADAQGAMKKALELFRNDMAAKYKELGFQG
> MYG_SHEEP
GLSDGEWQLVLNAWGKVEADVAGHGQEVLIRLFTGHPETLEKFDKFKHLKTEAEMKASE
DLKKHGNTVLTALGGILKKKGHHEAEVKHLAESHANKHKIPVKYLEFISDAIIHVLHAKH
PSNFGADAQGAMSKALELFRNDMAAEYKVLGFQG
> MYG_SPAEH
GLSDGEWQLVLNVWGKVEGDLAGHGQEVLIKLFKNHPETLEKFDKFKHLKSEDEMKGSE
DLKKHGNTVLTALGGILKKKGQHAAEIQPLAQSHATKHKIPIKYLEFISEAIIQVLQSKH
PGDFGADAQGAMSKALELFRNDIAAKYKELGFQG
> MYG_TACAC
GLSDGEWQLVLKVWGKVETDITGHGQDVLIRLFKTHPETLEKFDKFKHLKTEDEMKASA
DLKKHGGVVLTALGSILKKKGQHEAELKPLAQSHATKHKISIKFLEFISEAIIHVLQSKH
SADFGADAQAAMGKALELFRNDMATKYKEFGFQG
> MYG_THUAL
ADFDAVLKCWGPVEADYTTMGGLVLTRLFKEHPETQKLFPKFAGIAQADIAGNAAISAH
GATVLKKLGELLKAKGSHAAILKPLANSHATKHKIPINNFKLISEVLVKVMHEKAGLDAG
GQTALRNVMGIIIADLEANYKELGFSG
> MYG_TUPGL
GLSDGEWQLVLNVWGKVEADVAGHGQEVLIRLFKGHPETLEKFDKFKHLKTEDEMKASE
DLKKHGNTVLSALGGILKKKGQHEAEIKPLAQSHATKHKIPVKYLEFISEAIIQVLQSKH
PGDFGADAQAAMSKALELFRNDIAAKYKELGFQG
> MYG_TURTR
GLSDGEWQLVLNVWGKVEADLAGHGQDVLIRLFKGHPETLEKFDKFKHLKTEADMKASE
DLKKHGNTVLTALGAILKKKGHHDAELKPLAQSHATKHKIPIKYLEFISEAIIHVLHSRH
PAEFGADAQGAMNKALELFRKDIAAKYKELGFHG
> MYG_VARVA
GLSDEEWKKVVDIWGKVEPDLPSHGQEVIIRMFQNHPETQDRFAKFKNLKTLDEMKNSE
DLKKHGTTVLTALGRILKQKGHHEAEIAPLAQTHANTHKIPIKYLEFICEVIVGVIAEKH
SADFGADSQEAMRKALELFRNDMASRYKELGFQG
> MYG_VULCH
GLSDGEWQLVLNIWGKVETDLAGHGQEVLIRLFKNHPETLDKFDKFKHLKTEDEMKGSE
DLKKHGNTVLTALGGILKKKGHHEAELKPLAQSHATKHKIPVKYLEFISDAIIQVLQSKH
SGDFHADTEAAMKKALELFRNDIAAKYKELGFQG
> MYG_ZALCA
GLSDGEWQLVLNIWGKVEADLVGHGQEVLIRLFKGHPETLEKFDKFKHLKSEDEMKRSE
DLKKHGKTVLTALGGILKKKGHHDAELKPLAQSHATKHKIPIKYLEFISEAIIHVLQSKH
PGDFGADTHAAMKKALELFRNDIAAKYRELGFQG
> MYG_ZIPCA
GLSEAEWQLVLHVWAKVEADLSGHGQEILIRLFKGHPETLEKFDKFKHLKSEAEMKASE
DLKKHGHTVLTALGGILKKKGHHEAELKPLAQSHATKHKIPIKYLEFISDAIIHVLHSRH
PSDFGADAQAAMTKALELFRKDIAAKYKELGFHG
Output file format
ehmmalign
outputs a graph to the specified graphics device.
outputs a report format file. The default format is ...
Output files for usage example
File: globins630.ali
>BAHG_VITSP
.................mldqQTINIIKATV.PV...L.....K...E...H...GV.T.
....I...TTTFYKN..LFAKHPEVRPLFDMGRQESL-------..EQPKALAMTVLAAA
QN-IENLPA......ILPAVKKIAVKH..CQAG-..VAAAHYPIV.GQELLGAIKEV.L.
G.D.AATDDILDAWGKAYGVIADVFIQVEAdlyaqave
>GLB1_ANABR
.........psvqgaaaqltaDVKKDLRDSW.KV...I.....G...S...D...KK.G.
....N...GVALMTT..LFADNQETIGYFKRLGNVSQ---GMAN..DKLRGHSITLMYAL
QNFIDQLDNt...ddLVCVVEKFAVNH..ITR-K..ISAAEFGKI.NGPIKKVLAS-.-.
-.K.NFGDKYANAWAKLVAVVQAAL-----........
>GLB1_ARTSX
..........ervdpitglsgLEKNAILDTW.GK...V.....R...G...N...LQ.E.
....V...GKATFGK..LFAAHPEYQQMFRFFQGVQL-AFLVQS..PKFAAHTQRVVSAL
DQTLLALNRps..dqFVYMIKELGLDH..INRGT..-DRSFVEYL.KESL-----GD.S.
V.D.EFT------VQSFGEVIVNFLNEGLRqa......
>GLB1_CALSO
..................vsaNDIKNVQDTW.GK...L.....Y...D...Q...WD.A.
....V..hASKFYNK..LFKDSEDISEAFVKAGT---GSGIAMK..RQALVFGA-ILQEF
VANLNDPTA......LTLKIKGLCATH..KTRGI..TNMELFAFA.LADLVAYMGTT.I.
S.-.-FTAAQKASWTAVNDVILHQMSSYFAtva.....
>GLB1_CHITH
.................gpsgDQIAAAKASW.NT...V.....-...-...-...-K.N.
....N...QVDILYA..VFKANPDIQTAFSQFAG-KDLDSIKGT..PDFSKHAGRVVGLF
SEVMDLLGNdantptILAKAKDFGKSH..--KSR..ASPAQLDNF.RKSLVVYLKGA.-.
-.T.KWDSAVESSWAPVLDFVFSTLKNEL-........
>GLB1_GLYDI
.................glsaAQRQVIAATW.KD...I.....A...Ga.dN...GA.G.
....V...GKDCLIK..FLSAHPQMAAVFGFSGA--------SD..PGVAALGAKVLAQI
GVAVSHLGDe...gkMVAQMKAVGVRHkgYGNKH..IKAQYFEPL.GASLLSAMEHR.I.
G.G.KMNAAAKDAWAAAYADISGALISGLQs.......
>GLB1_LUMTE
.............eclvteglKVKLQWASAF.GH...A.....-...H...Q...RV.A.
....F...GLELWKG..ILREHPEIKAPFSRVRG-DN----IYS..PQFGAHSQRVLSGL
DITISMLDTp...dmLAAQLAHLKVQH..VER-N..LKPEFFDIF.LKHLLHVLGDR.L.
G.T.HFDF---GAWHDCVDQIIDGIKD--I........
>GLB1_MORMR
........pivdsgsvsplsdAEKNKIRAAW.DL...V.....Y...K...D...YE.K.
....T...GVDILVK..FFTGTPAAQAFFPKFKGLTTADDLKQS..SDVRWHAERIINAV
NDAVKSMDDt...ekMSMKLKELSIKH..AQSFY..VDRQYFKVL.AGII-------.-.
-.A.DTTAPGDAGFEKLMSMICILLSSAY-........
>GLB1_PARCH
.......ggtlaiqshgdltlAQKKIVRKTW.HQ...L.....M...R...N...KT.S.
....F...VTDLFIR..IFAYDPAAQNKFPQMAGMSA-SQLRSS..RQMQAHAIRVSSIM
SEYIEELDSd....iLPELLATLARTH..-DLNK..VGPAHYDLF.AKVLMEALQAE.L.
G.S.DFNQKTRDSWAKAFSIVQAVLLVKHG........
>GLB1_PETMA
........pivdsgsvpaltaAEKATIRTAW.AP...V.....Y...A...K...YQ.S.
....T...GVDILIK..FFTSNPAAQAFFPKFQGLTSADQLKKS..MDVRWHAERIINAV
NDAVVAMDDt...ekMSLKLRELSGKH..AKSFQ..VDPQYFKVL.AAVI-------.-.
-.V.DTVLPGDAGLEKLMSMICILLRSSY-........
[Part of this file has been deleted for brevity]
P.G.DFGADAQGAMKKALELFRNDMAAKYKelgfqg..
>MYG_SHEEP
.................glsdGEWQLVLNAW.GK...V.....E...A...D...VA.G.
....H...GQEVLIR..LFTGHPETLEKFDKFKHLKTEAEMKAS..EDLKKHGNTVLTAL
GGILKKKGH......HEAEVKHLAESH..ANKHK..IPVKYLEFI.SDAIIHVLHAK.H.
P.S.NFGADAQGAMSKALELFRNDMAAEYKvlgfqg..
>MYG_SPAEH
.................glsdGEWQLVLNVW.GK...V.....E...G...D...LA.G.
....H...GQEVLIK..LFKNHPETLEKFDKFKHLKSEDEMKGS..EDLKKHGNTVLTAL
GGILKKKGQ......HAAEIQPLAQSH..ATKHK..IPIKYLEFI.SEAIIQVLQSK.H.
P.G.DFGADAQGAMSKALELFRNDIAAKYKelgfqg..
>MYG_TACAC
.................glsdGEWQLVLKVW.GK...V.....E...T...D...IT.G.
....H...GQDVLIR..LFKTHPETLEKFDKFKHLKTEDEMKAS..ADLKKHGGVVLTAL
GSILKKKGQ......HEAELKPLAQSH..ATKHK..ISIKFLEFI.SEAIIHVLQSK.H.
S.A.DFGADAQAAMGKALELFRNDMATKYKefgfqg..
>MYG_THUAL
.....................ADFDAVLKCW.GP...V.....E...A...D...YT.T.
....M...GGLVLTR..LFKEHPETQKLFPKFAGIAQ-ADIAGN..AAISAHGATVLKKL
GELLKAKGS......HAAILKPLANSH..ATKHK..IPINNFKLI.SEVLVKVMHEK.A.
G.-.-LDAGGQTALRNVMGIIIADLEANYKelgfsg..
>MYG_TUPGL
.................glsdGEWQLVLNVW.GK...V.....E...A...D...VA.G.
....H...GQEVLIR..LFKGHPETLEKFDKFKHLKTEDEMKAS..EDLKKHGNTVLSAL
GGILKKKGQ......HEAEIKPLAQSH..ATKHK..IPVKYLEFI.SEAIIQVLQSK.H.
P.G.DFGADAQAAMSKALELFRNDIAAKYKelgfqg..
>MYG_TURTR
.................glsdGEWQLVLNVW.GK...V.....E...A...D...LA.G.
....H...GQDVLIR..LFKGHPETLEKFDKFKHLKTEADMKAS..EDLKKHGNTVLTAL
GAILKKKGH......HDAELKPLAQSH..ATKHK..IPIKYLEFI.SEAIIHVLHSR.H.
P.A.EFGADAQGAMNKALELFRKDIAAKYKelgfhg..
>MYG_VARVA
.................glsdEEWKKVVDIW.GK...V.....E...P...D...LP.S.
....H...GQEVIIR..MFQNHPETQDRFAKFKNLKTLDEMKNS..EDLKKHGTTVLTAL
GRILKQKGH......HEAEIAPLAQTH..ANTHK..IPIKYLEFI.CEVIVGVIAEK.H.
S.A.DFGADSQEAMRKALELFRNDMASRYKelgfqg..
>MYG_VULCH
.................glsdGEWQLVLNIW.GK...V.....E...T...D...LA.G.
....H...GQEVLIR..LFKNHPETLDKFDKFKHLKTEDEMKGS..EDLKKHGNTVLTAL
GGILKKKGH......HEAELKPLAQSH..ATKHK..IPVKYLEFI.SDAIIQVLQSK.H.
S.G.DFHADTEAAMKKALELFRNDIAAKYKelgfqg..
>MYG_ZALCA
.................glsdGEWQLVLNIW.GK...V.....E...A...D...LV.G.
....H...GQEVLIR..LFKGHPETLEKFDKFKHLKSEDEMKRS..EDLKKHGKTVLTAL
GGILKKKGH......HDAELKPLAQSH..ATKHK..IPIKYLEFI.SEAIIHVLQSK.H.
P.G.DFGADTHAAMKKALELFRNDIAAKYRelgfqg..
>MYG_ZIPCA
.................glseAEWQLVLHVW.AK...V.....E...A...D...LS.G.
....H...GQEILIR..LFKGHPETLEKFDKFKHLKSEAEMKAS..EDLKKHGHTVLTAL
GGILKKKGH......HEAELKPLAQSH..ATKHK..IPIKYLEFI.SDAIIHVLHSR.H.
P.S.DFGADAQAAMTKALELFRKDIAAKYKelgfhg..
Data files
None.
Notes
1. Command-line arguments
The following original HMMER options are not supported:
-h : Use -help to get help information instead.
-informat : All common sequence file formats are supported automatically.
-oneline : Alignment format output is specified via the ACD file and /
or the -aformat command line qualifier.
-outformat : Alignment format output is specified via the ACD file and / or
the -aformat command line qualifier.
2. Installing EMBASSY HMMER
The EMBASSY HMMER package contains "wrapper" applications providing an EMBOSS-style interface to the applications in the original HMMER package version 2.3.2 developed by Sean Eddy. Please read the file INSTALL in the EMBASSY HMMER package distribution for installation instructions.
3. Installing original HMMER
To use EMBASSY HMMER, you will first need to download and install the original HMMER package. Please read the file 00README in the the original HMMER package distribution for installation instructions:
WWW home: http://hmmer.wustl.edu/
Distribution: ftp://ftp.genetics.wustl.edu/pub/eddy/hmmer/
4. Setting up HMMER
For the EMBASSY HMMER package to work, the directory containing the original HMMER executables *must* be in your path. For example if you executables were installed to "/usr/local/hmmer/bin", then type:
set path=(/usr/local/hmmer/bin/ $path)
rehash
5. Getting help
Please read the Userguide.pdf distributed with the original HMMER and included in the EMBASSY HMMER distribution under the DOCS directory. The first 3 chapters (Introduction, Installation and Tutorial) are particularly useful.
References
None.
Warnings
Types of input data
hmmer v3.2.1 and therefore EMBASSY HMMER is only recommended for use with protein sequences. If you provide a non-protein sequence you will be reprompted for a protein sequence. To accept nucleic acid sequences you must replace instances of < type: "protein" > in the application ACD files with Environment variables
The original hmmer uses BLAST environment variables (below), if defined, to locate files. The EMBASSY HMMER does not.
BLASTDB location of sequence databases to be searched
BLASMAT location of substitution matrices
HMMERDB location of HMMs
Sequence input
ehmmalign makes a temporary local copy of its input sequence data. You must ensure there is sufficient disk space for this in the directory that ehmmalign is run.
Diagnostic Error Messages
None.
Exit status
It always exits with status 0.
Known bugs
None.
See also
Program name
Description
ehmmbuild
Build a profile HMM from an alignment
ehmmcalibrate
Calibrate HMM search statistics
ehmmconvert
Convert between profile HMM file formats
ehmmemit
Generate sequences from a profile HMM
ehmmfetch
Retrieve an HMM from an HMM database
ehmmindex
Create a binary SSI index for an HMM database
ehmmpfam
Search one or more sequences against an HMM database
ehmmsearch
Search a sequence database with a profile HMM
libgen
Generate discriminating elements from alignments
ohmmalign
Align sequences with an HMM
ohmmbuild
Build HMM
ohmmcalibrate
Calibrate a hidden Markov model
ohmmconvert
Convert between HMM formats
ohmmemit
Extract HMM sequences
ohmmfetch
Extract HMM from a database
ohmmindex
Index an HMM database
ohmmpfam
Align single sequence with an HMM
ohmmsearch
Search sequence database with an HMM
Author(s)
This program is an EMBOSS conversion of a program written by Sean Eddy
as part of his HMMER package.
European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
History
Target users
This program is intended to be used by everyone and everything, from naive users to embedded scripts.