The most important post-installation step is to set your operating system environment so that it knows where to find the EMBOSS applications. Assuming that you followed our suggestion and configured EMBOSS using --prefix=/usr/local/emboss
then you need to add the directory /usr/local/emboss/bin
to your PATH
. How to do this will depend on your operating system and the command shell you use. You can find out which shell you are using by typing:
env | grep SHELL |
For users of the sh
or bash
shells (or derivatives) the PATH
is altered using the following lines.
PATH="$PATH:/usr/local/emboss/bin" export PATH
If you want to make these definitions available for all users then you would typically add the lines to the system /etc/profile
file. If you just want to use EMBOSS yourself then you can add the lines to (e.g.) the .bashrc
file in your home directory.
For users of csh
or tcsh
shells the PATH
is altered using the following line.
set path=($path /usr/local/emboss/bin)
If you want to make these definitions available for all users then you would typically add the lines to the system /etc/csh.cshrc
file. If you just want to use EMBOSS yourself then you can add the line to (e.g.) the .cshrc
file in your home directory.
You may have to log out and log back in again for the changes to your PATH
to take effect.
An easy way to check that all is working is to use the EMBOSS application embossversion.
%
embossversion
Writes the current EMBOSS version number 6.1.0
If the version number of EMBOSS is not printed similarly to the above then all is not well; if it is printed then celebrate appropriately.
The most common error is Command not found
whenever you type in an EMBOSS application name. This is caused by incorrectly setting up the PATH
(see above). Double-check that you set up the PATH
correctly and, if necessary, take advice from someone familiar with the operating system you're using.
The second most common error is a report by the program that it cannot find the libnucleus
library. This is one of the EMBOSS libraries and, if you followed our suggestion, it will be found in the /usr/local/emboss/lib
directory after the installation phase. As long as you have set up your PATH
correctly then EMBOSS should always be able to find its libraries. It has, however, been reported that some systems (notably SuSE Linux variants) have problems. In this case there are a few solutions.
With [Open]SuSE this error often happens if you have not specified a --prefix
option or have otherwise installed EMBOSS at the root of the /usr/local
directory tree such that the EMBOSS libraries are in the /usr/local/lib
directory. [Open]SuSE maintains a cache of the contents of that directory which you will need to rebuild by typing as the superuser:
ldconfig |
Do this also for other operating systems that maintain such a cache. If the error happens on other operating systems or distributions then you could do one of the following:
Add the path to the EMBOSS libraries to the LD_LIBRARY_PATH
environment variable. For example:
LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/fu/bar/lib" |
export LD_LIBRARY_PATH |
Or, for csh
shells:
setenv LD_LIBRARY_PATH "$LD_LIBRARY_PATH:/fu/bar/lib" |
Add the path to the EMBOSS libraries system-wide. This is perhaps the preferable way. For example, under Linux you could add the following line to the /etc/ld.so.conf
file:
/fu/bar/lib
and then type:
ldconfig |
For other operating systems, check the manual pages to see how to do the equivalent operations.
If you wish to use the restriction mapping, domain recognition and amino acid index applications in EMBOSS then you will need to download the following databases from the Internet; all are relatively small. Download them all to a temporary directory.
This is available from ftp://ftp.neb.com/pub/rebase/
You need the withrefm
and proto
files from that directory. A common error is to download the withref
file by mistake - it must be the withrefm
file. The file extensions for these files change on the server every month to reflect the date.
Then type:
rebaseextract |
and follow the prompts.
This is available from ftp://ftp.genome.ad.jp/pub/db/community/aaindex/
You need the aaindex1
file from that directory.
Then type:
aaindexextract |
and follow the prompts.
This is available from ftp://ftp.ebi.ac.uk/pub/databases/prints/
You need the prints.dat
file from that directory.
Then type:
printsextract |
and follow the prompts.
This is available from ftp://ftp.ebi.ac.uk/pub/databases/prosite/release/
You need the prosite.dat
and the prosite.doc
files from that directory
Then type:
prosextract |
and follow the prompts.
You can now delete the data files you downloaded.
This is available from http://jaspar.genereg.net/html/DOWNLOAD/
You need the Archive.zip
file. Uncompress it and then run:
jaspextract |
and specify the all_data/FlatFileDir
directory in response to the prompt. You can now delete the source directory contents.
If you followed our advice and gave a --prefix
option to the configure
command, thereby specifying a directory where EMBOSS alone would be installed, then there are two methods for deleting EMBOSS.
If you've kept the source code tree from which you'd done the make install
.
In this case, deleting the installation is easy. Just type:
make uninstall |
This has the advantage that it will delete EMBOSS but will not delete any configuration files you have spent ages developing for your system. This is useful if you wish to reinstall a new version of EMBOSS after the deletion.
If you didn't keep the source code tree.
As long as you specified a suitable --prefix
option to the configure
command then you can use a UNIX rm -rf directoryname
command to delete the EMBOSS installation tree.
If you didn't specify a --prefix
option to the configure
command but did do a make install
then you'll have to clean EMBOSS out of the /usr/local
directory tree manually or, better, reinstall the same version of EMBOSS on top of itself and then use the make uninstall
method.
From time to time, bugfixes or new functionalities are provided, which can be applied to the version of EMBOSS you have installed. At such times new source code files will appear on our FTP server in the directory:
ftp://emboss.open-bio.org/pub/EMBOSS/fixes/
Usually these source code files are replacements for files that came with the EMBOSS distribution. You should read the README.fixes
file in the above directory to see what the file fixes and whereabouts in the EMBOSS source directory it lives.
To apply the fixes, copy the source code file to its correct location, return to the top level of your EMBOSS source code tree and type:
make clean |
make install |
This is, of course, another very good reason for not deleting your EMBOSS source code tree.
A more convenient way of applying all the fixes from the above directory is to use the patch file in the subdirectory:
ftp://emboss.open-bio.org/pub/EMBOSS/fixes/patches/
The patch files are of the form patch-1-
where n
.gzn
refers to the latest source code correction in the README.fixes
file in the directory above. So, if there are ten corrections in the latter file then the patch file would be called patch-1-10.gz
.
The patch files are applied using the UNIX patch
command e.g.:
gunzip EMBOSS-6.1.0.tar.gz |
tar xf EMBOSS-6.1.0.tar |
cd EMBOSS-6.1.0 |
gunzip -c patch-1-10.gz | patch -p1 |
Or, if the file has been uncompressed in transit:
patch -p1 < patch-1-10 |
You should always start with freshly extracted EMBOSS source code, as above, before applying a patch. This allows you to see any errors more easily. On rare occasions the developers will provide a patch file that contains fixes to a binary file. Some operating systems (e.g. FreeBSD) cannot handle binary patches and will report that such a patch file is malformed. In those circumstances follow the instructions in the nonbinary
directory.
A new version of EMBOSS is released at least once per year, typically on St Swithun's Day (15th July). Before installing the new version you should either delete the existing EMBOSS version (if installing to the same directory) or install EMBOSS in a new location. Do not install a new version of EMBOSS on top of an existing installation as files from previous versions may cause compatibility problems.
If you changed any system library or execution paths when you first installed EMBOSS then make sure you update these as necessary. A new version of EMBOSS is unlikely to work if new executables are trying to access older versions of the EMBOSS libraries.
EMBOSS includes two files that are used to configure the package, particularly for defining databases and for making global settings that influence the behaviour of all EMBOSS programs.
The file emboss.default
is used for site-wide configuration. Template files are included:
Stable release (.../share/EMBOSS/emboss.default.template ) |
CVS releases (.../emboss/emboss/emboss.default.template ) |
The file .embossrc
, which you can create in your personal home directory, is used for user-specific customisation. Typically you might test, for example, database definitions in your own ~/.embossrc
file before adding them to emboss.default
.
Blank lines are ignored. Comments start with a '#
' character in the first position of a line. For example:
# this is a comment
INCLUDE
allows you to include a subsidiary file as part of the text of the main emboss.default
or .embossrc
file at the position of the INCLUDE
command. This is useful for keeping the configuration files tidy. For example, to include the contents of the file project_databases.def
:
INCLUDE "project_databases.def"
Variables may be set with the keyword SETENV
, (usually shortened to SET
or ENV
- either is ok), followed by the variable name, then the value to which you wish it set. For example:
SET dbdir /data/sequencedbs
This variable may now be used in the rest of the file emboss.default
by preceding it with a $
. For example:
file: $dbdir/data.dat
The name of the variable is case-insensitive when used within emboss.default
or .embossrc
.
When maintaining EMBOSS for multiple users, more than one configuration might be required, for example to provide access to different sets of databases or data directories. It can be time consuming and error prone to maintain a series of individual .embossrc
files in each user directory, or to force users to work in the same directory.
An alternative is to maintain one central copy of each of the different configuration files (.embossrc
) in its own directory. All the user then need do is set the environment variable EMBOSSRC
in their .cshrc
(csh
) or .profile
(bash
) file to point to the appropriate directory.
It is possible to make EMBOSS unusable if you adjust the global variables. For example:
SET EMBOSS_HELP 1
will make all EMBOSS programs only display their help when they are run.
Environment Variable | Description | Type | Default value |
---|---|---|---|
EMBOSS_ACDCOMMANDLINELOG | Log file for full commandline, used to convert QA test definitions into memory leak test command lines | string | "" |
EMBOSS_ACDFILENAME | Use filename rather than sequence name as default for file naming | boolean | N |
EMBOSS_ACDLOG | Log ACD processing to file program.acdlog to debug ACD processing | boolean | N |
EMBOSS_ACDPROMPTS | Number of times to prompt for a value interactively | integer | 1 |
EMBOSS_ACDROOT | EMBOSS root directory for finding files | string |
|
EMBOSS_ACDUTILROOT | EMBOSS source directory for finding files | string |
|
EMBOSS_ACDWARNRANGE | Warn if a number is out of range and fixed to be within limits | boolean | N |
EMBOSS_AUTO | Run with all default values unless -noauto is on the command line | boolean | N |
EMBOSS_CACHESIZE | Cache size to use for database indexing | integer | 2048 |
EMBOSS_DATA | EMBOSS directory for finding data files | string |
|
EMBOSS_DEBUG | Write debug output to program.dbg unless -nodebug is on the command line | boolean | N |
EMBOSS_DEBUGBUFFER | Buffer debug output to save I/O time but risk losing output on a crash | boolean | N |
EMBOSS_DIE | Print program abort messages to standard error | boolean | Y |
EMBOSS_DOCROOT | EMBOSS directory for finding application documentation | string |
|
EMBOSS_EMBOSSRC | Directory to search for an additional .embossrc file | string |
|
EMBOSS_FEATWARN | Print warning messages when parsing feature table input | boolean | Y |
EMBOSS_FILTER | By default read standard input and write to standard output unless -nofilter is on the command line | boolean | Y |
EMBOSS_FORMAT | Input sequence format | string | unknown |
EMBOSS_GRAPHICS | Default graphics output device | string | x11 |
EMBOSS_HOMERC | Read the .embossrc file in the user's home directory | boolean | Y |
EMBOSS_HTTPVERSION | HTTP version | string | 1.1 |
EMBOSS_LANGUAGE | (Obsolete) Language used for the codes.language file | string | english |
EMBOSS_LOGFILE | System statistics log file | string | "" |
EMBOSS_MYEMBOSSACDROOT | MYEMBOSS package source directory for user's uninstalled utility ACD files | string |
|
EMBOSS_NAMDEBUG | Write log nessages to standard error while processing .embossrc and emboss.defaults | string | N |
EMBOSS_NAMVALID | Detailed validation while processing .embossrc and emboss.defaults | string | N |
EMBOSS_OPTIONS | Prompt for optional command line values unless -nooptions is on the command line | boolean | N |
EMBOSS_OUTDIRECTORY | Directory used to write output | string |
|
EMBOSS_OUTFEATFORMAT | Output feature format | string | gff |
EMBOSS_OUTFORMAT | Output sequence format | string | fasta |
EMBOSS_PAGER | Application to use for pages output to screen | string | more |
EMBOSS_PAGESIZE | Page size to use for database indexing | integer | 2048 |
EMBOSS_PROXY | HTTP proxy server address in the form proxy.xyz.ac.uk:7890 | string | "" |
EMBOSS_RCHOME | Process the .embossrc file in the home directory | boolean | Y |
EMBOSS_SEQWARN | Print warning messages when parsing standard sequence characters | boolean | N |
EMBOSS_STDOUT | By default write to standard output unless -nostdout is on the command line | boolean | Y |
EMBOSS_TIMETODAY | Date and time to override the current date - used to give a standard date and time for test runs | string |
|
EMBOSS_VERBOSE | Print verbose help output | boolean | N |
EMBOSS_WARNOBSOLETE | Print warning messages when ACD file declares an application as 'obsolete' | boolean | Y |
Environment Variable | Description | Type | Default value |
---|---|---|---|
EMBOSS_ERROR | Print error messages to standard error | boolean | Y |
EMBOSS_FATAL | Print fatal error messages to standard error | boolean | Y |
EMBOSS_WARNING | Print warning messages to standard error | boolean | Y |
Environment variable | Description | Type | Default value |
---|---|---|---|
EMBOSS_CLUSTALW | Name or path to launch clustalw | string | clustalw |
EMBOSS_PRIMER3_CORE | Name or path to launch primer3_core | string | primer3_core |
EMBOSS_HMMALIGN | Name or path to launch hmmalign | string | hmmalign |
EMBOSS_HMMBUILD | Name or path to launch hmmbuild | string | hmmbuild |
EMBOSS_HMMCALIBRATE | Name or path to launch hmmcalibrate | string | hmmcalibrate |
EMBOSS_HMMCONVERT | Name or path to launch hmmconvert | string | hmmconvert |
EMBOSS_HMMEMIT | Name or path to launch hmmemit | string | hmmemit |
EMBOSS_HMMFETCH | Name or path to launch hmmfetch | string | hmmfetch |
EMBOSS_HMMINDEX | Name or path to launch hmmindex | string | hmmindex |
EMBOSS_HMMPFAM | Name or path to launch hmmpfam | string | hmmpfam |
EMBOSS_HMMSEARCH | Name or path to launch hmmsearch | string | hmmsearch |
EMBOSS_MAST | Name or path to launch mast | string | mast |
EMBOSS_MEME | Name or path to launch meme | string | meme |
EMBOSS_MIRA | Name or path to launch mira | string | mira |
EMBOSS_MIRAEST | Name or path to launch miraEST | string | miraEST |
EMBOSS_BLASTPGP | Name or path to launch blastpgp | string | blastpgp |
EMBOSS_FORMATDB | Name or path to launch formatdb | string | formatdb |
EMBOSS_MODELFROMALIGN | Name or path to launch modelfromalign | string | modelfromalign |
EMBOSS_NACCESS | Name or path to launch naccess | string | naccess |
EMBOSS_RPSBLAST | Name or path to launch rpsblast | string | rpsblast |
EMBOSS_STAMP | Name or path to launch stamp | string | stamp |
EMBOSS_STRIDE | Name or path to launch stride | string | stride |
EMBOSS defines various environment variables. They include global variables used to control the behaviour of all EMBOSS programs, and variables to set the location of system files or directories, specify default values etc. There is normally no need to set the environment variables, but you may do so to customise the behaviour of your instance of EMBOSS.
Environment variables are useful for simplifying maintenance of your .embossrc
file. If, for example, you specify the location of your databases as an environment variable, then if you move the databases you only have to update one line in the configuration file. For example, for the data directory:
/data/databases/flatfiles/ |
you might have something like this:
set EMBOSS_database_dir /data/databases/flatfiles SET EMBOSS_embldir $EMBOSS_database_dir/embl
The second line sets another variable to the directory:
/data/databases/flatfiles/embl |
Global environment variables must have UPPERCASE names and usually have Boolean values; they can be turned on by setting them to "1
", or "Y
" (they are off by default.) The global variables can also be set in the UNIX session by defining an environment variable with the commands:
setenv (csh type shells) |
export (sh type shells) |
where NAME
is the name of the variable and value
is the value you wish to set it to.
EMBOSS includes several global qualifiers (see the EMBOSS Users Guide) that are available to all the applications. They are typically used by advanced users (who use -options
or -verbose
) or by developers (who use -debug
, -acdlog
). They may be set as follows:
set EMBOSS_ |
where QUALIFIER
is one of the global qualifiers. The value above is 1
but can be:
1 or Y for true. |
0 or N for false. |
Setting the qualifier value to true has the effect of running every program with that qualifier set. Qualifiers, when set, will work in the same way as if you used them when running the program. For example you can:
set EMBOSS_VERBOSE Y
and the program will run normally, but when the program is run with the -help
qualifier, the output will be in verbose form.
Other program options that can be set include
EMBOSS_FORMAT
EMBOSS_ACDROOT
EMBOSS_DATA
The value of EMBOSS_FORMAT
determines which default sequence format to use for output. For example, if you are running EMBOSS alongside GCG you may wish to have the following entry in your .embossrc
:
set EMBOSS_FORMAT gcg set EMBOSS_OUTFORMAT gcg
which has the effect of using GCG format for input and output by default.
If you wish to use a different directory for the ACD files then this can be set:
set EMBOSS_ACDROOT /path/to/acd
If you wish to maintain a separate data directory then use:
set EMBOSS_DATA /path/to/data
System administrators may wish to make use of the logging facilities of EMBOSS. Setting the variable EMBOSS_LOGFILE
forces the system to keep a log of which programs are used when and by whom:
set EMBOSS_LOGFILE /site/log/emboss.log
The log file structure is very simple. Three tab-separated fields are stored, program name, user name, and the date and time:
prettyplot joeuser Wed Aug 02 14:29:13 2000
The file defined by EMBOSS_LOGFILE
should be world writable. The following command ensures logging can occur:
chmod o+w /site/log/emboss.log |
All settings can be overridden in a users .embossrc
file by redefining the relevant variables. So to prevent your system usage being logged you can redefine EMBOSS_LOGFILE
by putting the following entry in your .embossrc
file:
set EMBOSS_LOGFILE /dev/null
This behaviour may change in the future to prevent users redefining some system settings.
Descriptions of the environment variables are stored in the EMBOSS system file variables.standard
which is stored and installed in the application ACD file directory. An excerpt of this file is shown below:
acdcommandlinelog string "" "Log file for full commandline, used to convert QA test definitions into memory leak test command lines" acdlog boolean "N" "Log ACD processing to file program.acdlog" acdprompts integer "1" "Number of times to prompt for a value when interactive" acdroot string "(install directory)" "EMBOSS root directory for finding files" acdutilroot string "(source directory)" "EMBOSS source directory for finding files"
EMBOSS data files are included in the distribution and stored in the standard EMBOSS data directory, which can be defined by the EMBOSS environment variable EMBOSS_DATA
.
If you built EMBOSS using make install
, EMBOSS will by default install the data files, including those installed with rebaseextract, prosextract, printsextract, aaindexextract or cutgextract, in the directory:
share/EMBOSS/data |
under the install directory, which is defined by the --prefix
when you configured the package (see the EMBOSS Users Guide). Typically this is:
usr/local/emboss/share/EMBOSS/data . |
If EMBOSS was not installed using make install
but just compiled using make
, then by default the data files are in:
emboss/data |
under the directory where emboss was built.
If you want to keep your data files somewhere else, or have a set of datafiles you want to keep separate from those distributed with the package, then you can set the EMBOSS_DATA
environment variable in your emboss.default
or .embossrc
file.
To see the available EMBOSS data files, run:
embossdata -showall |
To fetch one of the data files into your current directory for you to inspect or modify, run:
embossdata -fetch -file |
where
is the name of the data file.EDatafileName
.dat
Users can provide their own data files in their own directories. Project specific files can be put in the current directory or, for tidier directory listings, in a subdirectory called ".embossdata"
. Similarly, for files to be accessible to all EMBOSS applications, invoked from any location, they can be put in your home directory, or in a subdirectory under it called ".embossdata"
.
The directories are searched in the following order:
* .
(your current directory)
* .embossdata
(under your current directory)
* ~/
(your home directory)
* ~/.embossdata