Steps that were taken to wrap the HMMER package are described below. It is impossible to give entirely generic guidelines because the requirements depend on the software being wrapped. Nonetheless all the basic steps you are likely to take are illustrated here. If you need help at any stage you should contact the EMBOSS developers (Section 1.4, “Project Mailing Lists”).
The basic steps are as follows and explained in more detail below:
Planning and design
ACD file development
C source code development
Quality assurance tests
Documentation
Integration
The steps taken were:
Download the source code and documentation
Read the documentation
Decide which options to keep in the EMBOSS version
Decide if new parameters are needed e.g. for application output (normally to stdout
)
Documentation. HMMER includes an excellent User's Guide. It was necessary to read the Introduction, work through the Tutorial and then work through the manual pages for each application in turn. Not all applications and packages are documented to the same high standard! It's essential that you familiarise yourself with the package as a whole, and in particular identify all of the possible parameters for all the applications and their interactions. You should not start coding until you have this information.
Application options. The first design step is to decide which application options to keep in the EMBOSS version. An option should be discarded if it is:
Redundant to inbuilt EMBOSS functionality
Sensibly subsumed by a new EMBOSS qualifier
Always set so need not be defined in the ACD file
You should familiarise yourself with the functionality that is built into EMBOSS (see Section 3.1, “EMBOSS Programming”) to help decide what options are redundant. For example the HMMER help option -h
is not needed because -help
is an inbuilt qualifier for all EMBOSS applications.
One or more options might sensibly be covered by a single EMBOSS qualifier, for example there are 5 options in hmmbuild for setting sequence weighting which are handled by a single weighting option in the EMBASSY wrapper equivalent.
Certain options should always be set in the EMBOSS version and so needn't be defined in the ACD file, for example the -F
option to force overwrite of files is always set.
New options. The second step is to decide whether any new parameters are required. Typically a parameter for an output file is needed to catch output written to stdout
by default.
The key things to consider are:
Application name
Application short description
Documentation for program options
Qualifier names
Validating and reformatting the ACD
Application name. For HMMER the original application names were used except that the EMBOSS versions are prefixed with an 'e'. You should use the original names or some simple derivative except in unavoidable cases, for example because an EMBOSS or system application with that name already exists.
Application short description. The application short description was taken directly from the User Guide and pasted into the documentation:
attribute in the application definition. A description and documentation for each option was again taken from the User Guide and pasted into the help:
and information:
attributes as appropriate. This is vital documentation and cannot be omitted.
Application options. The qualifier names chosen were identical to the option names in the original wherever possible. This was even true for single character options, on the arguable grounds that consistency with the original is more important than consistency with EMBOSS. It's recommended that you do the same.
ACD File. The ACD file was tested and reformatted using the EMBOSS utilities acdc and acdpretty. You should routinely use these tools when developing ACD files.
More information is available on ACD file development (Chapter 5, C Programming) and on the ACD utilities (Section 4.6, “ACD Utilities”).
The application C source code was implemented in the following order:
Application header documentation
main()
function
Variables to handle ACD data items
Call to embInitP
Calls to ajAcdGet*
functions to retrieve objects for ACD data definitions
Code to reformat input files (if necessary)
Code to construct and call the HMMER command line
Code to reformat output files (if necessary)
Code to clean up the ACD variables
Application header. The application header documentation (Appendix D, Code Documentation Standards) was pasted in from another EMBOSS application. Then an empty main()
function and variables to handle ACD data items were added. A call to embInitP
was added to process the ACD file and the ajAcdGet*
functions used to retrieve ACD values.
File reformatting and housekeeping. Code was added, where necessary and possible, to reformat the input files by using temporary files. Code to reformat the output files, again by using temporary files, was again added where necessary. Finally code to clean up memory for the ACD variables was added.
Command line generation. The hardest part of the code was the bit to construct the call to the HMMER command line, but this is quite trivial once all the options are properly understood. A few tricky issues cropped up in generating the command line and you'll see these in the code later. These were documented in the code to save others time in the future. You should always document such tricky steps in your own code.
QA tests (Chapter 7, Quality Assurance) were written for the applications. These were based on the examples in the tutorial which use files from the HMMER distribution. In cases where test data are not already available then these would have to be collected.
The documentation consisted of a README
file for the package distribution and the standard EMBASSY package and application documentation that you'll be familiar with from EMBOSS.
A README
file for the package distribution was written to cover all the basics and included the following topics:
How to download the original and EMBASSY versions of HMMER
Where to get installation instructions and documentation
Requirements, caveats etc
Differences in the application between the two versions (see below)
A note was made for each application to describe:
Which HMMER options are supported as ACD qualifiers
Any new qualifiers and parameters in the EMBOSS version
If the order of parameters was changed
Formal documentation for the package was then generated following the guidelines (Section 8.1, “Application Documentation Standards”). For many of the sections in the application documentation, text could be pasted in directly from the original documentation. No new documentation, other than the README
file already described, was written. Once the text was inserted the EMBOSS-provided scripts were used to generate full documentation files automatically.
The last steps were to commit the new package code to the EMBOSS CVS server (see Section 1.5, “Contributing Software to EMBOSS”) and update the EMBOSS FTP and web sites. Such integration issues are handled by the EMBOSS developers.