10.4. How To Wrap Third Party Applications

Steps that were taken to wrap the HMMER package are described below. It is impossible to give entirely generic guidelines because the requirements depend on the software being wrapped. Nonetheless all the basic steps you are likely to take are illustrated here. If you need help at any stage you should contact the EMBOSS developers (Section 1.4, “Project Mailing Lists”).

The basic steps are as follows and explained in more detail below:

  1. Planning and design

  2. ACD file development

  3. C source code development

  4. Quality assurance tests

  5. Documentation

  6. Integration

10.4.1. Planning and Design

The steps taken were:

  1. Download the source code and documentation

  2. Read the documentation

  3. Decide which options to keep in the EMBOSS version

  4. Decide if new parameters are needed e.g. for application output (normally to stdout)

Documentation. HMMER includes an excellent User's Guide. It was necessary to read the Introduction, work through the Tutorial and then work through the manual pages for each application in turn. Not all applications and packages are documented to the same high standard! It's essential that you familiarise yourself with the package as a whole, and in particular identify all of the possible parameters for all the applications and their interactions. You should not start coding until you have this information.

Application options. The first design step is to decide which application options to keep in the EMBOSS version. An option should be discarded if it is:

  • Redundant to inbuilt EMBOSS functionality

  • Sensibly subsumed by a new EMBOSS qualifier

  • Always set so need not be defined in the ACD file

You should familiarise yourself with the functionality that is built into EMBOSS (see Section 3.1, “EMBOSS Programming”) to help decide what options are redundant. For example the HMMER help option -h is not needed because -help is an inbuilt qualifier for all EMBOSS applications.

One or more options might sensibly be covered by a single EMBOSS qualifier, for example there are 5 options in hmmbuild for setting sequence weighting which are handled by a single weighting option in the EMBASSY wrapper equivalent.

Certain options should always be set in the EMBOSS version and so needn't be defined in the ACD file, for example the -F option to force overwrite of files is always set.

New options. The second step is to decide whether any new parameters are required. Typically a parameter for an output file is needed to catch output written to stdout by default.

10.4.2. ACD File Development

The key things to consider are:

  • Application name

  • Application short description

  • Documentation for program options

  • Qualifier names

  • Validating and reformatting the ACD

Application name. For HMMER the original application names were used except that the EMBOSS versions are prefixed with an 'e'. You should use the original names or some simple derivative except in unavoidable cases, for example because an EMBOSS or system application with that name already exists.

Application short description. The application short description was taken directly from the User Guide and pasted into the documentation: attribute in the application definition. A description and documentation for each option was again taken from the User Guide and pasted into the help: and information: attributes as appropriate. This is vital documentation and cannot be omitted.

Application options. The qualifier names chosen were identical to the option names in the original wherever possible. This was even true for single character options, on the arguable grounds that consistency with the original is more important than consistency with EMBOSS. It's recommended that you do the same.

ACD File. The ACD file was tested and reformatted using the EMBOSS utilities acdc and acdpretty. You should routinely use these tools when developing ACD files.

More information is available on ACD file development (Chapter 5, C Programming) and on the ACD utilities (Section 4.6, “ACD Utilities”).

10.4.3. C Source Code Development

The application C source code was implemented in the following order:

  1. Application header documentation

  2. main() function

  3. Variables to handle ACD data items

  4. Call to embInitP

  5. Calls to ajAcdGet* functions to retrieve objects for ACD data definitions

  6. Code to reformat input files (if necessary)

  7. Code to construct and call the HMMER command line

  8. Code to reformat output files (if necessary)

  9. Code to clean up the ACD variables

Application header. The application header documentation (Appendix D, Code Documentation Standards) was pasted in from another EMBOSS application. Then an empty main() function and variables to handle ACD data items were added. A call to embInitP was added to process the ACD file and the ajAcdGet* functions used to retrieve ACD values.

File reformatting and housekeeping. Code was added, where necessary and possible, to reformat the input files by using temporary files. Code to reformat the output files, again by using temporary files, was again added where necessary. Finally code to clean up memory for the ACD variables was added.

Command line generation. The hardest part of the code was the bit to construct the call to the HMMER command line, but this is quite trivial once all the options are properly understood. A few tricky issues cropped up in generating the command line and you'll see these in the code later. These were documented in the code to save others time in the future. You should always document such tricky steps in your own code.

10.4.4. Quality Assurance Testing

QA tests (Chapter 7, Quality Assurance) were written for the applications. These were based on the examples in the tutorial which use files from the HMMER distribution. In cases where test data are not already available then these would have to be collected.

10.4.5. Documentation

The documentation consisted of a README file for the package distribution and the standard EMBASSY package and application documentation that you'll be familiar with from EMBOSS.

A README file for the package distribution was written to cover all the basics and included the following topics:

  • How to download the original and EMBASSY versions of HMMER

  • Where to get installation instructions and documentation

  • Requirements, caveats etc

  • Differences in the application between the two versions (see below)

A note was made for each application to describe:

  • Which HMMER options are supported as ACD qualifiers

  • Any new qualifiers and parameters in the EMBOSS version

  • If the order of parameters was changed

Formal documentation for the package was then generated following the guidelines (Section 8.1, “Application Documentation Standards”). For many of the sections in the application documentation, text could be pasted in directly from the original documentation. No new documentation, other than the README file already described, was written. Once the text was inserted the EMBOSS-provided scripts were used to generate full documentation files automatically.

10.4.6. Integration

The last steps were to commit the new package code to the EMBOSS CVS server (see Section 1.5, “Contributing Software to EMBOSS”) and update the EMBOSS FTP and web sites. Such integration issues are handled by the EMBOSS developers.