7.2. Application Quality Assurance

Each EMBOSS application is run on test data to ensure that it works as advertised. The test runs are performed nightly to ensure that the applications are not broken, for example, by recent changes to the library code. The test data itself consists of one or more sets of input files, application parameters and the corresponding output files. Each set corresponds to a single run of the application. As many sets of test data should be provided, especially for unusual input conditions, to provide as robust a test as possible.

The QA test definitions are used to generate include files for 3 sections of the application documentation: usage, input and output. These are created for any QA test that has the suffix -ex (example), -keep (output reused by another test) or -check. All other test names are for QA testing only and do not become part of the application documentation.

The tests are defined in the file:

.../emboss/test/qatest.dat

An example of an entry in qatest.dat is shown below:

ID myprogram-ex
AB myemboss
AA myprogram
IN
FI stderr
FC = 2
FP 0 /Warning: /
FP 0 /Error: /
FP 0 /Died: /
FI P10932.myprogram
FC = 5
FP /^Usa: tembl-id:P10932\n/
FP /^Length: 2167\n/
//

The test includes information about the parameters (command line) and what is expected in the output files. Documentation at the start of qatest.dat describes the records used to define a test, which are also described below (Section 7.2.1, “Test Records”).

7.2.1. Test Records

The records used in the file qatest.dat are summarised in the table (Table 7.1, “Records used in qatest.dat) and described below.

Table 7.1. Records used in qatest.dat
RecordShort DescriptionRequirement

ID ApplicationName-Tag

Test identifier

Mandatory

AA ApplicationName

Application name for EMBASSY applications

Mandatory for EMBASSY applications

AB PackageName

EMBASSY package name

Mandatory where an AA record is used

AP ApplicationName

Application name for standard EMBOSS applications

Mandatory for standard EMBOSS applications

AQ ApplicationName

Application name for "make check" EMBOSS applications

Mandatory for "make check" EMBOSS applications

CC comment

A comment

Optional

CL command

Command line parameters

Optional

DI directory

Output subdirectory

Mandatory where output subdirectories are used

DF FileName

Name of file in output subdirectory

Optional

DL directive

Directive for results directory on test completion

Optional

ER ErrorCode

Error return code

Optional

FI FileName

Name of output file in results directory

Mandatory for all files created in the main results directory

FC [<=>] NumberLines

Test number of lines in the preceding FI file

Mandatory for all files other than stdout and stderr if an FZ record is not defined, otherwise optional

FP count / regexp /

Test for pattern in the preceding FI file

Mandatory for all files other than stdout and stderr

FZ [<=>] FileSize

Test size of the preceding FI file

Mandatory for all files other than stdout and stderr if an FC record is not defined, otherwise optional.

IC text

Annotation on the input

Optional

IN UserInput

Value to use in response to an EMBOSS prompt

Mandatory for each application option that will be prompted for by the application

OC text

Annotation on the output

Optional

PP command

Pre-processor command

Optional

QQ command

Post-processor command

Optional

RQ RequiredApps

Required external programs

Mandatory for applications that require other applications to run, e.g. application wrappers

TI seconds

Time limit for test

Optional

UC text

Annotation on the usage examples

Optional

##

Comment

Optional

7.2.1.1. ID ApplicationName-Tag

The identifier has the general form ApplicationName-Tag where ApplicationName is the name of the application being tested and Tag indicate what sort of test is being performed.

When a test is run, a directory is created under /emboss/test/ to hold the output of the test run. This includes all the application output files and files with any output written by the application to the UNIX file streams stdout and stderr. The identifier defines the name of the directory. For example, if the identifier was myprogram-ex then a directory called test/qa/myprogram-ex would be created.

Tag may be any of the following:

-ex

The test is used to create the first usage example in the application HTML documentation. A number following -ex may be given to indicate the second and subsequent tests (see below).

-ex2, -ex3 etc.

Further tests that are used for usage examples in the documentation. In general, all such examples should correspond to a test defined in qatest.dat to ensure the documentation is correct.

-keep

Used where the test is used in the documentation and the test output is retained once the test completes. The test should include a DL keep record. This is used where the test output is read by a subsequent test, as is the case for database indexing. It is important that this test appears in the test file before any tests that use the results.

-keep2, -keep3 etc.

A number following -keep may be given to indicate the second and subsequent tests.

-fail

The test for a failure condition. Such tests typically have an ER record to indicate the error return code.

7.2.1.2. AA ApplicationName

This is used to generate the command line and for reporting statistics. If an AA record is given, the EMBASSY package must be identified with an AB record in the test definition. The application (ApplicationName) must be built by using make on the package. If the application is not found, qatest will assume this EMBASSY package was not installed and will not run the test.

7.2.1.3. AB PackageName

This informs the qatest script that a test depends on installation of an EMBASSY package (PackageName) and may fail to find the binary if the package is not installed.

7.2.1.4. AP ApplicationName

This is used to generate the command line and for reporting statistics. The application (ApplicationName) must be built by using make. The test is always performed by the qatest script.

7.2.1.5. AQ ApplicationName

This is used to generate the command line and for reporting statistics. The application (ApplicationName) must be built by using make check. The qatest script will look in the build directory for the executable as this application will not be installed. If the application is not found, qatest will assume make check was not run and omit the test.

7.2.1.6. CC comment

Used in commenting on failed tests. The comment (comment) will appear in the test output on failure as a guide to an acceptable failure condition. In such cases the text of the error message would be validated using e.g. FP records (see below).

The record ## (see below) is used for more general comments within the test definition.

7.2.1.7. CL command

This is the command line (command) without the application name. The command line may span multiple CL records, in which case text following the CL tokens is appended and a space inserted between the text from each line.

7.2.1.8. DI directory

This is the name of a directory (directory) created in the results directory (defined by the ID record) that may be used to hold output where an application opens a new output directory. Currently, tests check each file is named in a DF line but tests cannot be defined directly for file contents in those subdirectories. Howver, a QQ postprocessing command (see below) could be used to copy the contents to another file (in the main results directory), which could then be tested, e.g. for size and patterns.

7.2.1.9. DF FileName

This is the name of a file (FileName) created in a subdirectory of the results directory. The subdirectory the file belongs to is identified by the preceding DI record; DF records must be listed under the appropriate DI record. The files must be written for the test to pass.

7.2.1.10. DL directive

This controls what happens to the results directory on completion of the test. directive can be one of:

success

Delete the test directory on success. This is the default behaviour.

all

Delete the test directory always.

keep

Keep the test results directory always. This is used where another test reuses the results, as is the case in database indexing.

7.2.1.11. ER ErrorCode

The expected error code (ErrorCode). This is used for tests that are testing failure conditions. The test will report an error if the specified error code is not returned by the application. All applications which succeed return 0, those that fail return 1.

7.2.1.12. FI FileName

Currently, an FI record can only reference a file in the main results directory and not in a subdirectory of the results directory. The file name (FileName) needn't be given stdout and stderr which are assumed to exist and may be empty, unless defined otherwise.

7.2.1.13. FC [<=>] NumberLines

For the test to pass, the number of lines (NumberLines) may be checked as follows:

=

There must be NumberLines lines.

<

There must be more than NumberLines lines.

>

There must be less than NumberLines lines.

7.2.1.14. FP count / regexp /

For the test to pass the Perl regular expression regexp given between forward-slashes must be found in the preceding FI file. The expression may be preceded by an optional integer (count) to check the exact number of the pattern.

This record can be used to check stderr (which often contains the user prompts) for the absence of exception messages (warning, error or fatal).

7.2.1.15. FZ [<=>] FileSize

For the test to pass, the size (FileSize) of the file may be checked as follows:

=

File must be exactly FileSize.

<

File must be bigger than FileSize.

>

File must be smaller than FileSize.

There is an implicit test that stdout and stderr must be of size zero unless otherwise stated.

7.2.1.16. IC text

This is annotation (text) for the input and is used by makeexample.pl when creating the HTML documentation files. The information is not used in QA testing.

7.2.1.17. IN UserInput

The value UserInput will be used. If there is nothing on the line then an empty line is input to the application which will use the default value for the option (if defined). Care is needed where options are conditionally prompted for.

7.2.1.18. OC text

This is annotation (text) for the output and is used by makeexample.pl when creating the HTML documentation files. The information is not used in QA testing.

7.2.1.19. PP command

This is a command executed (by /bin/sh) before the test is run. Each PP line defines a single command: long commands may not be concatenated over multiple lines as is allowed for the CL records.

A typical use is to set an environment variable required by the application. The variable should always be exported (this is /bin/sh), for example:

PP EMBOSS_ACDROOT=../../acd
PP export EMBOSS_ACDROOT

7.2.1.20. QQ command

This is a command executed (by /bin/sh) after the test is run. Each PP line defines a single command: long commands may not be concatenated over multiple lines as was allowed for the CL records.

This is not used at present but the most likely application is to list the contents of a directory to another file which can then be tested for size and patterns.

7.2.1.21. RQ RequiredApps

A single required "helper" application should be given per RQ record. For example, srs is required for tests that use getz, or clustalw for emma.

7.2.1.22. TI seconds

This the time limit at which point the test times-out. The default is 60 seconds. Some examples can take longer on a heavily loaded system.

7.2.1.23. UC text

This is annotation (text) for the QA test itself and is used by makeexample.pl when creating the HTML documentation files. The information is not used in QA testing.

7.2.1.24. ##

This is a general comment in the test definition and is not reported. You should use CC records to comment on tests for failure.

7.2.2. Writing an Application QA Test

QA tests for new applications must be added to the appropriate place in the file qatest.dat. For example, if you were writing a test for an application in the EMBASSY package myemboss you would search for the line "AB myemboss" and add the test under there. The test, as a minimum, should include the following records:

  • An ID record with an identifier code for the test

  • An AA (EMBASSY) or AP or AQ (EMBOSS) records for the application name and an AB record with the name of the EMBASSY package (if appropriate)

  • A CL record giving any options on the command line for the test and/or one or more IN lines to give responses to any requests for input from the program. One IN record, which may be blank if the default response is acceptable, is required for each prompt

  • An FI for the name of each output file

  • An FC record under each FI file to test for correct line count or an FZ record to test for file size

  • One or more FP tests under each FI record to test for file contents

  • One or more DI records for output subdirectories where they are used

  • An RQ record for the name of required applications if there are any

  • An FI, FZ (or FC) and an FP record should be given for each of stderr and stdout

7.2.2.1. Location of test data

Any test data, i.e. the input files required by the application, should be added to the directory:

emboss/test/data

Where many input files are required then, to keep things tidy, these may be put under a subdirectory of the above directory. In either case, you should not create files unnecessarily: it is possible, likely even, that a file already exists under emboss/test/data that is suitable for your needs. More information on the contents of this directory is given below (Section 7.2.4, “Useful Files and Directories”).

7.2.3. Running an Application QA Test

To perform tests, you must edit your .embossrc file (in your home directory) or the emboss.default file to set the EMBOSS environment variable emboss_qadata to the test directory, e.g.

/home/auser/emboss/emboss/test

To run a test you must run the script:

/emboss/scripts/qatest.pl

from the directory /emboss/test/qa. The command has the following form:

qatest.pl TestIdentifier

where TestIdentifier is the test identifier given on the ID record of the appropriate entry in qatest.dat.

7.2.3.1. Example test

Let's assume you want to run the example entry in qatest.dat shown previously:

ID myprogram-ex
AB myemboss
AA myprogram
IN
FI stderr
FC = 2
FP 0 /Warning: /
FP 0 /Error: /
FP 0 /Died: /
FI P10932.myprogram
FC = 5
FP /^Usa: tembl-id:P10932\n/
FP /^Length: 2167\n/
//

Let's also assume you are in the directory emboss/test. To run the test myprogram-ex you would type:

cd qa
../../scripts/qatest.pl myprogram-ex -keep

If the output files are to be retained once the test completes, for tests that include no DL keep record, then qatest.pl must be invoked with the -keep qualifier. If it is not, the output files will be deleted. So, had the test included these lines:

ID myprogram-keep
DL keep

It would be invoked thus:

../../scripts/qatest.pl myprogram-ex

In either case, output files are created in the main results directory. If the test identifier line is

ID myprogram-ex

then the results directory will be:

emboss/qa/myprogram-ex

For other tests, files might also be written to subdirectories in the main results directory (see DI and DF records).

If qatest.pl is run on something not defined in qatest.dat it will report:

Tests total: 0.

If it succeeds, all files are deleted unless the test entry included a DL keep line, or -keep was specified on the command line.

If it fails, it will report why and all results will be saved in the results directory for inspection. You would check, for example, myprogram-ex, then identify the problem, update the test definition and try again until it works.

A typical session looks something like this:

 ../../scripts/qatest.pl -without=srs 
Tests total: 1586 pass: 1586 fail: 0
Skipped: 12 check: 1 embassy: 0 requirements: 11
Missing documentation html: 0 text: 0 sourceforge: 0
Time: 677 seconds

7.2.4. Useful Files and Directories

There are some useful files held under /emboss/test/. The directories are as follows:

acd

ACD files for test applications. These are used for testing ACD file parsing. You will not need this directory unless you extend ACD.

data

The directory for test data that hass already been mentioned. Your test data input files should go in here, or in a subdirectory beneath it.

gb

Some GenBank data files in NBRF/GCG database format.

qa

The directory from which all tests must be run. Application output files are written to their own results directory underneath this, and are deleted on successful test completion unless otherwise stated.

rc

This directory is used for database and resource definition tests. You will not need it.

swnew

Files from the SwissProt database used to make the tsw test database

embl

Files from the EMBL database used to make the tembl test database.

genbank

Some GenBank data files used to make the tgenbank test database.

memtest

Files for tracing memory leaks: you will not need this.

pir

Data files from the PIR database used to make the tpir test database.

swiss

Data files from the SwissProt database.

wormpep

Data files from the WormPep (worm peptide) database.