Operations are used within ACD files to achieve flexible and precise control over the command line interface. Currently the following types of operation are supported:

Operations to retrieve the value of a data definition

Operations to retrieve the values of attributes and calculated attributes of a data definition

Arithmetic operations

Tests for equality

Boolean logical operations

Calculations that use the above operations

Unary and ternary conditional statements using the above operations

An operation must be enclosed in a pair of parentheses `()`

and preceded by the "at" symbol ('`@`

'):

@()`operation`

If the operation contains white spaces, the whole token should be enclosed by double quotes (" "):

"@()"`operation with white space`

Operations can be nested, in which case each operation must be enclosed in parentheses:

"@(@(`operation`

))"`nested operation`

An operation typically contains a reference to the value or attribute of a data definition or variable. They may also contain arithmetic, logical and conditional operators.

`@(a+b)`

[Addition]`@(a-b)`

[Subtraction]`@(a*b)`

[Multiplication]`@(a/b)`

[Division]

`@(token1==token2)`

[Equality]`@(token1!=token2)`

[Non-equality]`@(token1<token2)`

[Less-than]`@(token1>token2)`

[Greater-than]

The attribute values for a given data definition can depend on the values from other data definitions. It is possible to retrieve the value of:

A data definition (application parameter)

An attribute of a data definition

A calculated attribute of a data definition

An ACD variable definition

Such values are retrieved using the ACD "get the value of" syntax which consists of a term

term surrounded by parentheses with a dollar sign (* ParameterName*.

`AttributeName`

`$`

) at the front:`$(`

* ParameterName*.

`AttributeName`

If only the value of the data definition is to be retrieved then the

component is omitted:`AttributeName`

`$(`

* ParameterName*)

Of course a variable name may also be given. Variables do not have attributes therefore the syntax is:

`$(`

* VariableName*)

For example, in the following ACD file excerpt two integer inputs are defined:

integer: gappenalty [ standard: "Y" default: "10" ] integer: gapextpenalty [ standard: "Y" default: "$(gappenalty)" ]

The default for the standard qualifier `gapextpenalty`

is set to the value of qualifier `gappenalty`

.

As another example assume a sequence input and a window over that sequence which should not be longer than the sequence itself. There is also have a subwindow which (by default) is one residue shorter than the main window, and an array of numbers that is the same size as the subwindow. The following ACD code would suffice:

sequence: sequence [ parameter: "Y" ] integer: window [ standard: "Y" default: "10" maximum: "$(sequence.length)" ] integer: subwindow [ standard: "y" default: "@($(window)-1)" maximum: "@($(window.maximum)-1)" ] array: values [ standard: "Y" size: "$(subwindow)" ]

`sequence.length`

is a calculated attribute and is the length of the sequence. The `$(sequence.length)`

value is calculated during ACD processing once the input sequence has been read in. In contrast `window.maximum`

simply refers to the `maximum:`

attribute of the `window`

data definition.

Of course you need to be careful that the type of retrieved value matches that of the attribute which is being calculated. For instance, if you have defined a `toggle`

datatype:

toggle: dovalues [ standard: "Y" ]

then the following definition is perfectly valid because the `standard:`

global attribute has a boolean value, which is the same type as the `toggle`

datatype.

array: values [ standard: "$(toggle)" size: "$(subwindow)" ]

The following definition is *not* valid and an error will be generated during ACD processing:

array: values [ standard: "Y" size: "$(toggle)" /* This is not valid ! */ ]

It is invalid because the `size:`

attribute of the `array`

datatype has an integer value, whereas the `toggle`

datatype has a boolean value.

Calculations can be performed in ACD using the `@`

syntax. A rather silly, but legal, calculation would be:

@(5 + 9)

which equates to the value 14. Calculations can be used to add, subtract, multiply or divide, or test for equality, inequality, "greater than" or "less than". The test values can be integers, floats and strings. You can use calculations with most attributes of datatypes where they make sense. In the following example a start condition is set to have a maximum value of the sequence length minus a window size value plus one:

sequence: sequence [ parameter: "Y" type: pureprotein ] integer: window [ standard: "Y" ] integer: start [ standard: "Y" maximum: "@(@($(sequence.length) - $(window)) + 1)" ]

Note that there are two separate calculations here so each needs to be surrounded by the `@()`

syntax. Long calculations can get messy. If you need to use them then you possibly need to rethink your ACD logic. If they can't be avoided then they can be tidied up with the use of variables (see below).

The supported arithmetic operations are addition, subtraction, multiplication and division. The standard characters for the arithmetic operations are used: `+`

, `-`

, `*`

and `/`

.

@(a+b) (Addition) @(a-b) (Subtraction) @(a*b) (Multiplication) @(a/b) (Division)

The operands `a`

and `b`

*must* resolve to a numeric value (integer or floating point); the result is undefined otherwise but will most probably result in an error during ACD processing.

Only a single arithmetic operation is allowed per set of parentheses, for instance, the code below is *not* valid:

integer: a [] integer: b [] integer: c [] integer: n [ default: "@(a + b + c)" /* This is not valid ! */ ]

It could however be rewritten using nested operations. The following is perfectly valid:

integer: n [ default: "@(@(a + b)+c)" ]

Note that two sets of parentheses are required: The first around the addition of `a`

and `b`

which gives `@(a + b)`

. The second around the addition of `@(a + b)`

to `c`

, which gives `@(@(a + b)+c)`

.

Where more than one arithmetic operation is required, however, one would typically use an internal ACD variable to hold the intermediate results.

The supported equality tests (symbols in parenthesis) are "equality" (`==`

), "non-equality" (`!=`

), "less than" (`<`

) and "greater than" (`>`

):

@(token1 == token2) (Equality) @(token1 != token2) (Non-equality) @(token1 < token2) (Less-than) @(token1 > token2) (Greater-than)

The above equality tests can be used on strings in which case the lexicographical sorting order of the string is used.

The supported boolean operations are logical `AND`

, logical `OR`

and logical `NOT`

. Again, the standard characters are used: `&`

, `|`

and `!`

:

@(!a) (NOT) @(a|b) (OR) @(a&b) (AND)

In the following ACD code snippet:

integer: fubar [ standard: "Y" default: 5 etc ] integer: rtfm [ standard: "@(@($(fubar)==3) | @($(fubar)==7))" etc ]

The integer `rtfm`

will only be prompted for if the value of `fubar`

is either 3 or 7. Each of the equality tests is a calculation and the boolean test is another calculation. There are therefore three instances of `@()`

.

There are three kinds of conditional statements in ACD, unary, ternary and case-type.

A typical use for unary conditionals is to switch prompts on or off. Assume that a window size should only be prompted for if the sequence turns out to be a protein. The ACD to achieve this would look as follows:

sequence: sequence [ parameter: "Y" type: gapany ] integer: window [ standard: "$(sequence.protein)" etc ]

If the sequence is a protein then the required statement is equivalent to:

standard: "Y"

and the prompt is switched on. If the sequence is nucleic the statement is equivalent to:

standard: "N"

This will effectively disable the prompt. Controlling prompting is described in detail elsewhere (Section 4.5, “Controlling the Prompt”).

Ternary conditional statements have the general form:

@(conditional ? value-if-true : value-if-false)

They are useful when setting up the application for two distinct modes of usage, for example when setting gap penalty values differently for proteins and nucleic acids in alignment programs. The example below will set the penalty to 14 for proteins and 16 for nucleic acids:

integer: penalty [ standard: "N" default: "@($(sequence.protein) ? 14 : 16)" etc ]

In the case-type operation, the test value is compared with a list of possible values. If a match is found then the operation resolves to the result associated with that possible value. The test value, which is parsed as a string, is followed by an equals sign (`=`

), which in turn is followed by one or more pairs of possible and associated values separated by a colon (`:`

). If none of the possible values match then the operation will resolve to the default result that is associated with the keyword `else`

.

The `else :`

default value pair is not mandatory and if none of the possible values match in a operation without the default value then the operation will resolve to a `NULL`

.

This is formalised as follows:

@(=`testval`

:`poss_valA`

`ass_valA`

:`poss_valB`

else :`ass_valB`

)`default_val`

For example:

string: matrix [ default: "@($(sequence.type) = protein : BLOSUM62 dna : dnamat rna : rnamat else : unknown)" ]

The `$(sequence.type)`

variable is a string value that holds the sequence type of the ACD data item named `sequence`

. If the type is `protein`

, the operation resolves to `BLOSUM62`

, if the type is `dna`

it resolves to `DNAMAT`

, etc. If the type is not in this list, the operation resolves to `unknown`

.

If the test value cannot unambiguously be assigned to a single associated value then the operation will resolve to the LAST associated value that matches its possible value.

Variables are useful for holding partial calculations or values. The general syntax for them is:

:`VariableName`

`Variable value`

Note that, as a variable only has a single value and no attributes, square brackets are not used.

As an example, here is a calculation to determine the maximum size of a sequence window:

integer: start [ standard: "Y" maximum: "@(@($(sequence.length) - $(window)) + 1)" ]

This can be tidied by storing one of the calculations in a partial result as follows:

variable: lminusw "@($(sequence.length) - $(window))" integer: start [ standard: "Y" maximum: "@($(lminusw) + 1)" ]

In the following ACD code, an internal ACD variable `protlen`

is used to store an intermediate result. The value of the variable `$(protlen)`

is calculated from the length of the input sequence ( `sequence`

datatype) and used in the definition of the maximum size of the `window`

parameter:

variable: protlen "@( $(sequence.length) / 3 )" integer: window [ maximum: "@($(protlen)-50)" default: 50 ]

The same result could be achieved using nested operations as shown below:

integer: window [ maximum: "@( @( $(sequence.length) / 3) - 50)" default: 50 ]

The `window`

parameter is calculated directly from the `sequence.length`

variable (calculated attribute) by using the divide arithmetic operation nested within a separate subtraction operation.

Variables may be used to simplify the ACD file making it easier to read and parse. An ACD file can use a variable definition to define a result once only, and then to refer to the variable by name in all later ACD data type definitions. The use of variables however might indicate that there is some complexity in the ACD definitions. When a variable is used, or when a conditional operation refers to another ACD value, the application might logically be regarded as two or more separate applications forked by the conditions resolved.