Operations are used within ACD files to achieve flexible and precise control over the command line interface. Currently the following types of operation are supported:
Operations to retrieve the value of a data definition
Operations to retrieve the values of attributes and calculated attributes of a data definition
Tests for equality
Boolean logical operations
Calculations that use the above operations
Unary and ternary conditional statements using the above operations
An operation must be enclosed in a pair of parentheses
() and preceded by the "at" symbol ('
If the operation contains white spaces, the whole token should be enclosed by double quotes (" "):
operation with white space)"
Operations can be nested, in which case each operation must be enclosed in parentheses:
An operation typically contains a reference to the value or attribute of a data definition or variable. They may also contain arithmetic, logical and conditional operators.
The attribute values for a given data definition can depend on the values from other data definitions. It is possible to retrieve the value of:
A data definition (application parameter)
An attribute of a data definition
A calculated attribute of a data definition
An ACD variable definition
Such values are retrieved using the ACD "get the value of" syntax which consists of a term
term surrounded by parentheses with a dollar sign (
$) at the front:
If only the value of the data definition is to be retrieved then the
component is omitted:
Of course a variable name may also be given. Variables do not have attributes therefore the syntax is:
For example, in the following ACD file excerpt two integer inputs are defined:
integer: gappenalty [ standard: "Y" default: "10" ] integer: gapextpenalty [ standard: "Y" default: "$(gappenalty)" ]
The default for the standard qualifier
gapextpenalty is set to the value of qualifier
As another example assume a sequence input and a window over that sequence which should not be longer than the sequence itself. There is also have a subwindow which (by default) is one residue shorter than the main window, and an array of numbers that is the same size as the subwindow. The following ACD code would suffice:
sequence: sequence [ parameter: "Y" ] integer: window [ standard: "Y" default: "10" maximum: "$(sequence.length)" ] integer: subwindow [ standard: "y" default: "@($(window)-1)" maximum: "@($(window.maximum)-1)" ] array: values [ standard: "Y" size: "$(subwindow)" ]
sequence.length is a calculated attribute and is the length of the sequence. The
$(sequence.length) value is calculated during ACD processing once the input sequence has been read in. In contrast
window.maximum simply refers to the
maximum: attribute of the
window data definition.
Of course you need to be careful that the type of retrieved value matches that of the attribute which is being calculated. For instance, if you have defined a
toggle: dovalues [ standard: "Y" ]
then the following definition is perfectly valid because the
standard: global attribute has a boolean value, which is the same type as the
array: values [ standard: "$(toggle)" size: "$(subwindow)" ]
The following definition is not valid and an error will be generated during ACD processing:
array: values [ standard: "Y" size: "$(toggle)" /* This is not valid ! */ ]
It is invalid because the
size: attribute of the
array datatype has an integer value, whereas the
toggle datatype has a boolean value.
Calculations can be performed in ACD using the
@ syntax. A rather silly, but legal, calculation would be:
@(5 + 9)
which equates to the value 14. Calculations can be used to add, subtract, multiply or divide, or test for equality, inequality, "greater than" or "less than". The test values can be integers, floats and strings. You can use calculations with most attributes of datatypes where they make sense. In the following example a start condition is set to have a maximum value of the sequence length minus a window size value plus one:
sequence: sequence [ parameter: "Y" type: pureprotein ] integer: window [ standard: "Y" ] integer: start [ standard: "Y" maximum: "@(@($(sequence.length) - $(window)) + 1)" ]
Note that there are two separate calculations here so each needs to be surrounded by the
@() syntax. Long calculations can get messy. If you need to use them then you possibly need to rethink your ACD logic. If they can't be avoided then they can be tidied up with the use of variables (see below).
The supported arithmetic operations are addition, subtraction, multiplication and division. The standard characters for the arithmetic operations are used:
@(a+b) (Addition) @(a-b) (Subtraction) @(a*b) (Multiplication) @(a/b) (Division)
b must resolve to a numeric value (integer or floating point); the result is undefined otherwise but will most probably result in an error during ACD processing.
Only a single arithmetic operation is allowed per set of parentheses, for instance, the code below is not valid:
integer: a  integer: b  integer: c  integer: n [ default: "@(a + b + c)" /* This is not valid ! */ ]
It could however be rewritten using nested operations. The following is perfectly valid:
integer: n [ default: "@(@(a + b)+c)" ]
Note that two sets of parentheses are required: The first around the addition of
b which gives
@(a + b). The second around the addition of
@(a + b) to
c, which gives
@(@(a + b)+c).
Where more than one arithmetic operation is required, however, one would typically use an internal ACD variable to hold the intermediate results.
The supported equality tests (symbols in parenthesis) are "equality" (
==), "non-equality" (
!=), "less than" (
<) and "greater than" (
@(token1 == token2) (Equality) @(token1 != token2) (Non-equality) @(token1 < token2) (Less-than) @(token1 > token2) (Greater-than)
The above equality tests can be used on strings in which case the lexicographical sorting order of the string is used.
The supported boolean operations are logical
OR and logical
NOT. Again, the standard characters are used:
@(!a) (NOT) @(a|b) (OR) @(a&b) (AND)
In the following ACD code snippet:
integer: fubar [ standard: "Y" default: 5 etc ] integer: rtfm [ standard: "@(@($(fubar)==3) | @($(fubar)==7))" etc ]
rtfm will only be prompted for if the value of
fubar is either 3 or 7. Each of the equality tests is a calculation and the boolean test is another calculation. There are therefore three instances of
There are three kinds of conditional statements in ACD, unary, ternary and case-type.
A typical use for unary conditionals is to switch prompts on or off. Assume that a window size should only be prompted for if the sequence turns out to be a protein. The ACD to achieve this would look as follows:
sequence: sequence [ parameter: "Y" type: gapany ] integer: window [ standard: "$(sequence.protein)" etc ]
If the sequence is a protein then the required statement is equivalent to:
and the prompt is switched on. If the sequence is nucleic the statement is equivalent to:
This will effectively disable the prompt. Controlling prompting is described in detail elsewhere (Section 4.5, “Controlling the Prompt”).
Ternary conditional statements have the general form:
@(conditional ? value-if-true : value-if-false)
They are useful when setting up the application for two distinct modes of usage, for example when setting gap penalty values differently for proteins and nucleic acids in alignment programs. The example below will set the penalty to 14 for proteins and 16 for nucleic acids:
integer: penalty [ standard: "N" default: "@($(sequence.protein) ? 14 : 16)" etc ]
In the case-type operation, the test value is compared with a list of possible values. If a match is found then the operation resolves to the result associated with that possible value. The test value, which is parsed as a string, is followed by an equals sign (
=), which in turn is followed by one or more pairs of possible and associated values separated by a colon (
:). If none of the possible values match then the operation will resolve to the default result that is associated with the keyword
else : default value pair is not mandatory and if none of the possible values match in a operation without the default value then the operation will resolve to a
This is formalised as follows:
string: matrix [ default: "@($(sequence.type) = protein : BLOSUM62 dna : dnamat rna : rnamat else : unknown)" ]
$(sequence.type) variable is a string value that holds the sequence type of the ACD data item named
sequence. If the type is
protein, the operation resolves to
BLOSUM62, if the type is
dna it resolves to
DNAMAT, etc. If the type is not in this list, the operation resolves to
If the test value cannot unambiguously be assigned to a single associated value then the operation will resolve to the LAST associated value that matches its possible value.
Variables are useful for holding partial calculations or values. The general syntax for them is:
Note that, as a variable only has a single value and no attributes, square brackets are not used.
As an example, here is a calculation to determine the maximum size of a sequence window:
integer: start [ standard: "Y" maximum: "@(@($(sequence.length) - $(window)) + 1)" ]
This can be tidied by storing one of the calculations in a partial result as follows:
variable: lminusw "@($(sequence.length) - $(window))" integer: start [ standard: "Y" maximum: "@($(lminusw) + 1)" ]
In the following ACD code, an internal ACD variable
protlen is used to store an intermediate result. The value of the variable
$(protlen) is calculated from the length of the input sequence (
sequence datatype) and used in the definition of the maximum size of the
variable: protlen "@( $(sequence.length) / 3 )" integer: window [ maximum: "@($(protlen)-50)" default: 50 ]
The same result could be achieved using nested operations as shown below:
integer: window [ maximum: "@( @( $(sequence.length) / 3) - 50)" default: 50 ]
window parameter is calculated directly from the
sequence.length variable (calculated attribute) by using the divide arithmetic operation nested within a separate subtraction operation.
Variables may be used to simplify the ACD file making it easier to read and parse. An ACD file can use a variable definition to define a result once only, and then to refer to the variable by name in all later ACD data type definitions. The use of variables however might indicate that there is some complexity in the ACD definitions. When a variable is used, or when a conditional operation refers to another ACD value, the application might logically be regarded as two or more separate applications forked by the conditions resolved.