4.4. Operations

4.4.1. Types of Operation

Operations are used within ACD files to achieve flexible and precise control over the command line interface. Currently the following types of operation are supported:

  • Operations to retrieve the value of a data definition

  • Operations to retrieve the values of attributes and calculated attributes of a data definition

  • Arithmetic operations

  • Tests for equality

  • Boolean logical operations

  • Calculations that use the above operations

  • Unary and ternary conditional statements using the above operations

4.4.2. General Operation Syntax

An operation must be enclosed in a pair of parentheses () and preceded by the "at" symbol ('@'):

@(operation)

If the operation contains white spaces, the whole token should be enclosed by double quotes (" "):

"@(operation with white space)"

Operations can be nested, in which case each operation must be enclosed in parentheses:

"@(operation @(nested operation))"

An operation typically contains a reference to the value or attribute of a data definition or variable. They may also contain arithmetic, logical and conditional operators.

4.4.2.1. Arithmetic Operators

  • @(a+b) [Addition]

  • @(a-b) [Subtraction]

  • @(a*b) [Multiplication]

  • @(a/b) [Division]

4.4.2.2. Logical Operators

  • @(!a) [Not boolean]

  • @(a|b) [Or]

  • @(a&b) [And]

4.4.2.3. Equality Operators

  • @(token1==token2) [Equality]

  • @(token1!=token2) [Non-equality]

  • @(token1<token2) [Less-than]

  • @(token1>token2) [Greater-than]

4.4.2.4. Conditional Operators

  • @(boolval ? iftrue : iffalse) [If]

  • @(testval = A : 1 B : 2 else : 0) [Case]

4.4.3. Retrieving Data Values in ACD Files.

The attribute values for a given data definition can depend on the values from other data definitions. It is possible to retrieve the value of:

  • A data definition (application parameter)

  • An attribute of a data definition

  • A calculated attribute of a data definition

  • An ACD variable definition

Such values are retrieved using the ACD "get the value of" syntax which consists of a term ParameterName.AttributeName term surrounded by parentheses with a dollar sign ($) at the front:

$(ParameterName.AttributeName)

If only the value of the data definition is to be retrieved then the AttributeName component is omitted:

$(ParameterName)

Of course a variable name may also be given. Variables do not have attributes therefore the syntax is:

$(VariableName)

For example, in the following ACD file excerpt two integer inputs are defined:

integer: gappenalty    
[
    standard: "Y"
    default:  "10"
]

integer: gapextpenalty 
[
    standard: "Y"
    default:  "$(gappenalty)"
]

The default for the standard qualifier gapextpenalty is set to the value of qualifier gappenalty.

As another example assume a sequence input and a window over that sequence which should not be longer than the sequence itself. There is also have a subwindow which (by default) is one residue shorter than the main window, and an array of numbers that is the same size as the subwindow. The following ACD code would suffice:

sequence: sequence
[
  parameter: "Y"
]

integer: window 
[
  standard: "Y"
  default: "10"
  maximum: "$(sequence.length)"
]

integer: subwindow [
  standard: "y"
  default: "@($(window)-1)"
  maximum: "@($(window.maximum)-1)"
]

array: values
[
  standard: "Y"
  size: "$(subwindow)"
]

sequence.length is a calculated attribute and is the length of the sequence. The $(sequence.length) value is calculated during ACD processing once the input sequence has been read in. In contrast window.maximum simply refers to the maximum: attribute of the window data definition.

Of course you need to be careful that the type of retrieved value matches that of the attribute which is being calculated. For instance, if you have defined a toggle datatype:

toggle: dovalues
[
  standard: "Y"
]

then the following definition is perfectly valid because the standard: global attribute has a boolean value, which is the same type as the toggle datatype.

array: values
[
  standard: "$(toggle)"
  size: "$(subwindow)"
]

The following definition is not valid and an error will be generated during ACD processing:

array: values
[
  standard: "Y"
  size: "$(toggle)"     /* This is not valid ! */
]

It is invalid because the size: attribute of the array datatype has an integer value, whereas the toggle datatype has a boolean value.

4.4.4. Calculations and Tests

Calculations can be performed in ACD using the @ syntax. A rather silly, but legal, calculation would be:

@(5 + 9)

which equates to the value 14. Calculations can be used to add, subtract, multiply or divide, or test for equality, inequality, "greater than" or "less than". The test values can be integers, floats and strings. You can use calculations with most attributes of datatypes where they make sense. In the following example a start condition is set to have a maximum value of the sequence length minus a window size value plus one:

sequence: sequence 
[
  parameter: "Y"
  type: pureprotein
]

integer: window 
[
  standard: "Y"
]

integer: start 
[
  standard: "Y"
  maximum:  "@(@($(sequence.length) - $(window)) + 1)"
]

Note that there are two separate calculations here so each needs to be surrounded by the @() syntax. Long calculations can get messy. If you need to use them then you possibly need to rethink your ACD logic. If they can't be avoided then they can be tidied up with the use of variables (see below).

4.4.4.1. Arithmetic Operations

The supported arithmetic operations are addition, subtraction, multiplication and division. The standard characters for the arithmetic operations are used: +, -, * and /.

@(a+b)   (Addition)
@(a-b)   (Subtraction)
@(a*b)   (Multiplication)
@(a/b)   (Division)

The operands a and b must resolve to a numeric value (integer or floating point); the result is undefined otherwise but will most probably result in an error during ACD processing.

Only a single arithmetic operation is allowed per set of parentheses, for instance, the code below is not valid:

integer: a []
integer: b []
integer: c []
integer: n
[
  default: "@(a + b + c)"    /* This is not valid ! */
]

It could however be rewritten using nested operations. The following is perfectly valid:

integer:  n
[
default: "@(@(a + b)+c)"
]

Note that two sets of parentheses are required: The first around the addition of a and b which gives @(a + b). The second around the addition of @(a + b) to c, which gives @(@(a + b)+c).

Where more than one arithmetic operation is required, however, one would typically use an internal ACD variable to hold the intermediate results.

4.4.4.2. Tests for Equality

The supported equality tests (symbols in parenthesis) are "equality" (==), "non-equality" (!=), "less than" (<) and "greater than" (>):

@(token1 == token2)     (Equality) 
@(token1 != token2)     (Non-equality) 
@(token1 < token2)   (Less-than) 
@(token1 > token2)   (Greater-than)

The above equality tests can be used on strings in which case the lexicographical sorting order of the string is used.

4.4.4.3. Boolean Tests

The supported boolean operations are logical AND, logical OR and logical NOT. Again, the standard characters are used: &, | and !:

@(!a)  (NOT)
@(a|b) (OR)
@(a&b) (AND)

In the following ACD code snippet:

integer: fubar 
[
   standard: "Y"
   default: 5
   etc
]

integer: rtfm 
[
   standard: "@(@($(fubar)==3) | @($(fubar)==7))"
   etc
]

The integer rtfm will only be prompted for if the value of fubar is either 3 or 7. Each of the equality tests is a calculation and the boolean test is another calculation. There are therefore three instances of @().

4.4.5. Conditional Statements

There are three kinds of conditional statements in ACD, unary, ternary and case-type.

4.4.5.1. Unary Conditional Statements

A typical use for unary conditionals is to switch prompts on or off. Assume that a window size should only be prompted for if the sequence turns out to be a protein. The ACD to achieve this would look as follows:

sequence: sequence 
[
  parameter: "Y"
  type: gapany
]

integer: window 
[
  standard: "$(sequence.protein)"
  etc
]

If the sequence is a protein then the required statement is equivalent to:

standard: "Y"

and the prompt is switched on. If the sequence is nucleic the statement is equivalent to:

standard: "N"

This will effectively disable the prompt. Controlling prompting is described in detail elsewhere (Section 4.5, “Controlling the Prompt”).

4.4.5.2. Ternary Conditional

Ternary conditional statements have the general form:

@(conditional  ? value-if-true : value-if-false)

They are useful when setting up the application for two distinct modes of usage, for example when setting gap penalty values differently for proteins and nucleic acids in alignment programs. The example below will set the penalty to 14 for proteins and 16 for nucleic acids:

integer: penalty 
[
  standard: "N"
  default: "@($(sequence.protein) ? 14 : 16)"
  etc
]

4.4.5.3. Case Conditional

In the case-type operation, the test value is compared with a list of possible values. If a match is found then the operation resolves to the result associated with that possible value. The test value, which is parsed as a string, is followed by an equals sign (=), which in turn is followed by one or more pairs of possible and associated values separated by a colon (:). If none of the possible values match then the operation will resolve to the default result that is associated with the keyword else.

The else : default value pair is not mandatory and if none of the possible values match in a operation without the default value then the operation will resolve to a NULL.

This is formalised as follows:

@(testval = poss_valA : ass_valA 
            poss_valB : ass_valB 
            else : default_val)

For example:

string: matrix 
[
    default: "@($(sequence.type) =
          protein : BLOSUM62
          dna : dnamat
          rna : rnamat
          else : unknown)"
]

The $(sequence.type) variable is a string value that holds the sequence type of the ACD data item named sequence. If the type is protein, the operation resolves to BLOSUM62, if the type is dna it resolves to DNAMAT, etc. If the type is not in this list, the operation resolves to unknown.

If the test value cannot unambiguously be assigned to a single associated value then the operation will resolve to the LAST associated value that matches its possible value.

4.4.6. Use of Variables

Variables are useful for holding partial calculations or values. The general syntax for them is:

VariableName : Variable value

Note that, as a variable only has a single value and no attributes, square brackets are not used.

As an example, here is a calculation to determine the maximum size of a sequence window:

integer: start 
[
  standard: "Y"
  maximum:  "@(@($(sequence.length) - $(window)) + 1)"
]

This can be tidied by storing one of the calculations in a partial result as follows:

variable: lminusw "@($(sequence.length) - $(window))"

integer: start 
[
  standard: "Y"
  maximum: "@($(lminusw) + 1)"
]

In the following ACD code, an internal ACD variable protlen is used to store an intermediate result. The value of the variable $(protlen) is calculated from the length of the input sequence ( sequence datatype) and used in the definition of the maximum size of the window parameter:

variable: protlen "@( $(sequence.length) / 3 )"

integer: window  
[
    maximum: "@($(protlen)-50)"
    default: 50
]

The same result could be achieved using nested operations as shown below:

integer: window 
[
    maximum: "@( @( $(sequence.length) / 3) - 50)"
    default: 50
]

The window parameter is calculated directly from the sequence.length variable (calculated attribute) by using the divide arithmetic operation nested within a separate subtraction operation.

Variables may be used to simplify the ACD file making it easier to read and parse. An ACD file can use a variable definition to define a result once only, and then to refer to the variable by name in all later ACD data type definitions. The use of variables however might indicate that there is some complexity in the ACD definitions. When a variable is used, or when a conditional operation refers to another ACD value, the application might logically be regarded as two or more separate applications forked by the conditions resolved.

4.4.6.1. Automatic Variables

Currently there is just one of these (acdprotein) which is set to true or false depending upon the type of the first sequence read.