A.1. Introduction to ACD Syntax

A.1. Introduction to ACD Syntax
Prev	Appendix A. ACD Syntax Reference	Next

A.1.1. General Syntax

The Ajax Command Definition (ACD) language was designed for writing ACD files for EMBOSS applications. Every application in EMBOSS or EMBASSY has an ACD file. The ACD syntax allows for very flexible descriptions of an application's parameters and its command line interface. It can specify everything that can appear on the command line or in another interface such as a web page.

ACD files are plain ASCII text files and must have the extension .acd. Typically they have the same name as the application, but this is not mandatory.

A.1.1.1. Whitespace

During ACD file parsing, the entire file contents are effectively treated as a single string which is parsed into tokens delimited by space characters. A single space between individual tokens is required: extraneous whitespaces are ignored.

A.1.1.2. Comments

Comment lines can be added and begin with "#" and continue to the end of the line.

A.1.2. ACD Definitions

An ACD file contains a single application definition and a data definition for each parameter. The application definition is given first, followed by the data definitions. Data definitions are organised into sections (see Section A.1.6, “ACD File Sections”).

Application and data definitions have the following general form: a single text token followed by a colon ':' (or '=') and a white space, followed by a second token. The definition body follows, which is one or more attributes delimited by a mandatory pair of square brackets [ ], which can span multiple lines. Each attribute is a name: value pair with the attribute value given between quotes (" "):

Either:

token: token 
[
   Attribute1Name: "Attribute1Value"
   Attribute2Name: "Attribute2Value"
]

Or:

token=token 
[
   Attribute1Name: "Attribute1Value"
   Attribute2Name: "Attribute2Value"
]

The first token is either application: (for the application definition) or an AJAX datatype (e.g. sequence) for data definitions. The second token is either the name of the application (e.g. seqret) or the name of parameter (e.g. asequence).

Application definition:

application: ApplicationName 
[
   ApplicationAttribute1Name: "ApplicationAttribute1Value"
   ApplicationAttribute2Name: "ApplicationAttribute2Value"
]

Data definition:

Datatype: ParameterName 
[
   DataAttribute1Name: "DataAttribute1Value"
   DataAttribute2Name: "DataAttribute2Value"
]

The application token and tokens for the datatype and attribute names can be can be abbreviated up to the point where they are not ambiguous. Such abbreviations are not recommended however because they tend to make the ACD file more difficult to read.

Attribute values are normally enclosed in double quotes, although this is only mandatory for values (typically strings) which include whitespace.

A.1.2.1. Application Definition

The application definition must be the first definition in the file:

application: ApplicationName 
[
   ApplicationAttribute1Name: "ApplicationAttribute1Value"
   ApplicationAttribute2Name: "ApplicationAttribute2Value"
]

The application name is arbitrary but is typically the same as that used for the ACD file name. It is the ACD file name (not ApplicationName, if different) that's used from within the application C source code to associate it with an ACD file. This allows multiple ACD files (and therefore command line interfaces) to be developed for a single file of application C source code.

For complete description of the available application attributes see Section A.3, “Application Attributes”.

A.1.2.2. Data Definition

All application parameters must have a data definition. Data definitions follow the application definition and must be placed in an appropriate file section (see Section A.1.6, “ACD File Sections”):

Datatype: ParameterName 
[
   DataAttribute1Name: "DataAttribute1Value"
   DataAttribute2Name: "DataAttribute2Value"
]

Datatype must be a valid ACD datatype. For a complete descriptions of the available datatypes see Section A.2, “Datatypes”.

ParameterName is the name of the parameter. It is a string that must conform to certain conventions (Section A.1, “Introduction to ACD Syntax”). This name is used to refer to the data definition from the command line and from within the C source code (see Section 6.3, “Handling ACD Files”).

For a complete description of the available attributes see:

Global attributes (Section A.4, “Global Attributes”)

Datatype-specific attributes (Section A.5, “Datatype-specific Attributes”)

Calculated attributes (Section A.6, “Calculated Attributes”)

A.1.3. Parameter Naming Conventions

A.1.3.1. General Conventions

The general conventions for parameter and qualifier names are as follows:

Must not contain whitespace characters
Should not normally be single characters
Should be meaningful words and indicate the function of the option so far as possible
They are not case sensitive

A.1.3.2. Datatype-specific Conventions

The conventions for parameter names that apply for individual datatypes are given in the table below.

Where more than one instance of a datatype is specified in an ACD file, then the character a, b etc can be appended to the flag: asequence, bsequence etc. This is indicated in the table by an asterisk in the parameter name, for example *sequence (see Table A.1, “Parameter and Qualifier Naming Conventions”).

Table A.1. Parameter and Qualifier Naming Conventions
Datatype	Name	Usage
`sequence`	`sequence, *sequence`	Primary input sequence, generally required
`seqall`	`sequence, *sequence, seqall`	Primary input sequence database, generally required
`seqset`	`sequence, *sequence, sequences`	Primary input sequences, generally required
`seqsetall`	`sequence, *sequence, sequences`	Primary input sequences, generally required
`seqout`, `seqoutset`, `seqoutall`	`outseq, outseq, outfile`	Primary output sequence, generally required, generally should default to the primary input sequence name, extension defaults to the name of the output sequence format.
`outfile`	`outfile, *file`	Primary output non-sequence results file, generally required. The file extension should be allowed to default to the application name. `outfile` should be used for the first output file. `outfile` or `*file` is acceptable for the second and subsequent output files.
`report`	`outfile, *file`	Report output file. `outfile` should be used for the first report file. `outfile` or `*file` is acceptable for the second and subsequent report files.
`align`	`outfile, *file`	Alignment output file. `outfile` should be used for the first output alignment. `outfile` or `*file` is acceptable for the second and subsequent output alignments.
`infile`	`infile, *file`	Primary input non-sequence file
`infile`	`data`	Primary auxiliary input data file, generally optional
`infile`	`patterns`	File of patterns to search for in sequence
`integer`	`minlen`	Minimal length of sequence feature to be found
`integer`	`maxlen`	Maximum length of sequence feature to be found
`integer`	`wordsize`	Word size for hash tables etc. Generally minimum value = 2 for protein, 4 for DNA
`integer`	`window`	Window length for calculating dotplots/features/etc.
`integer`	`shift`	Amount by which window is shifted in each iteration
`boolean`	`consensus`	Flag for whether consensus sequence should be output
`float`	`gap`	Gap penalty
`float`	`gapext`	Gap extension penalty
`integer`	`from`	Position of start of input sequence to specify for an operation (e.g. deletion), defaults to start of sequence, minimum value = 1, maximum value = <sequence length>
`integer`	`to`	Position of end of input sequence to specify for an operation (e.g.: deletion), defaults to the `from` value, minimum value = `from` value, maximum value = <sequence length>
`float` or `integer`	`threshold`	Threshold for various operations
`boolean`	`left`	Operation should be done at the start of the sequence
`boolean`	`right`	Operation should be done at the end of the sequence
`string`	`pattern`	Pattern to search for in sequence
`graph`	`graph`	Graphical output
`xygraph`	`graph`	Graphical output
`directory`	`directory, dir, path`	Directory of files
`outdir`	`outdir, *outdir`	Output directory of files
`dirlist`	`directory`	Directory of files
`filelist`	`*files`	List of files
`matrix`	`matrix`	Matrices
`datafile`	`datafile`	Datafiles
`feature`	`feature, *feature`	Feature input
`featout`	`outfeat, *outfeat`	Feature output
`regexp`	`pattern`	Regular expressions

A.1.3.3. Validated Parameter Names

For some datatypes, conventions are more strongly enforced: a warning will be generated during ACD processing if a standard name is not used for the following datatypes:

Sequence inputs (any data definition of the type sequence, seqall, seqsetall or seqset) and sequence outputs (seqout, seqoutall and seqoutset datatypes)
Feature inputs (any data definition of the type feature) and feature outputs (featout datatype)
Alignments (align datatype)
File inputs and outputs (any data definition of the type infile, filelist, directory, dirlist or outfile)
Report output (report datatype)

A.1.4. Types of Attributes

Application attributes may be defined for an application definition (Section A.3, “Application Attributes”).

There are three basic types of attributes that may be defined for a data definition:

Global attributes (Section A.4, “Global Attributes”)
Datatype-specific attributes (Section A.5, “Datatype-specific Attributes”)
Calculated attributes (Section A.6, “Calculated Attributes”)

Additionally, there are various "datatype associated" command line qualifiers (or simply "associated qualifiers") that are inbuilt for certain ACD datatypes may also be defined as attributes in the appropriate data definition. These are listed in the datatype descriptions (Section A.2, “Datatypes”).

A.1.5. Parameters and Qualifiers

Every data definition in the ACD file can be defined via an appropriate attribute to be one of the following:

Parameter
Standard Qualifier
Additional Qualifier

with the default being:

Advanced Qualifier

They differ in terms of how they are prompted for, how they may be specified on the command line and whether help information for them appears.

This behaviour is summarised in the table below (Table A.2, “Behaviour of Command line Parameters and Qualifiers”). "Flag" indicates whether the flag (parameter or qualifier name) must be given on the command line. "Prompt" indicates whether a value will be prompted for if one is not specified on the command line. Additional qualifiers will only be prompted for if -options is specified. "Help" indicates where the information from the built-in -help qualifier is shown. For more information, see Section 4.5, “Controlling the Prompt”.

Table A.2. Behaviour of Command line Parameters and Qualifiers
Type	Attribute	Flag	Prompt	Help
parameter	`parameter: "Y"`	No	Yes	Required section
standard	`standard: "Y"`	Yes	Yes	Required section
additional	`additional: "Y"`	Yes	Yes (with `-options`) or No (default needed)	Advanced section
advanced (default)	No attribute needed	Yes	No	Advanced section

A.1.6. ACD File Sections

Any data definitions in an ACD file must be contained within an appropriate section and given in the correct order. The sections must appear in this order:

Input
Required
Additional
Advanced
Output

Subsections with arbitrary names can also be defined. They can appear in any order but must be nested in a major section.

Sections and subsections have the following general syntax:

section: SectionName 
[
  information: "SectionName section"
  type: "page"
]
.
. (data definitions go here)
.
   section: NestedSectionName 
   [
   information: "NestedSectionName section"
   type: "page"
   ]
   .
   . (data definitions go here)
   .
   endsection: NestedSectionName
.
endsection: SectionName

For example:

section: input 
[
  information: "Input section"
  type: "page"
]
.
. (input data definitions go here)
.
   section: inputsubsection 
   [
   information: "Input sub-section"
   type: "page"
   ]
.
. (input sub-section data definitions go here)
.
endsection: inputsubsection

endsection: input

The section contents is summarised in the table (Table A.3, “ACD File Sections”).

Table A.3. ACD File Sections
Section name	Description
Input	Simple input values and any ACD type that will read input, including `infile`, `sequence`, `seqset`, `seqall`, `matrix`, `fmatrix` and `codon`. Any other parameters and qualifiers related to input can also be placed in this section. At present `datafile` is also included.
Required	Parameters and Standard Qualifiers, including any whose `standard:` attribute can be true but depends on a conditional operation. Any `toggle:` definitions that are used by the Parameters and Standard Qualifiers. Note that input and output parameters and qualifiers must be in their respective sections.
Additional	Additional Qualifiers, including any whose `additional` attribute can be true but depends on a conditional operation. Any `toggle:` definitions that are used by Additional Qualifiers. Input and output parameters and qualifiers must be in their respective sections.
Advanced	Any qualifiers (except input and output qualifiers) which have no `standard:` or `additional:` attribute defined.
Output	Any data type that will write output, including any `outfile`, `outdata`, `seqout`, `seqoutall`, `seqoutset` and `outtree`. Other qualifiers related to output can also be placed in this section. This is the last section to be defined.

A.1.6.1. Validation of Sections

Restrictions on the order of sections and what data definitions can appear in what sections are defined in the EMBOSS system file sections.standard (see Section 4.1, “Introduction to ACD File Development”). The restrictions are enforced during ACD processing and an error will be generated in the following circumstances:

If major sections appear in the wrong order
If subsections appear in the wrong major sections
If a parameter (data definition with a parameter: "Y" attribute) or a standard qualifier (standard: "Y" attribute) occurs in the "Advanced" or "Additional" sections
If an additional qualifier (additional: "Y" attribute) occurs in the "Advanced" or "Required" sections
If an advanced qualifier (no parameter: "Y", standard: "Y" or additional: "Y" attribute) occurs in the "Additional" or "Required" sections