6.3. Handling ACD Files

6.3.1. Introduction

Basic housekeeping code is required by all EMBOSS and EMBASSY applications. This includes code to process the command line and application ACD file, handle user inputs, retrieve AJAX objects corresponding to data definitions in the ACD file, and to exit cleanly.

6.3.2. Program Initialisation

Every application must process the ACD file and user input. It must:

  • Read in local database definitions

  • Find the right ACD file to use and parse it

  • Parse the command line

  • Prompt the user for required values not specified on the command line

  • Validate user input and reprompt for any incorrect values

  • Allocate memory for an AJAX object for each ACD data definition

  • Open input and output files

  • Read input files (the first sequence from any input sequence stream is read)

  • Initialise the AJAX objects (from reading the input files)

A single call is made to handle all of the above.

For EMBOSS applications:

embInit("ApplicationName", argc, argv);

For EMBASSY applications:

embInitP("ApplicationName", argc, argv, "PackageName");

All applications must call one of the above right at the start of the application. ApplicationName is the name of the ACD file to parse (ApplicationName.acd). PackageName is the name of the EMBASSY package, for example "myemboss". The command line is required which is why argc and argv from main are passed. Once these functions return then no further interaction with the user occurs. All input is read and held in memory before the application proper begins. An AJAX object for each ACD data definition is allocated.

For a simple program with no ACD data definitions, the first few lines of the program would look like this:

int main(int argc, char **argv)
{  
      embInit("helloworld", argc, argv);

6.3.3. Retrieving ACD Values

The ajAcdGet* family of functions return AJAX objects for data definitions in the application ACD file. They are defined in ajacd.h/c and have the general name:

ajAcdGetDatatype

where Datatype is one of the supported ACD datatypes (Section A.2, “Datatypes”).

A prefix ajAcdGet function is provided for each ACD datatype. They are not constructor functions as such, but instead return a pointer to an appropriate AJAX object that has been allocated by a call to embInit or embInitP. For example, when retrieving an ACD string, ajAcdGetString returns a pointer to the string (an AjPStr object) created by embInit. Attributes in the data definition and user input gathered at the command line are used to initialise the object. Memory for any new objects must be freed later on in main().

Table 6.2. ACD Data Retrieval Functions
ACD datatypeAJAX datatype (return value)AJAX Function
alignAjPAlignajAcdGetAlign
arrayAjPFloatajAcdGetArray
booleanAjBoolajAcdGetBoolean
codonAjPCodajAcdGetCodon
cpdbAjPFileajAcdGetCpdb
datafileAjPFileajAcdGetDatafile
directoryAjPDirajAcdGetDirectory
AjPStrajAcdGetDirectoryName
dirlistAjPListajAcdGetDirlist
discretestatesAjPPhyloState*ajAcdGetDiscretestates
AjPPhyloStateajAcdGetDiscretestatesSingle
distancesAjPPhyloDist*ajAcdGetDistances
AjPPhyloDistajAcdGetDistancesSingle
featoutAjPFeattabOutajAcdGetFeatout
featuresAjPFeattableajAcdGetFeatures
filelistAjPListajAcdGetFilelist
floatfloatajAcdGetFloat
doubleajAcdGetFloatDouble
frequenciesAjPPhyloFreqajAcdGetFrequencies
graphAjPGraphajAcdGetGraph
graphxyAjPGraphajAcdGetGraphxy
infileAjPFileajAcdGetInfile
intajintajAcdGetInt
ajlongajAcdGetIntLong
listAjPStr*ajAcdGetList
AjPStrajAcdGetListSingle
matrixAjPMatrixajAcdGetMatrix
matrixfAjPMatrixfajAcdGetMatrixf
outcodonAjPOutfileajAcdGetOutcodon
outcpdbAjPOutfileajAcdGetOutcpdb
outdataAjPOutfileajAcdGetOutdata
outdirAjPDiroutajAcdGetOutdir
AjPStrajAcdGetOutdirName
outdiscreteAjPOutfileajAcdGetOutdiscrete
outdistanceAjPOutfileajAcdGetOutdistance
outfileAjPFileajAcdGetOutfile
outfileallAjPFileajAcdGetOutfileall
outfreqAjPOutfileajAcdGetOutfreq
outmatrixAjPOutfileajAcdGetOutmatrix
outmatrixfAjPOutfileajAcdGetOutmatrixf
outpropertiesAjPOutfileajAcdGetOutproperties
outscopAjPOutfileajAcdGetOutscop
outtreeAjPOutfileajAcdGetOuttree
patternAjPPatlistSeqajAcdGetPattern
propertiesAjPPhyloPropajAcdGetProperties
rangeAjPRangeajAcdGetRange
regexpAjPPatlistRegexajAcdGetRegexp
AjPRegexpajAcdGetRegexpSingle
reportAjPReportajAcdGetReport
scopAjPFileajAcdGetScop
selectAjPStr*ajAcdGetSelect
AjPStrajAcdGetSelectSingle
seqAjPSeqajAcdGetSeq
seqallAjPSeqallajAcdGetSeqall
seqoutAjPSeqoutajAcdGetSeqout
seqoutallAjPSeqoutajAcdGetSeqoutall
seqoutsetAjPSeqoutajAcdGetSeqoutset
seqsetAjPSeqsetajAcdGetSeqset
seqsetallAjPSeqset*ajAcdGetSeqsetall
AjPSeqsetajAcdGetSeqsetallSingle
stringAjPStrajAcdGetString
toggleAjBoolajAcdGetToggle
treeAjPPhyloTree*ajAcdGetTree
AjPPhyloTreeajAcdGetTreeSingle

It's recommended that variables for handling ACD datatypes should have the same name as the parameter or qualifier in question i.e. the name given in the ACD data definition. This is not strictly required but it makes the code much easier to understand. For the same reason all calls to ajAcdGet* functions should be given in a single block of code for ease of reading.

Example. Consider the following ACD file:

application: example 
[
    documentation: "Example application."
]

string: astring
[
    default: "String to be printed to screen."
]

Here is the C source code to print astring to the screen:

int main(int argc, char **argv)
{
    AjPStr  astring = NULL;

    embInit("example", argc, argv);

    astring = ajAcdGetString("astring")     
    ajFmtPrint("%S\n", astring);
    ajStrDel(&astring);            

    embExit();

    return 0;
}

The code declares an AJAX string object (AjPStr) and calls embInit to invoke ACD file processing. embInit allocates memory for the string object which is why the above code does not call a string constructor function explicitly. Nonetheless a string object was created by embInit and should be freed once you are done with it. That is why ajStrDel is called.

6.3.4. Alternative ACD Retrieval Functions

There are several alternative ACD retrieval functions. In all cases token is the name of the ACD data definition (the name of the parameter or qualifier):

AjPStr         ajAcdGetDirectoryName (const char *token);           
AjPPhyloState  ajAcdGetDiscretestatesSingle (const char *token);    
AjPStr         ajAcdGetListSingle(const char *token);               
AjPStr         ajAcdGetOutdirName (const char *token);              
AjPRegexp      ajAcdGetRegexpSingle (const char *token);
AjPStr         ajAcdGetSelectSingle (const char *token);
AjPSeqset      ajAcdGetSeqsetallSingle (const char *token);
AjPPhyloTree   ajAcdGetTreeSingle (const char *token);

In contrast to the standard retrieval functions these return a value derived from the ACD datatype, such as the first sequence from a set of sequences or the name of a directory. For example ajAcdGetOutdirName returns an AjPStr holding the name of an output directory whereas the standard retrieval function ajAcdGetOutdir returns an AjPDir i.e. the directory itself. Their use is explained in the appropriate programming guide (Appendix B, Libraries Reference).

The alternative functions are provided for convenience where the full object is not required. ACD takes care of the memory management for any objects that have not been passed to the main program. Alternative functions with the suffix Single return an element of the object that would normally be returned by the standard retrieval function. This saves the calling program from stepping through a list of values when only one value can be selected form the list (i.e. the ACD file defines a minimum and maximum of 1 value to be returned). You only need to free the single string that was returned. Similarly, alternative functions with the suffix Name return an entirely new string. You need only free this string later, ACD takes care of freeing the full object when the program exits.

Consider the following ACD file:

application: example 
[
    documentation: "Example application."
]

directory: dir
[
    help: "Directory for reading."
]

The program below would print the name of the directory:

int main(int argc, char **argv)
{
    embInit("example", argc, argv);

    AjPStr  name = NULL;

    name =  ajAcdGetOutdirName("dir");

    ajFmtPrint("Directory name is %S\n", name);

    ajStrDel(&name);

    embExit();
    return 0;

6.3.5. Exiting Cleanly

Your application must exit cleanly. In other words all memory that has been allocated must be freed and an appropriate code returned to the operating system.

Memory management is covered in detail elsewhere (Section 5.5, “Programming with Objects”). In brief, memory is allocated by:

  • embInit or embInitP allocate memory for an AJAX object for each ACD data definition, a pointer to which is returned by the ajAcdGet* functions

  • embInit or embInitP also allocate some memory for housekeeping purposes

  • Explicit calls to memory allocation macros

  • Explicit calls to constructor functions

  • Implicit calls to constructor functions, which are made by some functions as a failsafe mechanism where an object is required but an unallocated object pointer was passed

All allocation macros must be matched to a corresponding freeing macro. All constructor calls, explicit or implicit, including calls to ajAcdGet* functions, must be matched to a corresponding destructor function. To free the memory allocated by EMBOSS for housekeeping you must call one of:

void embExit (void);
void embExitBad (void);

These functions are defined in embexit.h/c. embExit returns the success code (0) whereas embExit returns the failure code (0).

The last two lines of most EMBOSS applications are therefore:

    embExit();

    return 0;