6.17. Handling Arrays

6.17.1. Introduction

Array handling is greatly simplified in EMBOSS. AJAX provides dynamic array objects which grow automatically as required thereby saving you the bother of managing the memory yourself. This makes your code cleaner and less error-prone. Objects are provided for handling arrays of various dimensions (1D, 2D or 3D) of the following C datatypes:

  • Character (char)

  • Short integer (ajshort)

  • Unsigned integer (ajuint)

  • Integer (ajint)

  • Floating point number (float)

  • Long floating point number (ajlong)

  • Double-precision floating point number (double)

The objects can be allocated to a default or a specified reserved size. Each object keeps track of the reserved space available (in case a function is called which must extend the array) and the actual length of the array. Arrays of higher dimensions can nest arrays of lower dimensions.

Functions are provided for returning the current length and for getting and setting array elements. The array elements should only ever be accessed using these functions. If you assign a value to an element which is beyond the current size of the array, then the array will be dynamically reallocated to a sufficient size to accommodate the new element. In contrast, if you try to retrieve an element beyond the current size of the array an error will be generated.

The C-type array may be retrieved from any of the array objects should this be required. This allows standard array notation to be used to access array elements. Functions are provided to sort C-type arrays of integers, unsigned integers and floating point numbers in various ways.

Array objects are typically created directly within the C source code. A floating point array can however be specified in the ACD file (the array ACD datatype) and a pointer to a corresponding AJAX object (AjPFloat) retrieved by the program.

6.17.2. AJAX Library Files

AJAX library files for handling arrays are listed in the table (Table 6.29, “AJAX Library Files for Handling Arrays”). Library file documentation, including a complete description of datatypes and functions, is available at:

http://emboss.open-bio.org/rel/dev/libs/
Table 6.29. AJAX Library Files for Handling Arrays
Library File DocumentationDescription
ajarrGeneral array handling
ajsortArray sorting

ajarr.h/cDefines the various array objects and functions for handling dynamic arrays. It also contains static functions for handling them at a low level. You are unlikely to need these unless you plan to extend the core functionality of the library.

ajsort.h/cFunctions for sorting arrays (Section 6.17.10, “Sorting Arrays”).

6.17.3. AJAX Datatypes

For handling an input array defined in the ACD file use:

AjPFloat

Array of floating point numbers (for array ACD datatype).

AJAX provides dynamic array objects for the fundamental C datatypes (Section 5.1, “Basic Datatypes”) below.

6.17.3.1. char arrays

AjPChar

1D character array.

6.17.3.2. short arrays

AjPShort

1D short array.

AjPShort2d

2D short array.

AjPShort3d

3D short array.

6.17.3.3. unsigned int arrays

AjPUint

1D unsigned integer array.

AjPUint2d

2D unsigned integer array.

AjPUint3d

3D unsigned integer array.

6.17.3.4. int arrays

AjPInt

1D integer array.

AjPInt2d

2D integer array.

AjPInt3d

3D integer array.

6.17.3.5. float arrays

AjPFloat

1D float array.

AjPFloat2d

2D float array.

AjPFloat3d

3D float array.

6.17.3.6. long arrays

AjPLong

1D long array.

AjPLong2d

2D long array.

AjPLong3d

3D long array.

6.17.3.7. double arrays

AjPDouble

1D double array.

AjPDouble2d

2D double array.

AjPDouble3d

3D double array.

6.17.4. ACD Datatypes

The ACD datatype for handling array input is:

array

List of either integer or floating point numbers.

6.17.5. ACD Data Definition

A typical ACD definition for array input:

array: thresholds
[
    information: "Values to represent 'identical',  'similar' and 'related'"
    default: "-1.5,0.0,1.5"
    minimum: "0.0"
    size: "3"
    sum: "0"
    sumtest: "Y"
]

6.17.5.1. Parameter Name

All data definitions for array input should have an intuitive name (Section A.1.3, “Parameter Naming Conventions”) although no standard name scheme is enforced.

6.17.5.2. Common Attributes

Attributes that are typically specified are summarised below. They are datatype-specific (Section A.5, “Datatype-specific Attributes”) unless they are indicated as being global attributes (Section A.4, “Global Attributes”).

default: A global attribute that specifies a default value.

minimum: Specifies the minimum permitted value.

size: Specifies the permissible number of elements in an array data definition.

sum: Specifies the total of all values in an array data definition and is tested for unless the sumtest: attribute is false.

sumtest: A boolean attribute which, if turned off, turns off testing for the sum: attribute for an array data definition.

6.17.6. ACD File Handling

Datatypes and functions for handling arrays via the ACD file are shown below (Table 6.30, “Datatypes and Functions for Array Input and Output”).

Table 6.30. Datatypes and Functions for Array Input and Output
To read an array
ACD datatypearray
AJAX datatypeAjPFloat
To retrieve from ACDajAcdGetArray

Your application code will call embInit to process the ACD file and command line (see Section 6.3, “Handling ACD Files”). All values from the ACD file are read into memory and files are opened as necessary. You have a handle on the files and memory through the ajAcdGet* family of functions which return pointers to appropriate objects.

6.17.6.1. Input Array Retrieval

To retrieve an input array an object pointer is declared and then initialised using ajAcdGetArray:

    AjPFloat thresholds=NULL;

    thresholds = ajAcdGetArray("thresholds");

6.17.6.2. Processing Command line Options and ACD Attributes

Currently there are no functions for this.

6.17.6.3. Memory Management

It is your responsibility to free up memory at the end of the program. You must call the default destructor function (see below) on any array objects returned by calls to ajAcdGetArray.

6.17.7. Names of Functions

Functions for handling arrays have consistent names with a prefix of the following general form:

ajTypeDim

Type is the type of the array and is one of:

  • Chararr (character array)

  • Short (ajshort array)

  • Uint (ajuint integer array)

  • Int (ajint array)

  • Float (float array)

  • Long (ajlong array)

  • Double (double array)

Dim is the dimensionality of the array. It is not given for 1D arrays:

  • 2d (2-dimensional array)

  • 3d (3-dimensional array)

For example, all functions for handling 2-dimensional arrays of unsigned integers have the prefix ajUint2d whereas all functions for handling 1-dimensional arrays of floats have the prefix ajFloat.

Functions are provided for handling 1D, 2D and 3D arrays however only 1D character arrays are supported.

6.17.8. Array Object Memory Management

6.17.8.1. Default Object Construction

To use an array object that is not defined in the ACD file you must first instantiate the appropriate object pointer. The default construction functions have the suffix New:

AjPChar      ajChararrNew (void);
AjPShort     ajShortNew (void);
AjPShort2d   ajShort2dNew (void);
AjPShort3d   ajShort3dNew (void);
AjPUint      ajUintNew (void);
AjPUint2d    ajUint2dNew (void);
AjPUint3d    ajUint3dNew (void);
AjPInt       ajIntNew (void);
AjPInt2d     ajInt2dNew (void);
AjPInt3d     ajInt3dNew (void);
AjPFloat     ajFloatNew (void);
AjPFloat2d   ajFloat2dNew (void);
AjPFloat3d   ajFloat3dNew (void);
AjPLong      ajLongNew (void);
AjPLong2d    ajLong2dNew (void);
AjPLong3d    ajLong3dNew (void);
AjPDouble    ajDoubleNew (void);
AjPDouble2d  ajDouble2dNew (void);
AjPDouble3d  ajDouble3dNew (void);

All constructors return the address of a new object. The pointers do not need to be initialised to NULL beforehand but it is good practice to do so. All the functions above are used in the same way:

    AjPShort2d       shorts = NULL;

    shorts    = ajShort2dNew();

    /* The objects are instantiated and ready for use */

6.17.8.2. Default Object Destruction

You must free the memory for an object once your are finished with it. The default destructor functions have the suffix Del and take the address of the appropriate object as an argument:

void  ajChararrDel  (AjPChar* Parr);
void  ajShortDel    (AjPShort* Parr);
void  ajShort2dDel  (AjPShort2d* Parr);
void  ajShort3dDel  (AjPShort3d* Parr);
void  ajUintDel     (AjPUint* Parr);
void  ajUint2dDel   (AjPUint2d* Parr);
void  ajUint3dDel   (AjPUint3d* Parr);
void  ajIntDel      (AjPInt* Parr);
void  ajInt2dDel    (AjPInt2d* Parr);
void  ajInt3dDel    (AjPInt3d* Parr);
void  ajFloatDel    (AjPFloat* Parr);
void  ajFloat2dDel  (AjPFloat2d* Parr);
void  ajFloat3dDel  (AjPFloat3d* Parr);
void  ajLongDel     (AjPLong* Parr);
void  ajLong2dDel   (AjPLong2d* Parr);
void  ajLong3dDel   (AjPLong3d* Parr);
void  ajDoubleDel   (AjPDouble* Parr);
void  ajDouble2dDel (AjPDouble2d* Parr);
void  ajDouble3dDel (AjPDouble3d* Parr);

For example to free memory allocated by array input from the ACD file use:

void  ajFloatDel (AjPFloat* Parr);

It is used as follows:

    AjPFloat thresholds=NULL;

    thresholds = ajAcdGetArray("thresholds");

    /* Do something with the array */

    ajFloatDel(&thresholds);

It is the responsibility of the calling function to destroy any objects once they are finished with:

    AjPShort2d       shorts = NULL;

    shorts    = ajShort2dNew();

    /* The object is instantiated and ready for use */

    /* Do something with the instantiated objects */

    ajShort2dDel(&shorts);

    /* The memory is freed and the pointers reset to NULL, ready for re-use. */

    shorts    = ajShort2dNew();

    /* Do something else with the new objects.  The pointer variable is reallocated. */

    ajShort2dDel(&shorts);

    /* The objects are done with so the memory is freed. */

6.17.8.3. Alternative Object Construction and Loading

An array can be constructed with a fixed reserve size by using functions with the suffix NewRes. Such functions take an integer argument (size) for the reserved size but are otherwise identical to the default constructor. Only a few of the available functions are shown below:

AjPChar     ajChararrNewRes (ajint size);
AjPShort    ajShortNewRes (ajint size);
AjPShort2d  ajShort2dNewRes (ajint size);
AjPShort3d  ajShort3dNewRes (ajint size);
AjPUint     ajUintNewRes (ajint size);
AjPUint2d   ajUint2dNewRes (ajint size);
...

6.17.9. Getting and Setting Array Elements

If you assign a value to an element array which is beyond the current size of the array, then the array will be dynamically reallocated to a sufficient size to accommodate the new element. If you try to retrieve an array element beyond the current size of the array an error will be generated.

It is therefore handy to know the size of an array in advance. Functions for retrieving the size of an array have the suffix Len and either return the array length (for 1D arrays) or write the lengths by argument (2D and 3D arrays). Again, just a few functions are shown:

ajint  ajChararrLen (const AjPChar arr); 
ajint  ajShortLen (const AjPShort arr);
void   ajShort2dLen (const AjPShort2d arr, ajint* Pn1, ajint* Pn2);
void   ajShort3dLen (const AjPShort3d arr,ajint* Pn1, ajint* Pn2, ajint* Pn3);
ajint  ajUintLen (const AjPUint arr);
void   ajUint2dLen (const AjPUint2d arr, ajint* Pn1, ajint* Pn2);
...

For example, to get the sizes of a 3D double array:

AjPDouble3d arr = NULL;
ajint dim1;
ajint dim2;
ajint dim3;

/* Allocate array */
ajDouble3dLen(arr, &dim1, &dim2, &dim3);

Functions to retrieve an element of an array have the suffix Get. They take the index of the element to retrieve and return the appropriate datatype:

char    ajChararrGet (const AjPChar arr, ajint n);
short   ajShortGet (const AjPShort arr, ajint n);
short   ajShort2dGet (const AjPShort2d arr, ajint n1, ajint n2);
short   ajShort3dGet (const AjPShort3d arr, ajint n1, ajint n2, ajint n3);
ajuint  ajUintGet (const AjPUint arr, ajint n);
ajuint  ajUint2dGet (const AjPUint2d arr, ajint n1, ajint n2);
...

For example, to retrieve the very first element of a 3-dimensional float array:

    AjPFloat3d arr = NULL;
    float    val;

    ...
    /* Allocate array */

    val = ajFloat3dGet(arr, 0, 0, 0);

Functions to set an element of an array have the suffix Put. They take the address of an array, the index of the element to set and of course the value to set. They return ajTrue if the element was set successfully:

AjBool  ajChararrPut (const AjPChar* Parr, ajint n, char chr);
AjBool  ajShortPut (const AjPShort* Parr, ajint n, short i);
AjBool  ajShort2dPut (const AjPShort2d* Parr, ajint n1, ajint n2, short i);
AjBool  ajShort3dPut (const AjPShort3d* Parr, ajint n1, ajint n2, ajint n3, short i);
AjBool  ajUintPut (const AjPUint* Parr, ajint n, ajuint i);
AjBool  ajUint2dPut (const AjPUint2d* Parr, ajint n1, ajint n2, ajuint i);
...

For example, to set the very first element of a 3-dimensional float array:

    AjPFloat3d arr = NULL;
    float    val = 1.0;

    arr = ajFloat3dNew();

    ajFloat3dPut(&arr, 0, 0, 0, val);

6.17.10. Sorting Arrays

Functions are provided in ajsort.h/c to sort C-type arrays of integers, unsigned integers and floating point numbers. Arrays may be sorted in the following ways:

  • Basic sorting of a single array

  • Sorting a single array by the order of elements in another array

  • Sorting of two arrays simultaneously

  • Sorting in ascending or descending order

Functions for array sorting have the prefix ajSortInc (for sorting in incremental ascending order) or prefix ajSortDec (for sorting in descending order), followed by the datatype (Int, Uint or Float) of the array(s) that is sorted. The array size must be provided.

The basic sort functions include:

extern void  ajSortIntDec (ajint *a, ajuint n);
extern void  ajSortUintDec (ajuint *a, ajuint n);
extern void  ajSortFloatDec (float *a, ajuint n);
extern void  ajSortIntInc (ajint *a, ajuint n);
extern void  ajSortUintInc (ajuint *a, ajuint n);
extern void  ajSortFloatInc (float *a, ajuint n);

There are functions for sorting a single array (p) by the order of elements in another array (a):

void  ajSortFloatDecI (const float *a, ajuint *p, ajuint n);
void  ajSortIntDecI (const ajint *a, ajuint *p, ajuint n);
void  ajSortUintDecI (const ajuint *a, ajuint *p, ajuint n);
void  ajSortFloatIncI (const float *a, ajuint *p, ajuint n);
void  ajSortIntIncI (const ajint *a, ajuint *p, ajuint n);
void  ajSortUintIncI (const ajuint *a, ajuint *p, ajuint n);

Functions for sorting two arrays simultaneously:

void  ajSorttwoIntIncI (ajint *a, ajuint n, ajuint *p);
void  ajSorttwoUintIncI (ajuint *a, ajuint n, ajuint *p);

6.17.11. Retrieving a C-type Array

The C-type array may be retrieved from any of the array objects using the following functions. The function naming scheme should be obvious. Again, not all variants are shown:

char*     ajChararrChararr (const AjPChar arr);   

short*    ajShortShort (const AjPShort arr); 
short**   ajShort2dShort (const AjPShort2d arr);    
short***  ajShort3dShort (const AjPShort3d arr); 

ajuint*   ajUintUint (const AjPUint arr); 
ajuint**  ajUint2dUint (const AjPUint2d arr); 
...

Once the C-type array is retrieved then standard array notation can be used to access array elements. It is of course your responsibility to ensure you do not try to access an array element that is out of bounds. To help avoid doing so you can use the *Len functions (see above) to return the array size(s).