AJAX provides objects for implementing doubly-linked lists and iterating over them. All the basic functionality that you would expect is provided. This includes:
Node retrieval without removal (peeking)
List iteration to move backwards and forwards through a list
List querying to return nodes identified by a provided query function
List editing to add (push) and remove (pop) nodes from a list, to push an entire list onto another and reverse the node order
Garbage collection to remove nodes identified by a provided query function
List sorting on up to three fields (C data structure elements) of the node data
List to array conversion
List processing invoking a provided function on the list nodes
The list and list iterator objects are created and handled directly from the C source code. The list
ACD datatype is not for generic lists; it is used to handle selections from a menu.
AJAX library files for handling lists are listed in the table (Table 6.28, “AJAX Library Files for Handling Lists”). Library file documentation, including a complete description of datatypes and functions, is available at:
http://emboss.open-bio.org/rel/dev/libs/ |
Library File Documentation | Description |
---|---|
ajlist | General list handling |
ajlist.h/c
. All the functions you are likely to ever need for the creation and control of linked lists. They define the basic list object (AjPList
) which includes the list node object (AjPListNode
) as a nested substructure, and a list iteration object (AjIList
). It also includes static data structures and functions for handling lists at a low level. You are unlikely to need the low-level code unless you plan to implement code to extend the core functionality of the library.
There is no dedicated ACD datatype for handling general lists, the list
ACD datatype being used to handle selections from a menu (see Section 6.19, “Handling Menus”). For numeric arrays the array
datatype can be used (see Section 6.17, “Handling Arrays”).
There are three AJAX datatypes for handling lists:
AjPList
Basic linked list.
AjPListNode
Node in a linked list, included as a nested substructure in AjPList
.
AjIList
Used for list iteration.
The functions handle lists of nodes which include a pointer variable to the node data. The generic list handling functions have void *
parameters which must be cast to the correct type when an argument is passed. For convenience, a dedicated set of functions for handling string lists are provided that are cast to the appropriate type. Such functions all have the prefix ajListstr
to distinguish them from the general functions (which have the prefix ajList
).
To use a list you must first instantiate the list object pointer. The default constructor functions allocate an empty list (no nodes or node data):
/* Default generic list constructor */ AjPList ajListNew (void); /* Default string list constructor */ AjPList ajListstrNew (void);
All constructors return the address of a new list. The pointers do not need to be initialised to NULL
but it is good practice to do so:
AjPList list = NULL; AjPList liststr = NULL; list = ajListNew(); liststr = ajListstrNew(); /* The objects are instantiated and ready for use */
You must free the memory for a list once you are finished with it. The default destructor functions will free the list and the nodes but not the node data itself:
/* Default generic list destructor */ AjPList ajListFree (AjPList* Plist); /* Default string list destructor */ AjPList ajListstrFree (AjPList* Plist);
Typical code for object destruction is as follows:
AjPList list = NULL; AjPList liststr = NULL; list = ajListNew(); liststr = ajListstrNew(); /* The objects are instantiated and ready for use ... do something with the list */ ajListFree(&list); ajListFree(&liststr); /* The lists are freed and the pointers reset to NULL, ready for reuse. list = ajListNew(); liststr = ajListstrNew(); /* The pointer variable is reallocated. Do something else with the new lists. */ ajListFree(&list); ajListstrFree(&liststr); /* Free the lists once you are done with them. */
In practice you will allocate data to the nodes in a list which must be freed once you are done with them. You must be careful not to lose your handle (C pointers) on the node data in your program, or to forgot to free the nodes, or to try and free the node data twice. These are all easy mistakes to make. In the following example, two strings (str1
and str2
) are pushed onto both a generic list (list
) and a string list (liststr
) using ajListPushAppend
and ajListstrPushAppend
respectively. This is for illustrative purposes only. In practice you would always use the list string functions (ajListstr*
) to handle lists of strings:
AjPList list = NULL; AjPList liststr = NULL; AjPStr str1 = NULL; AjPStr str2 = NULL; AjPStr strtmp = NULL; AjIList iter = NULL; /* Create the strings and lists */ str1 = ajStrNewC("String 1"); str2 = ajStrNewC("String 2"); list = ajListNew(); liststr = ajListstrNew(); /* Push the strings onto the generic list. The strings must be cast to void *. */ ajListPushAppend(list, (void *)str1); ajListPushAppend(list, (void *)str2); /* Push the same strings onto the string list. The cast is not needed. */ ajListstrPushAppend(liststr, str1); ajListstrPushAppend(liststr, str2);
As the same two strings are pushed onto both lists, two string objects and both the lists need freeing. For string lists only, there is a destructor to free both the list and the node data (strings) referenced by the list:
/* String list and nodes destructor */ AjPList ajListstrFreeData (AjPList* Plist);
Therefore the memory allocated by the code above could be freed as follows:
/* Free the string list and strings using ajListstrFreeData. Free the generic list using ajListFree */ ajListFree(&list); ajListstrFreeData(&liststr); /* All memory is freed now and the list pointers are set to NULL and ready for reuse. */
For generic lists, you must free the node data manually. The following code is typical and uses the list iterator object (iter
) to iterate through a list list
) using ajListIterGet
calling the destructor function (in this case a string destructor) on each node in turn. Instead of using ajListstrFreeData
the memory allocated by the code above could be freed as follows:
/* Iterate through the list deleting the node data */ iter = ajListIterNew(list); while(strtmp = (AjPStr) ajListIterGet(iter)) ajStrDel(&strtmp); /* Free the list and list iterator */ ajListIterDel(&iter); ajListFree(&list);
Putting all this together:
AjPList list = NULL; AjPList liststr = NULL; AjPStr str1 = NULL; AjPStr str2 = NULL; AjPStr strtmp = NULL; AjIList iter = NULL; /* Create the strings and lists */ str1 = ajStrNewC("String 1"); str2 = ajStrNewC("String 2"); list = ajListNew(); liststr = ajListstrNew(); /* Push the strings onto the generic list. The strings must be cast to void *. */ ajListPushAppend(list, (void *)str1); ajListPushAppend(list, (void *)str2); /* Push the same strings onto the string list. The cast is not needed. */ ajListstrPushAppend(liststr, str1); ajListstrPushAppend(liststr, str2); /* Free the string list and strings using ajListstrFreeData. Free the generic list using ajListFree */ ajListFree(&list); ajListstrFreeData(&liststr); /* All memory is freed now and the list pointers are set to NULL and ready for reuse. */ /* Re-allocate the strings and push them onto the generic list. */ str1 = ajStrNewC("String 1"); str2 = ajStrNewC("String 2"); list = ajListNew(); ajListPushAppend(list, (void *)str1); ajListPushAppend(list, (void *)str2); /* Iterate through the list deleting the node data */ iter = ajListIterNew(list); while(strtmp = (AjPStr) ajListIterGet(iter)) ajStrDel(&strtmp); /* Free the list and list iterator */ ajListFree(&list); ajListIterDel(&iter); /* All memory is freed now */
Various alternative constructor and destructor functions are provided. For the constructors you must know whether only the nodes are copied by a function (the node data being shared between two lists) or the nodes and node data are physically copied from one list to another. Similarly, for the destructors, you must distinguish whether just the list or the list and the node data are deleted.
Functions to copy a list of nodes but not the node data itself include:
/* Copy a generic list */ AjPList ajListNewListref (const AjPList list); /* Copy a string list */ ajListstrNewListref (const AjPList list);
In the following example, ajListstrNewListref
is used to copy a list:
AjPList list = NULL; AjPList listcopy = NULL; AjPStr str1 = NULL; AjPStr str2 = NULL; /* Create a list of strings */ list = ajListNew(); str1 = ajStrNewC("String 1"); str2 = ajStrNewC("String 2"); ajListPushAppend(list, (void *)str1); ajListPushAppend(list, (void *)str2); /* Copy the list */ listcopy = ajListstrNewListref(list); /* Do something with the lists */ /* Free the original list and node data */ ajListstrFreeData(&list); /* Free the copy of the list */ ajListstrFree(&listcopy);
Step through the data and use ajListPushAppend
calls.
For example, to create a list from two sequences:
AjPList list = NULL; AjPSeq seq1 = NULL; AjPSeq seq2 = NULL; AjPSeq seqtmp = NULL; AjIList iter = NULL; /* Create the sequences */ list = ajListNew(); seq1 = ajSeqNewNameC("AAAAAAAAAA"); seq2 = ajSeqNewNameC("LLLLLLLLLL"); /* Create the list */ list = ajListNew; ajListPushAppend(list, seq1); ajListPushAppend(list, seq2); /* Do something with the lists */ /* Free the list and node data */ iter = ajListIterNew(list); while(seqtmp = (AjPSeq) ajListIterGet(iter)) ajSeqDel(&seqtmp); ajListFree(&list);
Functions are provided to return a node from a list without removing it. The current node, first node, last node or a numbered node may be returned:
/* Retrieve the current node in a generic list */ AjBool ajListPeek (const AjPList list, void** Pptr); /* Retrieve the current node in a string list */ AjBool ajListstrPeek (const AjPList list, AjPStr* Pstr); /* Retrieve the nth node in a generic list */ AjBool ajListPeekNumber (const AjPList list, void** Pptr, ajint n); /* Retrieve the first node in a generic list */ AjBool ajListPeekFirst (const AjPList list, void** Pptr); /* Retrieve the last node in a generic list */ AjBool ajListPeekLast (const AjPList list, void** Pptr);
All the functions return ajTrue
if the node was retrieved successfully and use an appropriate object pointer to pick up the retrieved node. For example, here the first node in a list of sequences is retrieved. Code to create the list of sequences is not shown:
AjPSeq seqtmp = NULL; AjPList list = NULL; ... /* Code to allocate list of sequences */ if(ajListPeekFirst(list, &seqtmp)) /* Do something with list */; else /* Error handling */;
To iterate through a list you must first construct a list iteration object. There are four functions for this:
/* Default list iterator constructor, start to end iteration. */ AjIList ajListIterNew (AjPList list); /* Start to end iteration through read-only list */ AjIList ajListIterNewread (const AjPList list); /* End to start iteration */ AjIList ajListIterNewBack (AjPList list); /* End to start iteration through read-only list */ AjIList ajListIterNewreadBack (const AjPList list);
You have already come across ajListIterNew
. The other functions are used in exactly the same way except they set up the iterator for reading from a read-only list (ajListIterNewread
), to iterate from the end to the start of a list ajListIterNewBack
or to iterate from end to start through a read-only list (ajListIterNewreadBack
).
Flexible control statements can be constructed for list iteration. There are three functions for iterating through a list from start to end:
/* Returns the next item in a list */ void* ajListIterGet (const AjIList iter); /* Returns true if iteration is complete */ AjBool ajListIterDone (const AjIList iter);
You have already come across ajListIterGet
to return the next item in a list (or NULL
if the end of the list is reached). ajListIterDone
will return ajTrue
if iteration has completed.
Another three functions have the same functionality but are for iteration from the end to the start of a list:
/* As above, but end to start iteration */ void* ajListIterGetBack (const AjIList iter); /* As above, but end to start iteration */ AjBool ajListIterDoneBack (const AjIList iter);
To search through a list and determine if any nodes are matches according to a given function use:
/* Search a generic list */ AjBool ajListMap (const AjPList list, AjBool apply(void** Pptr), void* cl); /* Search a string list */ AjBool ajListstrMap (const AjPList list, AjBool apply(AjPStr* Pstr), void* cl);
The search function must be provided and is of the same type as the comparison function used for garbage collection (see below). The cl
argument is usually NULL, but can be used to pass external data to the function. Using the same example, if the match function used is:
AjBool Compare(const void * ptr, void* cl) { if(ajSeqGetLen((AjPSeq)ptr) > 1000) return ajTrue; return ajFalse; }
then ajListMap
will return ajTrue
if any of the nodes is a sequence whose length is greater than 1000:
if(ajListMap(list, Compare, NULL)) /* List contains a sequence > 1000 residues */; else /* List does not contain such a sequence */;
Several basic list editing functions are provided. These include:
Popping a node from the start or end of a list
Pushing a node onto the start or end of a list
Push an entire list of nodes onto another list
Add or remove a list node at any point in the list
Reverse the order of the nodes in a list
The following functions add a node to a list (equivalent functions for string lists are available):
/* Push a node onto the start of a list */ void ajListPush (AjPList list, void* ptr); /* Push a node onto the end of a list */ void ajListPushAppend (AjPList list, void* ptr); /* Push a list of nodes onto the start of a list */ void ajListPushlist (AjPList list, AjPList* Plist); /* Insert a node into a list using an iterator */ void ajListIterInsert (AjIList iter, void* ptr);
ajListPushlist
deletes the list from which the nodes were pushed, so you must be careful not to try and delete the node data twice. In future code refactoring it may be renamed to reflect its destructor nature. In the following example, all memory is freed correctly:
AjPList list = NULL; AjPList listnew = NULL; AjPSeq seq1 = NULL; AjPSeq seq2 = NULL; AjPSeq seqtmp = NULL; AjIList iter = NULL; /* Create a list of sequences */ list = ajListNew(); seq1 = ajSeqNewNameC("AAAAAAAAAA", "seqA"); seq2 = ajSeqNewNameC("CCCCCCCCCC", "seqC"); ajListPushAppend(list, seq1); ajListPushAppend(list, seq2); /* Create a new list and add the nodes to it (the original list is deleted) */ listnew = ajListNew(); ajListPushlist(list, &listnew); /* Free the list and node data */ iter = ajListIterNew(listnew); while(seqtmp = (AjPSeq) ajListIterGet(iter)) ajSeqDel(&seqtmp); ajListFree(&listnew);
ajListIterInsert
will insert a node into a list at the current position of the list iterator that is passed. In this example a new sequence is inserted into the middle of the list:
AjPList list = NULL; AjPSeq seq1 = NULL; AjPSeq seq2 = NULL; AjPSeq seq3 = NULL; AjIList iter = NULL; /* Create a list from seq1 and seq3 */ list = ajListNew(); seq1 = ajSeqNewNameC("AAAAAAAAAA"); seq3 = ajSeqNewNameC("CCCCCCCCCC"); ajListPushAppend(list, seq1); ajListPushAppend(list, seq3); iter = ajListIterNew(list); while(seqtmp = (AjPSeq) ajListIterGet(iter)) { /* Insert a new node once node with sequence "CCCCCCCCCC" is found */ if(ajStrMatchC(ajSeqGetSeqS(seqtmp), "CCCCCCCCCC")) { seq2 = ajSeqNewNameC("BBBBBBBBBB"); ajListIterInsert(iter, (void *) seq2); break; } } /* The order of sequences in the list is now seq1, seq2, seq3 */
The following functions remove a node from a list (equivalent functions for string lists are available):
/* Pop a node from the start of a list */ AjBool ajListPop (AjPList list, void** Pptr); /* Pop a node from the end of a list */ AjBool ajListPopLast (AjPList list, void** Pptr); /* Remove current node at a list iterator position */ void ajListIterRemove (AjIList iter);
ajListIterRemove
simply removes the node at the current position of the list iterator that is passed. You must be careful not to lose your handle on the node data and inadvertently create a memory leak, which is why in this example, which follows from the code above, the sequence seq3
is deleted explicitly:
iter = ajListIterNew(list); while(seqtmp = (AjPSeq) ajListIterGet(iter)) { /* Simply delete the node */ if(ajStrMatchC(ajSeqGetSeqS(seqtmp), "CCCCCCCCCC")) { ajListIterRemove(iter); /* Delete the node */ ajSeqDel(&seqtmp); /* Delete the node data to avoid a memory leak */ break; }
In contrast, when a node is popped using ajListPop
or ajListPopLast
the data is caught by the second argument, which should be freed once you are done with it:
... ajListPopLast(list, &seqtmp); /* Do something with the popped sequence before deleting it */ ajSeqDel(&seqtmp);
To reverse the order of nodes in a list use:
void ajListReverse (AjPList list); /* Generic list */ void ajListstrReverse (AjPList list); /* String list */
Garbage collection may be performed on a list to remove nodes whose node data satisfy a certain constraint. ajListPurge
is used and takes three arguments: the list to be cleaned (list
), a comparison function (test
) to check the constraint and a destructor function (nidedelete
) to delete the node data:
void ajListPurge (AjPList list, AjBool (*test)(const void *), void (*nodedelete)(void **));
The comparison and destructor functions must be provided. As you can see from the prototype the comparison function returns a AjBool
(which is ajTrue
if the constraint is matched) and takes a generic void *
argument whereas the destructor function must be of void
return type and take a generic void **
argument.
Let us assume you have a list of sequence (AjPSeq
) objects you want to garbage collect, removing all sequences from the list that are longer than 1000 characters. The comparison function would be as follows:
AjBool Compare(const void * ptr) { if(ajSeqGetLen((AjPSeq)ptr) > 1000) return ajTrue; return ajFalse; }
We cannot use the default sequence destructor function because its parameter is of the wrong type:
void ajSeqDel (AjPSeq* Pseq);
A const
pointer is required, which means you must provide a wrapper function around the default destructor function:
void mySeqDelWrap(const void **ptr) { ajSeqDel((AjPSeq *) ptr); return; }
To perform garbage collection on a list of sequences (list
) you would call:
void ajListPurge (list, Compare, mySeqDelWrap);
A list may be sorted on up to three fields (elements of the C data structure) of the node data. Where more than one field is used the list is first sorted by field 1. Nodes with the same sort order for field 1 are then re-sorted by field 2 such that the relative sort order from field 1 is maintained. Similarly, if a third field is used, nodes with the same sort order for fields 1 and 2 are re-sorted by field 3 such that the relative sort order from sorting by fields 1 and 2 is preserved. The functions are :
void ajListSort (AjPList list,int (*sort1) (const void*, const void*)); void ajListSortTwo (AjPList list,int (*sort1) (const void*, const void*), int (*sort2) (const void*, const void*)); void ajListSortTwoThree (AjPList list,int (*sort1) (const void*, const void*), int (*sort2) (const void*, const void*), int (*sort3) (const void*, const void*));
Lists may also be sorted (by up to two fields) and duplicate nodes (nodes with the same sort order) removed. The functions are:
void ajListSortUnique (AjPList list, int (*compar) (const void*, const void*), void nodedelete (void**, void*)); void ajListSortTwoUnique(AjPList list, int (*sort1) (const void*, const void*), int (*sort2) (const void*, const void*), void nodedelete (void**, void*));
One or more comparison functions sort*
must be provided. These take two generic void *
arguments for the two objects passed, and return the sort order of the objects as an integer: which should be -1
if the first object sorts higher than the second and +1
otherwise. For the ajListSortUnique*
functions, a destructor function is the same as for ajListMap
(see above): it must be of void
return type and take a generic void **
argument and a (usually NULL) second argument which can be used to pass external data to the function.
Let us assume you have a list of sequences (AjPSeq
objects) you want to sort by length (shortest first) and make unique, removing any sequences whose length is the same as another sequence in the list such that the final list of sequences have unique lengths. The comparison function would be as follows:
int Compare(const void *ptr1, const void * ptr2) { if(ajSeqGetLen((AjPSeq)ptr1)) < ajSeqGetLen((AjPSeq)ptr2)) return -1; return 1; }
The destructor function is the same as for list deletion collection above.
To sort the list and remove sequences of duplicate length you would call:
void ajListSortUnique (list, Compare, mySeqDelWrap);
Various string comparison functions are provided in ajstr.c
and are useful for sorting lists of strings. The functions perform case-sensitive and case-insensitive comparison, with and without wildcard characters. Variants of the function support string objects (AjPStr
) and C-type (char *
) strings (not all shown):
/* String to C-type string */ int ajStrCmpC (const AjPStr thys, const char *text); /* String to string */ int ajStrCmpS (const AjPStr str, const AjPStr str2); /* Case-insensitive string to string */ int ajStrCmpCaseS (const AjPStr str1, const AjPStr str2); /* String to C-type string using the first len characters only */ ajint ajStrCmpLenC (const AjPStr thys, const char *text, ajuint len); /* String to string using the first len characters only */ int ajStrCmpLenS (const AjPStr str, const AjPStr str2, ajuint len); /* String to C-type string with wildcards.*/ int ajStrCmpWildC (const AjPStr thys, const char* text); /* String to string with wildcards.*/ int ajStrCmpWildS (const AjPStr thys, const AjPStr str); /* Case-insensitive string to C-type string with wildcards.*/ int ajStrCmpWildCaseC (const AjPStr thys, const char* text); /* Case-insensitive string to string with wildcards.*/ int ajStrCmpWildCaseS (const AjPStr thys, const AjPStr str); /* Same as ajStrCmpS but strings are cast as void */ int ajStrVcmp (const void* str1, const void* str2);
There are also comparison functions for handling C-type strings:
/* Case-insensitive C-type string to C-type string */ int ajCharCmpCase (const char* txt1, const char* txt2); /* Case-insensitive C-type string to C-type string using the first len characters only */ int ajCharCmpCaseLen (const char* txt1, const char* txt2, ajuint len); /* C-type string to C-type string using wildcards. */ ajint ajCharCmpWild (const char* txt1, const char* txt2); /* Case-insensitive C-type string to C-type string using wildcards. */ ajint ajCharCmpWildCase (const char* txt1, const char* txt2);
The functions return an integer representing the sort order which is -1
if the first string should sort before the second, +1
if the second string should sort first, or 0
if they are identical.
A list may be converted to an array by using:
ajint ajListToarray (const AjPList list, void*** PParray); ajint ajListstrToarray (const AjPList list, AjPStr** PParray);
A string list may be appended to an existing array by using:
ajint ajListstrToarrayAppend (const AjPList list, AjPStr** PParray);
The functions return the size of the array and in all cases memory for the array is (re)allocated as needed and the original list left untouched. The node data is not duplicated, so once these functions return, the list, the array and the node data must be freed. In the example below, code to create the list of sequences is not shown:
AjPSeq *seqarray = NULL; AjPList list = NULL; AjPSeq seqtmp = NULL; AjIList iter = NULL; ... /* Code to allocate list of sequences */ ajListToarray(list, &seqarray); /* Do something with the array */ /* Free the list and node data */ iter = ajListIterNew(list); while(seqtmp = (AjPSeq) ajListIterGet(iter)) ajSeqDel(&seqtmp); ajListFree(&listnew); /* Delete the iterator */ ajListIterDel(&iter); /* Free the array */ AJFREE(seqarray);
To invoke a function on all the nodes in a list use one of:
void ajListMap (AjPList list, void apply(void** Pptr, void* ptr), void* cl); void ajListstrMap (AjPList list, void apply(AjPStr* Pstr, void* ptr), void* cl); void ajListMapread (const AjPList list, void apply(void* ptr, void* ptr), void* cl); void ajListstrMapread (const AjPList list, void apply(AjPStr str, void* ptr), void* cl);
The function that is applied is again of the same type as the comparison function used for garbage collection (see above). The *MapRead
functions ensure that the function (apply
) that's applied does not modify the list elements and, as such, takes a list that is const
. In contrast, any type of operation is permitted with ajListMap
and ajListstrMap
.