9.2. The C Source Code (seqret.c)

The application C source code (see below) is very simple. Its basic functions are:

The source code is shown below:

/* @source seqret application
**
** Return a sequence
**
** @author Copyright (C) Peter Rice
** @@
**
** This program is free software; you can redistribute it and/or
** modify it under the terms of the GNU General Public License
** as published by the Free Software Foundation; either version 2
** of the License, or (at your option) any later version.
**
** This program is distributed in the hope that it will be useful,
** but WITHOUT ANY WARRANTY; without even the implied warranty of
** MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
** GNU General Public License for more details.
**
** You should have received a copy of the GNU General Public License
** along with this program; if not, write to the Free Software
** Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA  02111-1307, USA.
******************************************************************************/

#include "emboss.h"




/* @prog seqret ***************************************************************
**
** Reads and writes (returns) sequences
**
******************************************************************************/

int main(int argc, char **argv)
{
    AjPSeqall seqall;
    AjPSeqout seqout;
    AjPSeq seq = NULL;
    AjBool firstonly;

    embInit("seqret", argc, argv);

    seqout = ajAcdGetSeqoutall("outseq");
    seqall = ajAcdGetSeqall("sequence");
    firstonly = ajAcdGetBoolean("firstonly");

    while(ajSeqallNext(seqall, &seq))
    {
        ajSeqoutWriteSeq(seqout, seq);
        if(firstonly)
            break;
    }

    ajSeqoutClose(seqout);

    embExit();

    return 0;
}

9.2.1. Variable Declarations

The first block of code in main() declares variables for holding values from the ACD file:

AjPSeqall seqall=NULL;
AjPSeqout seqout=NULL;
AjPSeq seq = NULL;
AjBool firstonly;

The variables beginning with AjP are all C pointers to EMBOSS objects (C data structures for the corresponding types). These include AjPSeqall, AjPSeqout and AjPSeq. Many EMBOSS objects, for both complex biological and other types, are defined in the AJAX and NUCLEUS programming libraries. AjBool, in contrast, is the EMBOSS datatype for a simple Boolean variable. You'll notice that the pointer variables are initialised to NULL. It is good practice to always initialise pointers to NULL where they're first defined. For an explanation of why, and detailed information on programming with objects, see Section 5.3, “Objects (C Data Structures)”.

AjPSeq is for single sequence input, AjPSeqall for multiple sequence input and AjPSeqout for single sequence output. For the seqret application AjPSeqall seqall is used for the input sequence stream, AjPSeq seq is used to hold the data for a single sequence from that stream and AjPSeqout seqout is used for the output sequence stream. AjBool firstonly is used to hold the value of the firstonly control attribute from the ACD file.

You'll notice that no variable is required for the feature data definition. The value for this option is set on the input sequence stream, within the ACD file itself, by the ACD code features: "$(feature)". In other words whether feature information will or will not be included with the sequences is set within the ACD file and stored within the seqall object, therefore no additional variable is needed for it in the C code.

9.2.2. ACD File and Command line Processing

The code:

embInit("seqret", argc, argv);

embInit is used to process the ACD file and command line. It handles all of the user input processing which is why it's called first. embInit reads in local database definitions, finds the right ACD file to use (the first argument is "seqret" so it looks for seqret.acd in the ACD directory), reads the ACD file and processes the command line using argc and argv from main.

By the time embInit returns, the input sequence stream (sequence) will be opened for reading and the first sequence read into memory, the boolean variables feature and firstonly will have received values (possibly the default), and an output file will be opened for outseq. Memory is allocated for these objects and is available for use by the program.

embInit handles prompting of the user for values that are not entered on the command line, including functionality such as re-prompting the user for values that are out of range.

9.2.3. Retrieving Values from the ACD File

To retrieve C pointers to these data items the following code is used:

seqout = ajAcdGetSeqoutall("outseq");
seqall = ajAcdGetSeqall("sequence");

Similarly, to retrieve the value of the simple data type variable firstonly:

    firstonly = ajAcdGetBoolean("firstonly");

You can see that the argument to the ajAcdGet* functions is the name of the ACD definition which is to be retrieved.

9.2.4. Sequence Handling

To iterate through the input sequences and load a sequence into memory the following code is used:

while(ajSeqallNext(seqall, &seq))
{
    ajSeqoutWriteSeq(seqout, seq);
    if(firstonly)
        break;
}

ajSeqallNext is called in a loop to retrieve consecutive sequences in turn from the input stream. The second argument (&seq) sets the pointer to the current sequence in the stream. ajSeqoutWriteSeq is used to write this sequence to the output stream seqout. The loop will terminate after the first sequence if firstonly has been set.

9.2.5. Exiting Cleanly

The output stream is closed by calling ajSeqoutClose(seqout);. The application terminates cleanly with the call embExit() before returning 0 to the operating system:

    ajSeqoutClose(seqout);
    embExit();

    return 0;
}