The application C source code (see below) is very simple. Its basic functions are:
Declare variables for holding values from the ACD file (AjP*
type declarations)
Process the ACD file and command line (embInit
)
Read the values (input and output sequence streams) from the ACD file into memory (prefix ajAcdGet
family of functions)
Iterate through the input sequences and load a sequence into memory (ajSeqallNext
)
Write the sequence out (ajSeqoutWriteSeq
)
Close the output file
Exit cleanly (embExit
)
The source code is shown below:
/* @source seqret application ** ** Return a sequence ** ** @author Copyright (C) Peter Rice ** @@ ** ** This program is free software; you can redistribute it and/or ** modify it under the terms of the GNU General Public License ** as published by the Free Software Foundation; either version 2 ** of the License, or (at your option) any later version. ** ** This program is distributed in the hope that it will be useful, ** but WITHOUT ANY WARRANTY; without even the implied warranty of ** MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ** GNU General Public License for more details. ** ** You should have received a copy of the GNU General Public License ** along with this program; if not, write to the Free Software ** Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. ******************************************************************************/ #include "emboss.h" /* @prog seqret *************************************************************** ** ** Reads and writes (returns) sequences ** ******************************************************************************/ int main(int argc, char **argv) { AjPSeqall seqall; AjPSeqout seqout; AjPSeq seq = NULL; AjBool firstonly; embInit("seqret", argc, argv); seqout = ajAcdGetSeqoutall("outseq"); seqall = ajAcdGetSeqall("sequence"); firstonly = ajAcdGetBoolean("firstonly"); while(ajSeqallNext(seqall, &seq)) { ajSeqoutWriteSeq(seqout, seq); if(firstonly) break; } ajSeqoutClose(seqout); embExit(); return 0; }
The first block of code in main()
declares variables for holding values from the ACD file:
AjPSeqall seqall=NULL; AjPSeqout seqout=NULL; AjPSeq seq = NULL; AjBool firstonly;
The variables beginning with AjP
are all C pointers to EMBOSS objects (C data structures for the corresponding types). These include AjPSeqall
, AjPSeqout
and AjPSeq
. Many EMBOSS objects, for both complex biological and other types, are defined in the AJAX and NUCLEUS programming libraries. AjBool
, in contrast, is the EMBOSS datatype for a simple Boolean variable. You'll notice that the pointer variables are initialised to NULL
. It is good practice to always initialise pointers to NULL
where they're first defined. For an explanation of why, and detailed information on programming with objects, see Section 5.3, “Objects (C Data Structures)”.
AjPSeq
is for single sequence input, AjPSeqall
for multiple sequence input and AjPSeqout
for single sequence output. For the seqret application AjPSeqall seqall
is used for the input sequence stream, AjPSeq seq
is used to hold the data for a single sequence from that stream and AjPSeqout seqout
is used for the output sequence stream. AjBool firstonly
is used to hold the value of the firstonly
control attribute from the ACD file.
You'll notice that no variable is required for the feature
data definition. The value for this option is set on the input sequence stream, within the ACD file itself, by the ACD code features: "$(feature)"
. In other words whether feature information will or will not be included with the sequences is set within the ACD file and stored within the seqall
object, therefore no additional variable is needed for it in the C code.
The code:
embInit("seqret", argc, argv);
embInit
is used to process the ACD file and command line. It handles all of the user input processing which is why it's called first. embInit
reads in local database definitions, finds the right ACD file to use (the first argument is "seqret"
so it looks for seqret.acd
in the ACD directory), reads the ACD file and processes the command line using argc
and argv
from main
.
By the time embInit
returns, the input sequence stream (sequence
) will be opened for reading and the first sequence read into memory, the boolean variables feature
and firstonly
will have received values (possibly the default), and an output file will be opened for outseq
. Memory is allocated for these objects and is available for use by the program.
embInit
handles prompting of the user for values that are not entered on the command line, including functionality such as re-prompting the user for values that are out of range.
To retrieve C pointers to these data items the following code is used:
seqout = ajAcdGetSeqoutall("outseq"); seqall = ajAcdGetSeqall("sequence");
Similarly, to retrieve the value of the simple data type variable firstonly
:
firstonly = ajAcdGetBoolean("firstonly");
You can see that the argument to the ajAcdGet*
functions is the name of the ACD definition which is to be retrieved.
To iterate through the input sequences and load a sequence into memory the following code is used:
while(ajSeqallNext(seqall, &seq)) { ajSeqoutWriteSeq(seqout, seq); if(firstonly) break; }
ajSeqallNext
is called in a loop to retrieve consecutive sequences in turn from the input stream. The second argument (&seq
) sets the pointer to the current sequence in the stream. ajSeqoutWriteSeq
is used to write this sequence to the output stream seqout
. The loop will terminate after the first sequence if firstonly
has been set.