6.9. Handling Features

6.9. Handling Features
Prev	Chapter 6. Programming with AJAX	Next

6.9.1. Introduction

A feature is a region of interest in a molecular sequence. Features include things like restriction enzyme cut sites, protein secondary structure prediction states, exon positions, regions of motif matches and so on. EMBOSS supports, for input and output, most of the common sequence feature formats (see the EMBOSS Users Guide) that were developed for the major sequence databases and for input of features into the genome databases. The name of a features file and the format of the features in the file are specified on the command line using a Uniform Feature Object (UFO) (see the EMBOSS Users Guide).

Many applications do not read and write features directly. Features are also used to store the results of sequence analysis, and can be written out as 'reports' where a report format defined through ACD is used to write out a feature table and (for some formats) the original sequence (see Section 6.15, “Handling Application Reports”).

Features are annotations of simple ranges in a sequence (start and end) or of a numbered group of features which have a 'join' (to combine exons in a coding sequence) or some other combination ('group' or one-of' in the EMBL/Genbank feature table). These complex features are stored as a parent feature with a set of simpler features for each component. Currently these are stored in the same feature table. In a future release these may become subfeatures to simplify sorting operations.

The feature types need to be standardised to allow interconversion of formats. EMBOSS uses a set of data files installed in the share/EMBOSS/data/ directory to define types and tag names for each input/output format, and for internal use. The master internal naming files are Efeatures.emboss and Etags.emboss for nucleotid efeatures, and Efeatures.protein and Etags.protein for protein features. These include the files for the major feature format definitions so that most feature types and tags (where there is no clash between formats) will be stored and returned unchanged. For any type or tag that does not appear in these files, the first name defined is used as a default ('misc_feature' for nucleotide type, 'polypeptide region' for protein type, 'note' for a tag name

A feature object is modelled on the GFF3 feature data format, where a features is described by:

Start and end position
Name describing the feature
The strand direction (in a nucleic sequence)
A score

A feature object also holds data on:

A second start and end position for features where the start or end is wider than a single base or residue.
Source records the names of the program or database from which the features were derived.
The feature type (feature key) using an internal name derived from the Sequence Ontology (SO) and defined in the Efeatures.emboss (nucleotide) or Efeatures.protein (protein) data files and include all EMBL/GenBank and UniProt feature types.
List of tag names and values which are defined in the Etags.emboss (nucleotide) or Etags.protein (protein) data files and include all EMBL/GenBank or UniProt feature qualifiers.
Frame 1..3, -1..-3 for coding nucleotide features or 0 for non-coding or protein features.
Flag bit mask for EMBL location to record features between bases (11^12), types of join/order/one-of, and other attributes.
Group number for the individual exons and the parent of join/order/one-of feature locations in the EMBL/GenBank feature table.
Remote ID where the feature location (e.g. for an exon used by a join) is in another entry in the same database.
A label for the location of the feature in another entry.
Exon number.

A feature table is simply a group of features and is stored in one of three contexts:

As part of a sequence file
As part of a database entry
As a raw feature table (a file that does not contain the sequence the features refer to)

Most feature table definitions have a controlled vocabulary, i.e. there is a specified list of feature key names and feature tag names that can be used. This means that you cannot edit feature tables to add in features with new keys. If you edit a feature table you must stick to the allowed set of feature keys.

'Named note' tags are a way to store feature tag names that are not in the alowed set. The default (note) feature tag is stored with a value that begins with '*name' followed by the value. This preserves the annotation in a readable form when features are written out using a standard format such as EMBL or GFF3.

Features to be read or written by an application are defined in the application ACD file, although it is possible to create feature tables directly if this is required.

A set of command line qualifiers are available for features. These allow you to set such things as file name and format and the region of the sequence containing the features of interest. These qualifiers may be "hard-coded" as attributes in the ACD file (see Section A.5.2.8, “features” and Section A.5.3.2, “featout”).

AJAX provides comprehensive functionality for handling features including:

Features may be read and written directly as an alternative to ACD processing
Elements of the objects for handling features may be retrieved or set directly
Handling of feature tags
Querying the properties of features and feature tables
Processing of features and feature tables

6.9.2. AJAX Library Files

AJAX library files for handling features are listed in the table (Table 6.13, “AJAX Library Files for Handling Features”). Library file documentation, including a complete description of datatypes and functions, is available at:

http://emboss.open-bio.org/rel/dev/libs/

Table 6.13. AJAX Library Files for Handling Features
Library File Documentation	Description
ajfeat	General sequence feature handling
ajfeatdata	Feature datatypes

ajfeat.h/c. Most of the functions you will ever need for general feature handling. They also contain static functions for handling features at a low level. You are unlikely to need these unless you plan to implement code to extend the core functionality of the library.

ajfeatdata.h/c. Basic feature objects (AjPFeattable, AjPFeature and AjPFeattabOut) for general use, e.g. retrieving features via ACD file processing. It also defines a feature input object (AjPFeattabIn) used for low level feature input handling.

6.9.3. ACD Datatypes

A feature table (not individual features) may be specified for input or output in an ACD file.

The datatype for handling feature table input is:

features: Feature table input.

The datatype for handling feature table output is:

featout: Feature table output.

Features can also be read from an input sequence and written alongside an output sequence if the features: ACD attribute is set. If set then the sequence output will include feature information either in the same file (if the sequence format supports it) or in a separate file (by default in GFF format).

ACD datatypes for sequence input include:

sequence: Read a single sequence.
seqall: Read multiple sequences sequentially, one at a time.
seqset: Read multiple sequences as a single set.
seqsetall: Read multiple sequences as multiple sets.

ACD datatypes for sequence output include:

seqout: Write a single sequence.
seqoutset: Write multiple sequences as a single set.
seqoutall: Write multiple sequences sequentially, one at a time.

6.9.4. ACD Data Definition

A typical ACD definition for feature input:

features: features
[
    parameter: "Y"
    type:      "protein"
]

A typical ACD definition for feature output:

featout: outfeat
[
    parameter: "Y"
    type:      "protein"
    multiple:  "N" 
]

A typical ACD definition for sequence input with features:

# single input sequence
sequence: sequence  
[
    parameter: "Y"
    type:      "gapany"
    features:  "Y"
]

A typical ACD definition for sequence output with features:

# single sequence
seqout: outseq 
[
    parameter: "Y"
    type:      "gapany"
    features:  "Y"
]

The use of the sequence datatype is for illustrative purposes; the other sequence input and output types could also have been given.

6.9.4.1. Parameter Name

All data definitions for feature input and output should have standard parameter names. These include:

features for any feature inputs
outfeat for any outputs
Alternatives and variations (e.g. afeatures, bfeatures for multiple inputs, are allowed)

For more information see Appendix A, ACD Syntax Reference.

6.9.4.2. Common Attributes

Attributes that are typically specified are summarised below. They are datatype-specific (Section A.5, “Datatype-specific Attributes”) unless they are indicated as being global attributes (Section A.4, “Global Attributes”).

parameter: Features are typically the primary input or output of an EMBOSS application and, as such, should be defined as parameters by using the global attribute parameter: "Y".

type: Specifies the type of the sequence (protein or nucleotide) to which the features pertain and is used for validation purposes.

multiple: A boolean attribute that can be set for a featout data definition to specify the feature annotation is for multiple sequences.

offormat: GFF format is used by default for the output feature(s). The format is normally set at the command line but a default may be hardcoded with offormat:. All common feature formats are supported (see the EMBOSS Users Guide).

6.9.5. AJAX Datatypes

For handling feature tables, including input feature tables defined in the ACD file, use:

AjPFeattable: Feature table which includes a list of AjPFeature objects (for the features ACD datatype).

For handling feature table outputs defined in the ACD file use:

AjPFeattabOut: Feature table output (for the featout ACD datatype).

There is also a basic object for handling individual features:

AjPFeature: Biological feature.

There is a datatype for low level feature input beyond that provided by the static datatypes in the various library files:

AjPFeattabIn: Low level feature table input.

You are unlikely to need AjSFeattabIn unless you plan to implement code to support new feature formats for EMBOSS. For advice on how to do this ask the EMBOSS developers.

All sequence objects can include a feature table. On input through an ACD datatype features will be read if the features: attribute is true in the ACD definition.

In developing applications, feature tables are most likely to be used as part of a report output. A sequence is read, analysis results are generated as features, and both are output as a report format (see Section 6.15, “Handling Application Reports”).

6.9.6. ACD File Handling

Datatypes and functions for handling features via the ACD file are shown below (Table 6.14, “Datatypes and Functions for Feature Input and Output”).

Table 6.14. Datatypes and Functions for Feature Input and Output
	To read features	To write features
ACD datatype	`features`	`featout`
AJAX datatype	`AjPFeattable`	`AjPFeattabOut`
To retrieve from ACD	`ajAcdGetFeatures`	`ajAcdGetFeatout`

Your application code will call embInit to process the ACD file and command line (see Section 6.3, “Handling ACD Files”). All values from the ACD file are read into memory and files are opened as necessary. You have a handle on the files and memory through the ajAcdGet* family of functions which return pointers to appropriate objects.

6.9.6.1. Input Feature Retrieval

To retrieve input features an object pointer is declared and then initialised using ajAcdGetFeatures:

    AjPFeattable features=NULL;

    features = ajAcdGetFeatures("features");

6.9.6.2. Output Feature Retrieval

To retrieve an output feature stream an object pointer is declared and initialised using ajAcdGetFeatout:

    AjPFeattabOut outfeat=NULL;

    outfeat = ajAcdGetFeatout("outfeat");

6.9.6.3. Processing Command line Options and ACD Attributes

6.9.6.3.1. Setting the Features Range

The features input datatype has various inbuilt command line qualifiers (see ) including -fbegin and -fend which specify a start and end position for the features, and -freverse to reverse the orientation of nucleotide features.

When a feature table is read the feature values are held in the appropriate feature table object. Regardless of the range values you still get the entire table loaded into memory. The functions ajFeattableTrim or ajFeattableTrimOff are used to trim the features to the region defined by -fbegin and -fend:

AjFeattable ftable=NULL;

ftable = ajAcdGetFeatures("features");

ajFeattableTrim(ftable);
/* ajFeattableTrimOff(ftable,0,ajFeattableGetLen(ftable));*/

When a sequence is read the feature values are held in a feature table in the appropriate sequence object. Regardless of the sequence range values you still get the entire sequence loaded into memory. The function ajSeqTrim (or ajSeqsetTrim for a AjPSeqset object) is used to trim the sequence and features to the region defined by -sbegin and -send:

AjPSeq seq=NULL;

seq = ajAcdGetSeq("sequence");

ajSeqTrim(seq);

6.9.6.4. Memory Management

It is your responsibility to free up memory at the end of the program. You must call the default destructor function (see below) on any objects returned by calls to ajAcdGet*.

Additionally you must call embExit to clean up internal memory including that allocated for the housekeeping of feature tables:

embExit();

6.9.7. Memory Management

6.9.7.1. Default Object Construction

To use a feature table object that is not defined in the ACD file you must first instantiate the appropriate object pointer. The default construction functions ajFeattableNew is provided for this. ajFeattableNew leaves the type of feature table uninitialised. It is set when a feature is added to the table. TYpe-specific functions ajFeattableNewDna and ajFeattableNewProt create feature tables for nucleotide and protein features respectively. Function ajFeattableNewSeq creates a feature table with the type and length matching a sequence object.

Feature table output objects are typically loaded from ACD file processing (see above). In the unlikely event you need to create one manually you can use the default constructor ajFeattabOutNew or functions ajFeattabOutNewCSF or ajFeattabOutNewSSF which use an existing output file and a specified type and sequence name

To create a feature object (usually to be stored within a feature table), a similar set of functions is available. ajFeatNew creates a feature with all attributes including the type. ajFeatNewProt creates a feature with all attributes required by protein features (no strand or frame). ajFeatNewII and ajFeatNewIIRev are generic constructors requiring only the start and end values. The feature type will default to a "miscellaneous feature" or "polypeptide region" value. ajFeatNewFeat is a copy constructor making a feature object from an existing feature.

/* Feature Object */
AjPFeature  ajFeatNew (AjPFeattable thys,
                       const AjPStr source,
                       const AjPStr type,
                       ajint Start, 
                       ajint End, 
                       float score,
                       char strand, 
                       ajint frame);

AjPFeature  ajFeatNewProt(AjPFeattable thys,
                          const AjPStr source,
                          const AjPStr type,
                          ajint Start, ajint End,
                          float score);

AjPFeature  ajFeatNewII (AjPFeattable thys,
                         ajint Start, ajint End);

AjPFeature  ajFeatNewIIRev (AjPFeattable thys,
                           ajint Start, ajint End);

AjPFeature  ajFeatNewFeat (const AjPFeature orig);

/* General Feature Table Object */
AjPFeattable  ajFeattableNew (const AjPStr name);
AjPFeattable  ajFeattableNewDna (const AjPStr name);
AjPFeattable  ajFeattableNewProt (const AjPStr name);
AjPFeattable  ajFeattableNewSeq (const AjPSeq seq);

/* Output Feature Table Object */
AjPFeattabOut  ajFeattabOutNew (void);
AjPFeattabOut  ajFeattabOutNewCSF (const char* fmt, const AjPStr name,
				   const char* type, AjPFile file);
AjPFeattabOut  ajFeattabOutNewSSF (const AjPStr fmt, const AjPStr name,
				   const char* type, AjPFile file);

The parameters to ajFeatNew are as follows:

thys: Pointer to the feature table to which the new feature is added
source: Analysis basis for feature
type: Type of feature (e.g. exon)
Start: Start position of the feature
End: End position of the feature
score: Analysis score for the feature
strand: Strand of the feature
frame: Frame of the feature

All constructors return the address of a new object. The pointers do not need to be initialised to NULL but it is good practice to do so.

6.9.7.2. Default Object Destruction

You must free the memory for an object once you are finished with it. The functions are:

void  ajFeatDel (AjPFeature *pthis);
void  ajFeattableDel (AjPFeattable *pthis) ;
void  ajFeattabOutDel (AjPFeattabOut* pthis);

6.9.7.3. Example

In the example below, a feature table and individual features are built manually using the default constructor functions. The features are written out to a feature table retrieved from ACD processing:

    AjPFeattable feattable;
    AjPStr name   = NULL;
    AjPStr source = NULL;
    AjPStr type   = NULL;
    char strand   = '+';
    ajint frame   = 0;
    AjPFeature feature = NULL;
    AjPFeattabOut output = NULL;
    ajint i;
    float score = 0.0;

    embInit("demofeatures", argc, argv);

    output      = ajAcdGetFeatout("outfeat");

    ajStrAssignC(&name,"seq1");

    feattable = ajFeattableNew(name);

    ajStrAssignC(&source,"demofeature");
    score = 1.0;

    for(i=1;i<11;i++)
    {
        if(i & 1)
            ajStrAssignC(&type,"CDS");
        else
            ajStrAssignC(&type,"misc_feature");

        feature = ajFeatNew(feattable, source, type, i, i+10, score, strand,
                            frame);
    }

    ajFeattableWrite(output, feattable);

    ajStrDel(&source);
    ajStrDel(&name);
    ajStrDel(&type);
    ajFeattableDel(&feattable);
    ajFeattabOutDel(&output);

6.9.7.4. Alternative Object Construction and Loading

6.9.7.4.1. Single feature

There are a variety of alternative ways to create a feature object. The start and end position of the features may be specified:

AjPFeature  ajFeatNewII (AjPFeattable thys, ajint Start, ajint End);
AjPFeature  ajFeatNewIIRev (AjPFeattable thys, ajint Start, ajint End);
AjPFeature  ajFeatNewProt (AjPFeattable thys,
                           const AjPStr source, const AjPStr type,
                           ajint Start, ajint End, float score);

ajFeatNewIIRev sets features to be on the reverse strand whereas ajFeatNewProt is for protein features.

For cases where a copy of a feature is required that can be safely changed and/or deleted you can use ajFeatNewFeat:

AjPFeature  ajFeatNewFeat (const AjPFeature orig);

6.9.7.4.2. Feature table

There are a variety of alternative ways to create a feature table object, either by name or from an existing sequence object:

/* DNA feature table     */
AjPFeattable  ajFeattableNewDna (const AjPStr name);

/* Protein feature table */
AjPFeattable  ajFeattableNewProt (const AjPStr name);

/* From existing sequence; type is determined by the sequence type. */
AjPFeattable  ajFeattableNewSeq (const AjPSeq seq);

For cases where a copy of a feature table is required that can be safely changed and/or deleted you can use:

/* Copy whole feature table */      
AjPFeattable  ajFeattableNewFtable (const AjPFeattable orig);

/* Copy limited number of features */
AjPFeattable  ajFeattableNewFtableLimit (const AjPFeattable orig, ajint limit);

A feature table may be retrieved from a sequence object using these functions (defined in ajseq.c):

AjPFeattable  ajSeqGetFeatCopy (const AjPSeq thys);
const AjPFeattable  ajSeqGetFeat (const AjPSeq thys);

6.9.7.4.3. Editing a feature table

To add a new feature (AjPFeature) to a feature table (AjPFeattable) call:

void  ajFeattableAdd (AjPFeattable thys, AjPFeature feature);

To clear a feature table of all features call:

void  ajFeattableClear (AjPFeattable thys);

To clear an output feature table of all features call:

void  ajFeattabOutClear (AjPFeattabOut *thys);

6.9.8. Reading Features

Features may be read directly, using feature table input objects (these are the functions used by ACD processing):

AjPFeattable  ajFeattableNewRead  (AjPFeattabIn ftin);
AjPFeattable  ajFeattableNewReadUfo (AjPFeattabIn tabin,
                                     const AjPStr Ufo);

ajFeattableNewReadUfo will parse a UFO, open an input file and read a feature table. ajFeattableNewRead is a generic interface function for reading in features from a feature table input object.

6.9.9. Writing Features

Features may be written directly i.e. without using ACD processing (which uses either GFF3 format by default or the format defined by the environment variable EMBOSS_OUTFEATFORMAT). Features may be written in any format defined in the data structure FeatOOutFormat defined in ajfeat.c:

AjBool  ajFeatOutFormatDefault (AjPStr* pformat);

AjBool  ajFeattableWriteUfo (AjPFeattabOut tabout, const AjPFeattable thys,
                             const AjPStr Ufo);

AjBool  ajFeattableWrite (const AjPFeattable ft, AjPFeattabOut ftout);

AjBool  ajFeattableWrite (AjPFeattable thys, const AjPStr ufo);

ajFeatOutFormatDefault sets the default output format which is "gff" unless the EMBOSS_OUTFEATFORMAT variable is defined or a format is passed in the pformat parameter.

ajFeattableWriteUfo and ajFeattableWrite are equivalent to ajFeattableNewRead and ajFeattableNewReadUfo but for writing. ajFeattableWriteUfo will parse a UFO, open an output file and write a feature table to it. ajFeattableWrite is generic interface function for writing features to a file given the file handle, class of map, data format of output and possibly other associated data.

The following functions write the feature table in the indicated format:

/* DDBJ format      */
AjBool  ajFeattableWriteDdbj (const AjPFeattable features,  AjPFile file);   

/* EMBL format      */
AjBool  ajFeattableWriteEmbl (const AjPFeattable features, AjPFile file);    

/* Genbank format   */
AjBool  ajFeattableWriteGenbank (const AjPFeattable features, AjPFile file); 

/* GFF format       */
AjBool  ajFeattableWriteGff2 (const AjPFeattable features, AjPFile file);    

/* GFF format       */
AjBool  ajFeattableWriteGff3 (const AjPFeattable features, AjPFile file);    

/* PIR format       */ 
AjBool  ajFeattableWritePir (const AjPFeattable features, AjPFile file);     

/* SwissProt format */
AjBool  ajFeattableWriteSwiss (const AjPFeattable features, AjPFile file);

Feature tables may be written to an application report using the following functions defined in ajreport.h/c:

void  ajReportSetType (AjPReport thys, const AjPFeattable ftable, const AjPSeq seq);
AjBool  ajReportWrite (AjPReport thys,  const AjPFeattable ftable,  const AjPSeq seq);
void  ajReportWriteHeader (AjPReport thys, const AjPFeattable ftable, const AjPSeq seq);
void  ajReportWriteTail (AjPReport thys, const AjPFeattable ftable, const AjPSeq seq);

For more information on reports, see Section 6.15, “Handling Application Reports”.

6.9.10. Output Feature Table Functions

Functions described here are for manipulating an output feature table object.

To open the output file call:

AjBool  ajFeattabOutOpen (AjPFeattabOut thys, const AjPStr ufo);

Elements of the output feature table object may be retrieved and queried using:

AjPFile  ajFeattabOutFile (const AjPFeattabOut thys);
AjPStr  ajFeattabOutFilename (const AjPFeattabOut thys);

/* These functions are used internally to test whether the output file 
   has been opened and used */
AjBool  ajFeattabOutIsLocal (const AjPFeattabOut thys);
AjBool  ajFeattabOutIsOpen (const AjPFeattabOut thys);

Elements of an output feature table are set with:

/* sets the UFO (format and filename) for feature output */
AjBool  ajFeattabOutSet (AjPFeattabOut thys, const AjPStr ufo);

/* sets the base file name (.format) for feature output */
void  ajFeattabOutSetBasename (AjPFeattabOut thys, const AjPStr basename);

/* sets the feature table type 'any', 'N' 'nucleotide' or 'P' 'protein' */
AjBool  ajFeattabOutSetType (AjPFeattabOut thys, const AjPStr type);
AjBool  ajFeattabOutSetTypeC (AjPFeattabOut thys, const char* type);

6.9.11. Retrieving Elements of a Feature Object

The elements of a feature object may be retrieved using the following:

/* End position */
ajuint  ajFeatGetEnd (const AjPFeature thys);

/* Direction (ajTrue for a forward direction, ajFalse for reverse) */
AjBool  ajFeatGetForward (const AjPFeature thys); 

/* Reading frame */
ajint  ajFeatGetFrame (const AjPFeature thys);   

/* Sequence length */
ajuint  ajFeatGetLength(const AjPFeature thys);   

/* Finds a named note tag (a general tag value with a *name prefix) */
AjBool  ajFeatGetNoteS (const AjPFeature thys,  AjPStr* val, const AjPStr name);   
AjBool  ajFeatGetNoteSI (const AjPFeature thys, AjPStr* val, const AjPStr name, ajint count);
AjBool  ajFeatGetNoteC (const AjPFeature thys,  AjPStr* val, const char* name);    
AjBool  ajFeatGetNoteCI (const AjPFeature thys, AjPStr* val, const char* name, ajint count);   

/* Returns the nth value of a named feature tag. If not found as a tag, also searches for a named note tag*/
AjBool  ajFeatGetTagC (const AjPFeature thys, const char* tname, ajint num,
		       AjPStr* Pval)
AjBool  ajFeatGetTagS (const AjPFeature thys, const AjPStr name, ajint num,
                       AjPStr* val);   

/* Score */
float  ajFeatGetScore (const AjPFeature thys);  

/* Source name */
const AjPStr  ajFeatGetSource (const AjPFeature thys);

/* Start position */
ajuint  ajFeatGetStart (const AjPFeature thys);  

/* Strand */
char  ajFeatGetStrand (const AjPFeature thys); 

/* Returns the type (key) of a feature object. */
const AjPStr  ajFeatGetType (const AjPFeature thys);

Caution

Note that ajFeatGetType returns a copy of the pointer to the type (key) of the specified feature object. The key is still owned by the feature and should not to be destroyed!

6.9.12. Retrieving Elements of a Feature Table Object

The elements of a feature table object may be retrieved be using the following:

/* Returns the feature table start position, or 1 if no start has been set. */
ajint  ajFeattableGetBegin (const AjPFeattable thys);            

/* Returns the features table end position, or the feature table length if no end has been set.*/
ajint  ajFeattableGetEnd (const AjPFeattable thys);        

/* Returns the name of a feature table object. */ 
const AjPStr  ajFeattableGetName (const AjPFeattable thys);       
const char*   ajFeattableGetTypeC (const AjPFeattable thys);      
const AjPStr  ajFeattableGetTypeS (const AjPFeattable thys);       

/* Returns the sequence length of a feature table */
ajint  ajFeattableGetLen (const AjPFeattable thys);         

/* Returns the number of features */
ajint  ajFeattableGetSize (const AjPFeattable thys);

Caution

ajFeattableGetName, ajFeattableGetTypeC and ajFeattableGetTypeS return a copy of the pointer to the name or type (key). This is still owned by the feature table and so should not to be destroyed.

6.9.13. Setting Elements of a Feature Object

The elements (indicated in comments) of a feature object may be set using the following:

/* Description */
void  ajFeatSetDesc (AjPFeature thys, const AjPStr desc);        

/* Append to description */
void  ajFeatSetDescApp (AjPFeature thys, const AjPStr desc);     

/* Score */
void  ajFeatSetScore (AjPFeature thys, float score);             

/* Strand */
void  ajFeatSetStrand (AjPFeature thys, AjBool rev);

ajFeattableSetDefname will provides a unique name for the current program run for a feature table.

6.9.14. Setting Elements of a Feature Table Object

The elements of a feature table object may be set using the following:

/* Name */
void  ajFeattableSetDefname (AjPFeattable thys, const AjPStr setname);      

/* Sequence length */
void  ajFeattableSetLength (AjPFeattable thys, ajuint len)                  

/* Type to nucleotide */
void  ajFeattableSetNuc (AjPFeattable thys);                               

/* Type to protein */
void  ajFeattableSetProt (AjPFeattable thys);                              

/* Begin and end range  */
void  ajFeattableSetRange (AjPFeattable thys, ajint fbegin, ajint fend) ;

6.9.15. Functions for Handling Feature Tags

Feature tags (names and values) are stored as pairs in a list. Tags can be returned as arrays or iterated over. When values are added (annotating a feature) they are usually simply defined as a name and value pair which replaces any existing value with the same tag name. Some tag names allow multiple values (for example the EMBL/Genbank feature 'note' used for general annotation and for 'named note tags' with a '*name' prefix to the value). These can be added as extra values using the ajFeatTagAdd functions. The tag values are validated against the most recent EMBL/GenBank feature table documentation. Warning messages are generated if variable EMBOSS_FEATWARN is set true, but turned off by default to avoid excessive warnings on data from other sources. Functions for handling feature tags include:

/* Sets a feature tag value, creating a new feature tag even if one already exists. */
AjBool  ajFeatTagAdd (AjPFeature thys, const AjPStr tag, const AjPStr value);
AjBool  ajFeatTagAddC (AjPFeature thys, const char* tag, const AjPStr value);
AjBool  ajFeatTagAddCC (AjPFeature thys, const char* tag, const char* value);

/* Sets a feature tag value */
AjBool  ajFeatTagSet (AjPFeature thys,  const AjPStr tag, const AjPStr value);
AjBool  ajFeatTagSetC (AjPFeature thys, const char* tag, const AjPStr value);

/* Returns an iterator over all feature tag-value pairs */
AjIList  ajFeatTagIter (const AjPFeature thys);

/* Returns the tag-value pairs of a feature object */
AjBool  ajFeatTagval (AjIList iter, AjPStr* tagnam, AjPStr* tagval);

/* Traces (to the debug file) the tag-value pairs of a feature object */
void  ajFeatTagTrace (const AjPFeature thys);

Functions for handling the FeattabIn object are available but not covered here as you will not normally need to use this object.

6.9.16. Querying Properties of Features

Functions are available to examine complex feature locations to process joins and their child (exon) features. Functions are provided to test whether a feature has a remote id defined (the feature refers to another sequence) and to test the base range is withing the range required for processing or for output. The properties of features may be queried using the following:

/* Tests whether the feature is a child member of a join */
AjBool  ajFeatIsChild (const AjPFeature gf);  

/* Tests whether the feature is a member of a complement around a multiple (join, etc.) */
AjBool  ajFeatIsCompMult (const AjPFeature gf); 

/* Tests whether the feature is a member of a join, group order or one_of */
AjBool  ajFeatIsMultiple (const AjPFeature gf); 

/* Checks whether the feature is in another (remote id) sequence  */
AjBool  ajFeatIsLocal (const AjPFeature gf); 

/* ... and tests the location is within a specified range */
AjBool  ajFeatIsLocalRange (const AjPFeature gf, ajuint start, ajuint end);

6.9.17. Querying Properties of Feature Tables

The type (nucleotide or protein) of a feature table may be queried using:

/* Returns ajTrue if nucleotide */
AjBool  ajFeattableIsNuc (const AjPFeattable thys);  

/* Returns ajTrue if protein */
AjBool  ajFeattableIsProt (const AjPFeattable thys);

6.9.18. Processing Features

There are a couple of functions for processing features:

void  ajFeatReverse (AjPFeature thys, ajint ilen) ;           
AjBool  ajFeatTrimOffRange (AjPFeature ft, ajuint ioffset,
                            ajuint begin, ajuint end,
                            AjBool dobegin, AjBool doend);

ajFeatReverse will reverse a feature by reversing all positions and strand data.

ajFeatTrimOffRange trims a feature table using the specified begin and end values. ajFeatTrimOffRange is called where a sequence has been trimmed, so it is necessary to specify any missing sequence positions at the start (ioffset).

6.9.19. Processing Feature Tables

There are a few functions for processing whole feature tables. All features in a feature table may be reversed or trimmed using:

/* Reverse the features in a feature table by iterating through and reversing all positions and strands. */
void  ajFeattableReverse (AjPFeattable  thys) ;

/* Trim a feature table using the Begin and Ends. */
AjBool  ajFeattableTrimOff (AjPFeattable thys, ajuint ioffset, ajuint ilen);

There are functions to convert a position (start or end value) in a feature table into a true position in the source sequence, using any offset information from trimming the feature table within a set range:

ajuint  ajFeattablePos (const AjPFeattable thys, ajint ipos);
ajuint  ajFeattablePosI (const AjPFeattable thys, ajuint imin, ajint ipos);
ajuint  ajFeattablePosII (ajuint ilen, ajuint imin, ajint ipos);

If ipos is negative, it is counted from the end of the string rather than the beginning. For strings the result can go off the end to the terminating NULL. For sequences the maximum is the last base.

Finally, features in a feature table may be sorted using:

/* End position */
void    ajFeatSortByEnd (AjPFeattable Feattab);    

/* Start position */
void    ajFeatSortByStart (AjPFeattable Feattab);  

/* Type */
void    ajFeatSortByType (AjPFeattable Feattab);

6.9.20. Miscellaneous Functions

There are a few miscellaneous functions for handling features:

/* Returns a sequence from a feature. */
AjBool  ajFeatGetSeq(const AjPFeature thys, const AjPFeattable table,
                     const AjPSeq seq, AjPStr* Pseqstr);

Prev	Up	Next
6.8. Handling Sequence Translation	Home	6.10. Handling Comparison Matrices