10.8. Reducing The Length Of The Command Line

A problem can arise when wrapping applications that have many possible command line qualifiers. An example of this is the MIRA package which has of the order of a hundred. For a package like this one cannot sensibly take the approach of constructing a command line with all the possible qualifiers being specified; the command line might be too long for the shell and it'd certainly be hard to read and debug.

Two approaches are taken by the MIRA wrapper to avoid command line clutter. The first is to make use of the ajAcdIsUserdefinedC library function. This is demonstrated by the code snippet below:

    ...
    AjPStr cl    = NULL;
    AjPStr squal = NULL;

    ...

    if(ajAcdIsUserdefinedC("genome"))
    {
        squal = ajAcdGetListSingle("genome");
        ajFmtPrintAppS(&cl," -genome%S",squal);
        ajStrDel(&squal);
    }

The ajAcdIsUserdefinedC function call checks whether a user has typed anything in response to the named ACD qualifier name or has specified a value for that qualifier on the command line. If either case is true then the code above retrieves the value from ACD and adds the qualifier and value to the command line. If the user has not provided a value then the (correct) assumption is that the MIRA program will use a default value and so the command line is left unchanged.

Now that approach works quite nicely but there is a drawback. The ajAcdIsUserdefinedC call will return a true value if the user has typed anything; that includes typing the default value held in ACD for a given qualifier. It would be preferable if the command line was kept clear for unnecessarily specified default values. That is what the rest of the code, the second approach, in the MIRA wrapper does. It is a useful exercise to study the code.

The code is very slightly complicated by the fact that MIRA allows a user to optionally supply a prefix which can be attached to a command line qualifier. For example, the qualifier -project can also be specified as -GE:project as it belongs to a 'GEneral' class of input options. The wrapper therefore first lists all the qualifiers and their prefixes in a static array. It then loads them all into an AjPTable lookup table as one of the first jobs within main() (the code is trivial and therefore not shown here)

typedef struct MiraSPrefix
{
    const char* qname;
    const char* prefix;
} MiraOPrefix;
#define MiraPPrefix MiraOPrefix*

static MiraOPrefix miraprefix[] = {
    {"project", "GE:"},
    {"lj", "GE:"},
    {"fo", "GE:"},
    {"mxti", "GE:"},
    {"rns", "GE:"},
    {"eq", "GE:"},
    {"eqo", "GE:"},
    {"droeqe", "GE:"},
    {"uti", "GE:"},
    {"ess", "GE:"},
    {"ps", "GE:"},
    ...
    {NULL, NULL}
};

Having explained that complication the strategy used by the wrapper can now be described. The wrapper programmer looks through the ACD file and, for the C code, clusters all the different datatypes into groups. In other words the wrapper will deal with all the string datatypes as one code section, then all the input file datatypes etc. Taking the string datatype as an example the wrapper defines these in another static block.

typedef struct MiraSQuals
{
    const char* qname;
    const char* mname;
    const char* def;
} MiraOQuals;
#define MiraPQuals MiraOQuals*

static MiraOQuals mirastrings[] =
{
    {"project", "project", "mira"},
    {"bsn", "bsn", ""},
    {"np", "np", "mira"},
    {"gapfda", "gap4da", "gap4da"},
    {"log", "log", "miralog"},
    {"co", "co", "mira_out.caf"},
    ...
    {NULL, NULL, NULL}
};

The MiraSQuals structure is common to defining all the other datatype name blocks as well as the strings. For each definition block the variable qname is the qualifier name specified in ACD, the variable mname is the name to be printed out on the command line (as you can see they're different for gapfdna) and def is the default value for the qualifier as specified in the ACD file. Therein lies the slight drawback to this method i.e. you have the specify the default value both within the ACD file and in the C code. You also obviously should make sure that they match, although no great harm will arise if you don't - you'd just get an unexpected qualifier appearing on the command line. The advantages outweigh the disadvantage in this case. All that then needs to be done is to write a simple function to handle each of the datatypes and to call that function from within main(). The function for handling strings is shown here:

static void emira_dostrings(AjPStr *cl, AjPTable table)
{
    ajuint i;
    AjPStr squal = NULL;
    AjPStr prefix = NULL;
    AjPStr key    = NULL;
    AjPStr value  = NULL;
    
    prefix = ajStrNew();
    key    = ajStrNew();
    
    i = 0;

    while(mirastrings[i].qname)
    {
        squal = ajAcdGetString(mirastrings[i].qname);
        ajStrAssignC(&key,mirastrings[i].qname);
        ajStrAssignC(&prefix,"");

        value = ajTableFetch(table, key);

        if(value)
            ajStrAssignS(&prefix,value);

        if(!ajStrMatchC(squal,mirastrings[i].def))
            ajFmtPrintAppS(cl," -%S%s=%S",prefix,mirastrings[i].mname,squal);

        ajStrDel(&squal);
        ++i;
    }

    ajStrDel(&key);
    ajStrDel(&prefix);
    
    return;
}

The code looks sequentially through the static table of defined strings. First the code retrieves the associated value from ACD. It then performs a lookup in the qualifier prefix table for any associated prefix value and sets a variable accordingly (this need not be done for most packages so the code can be simplified). It then compares the value retrieved from ACD to the default value given in the static definition block and only adds the qualifier to the comand line if the two values don't match.

The result of this code is a nice clean command line. The code also has the advantage that it avoids having to define a long unsightly block of ajAcdGet* function calls at the start of the wrapper. Though you may consider this approach to be over-engineered for wrapping packages with simpler interfaces it is nevertheless worth considering as the resulting wrapper is much easier to maintain.