Bugs are problems with code that cause it to crash or operate in an unexpected way. They arise through erroneous use of syntax (which is not always caught by compilers) and errors in the code logic. This section gives some practical hints for debugging EMBOSS code.
Very broadly, debugging proceeds in four stages:
Fixing bugs that prevent the program from compiling
Fixing bugs that cause the program to crash
Fixing bugs that cause the program to operate incorrectly
Fixing bugs that manifest under extensive test conditions
With experience, most bugs are obvious from visual inspection of the code. It is highly recommended that, before you compile your code, you read then re-read it until you're satisfied it should work as expected. Be scrupulous when writing the code itself. Avoid the temptation to code too quickly; the extra time spent avoiding errors in the first place will be very well rewarded later.
A simple and powerful debugging method is to use ajFmtPrint
and fflush(stdout)
statements to report values of key variables at different stages of execution, allowing you to trace and identify problems. ajFmtPrint
is used to print variables, and fflush
is called immediately afterwards to flush the output buffer. This is important as the output buffer might have content at the point of the crash, which will only be printed to screen by calling fflush(stdout)
. Most bugs are easily squashed using this method.
When writing your code, there will be many stages where you know in advance what value(s) a variable should (or should not) have, particularly when checking a function's arguments or return values. For instance, the value of a pointer used by a function should in most cases not be NULL
. At all such places, at least in early versions of the code, code should be added to trap errors and raise appropriate warnings, alerting you to potential bugs before they manifest.
AJAX functions in ajmess.h
provide various levels of error handling. Each of the following format and output an exception message (provided as a string):
ajUser
Report an informative message
ajWarn
Report a warning message
ajErr
Report an error message
ajExit
Report a message then exit
ajDie
Report a message then crash (kill) the application
ajDebug
Report a general debugging message to the file programname.dbg
if the switch -debug
was given on the command line
Messages go to stdout
or stderr
(in both cases usually the screen) or, in the case of ajDebug
, to the file programname.dbg
. The EMBOSS code makes extensive use of ajDebug
so that bugs reported by users can easily be traced. The typical way therefore to debug applications is to produce a debug trace using -debug
. In practice, much of the error-trapping code can, for purposes of speed, be commented out or removed once extensive testing of the code is complete. This is especially true for library function code where speed is paramount.
For the vast majority of applications, ajDebug
and the other functions above will suffice. In special circumstances however you might need to write your own exception handling functions. An AJAX library file ajmess
is provided for this (see Section 6.23, “Handling Exception Messages”).
Most AJAX library files include functions for debugging the code in that library file. These usually call ajDebug
. Some libraries provide more comprehensive debugging functions than others, but typically functions are provided to report on the internal state of data structures defined in that library file. Generally, debugging functions are organised under their own "Debug" section in the library C source code and online documentation. In some cases special debugging files are provided. For example, there is a "debug" output sequence format used when debugging sequence output, and a "trace feature table" used when debugging report formats.
For more information, see the library documentation for AJAX and NUCLEUS. See also the programming guides (Section 3.1, “EMBOSS Programming”) for individual library files.
The default behaviour of EMBOSS is not to report debugging information generated by calls to ajDebug
. The global command line qualifier -debug
can be used to turn debugging on for any EMBOSS application. For example, if you think you have found a bug when the following command is issued:
seqret sequence.seq |
then debugging can be turned on as follows:
seqret sequence.seq -debug |
This will create a debug file called seqret.dbg
. Debugging could be explicitly turned off by prepending the qualifier with no
:
seqret sequence.seq -nodebug |
but there's normally no need to do this as the default is false (no debugging) anyway. It could however be useful if debugging was turned on by default in the EMBOSS configuration files or by an environment variable. Debugging can be globally switched on using the EMBOSS environment variable:
EMBOSS_DEBUG |
If this is set TRUE
, all programs act as if they have -debug
set on the command line. They create a file called programname.dbg
containing debugging information.
Logging of the processing of the EMBOSS configuration files emboss.default
and .embossrc
can be turned on using:
EMBOSS_NAMDEBUG |
This processing takes place before the -debug
command line switch is processed. The functions that are called are described in ajnam.c
. The debugging information includes:
A report of all defined databases, variables and environment variables
A report of defined attributes for a database definition.
A report of defined attributes for a resource definition.
Some bugs might not be obvious from visual inspection or easily traced using ajFmtPrint
and fflush(stdout)
. For these, it is worth using specialised debugging software. A debugger executes the bugged program and traces its internal state to allow problems with the code to be rapidly identified and fixed. The functions available depend on the debugger used. Most give control over the program, allowing it to be executed in a stepwise manner, variables to be given values and so on, providing a quicker and more reliable means of stepping through the logic of the code than doing this mentally. Most debuggers should provide at least the following information:
The line of code and statement the program crashed on
If the error occurred within a function, the line the function was invoked from and the arguments
The values of variables (local to a function or global variables) at a particular point during execution of the program
The result of a particular expression in a program
The most popular UNIX debugger is GDB, the GNU debugger (see http://sourceware.org/gdb/). It includes powerful features for tracing and altering the execution of a program, but you'll only need to use the very basics for it be extremely useful.
If you intend using GDB to debug EMBOSS code, it's necessary to configure the package using:
--enable-debug |
before you build the package (see Section 1.2, “Installation of CVS (Developers) Release”).
Bear in mind that the output of debuggers cannot always be entirely trusted. Rare cases can arise where the behaviour of the executable compiled for debugging is subtly different to the standard executable. For this reason, ajFmtPrint
and fflush
statements (see above), which are less invasive as far as the executable is concerned, remain the tool of last resort.
Some bugs evade identification by direct means or debuggers. Such bugs are usually caused by serious programming errors including the invalid use of pointers, memory corruption, memory leaks, or invalid memory access, for instance, an out-of-bounds array. These errors can be extremely difficult to trace but can easily arise in C programming, especially where C pointers are used extensively. Fortunately, there are several excellent programs available that can trap and identify such problems. Even if your code appears to be working correctly it is well worth using a memory checker to check that memory is not being violated. Their use can avoid hours of frustration later on and will help ensure your code remains stable in different use cases. Popular programs include:
A suite of tools for debugging and profiling Linux programs. It detects many memory management and threading bugs. It also performs profiling which helps you to identify ways to speed up and reduce memory usage of your programs. Valgrind runs on X86/Linux, AMD64/Linux, PPC32/Linux and PPC64/Linux.
Commercial software for memory corruption detection, memory leak detection, application performance profiling and code coverage analysis.
Powerful commercial software checking the integrity of memory usage and detecting potential defects and inefficiencies in memory usage.