Software development and maintenance under EMBOSS is made easy. EMBOSS has powerful inbuilt functionality that any application can make use of with little or no additional coding. This includes support for many simple and complex (biological) data types, common file formats and simple database configuration. Generic mechanisms are in place for sequence and sequence feature specification and for qualifiers controlling program behaviour. Depending on your particular requirements, this might save you a great deal of effort. Furthermore, when, for example, new input and output data formats are added to EMBOSS, your applications will automatically be able to use them; no application code needs to change. Well defined processes are in place for key aspects such as quality assurance testing, installation, maintenance and support. General aspects are handled by the EMBOSS developers, leaving you to support the parts specific to your own software.
Your application will use the EMBOSS command line which is consistent across the applications. AJAX Command Definition (ACD) files define the command line interface and the datatype and permissible values for all application parameters. The processing and validation of the command line and user input is handled automatically at startup, before the application proper starts. For example, the production of a sensible prompt and reprompting for values that are out of range. There is clean separation of the user-interface handling aspects from the core functionality of the code: a single function call is used to process the command line and ACD file. The ACD syntax also makes the wrapping of third party applications under EMBOSS simple.
EMBOSS includes extensive C programming libraries (AJAX and NUCLEUS) for low level and higher-level tasks respectively. These provide a robust toolkit to develop new bioinformatics applications and workflows and to extend the core library functionality. The application programmers interface (API) is comprehensive and consistent. A developer needn't know the internals to use the libraries: how to call the functions, the required input data and outputs are all clearly documented. All code is in ANSI standard C and there are defined standards for coding and documentation.
Memory management under EMBOSS is greatly simplified. Memory for all data defined in the ACD file is allocated automatically. Dynamic memory management for programming objects such as strings, sequences and arrays is handled automatically. Memory is allocated and freed as necessary by the library functions, saving the application programmer a good deal of effort.
The EMBOSS source code is well documented and is indexed as an SRS databases to allow easy search and navigation. The documentation is generated automatically from structured comments in the code, which are validated to ensure correctness and consistency, for example, that functions and function parameters have standardised names. The EMBOSS Developers Guide includes programming guides with example code for most library files illustrating their use. Mailing lists for discussions about development and for reporting bugs have a good response time.
AJAX. AJAX is the core low level library used by all EMBOSS applications and provides a comprehensive set of basic objects and functions. It includes standard data structures for strings, sequences, features, structures, file handles, tables, lists, trees, dynamic arrays etc. Algorithms for string handling, pattern-matching, sorting, iteration and very fast database indexing are included, and much more besides. AJAX is licensed under the GNU LGPL.
NUCLEUS. NUCLEUS includes higher-level code and algorithms, mostly for common molecular sequence analysis tasks. Functions for sequence comparisons, translation, codon usage and annotation are included. In comparison to the AJAX library and the EMBOSS applications, some parts of NUCLEUS are not as well developed or documented. In future code refactoring, the libraries will be consolidated and the documentation improved. NUCLEUS, like AJAX, is licensed under the GNU LGPL.