9.8. Utilities

In addition to the 200 + EMBOSS and EMBASSY programs that can be used through the Jemboss interface, there are three utilities that offer a graphical analysis of certain results.

9.8.1. Jemboss Alignment Editor (JAE)

A multiple sequence viewer written and supported by the EMBOSS team, this is also available as a standalone utility from http://emboss.open-bio.org/Jemboss/ as well as being a tool attached to the Jemboss interface. It reads any sequence in FASTA or MSF format.

Any number of JAE windows may be opened at any one time.

  • Type str into the Jemboss GoTo box to select the stretcher application. Enter uni:bgal_ecoli into the first Sequence Filename field and uni:bgal1_entcl into the second. LOAD SEQEUNCE ATTRIBUTES for both sequences. Alter the align format drop down menu to read fasta (scroll up) and hit GO. Save the results to a local file (Section 9.3.5.1, “Saving Analysis Results”) as bgal_align.fasta and closed the Saved Results window.

  • If using the standalone version downloaded from the above mentioned OpenBio page: Select the File menu and the Open option and navigate to the bgal_align.fasta file. Open the file.

The Open option does not default to the specified local home directory location (Section 9.7.1, “Directory location”) and must be navigated to.

  • From the main Jemboss window, it is also possible to use the Tools menu and select the Multiple Sequence Editor ; Jemboss utility. Use the File/Open menus to retrieve the alignment file.

9.8.1.1. Sequence Manipulation

The alignment can be manipulated by dragging residues or nucleotides to the right or left to create or close gaps.

  • Highlight both sequences by clicking on each in turn and hit the Lock button at the bottom left of the viewer.

If a sequence is highlighted in error then a further mouse click will un-highlight it.

  • Mouse over the fifteenth position of the E. coli sequence and drag the arginine residue to the right. The other sequence also moves at the same point.

  • To unlock the sequences hit the same button (it now says unlock). The Edit/Unlock All Sequences options can also be used instead.

The sequence text size can be altered using the drop down menu to the right of the Calculate menu on the toolbar.

9.8.1.2. Sequence Manipulation Menu

This can be accessed by selecting a residue and clicking with the right hand mouse button. In addition to the menu options, the sequence containing the selected residue is reported.

9.8.1.3. Delete

The delete option removes the entire sequence from the display. Currently this operation cannot be undone without reloading the entire alignment back into the JAE.

9.8.1.4. Reverse Complement

This is for nucleotide sequences only. Further options to reverse and/or complement the sequence separately are available from the menu.

9.8.1.5. Trim Sequences

Sequences can be trimmed at both the 5' and 3' ends to accommodate alignments between sequences with loosely configured termini.

  • Select the Edit menu and the Trim Sequences option. Alter the start to 15 and the end to 1016 to remove the sequence termini. Hit the OK button.

There is currently no way to reinstate the trimmed termini without re-loading the whole alignment.

9.8.1.6. Insert Annotation Sequence

The Insert Annotation Sequence option is accessed from the Edit menu and allows a sequence to be annotated with further data which can be Read from File or entered via Cut and Paste into the entry box. Any sequences must be pasted using the <CONTROL>+V shortcut. Accepted formats are fasta, msf, clustal and jpred.

9.8.1.7. Display Manipulation

Options in the View menu offer various options to display the alignment data. Options are available to colour the text or the white area surrounding the text as the draw and colour boxes options respectively.

9.8.1.8. Find Pattern

  • Select the View menu at the top of the Editor and then the Find Pattern option. Enter LILC into the field and hit the Find button to reveal the area immediately before the active site glutamate. Close the Find Pattern window.

The last pattern to be searched will remain in the entry field until either a new pattern is selected or the JAE session is halted. Find patterns is not case sensitive. The wrap around option is selected by default and ensures the pattern is searched for within the whole sequence and not simply the portion of sequences following the cursor. The same pattern may be used to search in multiple places in the sequence.

9.8.1.9. Matrix Display

Displays all protein and nucleotide alignment matrices.

  • Select the View menu and Matrix Display option to reveal the default Blosum62 scoring matrix grid. Highlight the EPAM250 matrix from the right hand pane and hit the Set button. The new matrix will be reported in the main JAE window.

The new scoring matrix will then apply to calculations of the consensus sequence (Section 9.8.1.16, “Consensus Sequence”). The current matrix is reported at the bottom left of the alignment viewer. The set matrix display will be remembered for the duration of the JAE session.

  • Close the matrix display box.

9.8.1.10. Colour Display

Specifies the colour for a given amino acid.

  • Access the Colour Display option from the View menu. Right click on the colour box against the arginine (R) amino acid and select the colour option that appears. Select light blue (R128, G192, B255 on the tooltip) and confirm selection with a left mouse click.

The changes are effected in the alignment viewer.

  • Close the Colour window.

9.8.1.11. Colour by Property

Alters the display to colour amino acids according to selected properties. The default selection is Residue Colour and the white area surrounding the text is coloured with the appropriate hue. This can be turned off by deselecting the colour boxes option at the bottom of the View menu but coloured boxes reappear when a different scheme is selected. The selected colour scheme is reported at the bottom of the alignment display.

The Colour by Property option indicates which property is represented by which colour. This selection may be useful for identifying regions of functional or structural similarity.

9.8.1.12. Colour Identical matches

  • Select the View menu and Colour Identical/Matches.

    The colour selection for each match can be altered by left clicking on the default colour and selecting a new one from the available palette. The default box selection can be altered to change the display image of the alignment.

    Retain the Identity Number (Section 9.8.1.13, “Identity Number”) as 2. Alter the Threshold for positive matches (Section 9.8.1.14, “Threshold for Positive Matches”) to 2.

    The Identity Colour represents the actual text and the Background Colour represents the area around the text.

  • Hit the Set button to effect the changes in the Viewer. Close the Colour Matches box.

  • De-select the Box option to remove the black line drawing around areas of identity and similarity.

9.8.1.13. Identity Number

This selects which matches are coloured as identical and that selection is based on how many of the amino acid residues in a specific position are identical. The default is always the number of sequences in the alignment.

Access the Identity Table option from the Calculate menu to see that the current alignment affords a 64% identity between the two residues. Move several residues to the right (Section 9.8.1.1, “Sequence Manipulation”) and re-select the Identity Table option. Note the difference in the identity.

9.8.1.14. Threshold for Positive Matches

Any score greater than zero according to values in the selected scoring matrix.

9.8.1.15. Data Display

Options in the Calculate menu are designed to offer an alternative calculation of the information contained in the alignment display.

9.8.1.16. Consensus Sequence

  • Select the File menu on the Jemboss Alignment Editor and open the new alignment file.

  • Access the Calculate menu and select the Calculate Consensus option.

Once the initial calculation has been done, the menu option subsequently reads Recalculate consensus.

  • Scroll to residue 64 to see the start of the consensus sequence VTYDG-SL.

This calculation is based on the EMBOSS program cons. It may be affected by altering the scoring matrix (Section 9.8.1.9, “Matrix Display”) and the Consensus Options (see below) and thus should be recalculated after every change to these parameters. The consensus sequence may be removed from the display at any time using the delete option on the Sequence Manipulation Menu (Section 9.8.1.2, “Sequence Manipulation Menu”).

  • Select the Set consensus options from the Calculate menu to show the Consensus Options box.

The consensus sequence needn't be displayed before these options are set as the display will then appear on screen.

  • Alter the Minimum positive match score value for there to be a consensus option to 6. The residue G disappears from this consensus run with a plurality value of 6.

The Minimum positive match score represents the plurality value. A consensus sequence will be displayed if the resulting score*weight calculation of the sequence equals or exceeds this value. The default is 0.5 per sequence in the alignment which is equal to half a default weight for each sequence of 1. Values to several decimal places will be accepted in this field.

  • Alter the plurality value back to 4.5 and alter the Minimum number of identities for there to be a consensus to 6. Hit the Calculate Consensus button. The T and S residues disappear from the sequence as there are only 5 and 4 of them in positions 64 and 69 respectively.

Only whole numbers may be entered into this field. This parameter is based on the number of identities in a specific position, but both this and the plurality value requirements must be fulfilled before a consensus is shown.

  • Re-enter 6 in the plurality value field. The consensus run now appears as V-YD---L.

  • Increase the value of the Threshold positive match score for setting the consensus to uppercase to 6 and hit the Calculate Consensus button. The remaining consensus run is now displayed in lower case characters. Alter the value to 5 to reinstate upper case letters.

Residues in the consensus will be set in uppercase only if their plurality value exceeds the Threshold positive match score. Values to several decimal places will be accepted in this field.

Any alterations in the Consensus Options box will be deleted once the window is closed, although the altered consensus sequence will remain in the display until it is removed or the alignment window is closed.

The consensus sequence may be saved using the File/Save Consensus options. The consensus sequence must be visible on the display before it can be saved.

9.8.1.17. Consensus Plot

  • Access the Calculate menu and select the Calculate Consensus plot option. The plot will appear underneath the alignment.

This plot is based on the calculations used in the EMBOSS program plotcon which plots the graph based on the percentage identity of each sequence in the alignment.

Once calculated the menu option reads Recalculate Consensus Plot and this must be selected if the matrix used for the alignment is altered. Calculated consensus plots can be removed from the display by right clicking the consensus sequence and then selecting the Delete option.

9.8.1.18. Identity Table

Accessed from the Calculate menu. Calculates pairwise percentage identity between each sequence in the alignment. This calculation is a simple percentage score of identical matches within the multiple alignment as displayed.

9.8.1.19. Sort by ID

This option rearranges the alignment display to list the sequences in the alignment in alphabetical order.

9.8.1.20. Save

  • Use the File menu and Save As option to save the manipulated alignment to a local file. The Save Consensus option stores the consensus sequence only to a local file.

9.8.1.21. Print

The alignment may be printed to paper or as a graphic.

  • Select File menu and the Print Preview option. Select single page and alter the Residues per line to 25 and view the results. Return to File menu and select Print and Print Image Files from the submenu. Hit OK on the Page Setup box and in the subsequent Save box, alter the Residues per line on the right hand side to 25 and select JPG from the drop down menu. Alter the file name to bgal_display and Save to an appropriate location. Five images have now been saved.

  • Should the graphic need to appear on paper, it will be printed on five sheets. Select the Print Postscript option and alter the page setup and number of residues per line as requested in the two subsequent boxes. Alter any printer properties before printing. There may be slight delay between the Page Setup box disappearing and the Options box appearing.

Both the Page Setup and Options boxes appear when the Print Preview for Multiple Pages is selected. Anything in these boxes may be altered and previewed before committing the graphic to paper.

The single image may be printed or previewed. In this case, the only option offered is the number of Residues per line.

9.8.2. DNA Editor

This is a basic drawing package that offers a graphical representation of the features within a nucleotide sequence. It is accessible from the Tools menu on the Jemboss main window. This utility is also available as a standalone option from http://emboss.open-bio.org/Jemboss/. Jemboss itself does not need to be downloaded in order for this to function.

If this editor is opened in error then a graphical display will result when the editor is closed again. This display should simply be closed (and thereby deleted) if it is not required.

9.8.2.1. Data Input

9.8.2.1.1. Create New DNA Display

Use the mouse or the TAB key to move around to various entry fields as use of the return key will create the graphic before all the features have been added.

  • Access the DNA Editor from the Jemboss Tools menu.

alternatively:

  • Accept the security filter (if launching from the website) and hit OK on the Jemboss DNA Viewer wizard to create a new DNA display. In the DNA Attributes section, alter the type of graphic to linear and enter 7477 in the stop entry field. Alter the Tick Interval on the right hand side to 1000 and the Minor Tick Interval to 0.

  • In the Genetic Feature section, enter 79 in the start and 1161 in the stop fields (overwriting the previous numbers) and hit the Add Feature button. Double click on CDS in the Label column of the Genetic Feature table and alter the text to LacI. Hit return to confirm text entry.

  • Now enter 1284 and 4358 in the start and stop fields. Hit the red square and alter the colour to green, confirming with the OK button on the colour palette. Hit the Add feature button and alter the Label text in the Genetic Feature table to lacZ and confirm by hitting return. Alter the line width to 7 using the slide bar.

Should a feature have been added in error then there is currently no way of deleting it from the table using the editor wizard, but it must be deleted after the graphic has been created.

Line width may also be entered directly into the field to the right of the colour square.

  • In the Restriction Enzymes section, enter Cac8I and 159 in the Label and Position fields respectively. Hit the Add RE button.

  • Hit the OK button at the bottom of the wizard to produce the graphical output.

9.8.2.1.2. Read in Data File

The Editor can read files created by the programs lindna and cirdna and write them out to the DNA Viewer (Section 9.8.2.2, “DNA Viewer). This option is available as an alternative to the create a new DNA display (Section 9.8.2.1.1, “Create New DNA Display”) option on opening the DNA Editor.

9.8.2.1.3. Edit Data File

This option is only available from the DNA viewer (Section 9.8.2.2, “DNA Viewer) and allows a return to the DNA Wizard in order to alter the original data.

9.8.2.2. DNA Viewer

Contains the graphical output created by the DNA Editor and offers all options included in the wizard and some additional features to edit the graphic. The graphic can be viewed as a larger or smaller image using the View menu and Zoom In and Out options.

Once this Viewer is closed, the data is lost unless it has been saved (Section 9.8.2.8, “Print”) or if the original data was available as a lindna or cirdna file (Section 9.8.2.1.2, “Read in Data File”)

9.8.2.3. DNA Wizard

Used as above, and accessed from the options menu. Options to create a new file (Section 9.8.2.1.1, “Create New DNA Display”), read in a datafile (Section 9.8.2.1.2, “Read in Data File”) and edit currently displayed data (Section 9.8.2.1.3, “Edit Data File”) are available.

9.8.2.4. DNA Attributes

Contains all DNA attributes found in the DNA Editor. To alter a feature in the graphical output, enter the relevant value in any of the fields and hit the return button to effect the change in the graphic.

9.8.2.5. Tick Marks

Contains the same tick mark information as the DNA editor. Alter the text and press the Set button to effect the change in the graphic.

9.8.2.6. Genetic Features

Contains the same Genetic features information as the DNA editor. Text may be altered in the table by double clicking on the relevant entry field, altering the text and hitting return to confirm input. The colour of a feature may, as for the DNA editor, be altered within the table. Select the current colour and alter it using anyone of the three colour palette tabs offered.

Should a feature have been mistakenly added, highlight it and delete using the Tools/Delete Selected Row option. This delete option is only available in the DNA Viewer.

9.8.2.7. Restriction Enzymes

Contains the same restriction enzyme information as the DNA editor. New restriction positions can be added or existing positions can be deleted.

9.8.2.8. Print

There are options under the File menu to print the final graphic to paper (Print Postscript option) or to an image file. If the Print Postscript option is chosen then the data will be sent by the print wizard to the printer you will select. An actual postscript file may be saved by selecting the Print to File option on the print wizard (if you have any postscript printer installed and you have selected it).

  • Once the graphic is complete, hit the File menu and select the Print option. Select the Print png/jpeg image option. Click OK in the Page Setup box. Select an appropriate folder and rename the image in the Save box. Hit the Save button.

The saved image can be used in subsequent documents.

  • Close the DNA Viewer.

9.8.3. JALVIEW

This is third party software, written originally by Michele Clamp. All downloads and documentation are available from http://www.jalview.org/. This multiple sequence viewer can be accessed in Jemboss via the Tools menu on the main Jemboss toolbar. Any errors found within this software cannot be fixed by the EMBOSS team, although they will pass any reports on to the relevant people.