[EMBOSS] Nthseq issue
pmr at ebi.ac.uk
Thu Jan 22 03:35:50 EST 2009
Scott Hazelhurst wrote:
> I don't know whether this is a bug or a feature, but I discovered that
> nthseq skips empty sequences in its counting. So if you have 10 sequences
> and the fifth is empty, then nthseq -number 6 actually returns the 7th
> sequence. It does print out a warning that the sequence is empty but not
> that its skipping (and also if you are putting this in a pipeline you
> wouldn't see it). I couldn't see any documentation on this.
> I found this problem in a data set from some collaborators, we ran dust and
> then used biosed to remove Ns. Obviously this makes some sequences not
> usable. While it is understandable why nthseq behaves in the way it does,
> the problem is that in an automated set up it may be difficult do the
We will, take a look. Zero length sequences are routinely ignored in
EMBOSS. We will check whether it is possible to use an alternative method
for counting in nthseq and any other application that counts input sequences.
Of course, if the nth sequence is empty nthseq would have to return a
failure to read it.
More information about the EMBOSS