[EMBOSS] FW: Forthcoming change in the EMBL flatfile format
rls at ebi.ac.uk
Wed Apr 26 11:46:51 EDT 2006
> -----Original Message-----
> From: owner-seq-dbg at ebi.ac.uk
> [mailto:owner-seq-dbg at ebi.ac.uk] On Behalf Of Carola Kanz
> Sent: 26 April 2006 16:29
> To: seq-dbg at ebi.ac.uk
> Subject: Forthcoming change in the EMBL flatfile format
> Dear all,
> if you are working with the EMBL flatfile format and you are
> not yet aware of the format change we are going to introduce
> with the next release, please have a look at the following
> Dear colleagues,
> We would like to announce the following important change in
> the EMBL database in June this year.
> At the time of release 87 (available from JUN-2006) the
> format of the EMBL flat file will undergo a change: the ID
> line will have a different structure (see below) and the SV
> line will be removed.
> The changes affecting the ID line structure are:
> * All tokens will be separated by a semicolon.
> * The entry name will not be displayed, in its place
> there will be
> the primary accession number.
> * The sequence version will be indicated.
> * The topology will be a separate token and will be
> indicated for
> both circular and linear molecules.
> * Both the data class and the taxonomic divisions will
> be displayed.
> This is an example of the new ID line:
> ID CD789012; SV 4; linear; genomic DNA; HTG; MAM; 500 BP.
> (1) (2) (3) (4) (5) (6) (7)
> The tokens represent:
> 1. Primary accession number.
> 2. 'SV' + sequence version number.
> 3. Topology: 'circular' or 'linear'.
> 4. Molecule type.
> 5. Data class (ANN, CON, PAT, EST, GSS, HTC, HTG, MGA, WGS, TPA,
> STS, STD, "normal" entries will have STD for standard).
> 6. Taxonomic division (HUM, MUS, ROD, PRO, MAM, VRT, FUN,
> PLN, ENV,
> INV, SYN, UNC, VRL, PHG)."
> 7. Sequence length + 'BP.'.
> The entry name will not be displayed any more in the ID line.
> Since EMBL release 3 (Dec 1983) the stable identifier of an
> entry has been the primary accession number.
> A mapping file (entryname to accession number) will be
> provided with the next release for those entries where the
> entryname doesn't coincide with the accession number.
> To give users a test dataset, one file with new-style ID
> lines called new_id_line.test.gz was provided together with
> the March release of the EMBL database:
> Feedback from users is sought; please use the "Contact us"
> link at the bottom of the EBI home page and specify "EMBL" in
> the feedback form.
> Note: this information was first made available on our
> "Forthcoming changes" page (
> l#0606 ) and in the EMBL database release notes.
More information about the EMBOSS