Class ABIFParser

  • Direct Known Subclasses:
    ABIFChromatogram.Parser

    public class ABIFParser
    extends java.lang.Object
    A general base parser for files produced by ABI software. This includes chromatograms derived from ABI sequencers and potentially other data files as well. The format was described by Clark Tibbetts in his paper "Raw Data File Formats, and the Digital and Analog Raw Data Streams of the ABI PRISM 377 DNA Sequencer." Available online http://www-2.cs.cmu.edu/afs/cs/project/genome/WWW/Papers/clark.html

    Briefly, the format consists of a set of named fixed-length "tagged data records" which may contain data themselves, or pointers to data elsewhere in the file. This class reads these records and exposes them to subclasses through the getDataRecord(java.lang.String, int) method. The attributes of the records as described in Tibbets' paper are exposed through public (final) fields of ABIFParser.TaggedDataRecord instances.

    If a record only contains a pointer to the desired data (see ABIFParser.TaggedDataRecord.hasOffsetData, subclasses may get at the raw data by using ABIFParser.TaggedDataRecord.offsetData:

    This parser provides methods and classes for dealing with the files as streams or local files (local files being more memory-efficient).

    Author:
    Rhett Sutphin (UI CBCB), Richard Holland
    • Constructor Detail

      • ABIFParser

        public ABIFParser​(java.io.File f)
                   throws java.io.IOException
        Creates a new ABIFParser for a file.
        Throws:
        java.io.IOException
      • ABIFParser

        public ABIFParser​(java.io.InputStream in)
                   throws java.io.IOException
        Creates a new ABIFParser for an input stream. Note that the stream will be wrapped in a CachingInputStream if it isn't one already. If it is, it will be seeked to 0.
        Throws:
        java.io.IOException
      • ABIFParser

        public ABIFParser​(ABIFParser.DataAccess toParse)
                   throws java.io.IOException
        Creates a new ABIFParser for the specified ABIFParser.DataAccess object. If you need to read from something other than a file or a stream, you'll have to implement a ABIFParser.DataAccess-implementing class wrapping your source and then pass an instance to this constructor.
        Throws:
        java.io.IOException
    • Method Detail

      • getDataAccess

        public final ABIFParser.DataAccess getDataAccess()
        Returns the accessor for the raw data being parsed by this parser.
      • decodeDNAToken

        public static Symbol decodeDNAToken​(char token)
                                     throws IllegalSymbolException
        Decodes a character into a Symbol in the DNA alphabet. Uses a definition of characters that is compatible with the ABI format.
        Parameters:
        token - the character to decode
        Throws:
        IllegalSymbolException - when token isn't in { a, A, c, C, g, G, t, T, n, N, - }
      • getDataRecord

        public ABIFParser.TaggedDataRecord getDataRecord​(java.lang.String tagName,
                                                         int tagNumber)
                                                  throws java.lang.IllegalArgumentException,
                                                         java.lang.IllegalStateException
        Get the entry from the file TOC with the given name and tag number.
        Parameters:
        tagName - the four-character string name of the desired data record
        tagNumber - which one of the tags with this name to return (must be positive)
        Returns:
        the requested data record, or null if no such record exists
        Throws:
        java.lang.IllegalArgumentException - if tagName is the wrong length or tagNumber is 0 or negative
        java.lang.IllegalStateException - if the initial parsing is not complete
      • getAllDataRecords

        public java.util.Map getAllDataRecords()
        Obtain all data records. Keys of the map are strings consisting of tag names with tag numbers concatenated immediately afterwards. Values are TaggedDataRecord objects. The map has no particular order and so cannot be relied on to iterate over records in the same order they were read from the file.
        Returns:
        the map of all data records.