Class SamFileHeaderMerger


  • public class SamFileHeaderMerger
    extends Object
    Merges SAMFileHeaders that have the same sequences into a single merged header object while providing read group translation for cases where read groups clash across input headers.
    • Constructor Detail

      • SamFileHeaderMerger

        public SamFileHeaderMerger​(SAMFileHeader.SortOrder sortOrder,
                                   Collection<SAMFileHeader> headers,
                                   boolean mergeDictionaries)
        Create SAMFileHeader with additional information.. This is the preferred constructor.
        Parameters:
        sortOrder - sort order new header should have
        headers - sam file headers to combine
        mergeDictionaries - If true, merge sequence dictionaries in new header. If false, require that all input sequence dictionaries be identical.
    • Method Detail

      • positiveFourDigitBase36Str

        public static String positiveFourDigitBase36Str​(int leftOver)
        Convert an integer to base36, protected solely for testing
        Parameters:
        leftOver - Both the initial value and the running quotient
        Returns:
        A four digit string composed of base 36 symbols
      • getReadGroupId

        public String getReadGroupId​(SAMFileHeader header,
                                     String originalReadGroupId)
        Returns the read group id that should be used for the input read and RG id.
      • getProgramGroupId

        @Deprecated
        public String getProgramGroupId​(SamReader reader,
                                        String originalProgramGroupId)
        Parameters:
        reader - one of the input files
        originalProgramGroupId - a program group ID from the above input file
        Returns:
        new ID from the merged list of program groups in the output file
      • getProgramGroupId

        public String getProgramGroupId​(SAMFileHeader header,
                                        String originalProgramGroupId)
        Parameters:
        header - one of the input headers
        originalProgramGroupId - a program group ID from the above input file
        Returns:
        new ID from the merged list of program groups in the output file
      • hasReadGroupCollisions

        public boolean hasReadGroupCollisions()
        Returns true if there are read group duplicates within the merged headers.
      • hasProgramGroupCollisions

        public boolean hasProgramGroupCollisions()
        Returns true if there are program group duplicates within the merged headers.
      • hasMergedSequenceDictionary

        public boolean hasMergedSequenceDictionary()
        Returns:
        if we've merged the sequence dictionaries, return true
      • getMergedHeader

        public SAMFileHeader getMergedHeader()
        Returns the merged header that should be written to any output merged file.
      • getHeaders

        public Collection<SAMFileHeader> getHeaders()
        Returns the collection of readers that this header merger is working with.
      • getMergedSequenceIndex

        @Deprecated
        public Integer getMergedSequenceIndex​(SamReader reader,
                                              Integer oldReferenceSequenceIndex)
        returns the new mapping for a specified reader, given it's old sequence index
        Parameters:
        reader - the reader
        oldReferenceSequenceIndex - the old sequence (also called reference) index
        Returns:
        the new index value
      • getMergedSequenceIndex

        public Integer getMergedSequenceIndex​(SAMFileHeader header,
                                              Integer oldReferenceSequenceIndex)
        Another mechanism for getting the new sequence index, for situations in which the reader is not available. Note that if the SAMRecord has already had its header replaced with the merged header, this won't work.
        Parameters:
        header - The original header for the input record in question.
        oldReferenceSequenceIndex - The original sequence index.
        Returns:
        the new index value that is compatible with the merged sequence index.