Class GeneFeatureHelper


  • public class GeneFeatureHelper
    extends java.lang.Object
    Author:
    Scooter Willis
    • Constructor Detail

      • GeneFeatureHelper

        public GeneFeatureHelper()
    • Method Detail

      • loadFastaAddGeneFeaturesFromUpperCaseExonFastaFile

        public static java.util.LinkedHashMap<java.lang.String,​ChromosomeSequence> loadFastaAddGeneFeaturesFromUpperCaseExonFastaFile​(java.io.File fastaSequenceFile,
                                                                                                                                            java.io.File uppercaseFastaFile,
                                                                                                                                            boolean throwExceptionGeneNotFound)
                                                                                                                                     throws java.lang.Exception
        Throws:
        java.lang.Exception
      • outputFastaSequenceLengthGFF3

        public static void outputFastaSequenceLengthGFF3​(java.io.File fastaSequenceFile,
                                                         java.io.File gffFile)
                                                  throws java.lang.Exception
        Output a gff3 feature file that will give the length of each scaffold/chromosome in the fasta file. Used for gbrowse so it knows length.
        Parameters:
        fastaSequenceFile -
        gffFile -
        Throws:
        java.lang.Exception
      • loadFastaAddGeneFeaturesFromGeneIDGFF2

        public static java.util.LinkedHashMap<java.lang.String,​ChromosomeSequence> loadFastaAddGeneFeaturesFromGeneIDGFF2​(java.io.File fastaSequenceFile,
                                                                                                                                java.io.File gffFile)
                                                                                                                         throws java.lang.Exception
        Loads Fasta file and GFF2 feature file generated from the geneid prediction algorithm
        Parameters:
        fastaSequenceFile -
        gffFile -
        Returns:
        Throws:
        java.lang.Exception
      • addGeneIDGFF2GeneFeatures

        public static void addGeneIDGFF2GeneFeatures​(java.util.LinkedHashMap<java.lang.String,​ChromosomeSequence> chromosomeSequenceList,
                                                     FeatureList listGenes)
                                              throws java.lang.Exception
        Load GFF2 feature file generated from the geneid prediction algorithm and map features onto the chromosome sequences
        Parameters:
        chromosomeSequenceList -
        listGenes -
        Throws:
        java.lang.Exception
      • getChromosomeSequenceFromDNASequence

        public static java.util.LinkedHashMap<java.lang.String,​ChromosomeSequence> getChromosomeSequenceFromDNASequence​(java.util.LinkedHashMap<java.lang.String,​DNASequence> dnaSequenceList)
      • loadFastaAddGeneFeaturesFromGmodGFF3

        public static java.util.LinkedHashMap<java.lang.String,​ChromosomeSequence> loadFastaAddGeneFeaturesFromGmodGFF3​(java.io.File fastaSequenceFile,
                                                                                                                              java.io.File gffFile,
                                                                                                                              boolean lazyloadsequences)
                                                                                                                       throws java.lang.Exception
        Lots of variations in the ontology or descriptors that can be used in GFF3 which requires writing a custom parser to handle a GFF3 generated or used by a specific application. Probably could be abstracted out but for now easier to handle with custom code to deal with gff3 elements that are not included but can be extracted from other data elements.
        Parameters:
        fastaSequenceFile -
        gffFile -
        lazyloadsequences - If set to true then the fasta file will be parsed for accession id but sequences will be read from disk when needed to save memory
        Returns:
        Throws:
        java.lang.Exception
      • addGmodGFF3GeneFeatures

        public static void addGmodGFF3GeneFeatures​(java.util.LinkedHashMap<java.lang.String,​ChromosomeSequence> chromosomeSequenceList,
                                                   FeatureList listGenes)
                                            throws java.lang.Exception
        Load GFF3 file using mRNA as the gene feature as not all GFF3 files are complete
        Parameters:
        chromosomeSequenceList -
        listGenes -
        Throws:
        java.lang.Exception
      • loadFastaAddGeneFeaturesFromGlimmerGFF3

        public static java.util.LinkedHashMap<java.lang.String,​ChromosomeSequence> loadFastaAddGeneFeaturesFromGlimmerGFF3​(java.io.File fastaSequenceFile,
                                                                                                                                 java.io.File gffFile)
                                                                                                                          throws java.lang.Exception
        Throws:
        java.lang.Exception
      • addGlimmerGFF3GeneFeatures

        public static void addGlimmerGFF3GeneFeatures​(java.util.LinkedHashMap<java.lang.String,​ChromosomeSequence> chromosomeSequenceList,
                                                      FeatureList listGenes)
                                               throws java.lang.Exception
        Throws:
        java.lang.Exception
      • loadFastaAddGeneFeaturesFromGeneMarkGTF

        public static java.util.LinkedHashMap<java.lang.String,​ChromosomeSequence> loadFastaAddGeneFeaturesFromGeneMarkGTF​(java.io.File fastaSequenceFile,
                                                                                                                                 java.io.File gffFile)
                                                                                                                          throws java.lang.Exception
        Throws:
        java.lang.Exception
      • addGeneMarkGTFGeneFeatures

        public static void addGeneMarkGTFGeneFeatures​(java.util.LinkedHashMap<java.lang.String,​ChromosomeSequence> chromosomeSequenceList,
                                                      FeatureList listGenes)
                                               throws java.lang.Exception
        Throws:
        java.lang.Exception
      • getProteinSequences

        public static java.util.LinkedHashMap<java.lang.String,​ProteinSequence> getProteinSequences​(java.util.Collection<ChromosomeSequence> chromosomeSequences)
                                                                                                   throws java.lang.Exception
        Throws:
        java.lang.Exception
      • getGeneSequences

        public static java.util.LinkedHashMap<java.lang.String,​GeneSequence> getGeneSequences​(java.util.Collection<ChromosomeSequence> chromosomeSequences)
                                                                                             throws java.lang.Exception
        Throws:
        java.lang.Exception
      • main

        public static void main​(java.lang.String[] args)
                         throws java.lang.Exception
        Throws:
        java.lang.Exception