Package org.biojava.bio.structure.io
Class PDBFileParser
- java.lang.Object
-
- org.biojava.bio.structure.io.PDBFileParser
-
public class PDBFileParser extends java.lang.Object
This class implements the actual PDB file parsing. Do not access it directly, but via the PDBFileReader class.Parsing
During the PDBfile parsing several Flags can be set:-
setParseCAOnly(boolean)
- parse only the Atom records for C-alpha atoms -
setParseSecStruc(boolean)
- a flag if the secondary structure information from the PDB file (author's assignment) should be parsed. If true the assignment can be accessed throughAminoAcid
.getSecStruc(); -
setAlignSeqRes(boolean)
- should the AminoAcid sequences from the SEQRES and ATOM records of a PDB file be aligned? (default:yes)
To provide excessive memory usage for large PDB files, there is the ATOM_CA_THRESHOLD. If more Atoms than this threshold are being parsed in a PDB file, the parser will automatically switch to a C-alpha only representation.
The result of the parsing of the PDB file is a new
For more documentation on how to work with the Structure API please see http://biojava.org/wiki/BioJava:CookBook#Protein_StructureStructure
object.Example
Q: How can I get a Structure object from a PDB file?
A:
public
Structure
loadStructure(String pathToPDBFile){ // The PDBFileParser is wrapped by the PDBFileReaderPDBFileReader
pdbreader = newPDBFileReader
();Structure
structure = null; try{ structure = pdbreader.getStructure(pathToPDBFile); System.out.println(structure); } catch (IOException e) { e.printStackTrace(); } return structure; }- Since:
- 1.4
- Author:
- Andreas Prlic, Jules Jacobsen
-
-
-
Field Summary
Fields Modifier and Type Field Description static int
ATOM_CA_THRESHOLD
the maximum number of atoms that will be parsed before the parser switches to a CA-only representation of the PDB file.static java.lang.String
HELIX
Helix secondary structure assignment.static int
MAX_ATOMS
the maximum number of atoms we will add to a structure this protects from memory overflows in the few really big protein structures.boolean
parseCAOnly
Set the flag to only read in Ca atoms - this is useful for parsing large structures like 1htq.static java.lang.String
PDB_AUTHOR_ASSIGNMENT
Secondary strucuture assigned by the PDB author/static java.lang.String
STRAND
Strand secondary structure assignment.static java.lang.String
TURN
Turn secondary structure assignment.
-
Constructor Summary
Constructors Constructor Description PDBFileParser()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected java.lang.String
getTimeStamp()
Returns a time stamp.boolean
isAlignSeqRes()
Flag if the SEQRES amino acids should be aligned with the ATOM amino acids.boolean
isParseCAOnly()
the flag if only the C-alpha atoms of the structure should be parsed.boolean
isParseSecStruc()
is secondary structure assignment being parsed from the file? default is nullvoid
linkChains2Compound(Structure s)
Structure
parsePDBFile(java.io.BufferedReader buf)
parse a PDB file and return a datastructure implementing PDBStructure interface.Structure
parsePDBFile(java.io.InputStream inStream)
parse a PDB file and return a datastructure implementing PDBStructure interface.void
setAlignSeqRes(boolean alignSeqRes)
define if the SEQRES in the structure should be aligned with the ATOM records if yes, the AminoAcids in structure.getSeqRes will have the coordinates set.void
setParseCAOnly(boolean parseCAOnly)
the flag if only the C-alpha atoms of the structure should be parsed.void
setParseSecStruc(boolean parseSecStruc)
a flag to tell the parser to parse the Author's secondary structure assignment from the file default is set to false, i.e.
-
-
-
Field Detail
-
PDB_AUTHOR_ASSIGNMENT
public static final java.lang.String PDB_AUTHOR_ASSIGNMENT
Secondary strucuture assigned by the PDB author/- See Also:
- Constant Field Values
-
HELIX
public static final java.lang.String HELIX
Helix secondary structure assignment.- See Also:
- Constant Field Values
-
STRAND
public static final java.lang.String STRAND
Strand secondary structure assignment.- See Also:
- Constant Field Values
-
TURN
public static final java.lang.String TURN
Turn secondary structure assignment.- See Also:
- Constant Field Values
-
ATOM_CA_THRESHOLD
public static final int ATOM_CA_THRESHOLD
the maximum number of atoms that will be parsed before the parser switches to a CA-only representation of the PDB file. If this limit is exceeded also the SEQRES groups will be ignored.- See Also:
- Constant Field Values
-
MAX_ATOMS
public static final int MAX_ATOMS
the maximum number of atoms we will add to a structure this protects from memory overflows in the few really big protein structures.- See Also:
- Constant Field Values
-
parseCAOnly
public boolean parseCAOnly
Set the flag to only read in Ca atoms - this is useful for parsing large structures like 1htq.
-
-
Method Detail
-
isParseCAOnly
public boolean isParseCAOnly()
the flag if only the C-alpha atoms of the structure should be parsed.- Returns:
- the flag
-
setParseCAOnly
public void setParseCAOnly(boolean parseCAOnly)
the flag if only the C-alpha atoms of the structure should be parsed.- Parameters:
parseCAOnly
- boolean flag to enable or disable C-alpha only parsing
-
isAlignSeqRes
public boolean isAlignSeqRes()
Flag if the SEQRES amino acids should be aligned with the ATOM amino acids.- Returns:
- flag if SEQRES - ATOM amino acids alignment is enabled
-
setAlignSeqRes
public void setAlignSeqRes(boolean alignSeqRes)
define if the SEQRES in the structure should be aligned with the ATOM records if yes, the AminoAcids in structure.getSeqRes will have the coordinates set.- Parameters:
alignSeqRes
-
-
isParseSecStruc
public boolean isParseSecStruc()
is secondary structure assignment being parsed from the file? default is null- Returns:
- boolean if HELIX STRAND and TURN fields are being parsed
-
setParseSecStruc
public void setParseSecStruc(boolean parseSecStruc)
a flag to tell the parser to parse the Author's secondary structure assignment from the file default is set to false, i.e. do NOT parse.- Parameters:
parseSecStruc
- if HELIX STRAND and TURN fields are being parsed
-
getTimeStamp
protected java.lang.String getTimeStamp()
Returns a time stamp.- Returns:
- a String representing the time stamp value
-
parsePDBFile
public Structure parsePDBFile(java.io.InputStream inStream) throws java.io.IOException
parse a PDB file and return a datastructure implementing PDBStructure interface.- Parameters:
inStream
- an InputStream object- Returns:
- a Structure object
- Throws:
java.io.IOException
-
parsePDBFile
public Structure parsePDBFile(java.io.BufferedReader buf) throws java.io.IOException
parse a PDB file and return a datastructure implementing PDBStructure interface.- Parameters:
buf
- a BufferedReader object- Returns:
- the Structure object
- Throws:
java.io.IOException
- ...
-
linkChains2Compound
public void linkChains2Compound(Structure s)
- Parameters:
s
- the structure
-
-