libStatGen Software  1
CigarRoller Class Reference

The purpose of this class is to provide accessors for setting, updating, modifying the CIGAR object. It is a child class of Cigar. More...

#include <CigarRoller.h>

Inheritance diagram for CigarRoller:
Collaboration diagram for CigarRoller:

Public Member Functions

 CigarRoller ()
 Default constructor initializes as a CIGAR with no operations.
 
 CigarRoller (const char *cigarString)
 Constructor that initializes the object with the specified cigarString.
 
CigarRolleroperator+= (CigarRoller &rhs)
 Add the contents of the specified CigarRoller to this object.
 
CigarRolleroperator+= (const CigarOperator &rhs)
 Append the specified operator to this object.
 
CigarRolleroperator= (CigarRoller &rhs)
 Set this object to be equal to the specified CigarRoller.
 
void Add (Operation operation, int count)
 Append the specified operation with the specified count to this object.
 
void Add (char operation, int count)
 Append the specified operation with the specified count to this object.
 
void Add (const char *cigarString)
 Append the specified cigarString to this object.
 
void Add (CigarRoller &rhs)
 Append the specified Cigar object to this object.
 
bool Remove (int index)
 Remove the operation at the specified index. More...
 
bool IncrementCount (int index, int increment)
 Increments the count for the operation at the specified index by the specified value, specify a negative value to decrement. More...
 
bool Update (int index, Operation op, int count)
 Updates the operation at the specified index to be the specified operation and have the specified count. More...
 
void Set (const char *cigarString)
 Sets this object to the specified cigarString.
 
void Set (const uint32_t *cigarBuffer, uint16_t bufferLen)
 Sets this object to the BAM formatted cigar found at the beginning of the specified buffer which is bufferLen long. More...
 
int getMatchPositionOffset ()
 DEPRECATED - do not use, there are better ways to accomplish that by using read lengths, reference lengths, span of the read, etc. More...
 
const char * getString ()
 Get the string reprentation of the Cigar operations in this object, caller must delete the returned value. More...
 
void clear ()
 Clear this object so that it has no Cigar Operations.
 
- Public Member Functions inherited from Cigar
 Cigar ()
 Default constructor initializes as a CIGAR with no operations.
 
void getCigarString (String &cigarString) const
 Set the passed in String to the string reprentation of the Cigar operations in this object. More...
 
void getCigarString (std::string &cigarString) const
 Set the passed in std::string to the string reprentation of the Cigar operations in this object. More...
 
void getExpandedString (std::string &s) const
 Sets the specified string to a valid CIGAR string of characters that represent the cigar with no digits (a CIGAR of "3M" would return "MMM"). More...
 
const CigarOperatoroperator[] (int i) const
 Return the Cigar Operation at the specified index (starting at 0).
 
const CigarOperatorgetOperator (int i) const
 Return the Cigar Operation at the specified index (starting at 0).
 
bool operator== (Cigar &rhs) const
 Return true if the 2 Cigars are the same (the same operations of the same sizes). More...
 
int size () const
 Return the number of cigar operations.
 
void Dump () const
 Write this object as a string to cout.
 
int getExpectedQueryBaseCount () const
 Return the length of the read that corresponds to the current CIGAR string. More...
 
int getExpectedReferenceBaseCount () const
 Return the number of bases in the reference that this CIGAR "spans". More...
 
int getNumBeginClips () const
 Return the number of clips that are at the beginning of the cigar.
 
int getNumEndClips () const
 Return the number of clips that are at the end of the cigar.
 
int32_t getRefOffset (int32_t queryIndex)
 Return the reference offset associated with the specified query index or INDEX_NA based on this cigar. More...
 
int32_t getQueryIndex (int32_t refOffset)
 Return the query index associated with the specified reference offset or INDEX_NA based on this cigar. More...
 
int32_t getRefPosition (int32_t queryIndex, int32_t queryStartPos)
 Return the reference position associated with the specified query index or INDEX_NA based on this cigar and the specified queryStartPos which is the leftmost mapping position of the first matching base in the query. More...
 
int32_t getQueryIndex (int32_t refPosition, int32_t queryStartPos)
 Return the query index or INDEX_NA associated with the specified reference offset when the query starts at the specified reference position. More...
 
int32_t getExpandedCigarIndexFromQueryIndex (int32_t queryIndex)
 Returns the index into the expanded cigar for the cigar associated with the specified queryIndex. More...
 
int32_t getExpandedCigarIndexFromRefOffset (int32_t refOffset)
 Returns the index into the expanded cigar for the cigar associated with the specified reference offset. More...
 
int32_t getExpandedCigarIndexFromRefPos (int32_t refPosition, int32_t queryStartPos)
 Returns the index into the expanded cigar for the cigar associated with the specified reference position and queryStartPos. More...
 
char getCigarCharOp (int32_t expandedCigarIndex)
 Return the character code of the cigar operator associated with the specified expanded CIGAR index. More...
 
char getCigarCharOpFromQueryIndex (int32_t queryIndex)
 Return the character code of the cigar operator associated with the specified queryIndex. More...
 
char getCigarCharOpFromRefOffset (int32_t refOffset)
 Return the character code of the cigar operator associated with the specified reference offset. More...
 
char getCigarCharOpFromRefPos (int32_t refPosition, int32_t queryStartPos)
 Return the character code of the cigar operator associated with the specified reference position. More...
 
uint32_t getNumOverlaps (int32_t start, int32_t end, int32_t queryStartPos)
 Return the number of bases that overlap the reference and the read associated with this cigar that falls within the specified region. More...
 
bool hasIndel ()
 Return whether or not the cigar has indels (insertions or delections) More...
 

Friends

std::ostream & operator<< (std::ostream &stream, const CigarRoller &roller)
 Writes all of the cigar operations contained in this roller to the passed in stream. More...
 

Additional Inherited Members

- Public Types inherited from Cigar
enum  Operation {
  none =0, match, mismatch, insert,
  del, skip, softClip, hardClip,
  pad
}
 Enum for the cigar operations. More...
 
- Static Public Member Functions inherited from Cigar
static bool foundInReference (Operation op)
 Return true if the specified operation is found in the reference sequence, false if not. More...
 
static bool foundInReference (char op)
 Return true if the specified operation is found in the reference sequence, false if not. More...
 
static bool foundInReference (const CigarOperator &op)
 Return true if the specified operation is found in the reference sequence, false if not. More...
 
static bool foundInQuery (Operation op)
 Return true if the specified operation is found in the query sequence, false if not. More...
 
static bool foundInQuery (char op)
 Return true if the specified operation is found in the query sequence, false if not. More...
 
static bool foundInQuery (const CigarOperator &op)
 Return true if the specified operation is found in the query sequence, false if not. More...
 
static bool isClip (Operation op)
 Return true if the specified operation is a clipping operation, false if not. More...
 
static bool isClip (char op)
 Return true if the specified operation is a clipping operation, false if not. More...
 
static bool isClip (const CigarOperator &op)
 Return true if the specified operation is a clipping operation, false if not. More...
 
static bool isMatchOrMismatch (Operation op)
 Return true if the specified operation is a match/mismatch operation, false if not. More...
 
static bool isMatchOrMismatch (const CigarOperator &op)
 Return true if the specified operation is a match/mismatch operation, false if not. More...
 
- Static Public Attributes inherited from Cigar
static const int MAX_OP_VALUE = pad
 
static const int32_t INDEX_NA = -1
 Value associated with an index that is not applicable/does not exist, used for converting between query and reference indexes/offsets when an associated index/offset does not exist. More...
 
- Protected Member Functions inherited from Cigar
void clearQueryAndReferenceIndexes ()
 
void setQueryAndReferenceIndexes ()
 
- Protected Attributes inherited from Cigar
std::vector< CigarOperatorcigarOperations
 

Detailed Description

The purpose of this class is to provide accessors for setting, updating, modifying the CIGAR object. It is a child class of Cigar.

Docs from Sam1.pdf:

Clipped alignment. In Smith-Waterman alignment, a sequence may not be aligned from the first residue to the last one. Subsequences at the ends may be clipped off. We introduce operation ʻSʼ to describe (softly) clipped alignment. Here is an example. Suppose the clipped alignment is: REF: AGCTAGCATCGTGTCGCCCGTCTAGCATACGCATGATCGACTGTCAGCTAGTCAGACTAGTCGATCGATGTG READ: gggGTGTAACC-GACTAGgggg where on the read sequence, bases in uppercase are matches and bases in lowercase are clipped off. The CIGAR for this alignment is: 3S8M1D6M4S.

If the mapping position of the query is not available, RNAME and CIGAR are set as “*”

A CIGAR string is comprised of a series of operation lengths plus the operations. The conventional CIGAR format allows for three types of operations: M for match or mismatch, I for insertion and D for deletion. The extended CIGAR format further allows four more operations, as is shown in the following table, to describe clipping, padding and splicing:

op Description


M Match or mismatch I Insertion to the reference D Deletion from the reference N Skipped region from the reference S Soft clip on the read (clipped sequence present in <seq>) H Hard clip on the read (clipped sequence NOT present in <seq>) P Padding (silent deletion from the padded reference sequence)

CigarRoller is an aid to correctly generating the CIGAR strings necessary to represent how a read maps to the reference.

It is called once a particular match candidate is being written out, so it is far less performance sensitive than the Smith Waterman code below.

Definition at line 66 of file CigarRoller.h.

Member Function Documentation

◆ getMatchPositionOffset()

int CigarRoller::getMatchPositionOffset ( )

DEPRECATED - do not use, there are better ways to accomplish that by using read lengths, reference lengths, span of the read, etc.

Definition at line 244 of file CigarRoller.cpp.

References Cigar::del, and Cigar::insert.

Referenced by Add().

245 {
246  int offset = 0;
247  std::vector<CigarOperator>::iterator i;
248 
249  for (i = cigarOperations.begin(); i != cigarOperations.end(); i++)
250  {
251  switch (i->operation)
252  {
253  case insert:
254  offset += i->count;
255  break;
256  case del:
257  offset -= i->count;
258  break;
259  // TODO anything for case skip:????
260  default:
261  break;
262  }
263  }
264  return offset;
265 }
insertion to the reference (the query sequence contains bases that have no corresponding base in the ...
Definition: Cigar.h:91
deletion from the reference (the reference contains bases that have no corresponding base in the quer...
Definition: Cigar.h:92

◆ getString()

const char * CigarRoller::getString ( )

Get the string reprentation of the Cigar operations in this object, caller must delete the returned value.

Definition at line 272 of file CigarRoller.cpp.

Referenced by Add().

273 {
274  // NB: the exact size of the string is not important, it just needs to be guaranteed
275  // larger than the largest number of characters we could put into it.
276 
277  // we do not explicitly manage memory usage, and we expect when program exits, the memory used here will be freed
278  static char *ret = NULL;
279  static unsigned int retSize = 0;
280 
281  if (ret == NULL)
282  {
283  retSize = cigarOperations.size() * 12 + 1; // 12 == a magic number -> > 1 + log base 10 of MAXINT
284  ret = (char*) malloc(sizeof(char) * retSize);
285  assert(ret != NULL);
286 
287  }
288  else
289  {
290  // currently, ret pointer has enough memory to use
291  if (retSize > cigarOperations.size() * 12 + 1)
292  {
293  }
294  else
295  {
296  retSize = cigarOperations.size() * 12 + 1;
297  free(ret);
298  ret = (char*) malloc(sizeof(char) * retSize);
299  }
300  assert(ret != NULL);
301  }
302 
303  char *ptr = ret;
304  char buf[12]; // > 1 + log base 10 of MAXINT
305 
306  std::vector<CigarOperator>::iterator i;
307 
308  // Progressively append the character representations of the operations to
309  // the cigar string we allocated above.
310 
311  *ptr = '\0'; // clear result string
312  for (i = cigarOperations.begin(); i != cigarOperations.end(); i++)
313  {
314  sprintf(buf, "%d%c", (*i).count, (*i).getChar());
315  strcat(ptr, buf);
316  while (*ptr)
317  {
318  ptr++; // limit the cost of strcat above
319  }
320  }
321  return ret;
322 }

◆ IncrementCount()

bool CigarRoller::IncrementCount ( int  index,
int  increment 
)

Increments the count for the operation at the specified index by the specified value, specify a negative value to decrement.

Returns
true if it is successfully incremented, false if not.

Definition at line 171 of file CigarRoller.cpp.

Referenced by Add(), and SamRecord::shiftIndelsLeft().

172 {
173  if((index < 0) || ((unsigned int)index >= cigarOperations.size()))
174  {
175  // can't update, out of range, return false.
176  return(false);
177  }
178  cigarOperations[index].count += increment;
179 
180  // Modifying the cigar, so the query & reference indexes are out of date,
181  // so clear them.
182  clearQueryAndReferenceIndexes();
183  return(true);
184 }

◆ Remove()

bool CigarRoller::Remove ( int  index)

Remove the operation at the specified index.

Returns
true if successfully removed, false if not.

Definition at line 156 of file CigarRoller.cpp.

Referenced by Add(), SamRecord::shiftIndelsLeft(), and CigarHelper::softClipEndByRefPos().

157 {
158  if((index < 0) || ((unsigned int)index >= cigarOperations.size()))
159  {
160  // can't remove, out of range, return false.
161  return(false);
162  }
163  cigarOperations.erase(cigarOperations.begin() + index);
164  // Modifying the cigar, so the query & reference indexes are out of date,
165  // so clear them.
166  clearQueryAndReferenceIndexes();
167  return(true);
168 }

◆ Set()

void CigarRoller::Set ( const uint32_t *  cigarBuffer,
uint16_t  bufferLen 
)

Sets this object to the BAM formatted cigar found at the beginning of the specified buffer which is bufferLen long.

Definition at line 211 of file CigarRoller.cpp.

References Add(), and clear().

212 {
213  clear();
214 
215  // Parse the buffer.
216  for (int i = 0; i < bufferLen; i++)
217  {
218  int opLen = cigarBuffer[i] >> 4;
219 
220  Add(cigarBuffer[i] & 0xF, opLen);
221  }
222 }
void Add(Operation operation, int count)
Append the specified operation with the specified count to this object.
Definition: CigarRoller.cpp:77
void clear()
Clear this object so that it has no Cigar Operations.

◆ Update()

bool CigarRoller::Update ( int  index,
Operation  op,
int  count 
)

Updates the operation at the specified index to be the specified operation and have the specified count.

Returns
true if it is successfully updated, false if not.

Definition at line 187 of file CigarRoller.cpp.

Referenced by Add(), and SamRecord::shiftIndelsLeft().

188 {
189  if((index < 0) || ((unsigned int)index >= cigarOperations.size()))
190  {
191  // can't update, out of range, return false.
192  return(false);
193  }
194  cigarOperations[index].operation = op;
195  cigarOperations[index].count = count;
196 
197  // Modifying the cigar, so the query & reference indexes are out of date,
198  // so clear them.
199  clearQueryAndReferenceIndexes();
200  return(true);
201 }

Friends And Related Function Documentation

◆ operator<<

std::ostream& operator<< ( std::ostream &  stream,
const CigarRoller roller 
)
friend

Writes all of the cigar operations contained in this roller to the passed in stream.

Definition at line 167 of file CigarRoller.h.

168 {
169  stream << roller.cigarOperations;
170  return stream;
171 }

The documentation for this class was generated from the following files: