gatb.core-API-0.0.0
|
Structure holding genomic information. More...
#include <Sequence.hpp>
Public Member Functions | |
Sequence (tools::misc::Data::Encoding_e encoding=tools::misc::Data::ASCII) | |
Sequence (char *seq) | |
virtual | ~Sequence () |
virtual const std::string & | getComment () const |
virtual const std::string | getCommentShort () const |
virtual const std::string & | getQuality () const |
virtual tools::misc::Data & | getData () |
virtual char * | getDataBuffer () const |
virtual size_t | getDataSize () const |
virtual tools::misc::Data::Encoding_e | getDataEncoding () const |
virtual size_t | getIndex () const |
void | setDataRef (tools::misc::Data *ref, int offset, int length) |
void | setIndex (size_t index) |
std::string | toString () const |
void | setComment (const std::string &cmt) |
void | setQuality (const std::string &qual) |
std::string | getRevcomp () const |
Public Attributes | |
std::string | _comment |
std::string | _quality |
Structure holding genomic information.
A sequence holds several data :
The genomic data is hold in a tools::misc::Data attribute and is supposed to hold nucleotides.
Actually, the inner format may be of different kind (ASCII, INTEGER, BINARY) and depends on the type of the bank that provides Sequence objects. For instance:
The buffer holding the nucleotides is located in the tools::misc::Data attribute, so have a look there to have further details on where the buffer can be allocated. Note just here that the buffer could be stored in the Data object itself, or may be a reference to a buffer allocated in another place.
The class Sequence is closely related to the IBank interface.
Note that this class should not be instantiated directly by end users; it is more likely that end users will receive such objects through an iteration from a bank.
Example of use:
|
inline |
Constructor.
[in] | encoding | : encoding scheme of the genomic data of the sequence |
|
inline |
Constructor. For testing mainly : allows to set the genomic data through an ascii representation. For instance, one can provide "ACTTACGCAGAT" as argument of this constructor.
[in] | seq | : the genomic data as an ascii string |
|
inlinevirtual |
Destructor.
|
inlinevirtual |
|
inlinevirtual |
|
inlinevirtual |
|
inlinevirtual |
Return the raw buffer holding the genomic data. IMPORTANT : getting genomic data this way implies that the user knows what is the underlying encoding scheme in order to decode it (may be ASCII, INTEGER or BINARY)
|
inlinevirtual |
|
inlinevirtual |
|
inlinevirtual |
Return the index of the sequence. It may be the index of the sequence in the database that holds the sequence.
|
inlinevirtual |
|
inline |
Returns a string that is the reverse complement of the sequence The Sequence object needs to be in ASCII Format
|
inline |
Set the comment of the sequence (likely to be called by a IBank iterator).
[in] | cmt | : comment of the sequence |
|
inline |
Set the genomic data as a reference on a Data object (more precisely on a range in this data). This method may be used when one wants that the genomic data of the sequence points to an already existing buffer of nucleotides, which means that the sequence doesn't allocate any memory for storing the genomic data, it only relies on data stored somewhere else. This is mainly a shortcut to the gatb::core::tools::misc::Data::setRef method.
[in] | ref | : the referred Data instance holding the genomic data |
[in] | offset | : starting index in the referred data |
[in] | length | : length of the genomic data of the current sequence. |
|
inline |
Set the index of the sequence. Typically, it should be called by a IBank iterator that knows what is the index of the currently iterated sequence.
[in] | index | : index of the sequence |
|
inline |
Set the quality string of the sequence (likely to be called by a fastq iterator).
[in] | qual | : quality string of the sequence. |
|
inline |
Get an ascii representation of the sequence. IMPORTANT ! this implementation supposes that the format of the Data attribute is ASCII. No conversion is done in case of other formats.
std::string _comment |
Comment attribute (note: should be private with a setter and getter).
std::string _quality |
Quality attribute (note: should be private with a setter and getter).