gatb.core-API-0.0.0
ICountProcessor< span > Class Template Referenceabstract

Interface that uses kmer counting information. More...

#include <ICountProcessor.hpp>

Inheritance diagram for ICountProcessor< span >:
Inheritance graph

Public Types

typedef kmer::impl::Kmer< span >::Type Type
 

Public Member Functions

virtual void begin (const kmer::impl::Configuration &config)=0
 
virtual void end ()=0
 
virtual void beginPass (size_t passId)=0
 
virtual void endPass (size_t passId)=0
 
virtual ICountProcessorclone ()=0
 
virtual void finishClones (std::vector< ICountProcessor< span > * > &clones)=0
 
virtual void beginPart (size_t passId, size_t partId, size_t cacheSize, const char *name)=0
 
virtual void endPart (size_t passId, size_t partId)=0
 
virtual bool process (size_t partId, const Type &kmer, const CountVector &count, CountNumber sum=0)=0
 
virtual std::string getName () const =0
 
virtual void setName (const std::string &name)=0
 
virtual tools::misc::impl::Properties getProperties () const =0
 
virtual std::vector< ICountProcessor * > getInstances () const =0
 
template<typename T >
T * get () const
 
- Public Member Functions inherited from SmartPointer
void use ()
 
void forget ()
 
- Public Member Functions inherited from ISmartPointer
virtual ~ISmartPointer ()
 

Additional Inherited Members

- Protected Member Functions inherited from SmartPointer
 SmartPointer ()
 
virtual ~SmartPointer ()
 

Detailed Description

template<size_t span>
class gatb::core::kmer::ICountProcessor< span >

Interface that uses kmer counting information.

This interface is mainly an Observer that listens to data produced by the sorting count algorithm. Such an information is made of a kmer an the number of occurrences of this kmers in each bank provided to the algorithm.

Through this interface, it becomes easy to plug specific listeners that can do different things on the [kmer,counts] information. There is a default implementation of the ICountProcessor interface that does the historical job of DSK: 1) building an histogram 2) filtering out kmers with too low coverage 3) saving on disk kmers having big enough coverage

Such an instance can be associated to the SortingCountAlgorithm instance with the SortingCountAlgorithm::setProcessor method; this instance will be called 'prototype instance'.

From an execution point of view, one instance of ICountProcessor is created (with method 'clone') for counting the kmers of one specific partition. If N cores are used, it means that N instances of ICountProcessor will be cloned from the so called 'prototype' instance (ie. the instance associated to the SortingCountAlgorithm instance). Each clone processes its partition in one specific thread.

While processing a partition, a cloned ICountProcessor instance is called via its 'process' method: this is here that the information [kmer,counts] is provided to the ICountProcessor clone, and accordingly to the actual implementation class of the ICountProcessor interface, different processings can be done.

When all the clones have finished their job (in their own thread), the prototype instance is called (in the main thread) via the 'finishClones' method, where the prototype instance has access to the N clones before they are deleted. It allows for instance to gather in the prototype instance the information collected by the clones during their processing.

From a global point of view, the interface is made of three parts : 1) methods called on the prototype instance in the context of the main thread 2) methods called on a cloned instance in the context of specific threads 3) all other methods

The following figure shows how ICountProcessor interacts with other classes, and in particular with SortingCountAlgorithm. One can also see the multithreading context, with the main thread creating clones and with clones processing their job in specific threads.

ICountProcessor.png
Usage and life cycle of ICountProcessor in the context of SortingCountAlgorithm

Examples of ICountProcessor implementors : 1) CountProcessorHistogram : collect kmers distribution information 2) CountProcessorSolidity... : check whether a kmer is solid or not 3) CountProcessorDump : dump kmer count information in file system 4) CountProcessorChain : list of linked ICountProcessor instances

The CountProcessorChain implementation allows to link several instances of ICountProcessor. When such an instance is called via 'process', the first item of the list is called via 'process'; if it returns true, the next item in the list is called and so on; if it returns false, the chain is stopped. This class is used for the definition of the "DSK" count processor (histogram -> solidity -> dump)

Member Typedef Documentation

typedef kmer::impl::Kmer<span>::Type Type

Shortcuts.

Member Function Documentation

virtual void beginPart ( size_t  passId,
size_t  partId,
size_t  cacheSize,
const char *  name 
)
pure virtual

Called at the beginning of a new kmers partition processing.

Parameters
[in]passId: index of the current pass in the SortingCountAlgorithm.
[in]passId: index of the current kmers partition in the SortingCountAlgorithm.
[in]cacheSize: memory size used for the current kmers partition
[in]name: class name of the child PartitionsCommand class.

Implemented in CountProcessorDump< span >, CountProcessorDumpKff< span >, CountProcessorChain< span >, and CountProcessorAbstract< span >.

virtual void beginPass ( size_t  passId)
pure virtual

Called just before starting a pass.

Parameters
[in]passIdindex of the pass to begin

Implemented in CountProcessorAbstract< span >.

virtual void end ( )
pure virtual

Called just after the mainloop of SortingCountAlgorithm.

Implemented in CountProcessorHistogram< span >, CountProcessorChain< span >, and CountProcessorAbstract< span >.

virtual void endPart ( size_t  passId,
size_t  partId 
)
pure virtual

Called at the end of a new kmers partition processing.

Parameters
[in]passId: index of the current pass in the SortingCountAlgorithm.
[in]passId: index of the current kmers partition in the SortingCountAlgorithm.

Implemented in CountProcessorDumpKff< span >, CountProcessorDump< span >, CountProcessorChain< span >, and CountProcessorAbstract< span >.

virtual void endPass ( size_t  passId)
pure virtual

Called just after the end of a pass.

Implemented in CountProcessorCustomProxy< span >, CountProcessorCutoff< span >, and CountProcessorAbstract< span >.

T* get ( ) const
inline

Try to get an instance of a specific type within the current object.

Returns
a T pointer to the instance if found, 0 otherwise.
virtual std::vector<ICountProcessor*> getInstances ( ) const
pure virtual

Get a vector of instances in case of the current object is a composite.

Returns
a vector of ICountProcessor instance.

Implemented in CountProcessorChain< span >, and CountProcessorAbstract< span >.

virtual std::string getName ( ) const
pure virtual

Get a name for the count processor.

Returns
the count processor name.

Implemented in CountProcessorAbstract< span >.

virtual bool process ( size_t  partId,
const Type kmer,
const CountVector count,
CountNumber  sum = 0 
)
pure virtual

Notification that a [kmer,counts] is available and can be handled by the count processor.

Parameters
[in]partId: index of the current partition
[in]kmer: kmer for which we are receiving counts
[in]count: vector of counts of the kmer, one count per bank
[in]sum: sum of the occurrences for all bank.

Implemented in CountProcessorDumpKff< span >, CountProcessorHistogram< span >, CountProcessorDump< span >, CountProcessorChain< span >, CountProcessorCutoff< span >, and CountProcessorAbstract< span >.

virtual void setName ( const std::string &  name)
pure virtual

Set a name for the count processor.

Parameters
[in]name: the count processor name.

Implemented in CountProcessorAbstract< span >.


The documentation for this class was generated from the following file: