gatb.core-API-0.0.0
|
Interface that uses kmer counting information. More...
#include <ICountProcessor.hpp>
Public Types | |
typedef kmer::impl::Kmer< span >::Type | Type |
Public Member Functions | |
virtual void | begin (const kmer::impl::Configuration &config)=0 |
virtual void | end ()=0 |
virtual void | beginPass (size_t passId)=0 |
virtual void | endPass (size_t passId)=0 |
virtual ICountProcessor * | clone ()=0 |
virtual void | finishClones (std::vector< ICountProcessor< span > * > &clones)=0 |
virtual void | beginPart (size_t passId, size_t partId, size_t cacheSize, const char *name)=0 |
virtual void | endPart (size_t passId, size_t partId)=0 |
virtual bool | process (size_t partId, const Type &kmer, const CountVector &count, CountNumber sum=0)=0 |
virtual std::string | getName () const =0 |
virtual void | setName (const std::string &name)=0 |
virtual tools::misc::impl::Properties | getProperties () const =0 |
virtual std::vector< ICountProcessor * > | getInstances () const =0 |
template<typename T > | |
T * | get () const |
Public Member Functions inherited from SmartPointer | |
void | use () |
void | forget () |
Public Member Functions inherited from ISmartPointer | |
virtual | ~ISmartPointer () |
Additional Inherited Members | |
Protected Member Functions inherited from SmartPointer | |
SmartPointer () | |
virtual | ~SmartPointer () |
Interface that uses kmer counting information.
This interface is mainly an Observer that listens to data produced by the sorting count algorithm. Such an information is made of a kmer an the number of occurrences of this kmers in each bank provided to the algorithm.
Through this interface, it becomes easy to plug specific listeners that can do different things on the [kmer,counts] information. There is a default implementation of the ICountProcessor interface that does the historical job of DSK: 1) building an histogram 2) filtering out kmers with too low coverage 3) saving on disk kmers having big enough coverage
Such an instance can be associated to the SortingCountAlgorithm instance with the SortingCountAlgorithm::setProcessor method; this instance will be called 'prototype instance'.
From an execution point of view, one instance of ICountProcessor is created (with method 'clone') for counting the kmers of one specific partition. If N cores are used, it means that N instances of ICountProcessor will be cloned from the so called 'prototype' instance (ie. the instance associated to the SortingCountAlgorithm instance). Each clone processes its partition in one specific thread.
While processing a partition, a cloned ICountProcessor instance is called via its 'process' method: this is here that the information [kmer,counts] is provided to the ICountProcessor clone, and accordingly to the actual implementation class of the ICountProcessor interface, different processings can be done.
When all the clones have finished their job (in their own thread), the prototype instance is called (in the main thread) via the 'finishClones' method, where the prototype instance has access to the N clones before they are deleted. It allows for instance to gather in the prototype instance the information collected by the clones during their processing.
From a global point of view, the interface is made of three parts : 1) methods called on the prototype instance in the context of the main thread 2) methods called on a cloned instance in the context of specific threads 3) all other methods
The following figure shows how ICountProcessor interacts with other classes, and in particular with SortingCountAlgorithm. One can also see the multithreading context, with the main thread creating clones and with clones processing their job in specific threads.
Examples of ICountProcessor implementors : 1) CountProcessorHistogram : collect kmers distribution information 2) CountProcessorSolidity... : check whether a kmer is solid or not 3) CountProcessorDump : dump kmer count information in file system 4) CountProcessorChain : list of linked ICountProcessor instances
The CountProcessorChain implementation allows to link several instances of ICountProcessor. When such an instance is called via 'process', the first item of the list is called via 'process'; if it returns true, the next item in the list is called and so on; if it returns false, the chain is stopped. This class is used for the definition of the "DSK" count processor (histogram -> solidity -> dump)
typedef kmer::impl::Kmer<span>::Type Type |
Shortcuts.
|
pure virtual |
Called just before the mainloop of SortingCountAlgorithm.
[in] | config | : configuration of the SortingCountAlgorithm. |
Implemented in CountProcessorSolidityAbstract< span, Derived >, CountProcessorSolidityAbstract< span, CountProcessorSoliditySum< span > >, CountProcessorSolidityAbstract< span, CountProcessorSolidityOne< span > >, CountProcessorSolidityAbstract< span, CountProcessorSolidityCustom< span > >, CountProcessorSolidityAbstract< span, CountProcessorSolidityAll< span > >, CountProcessorSolidityAbstract< span, CountProcessorSolidityMax< span > >, CountProcessorSolidityAbstract< span, CountProcessorSolidityMin< span > >, CountProcessorDump< span >, CountProcessorDumpKff< span >, CountProcessorChain< span >, and CountProcessorAbstract< span >.
|
pure virtual |
Called at the beginning of a new kmers partition processing.
[in] | passId | : index of the current pass in the SortingCountAlgorithm. |
[in] | passId | : index of the current kmers partition in the SortingCountAlgorithm. |
[in] | cacheSize | : memory size used for the current kmers partition |
[in] | name | : class name of the child PartitionsCommand class. |
Implemented in CountProcessorDump< span >, CountProcessorDumpKff< span >, CountProcessorChain< span >, and CountProcessorAbstract< span >.
|
pure virtual |
Called just before starting a pass.
[in] | passId | index of the pass to begin |
Implemented in CountProcessorAbstract< span >.
|
pure virtual |
Clone the instance. An instance can be cloned N times in order to use the cloned instance in one thread.
Implemented in CountProcessorHistogram< span >, CountProcessorSolidityAbstract< span, Derived >, CountProcessorSolidityAbstract< span, CountProcessorSoliditySum< span > >, CountProcessorSolidityAbstract< span, CountProcessorSolidityOne< span > >, CountProcessorSolidityAbstract< span, CountProcessorSolidityCustom< span > >, CountProcessorSolidityAbstract< span, CountProcessorSolidityAll< span > >, CountProcessorSolidityAbstract< span, CountProcessorSolidityMax< span > >, CountProcessorSolidityAbstract< span, CountProcessorSolidityMin< span > >, CountProcessorDump< span >, CountProcessorDumpKff< span >, CountProcessorChain< span >, and CountProcessorCutoff< span >.
|
pure virtual |
Called just after the mainloop of SortingCountAlgorithm.
Implemented in CountProcessorHistogram< span >, CountProcessorChain< span >, and CountProcessorAbstract< span >.
|
pure virtual |
Called at the end of a new kmers partition processing.
[in] | passId | : index of the current pass in the SortingCountAlgorithm. |
[in] | passId | : index of the current kmers partition in the SortingCountAlgorithm. |
Implemented in CountProcessorDumpKff< span >, CountProcessorDump< span >, CountProcessorChain< span >, and CountProcessorAbstract< span >.
|
pure virtual |
Called just after the end of a pass.
Implemented in CountProcessorCustomProxy< span >, CountProcessorCutoff< span >, and CountProcessorAbstract< span >.
|
pure virtual |
Called when N partitions have been processed through N clones. This should be the last time these clones are available before being deleted. It can be the opportunity to the prototype instance to gather information from the clones.
[in] | clones | : the N cloned instances |
Implemented in CountProcessorSolidityAbstract< span, Derived >, CountProcessorSolidityAbstract< span, CountProcessorSoliditySum< span > >, CountProcessorSolidityAbstract< span, CountProcessorSolidityOne< span > >, CountProcessorSolidityAbstract< span, CountProcessorSolidityCustom< span > >, CountProcessorSolidityAbstract< span, CountProcessorSolidityAll< span > >, CountProcessorSolidityAbstract< span, CountProcessorSolidityMax< span > >, CountProcessorSolidityAbstract< span, CountProcessorSolidityMin< span > >, CountProcessorDump< span >, CountProcessorDumpKff< span >, CountProcessorChain< span >, and CountProcessorAbstract< span >.
|
inline |
Try to get an instance of a specific type within the current object.
|
pure virtual |
Get a vector of instances in case of the current object is a composite.
Implemented in CountProcessorChain< span >, and CountProcessorAbstract< span >.
|
pure virtual |
Get a name for the count processor.
Implemented in CountProcessorAbstract< span >.
|
pure virtual |
Get some properties about the count processor.
Implemented in CountProcessorDumpKff< span >, CountProcessorHistogram< span >, CountProcessorDump< span >, CountProcessorSolidityAbstract< span, Derived >, CountProcessorSolidityAbstract< span, CountProcessorSoliditySum< span > >, CountProcessorSolidityAbstract< span, CountProcessorSolidityOne< span > >, CountProcessorSolidityAbstract< span, CountProcessorSolidityCustom< span > >, CountProcessorSolidityAbstract< span, CountProcessorSolidityAll< span > >, CountProcessorSolidityAbstract< span, CountProcessorSolidityMax< span > >, CountProcessorSolidityAbstract< span, CountProcessorSolidityMin< span > >, CountProcessorChain< span >, CountProcessorCutoff< span >, and CountProcessorAbstract< span >.
|
pure virtual |
Notification that a [kmer,counts] is available and can be handled by the count processor.
[in] | partId | : index of the current partition |
[in] | kmer | : kmer for which we are receiving counts |
[in] | count | : vector of counts of the kmer, one count per bank |
[in] | sum | : sum of the occurrences for all bank. |
Implemented in CountProcessorDumpKff< span >, CountProcessorHistogram< span >, CountProcessorDump< span >, CountProcessorChain< span >, CountProcessorCutoff< span >, and CountProcessorAbstract< span >.
|
pure virtual |
Set a name for the count processor.
[in] | name | : the count processor name. |
Implemented in CountProcessorAbstract< span >.