gatb.core-API-0.0.0
IHistogram Class Referenceabstract

Interface for kmers distribution management. More...

#include <IHistogram.hpp>

Inheritance diagram for IHistogram:
Inheritance graph

Public Member Functions

virtual ~IHistogram ()
 
virtual size_t getLength ()=0
 
virtual size_t getLength2 ()=0
 
virtual void inc (u_int16_t index)=0
 
virtual void inc2D (u_int16_t index1, u_int16_t index2)=0
 
virtual void save (tools::storage::impl::Group &group)=0
 
virtual void compute_threshold (int min_auto_threshold)=0
 
virtual u_int16_t get_solid_cutoff ()=0
 
virtual u_int64_t get_nbsolids_auto ()=0
 
virtual float get_ratio_weak ()=0
 
virtual u_int16_t get_first_peak ()=0
 
virtual u_int64_t & get (u_int16_t idx)=0
 
virtual u_int64_t & get2D (u_int16_t idx1, u_int16_t idx2)=0
 
- Public Member Functions inherited from ISmartPointer
virtual ~ISmartPointer ()
 
virtual void use ()=0
 
virtual void forget ()=0
 

Detailed Description

Interface for kmers distribution management.

This interface allows to have an idea of the function y(x), where x is the occurrence number of a kmer and y is the number of kmers occurring x times.

It is often interesting to have a graphical display of this kind of distribution; for instance, it may give an estimation of the coverage of NGS data.

We can also find x0 at the first minimum of y(x) : for x<x0, we are likely to have sequencing errors. The first maximum at x1 (x1>x0) is also interesting because it provides an estimation of the reads coverage.

This interface is mainly used by the SortingCountAlgorithm.

Here is a command line for showing the histogram with gnuplot from the hdf5 file 'graph.h5'

  • h5dump -y -d dsk/histogram graph.h5 | grep [0-9] | grep -v [A-Z].* | paste - - | gnuplot -p -e 'plot [][0:100] "-" with lines'

For the sum of the distribution, you can use:

  • h5dump -y -d dsk/histogram graph.h5 | grep [0-9] | grep -v [A-Z].* | paste - - | gawk 'BEGIN{s=0; i=0} { s=s+$2; i=i+1; print i," ", s}' | gnuplot -p -e 'plot [0:10][0:] "-" with lines'

Constructor & Destructor Documentation

virtual ~IHistogram ( )
inlinevirtual

Destructor.

Member Function Documentation

virtual void compute_threshold ( int  min_auto_threshold)
pure virtual

Compute first minimum at x0 and firt maximum at x1 (x1>x0).

Implemented in HistogramCache, HistogramNull, and Histogram.

virtual u_int64_t& get ( u_int16_t  idx)
pure virtual

Retrieve the value for x.

Parameters
[in]idx: x value.
Returns
y(x).

Implemented in HistogramCache, HistogramNull, and Histogram.

virtual u_int64_t& get2D ( u_int16_t  idx1,
u_int16_t  idx2 
)
pure virtual

Retrieve the value for x and y of histo2D.

Parameters
[in]idx1: x value.
[in]idx2: y value.
Returns
cpt(x,y).

Implemented in HistogramCache, HistogramNull, and Histogram.

virtual u_int16_t get_first_peak ( )
pure virtual

Get the x1 value at the first maximum after x0.

Implemented in HistogramCache, HistogramNull, and Histogram.

virtual u_int64_t get_nbsolids_auto ( )
pure virtual

Get the number of kmers for x>x0, aka solid kmers for x0 threshold

Returns
number of kmers.

Implemented in HistogramCache, HistogramNull, and Histogram.

virtual float get_ratio_weak ( )
pure virtual

Get the ratio of weak kmers in total volume

Returns
ratio

Implemented in HistogramCache, HistogramNull, and Histogram.

virtual u_int16_t get_solid_cutoff ( )
pure virtual

Get the solid cutoff, ie the x0 at first minimum.

Returns
x0

Implemented in HistogramCache, HistogramNull, and Histogram.

virtual size_t getLength ( )
pure virtual

Return the maximum allowed for X.

Returns
the max X value.

Implemented in HistogramCache, HistogramNull, and Histogram.

virtual size_t getLength2 ( )
pure virtual

Return the maximum allowed for Y in case of 2D histogram.

Returns
the max Y value.

Implemented in HistogramCache, HistogramNull, and Histogram.

virtual void inc ( u_int16_t  index)
pure virtual

Increase the number of kmers occurring X time

Parameters
[in]index: the X value.

Implemented in HistogramCache, HistogramNull, and Histogram.

virtual void inc2D ( u_int16_t  index1,
u_int16_t  index2 
)
pure virtual

Increase the number of kmers occurring X time in genome and Y times in read

Parameters
[in]index1: the X value.
[in]index2: the Y value.

Implemented in HistogramCache, HistogramNull, and Histogram.

virtual void save ( tools::storage::impl::Group group)
pure virtual

Save the distribution. It is saved into the bag provided at construction.

Implemented in HistogramCache, HistogramNull, and Histogram.


The documentation for this class was generated from the following file: