$extrastylesheet
#include <statistics.h>

Public Member Functions | |
| StatisticsVector (dof_id_type i=0) | |
| StatisticsVector (dof_id_type i, T val) | |
| virtual | ~StatisticsVector () |
| virtual Real | l2_norm () const |
| virtual T | minimum () const |
| virtual T | maximum () const |
| virtual Real | mean () const |
| virtual Real | median () |
| virtual Real | median () const |
| virtual Real | variance () const |
| virtual Real | variance (const Real known_mean) const |
| virtual Real | stddev () const |
| virtual Real | stddev (const Real known_mean) const |
| void | normalize () |
| virtual void | histogram (std::vector< dof_id_type > &bin_members, unsigned int n_bins=10) |
| void | plot_histogram (const processor_id_type my_procid, const std::string &filename, unsigned int n_bins) |
| virtual void | histogram (std::vector< dof_id_type > &bin_members, unsigned int n_bins=10) const |
| virtual std::vector< dof_id_type > | cut_below (Real cut) const |
| virtual std::vector< dof_id_type > | cut_above (Real cut) const |
The StatisticsVector class is derived from the std::vector<> and therefore has all of its useful features. It was designed to not have any internal state, i.e. no public or private data members. Also, it was only designed for classes and types for which the operators +,*,/ have meaining, specifically floats, doubles, ints, etc. The main reason for this design decision was to allow a std::vector<> to be successfully cast to a StatisticsVector, thereby enabling its additional functionality. We do not anticipate any problems with deriving from an stl container which lacks a virtual destructor in this case.
Where manipulation of the data set was necessary (for example sorting) two versions of member functions have been implemented. The non-const versions perform sorting directly in the data set, invalidating pointers and changing the entries. const versions of the same functions are generally available, and will be automatically invoked on const StatisticsVector objects. A draw-back to the const versions is that they simply make a copy of the original object and therefore double the original memory requirement for the data set.
Most of the actual code was copied or adapted from the GNU Scientific Library (GSL). More precisely, the recursion relations for computing the mean were implemented in order to avoid possible problems with buffer overruns.
Definition at line 77 of file statistics.h.
| libMesh::StatisticsVector< T >::StatisticsVector | ( | dof_id_type | i = 0 | ) | [inline, explicit] |
Call the std::vector constructor.
Definition at line 85 of file statistics.h.
: std::vector<T> (i) {}
| libMesh::StatisticsVector< T >::StatisticsVector | ( | dof_id_type | i, |
| T | val | ||
| ) | [inline] |
Call the std::vector constructor, fill each entry with val
Definition at line 90 of file statistics.h.
: std::vector<T> (i,val) {}
| virtual libMesh::StatisticsVector< T >::~StatisticsVector | ( | ) | [inline, virtual] |
Destructor. Virtual so we can derive from the StatisticsVector
Definition at line 95 of file statistics.h.
{}
| std::vector< dof_id_type > libMesh::StatisticsVector< T >::cut_above | ( | Real | cut | ) | const [virtual] |
Returns a vector of dof_id_types which correspond to the indices of every member of the data set above the cutoff value cut. I chose not to combine these two functions since the interface is cleaner with one passed parameter instead of two.
Reimplemented in libMesh::ErrorVector.
Definition at line 366 of file statistics.C.
References libMesh::START_LOG().
{
START_LOG ("cut_above()", "StatisticsVector");
const dof_id_type n = cast_int<dof_id_type>(this->size());
std::vector<dof_id_type> cut_indices;
cut_indices.reserve(n/2); // Arbitrary
for (dof_id_type i=0; i<n; i++)
{
if ((*this)[i] > cut)
{
cut_indices.push_back(i);
}
}
STOP_LOG ("cut_above()", "StatisticsVector");
return cut_indices;
}
| std::vector< dof_id_type > libMesh::StatisticsVector< T >::cut_below | ( | Real | cut | ) | const [virtual] |
Returns a vector of dof_id_types which correspond to the indices of every member of the data set below the cutoff value "cut".
Reimplemented in libMesh::ErrorVector.
Definition at line 340 of file statistics.C.
References libMesh::START_LOG().
{
START_LOG ("cut_below()", "StatisticsVector");
const dof_id_type n = cast_int<dof_id_type>(this->size());
std::vector<dof_id_type> cut_indices;
cut_indices.reserve(n/2); // Arbitrary
for (dof_id_type i=0; i<n; i++)
{
if ((*this)[i] < cut)
{
cut_indices.push_back(i);
}
}
STOP_LOG ("cut_below()", "StatisticsVector");
return cut_indices;
}
| void libMesh::StatisticsVector< T >::histogram | ( | std::vector< dof_id_type > & | bin_members, |
| unsigned int | n_bins = 10 |
||
| ) | [virtual] |
Computes and returns a histogram with n_bins bins for the data set. For simplicity, the bins are assumed to be of uniform size. Upon return, the bin_members vector will contain unsigned integers which give the number of members in each bin. WARNING: This non-const function sorts the vector, changing its order. Source: GNU Scientific Library
Definition at line 190 of file statistics.C.
References end, libMesh::libmesh_assert(), std::max(), std::min(), libMesh::out, libMesh::Real, and libMesh::START_LOG().
Referenced by libMesh::StatisticsVector< T >::histogram().
{
// Must have at least 1 bin
libmesh_assert (n_bins>0);
const dof_id_type n = cast_int<dof_id_type>(this->size());
std::sort(this->begin(), this->end());
// The StatisticsVector can hold both integer and float types.
// We will define all the bins, etc. using Reals.
Real min = static_cast<Real>(this->minimum());
Real max = static_cast<Real>(this->maximum());
Real bin_size = (max - min) / static_cast<Real>(n_bins);
START_LOG ("histogram()", "StatisticsVector");
std::vector<Real> bin_bounds(n_bins+1);
for (unsigned int i=0; i<bin_bounds.size(); i++)
bin_bounds[i] = min + i * bin_size;
// Give the last bin boundary a little wiggle room: we don't want
// it to be just barely less than the max, otherwise our bin test below
// may fail.
bin_bounds.back() += 1.e-6 * bin_size;
// This vector will store the number of members each bin has.
bin_members.resize(n_bins);
dof_id_type data_index = 0;
for (unsigned int j=0; j<bin_members.size(); j++) // bin vector indexing
{
// libMesh::out << "(debug) Filling bin " << j << std::endl;
for (dof_id_type i=data_index; i<n; i++) // data vector indexing
{
//libMesh::out << "(debug) Processing index=" << i << std::endl;
Real current_val = static_cast<Real>( (*this)[i] );
// There may be entries in the vector smaller than the value
// reported by this->minimum(). (e.g. inactive elements in an
// ErrorVector.) We just skip entries like that.
if ( current_val < min )
{
// libMesh::out << "(debug) Skipping entry v[" << i << "]="
// << (*this)[i]
// << " which is less than the min value: min="
// << min << std::endl;
continue;
}
if ( current_val > bin_bounds[j+1] ) // if outside the current bin (bin[j] is bounded
// by bin_bounds[j] and bin_bounds[j+1])
{
// libMesh::out.precision(16);
// libMesh::out.setf(std::ios_base::fixed);
// libMesh::out << "(debug) (*this)[i]= " << (*this)[i]
// << " is greater than bin_bounds[j+1]="
// << bin_bounds[j+1] << std::endl;
data_index = i; // start searching here for next bin
break; // go to next bin
}
// Otherwise, increment current bin's count
bin_members[j]++;
// libMesh::out << "(debug) Binned index=" << i << std::endl;
}
}
#ifdef DEBUG
// Check the number of binned entries
const dof_id_type n_binned = std::accumulate(bin_members.begin(),
bin_members.end(),
static_cast<dof_id_type>(0),
std::plus<dof_id_type>());
if (n != n_binned)
{
libMesh::out << "Warning: The number of binned entries, n_binned="
<< n_binned
<< ", did not match the total number of entries, n="
<< n << "." << std::endl;
}
#endif
STOP_LOG ("histogram()", "StatisticsVector");
}
| void libMesh::StatisticsVector< T >::histogram | ( | std::vector< dof_id_type > & | bin_members, |
| unsigned int | n_bins = 10 |
||
| ) | const [virtual] |
A const version of the histogram function.
Definition at line 328 of file statistics.C.
References libMesh::StatisticsVector< T >::histogram().
{
StatisticsVector<T> sv = (*this);
return sv.histogram(bin_members, n_bins);
}
| Real libMesh::StatisticsVector< T >::l2_norm | ( | ) | const [virtual] |
Returns the l2 norm of the data set.
Definition at line 36 of file statistics.C.
References libMesh::Real.
{
Real normsq = 0.;
const dof_id_type n = cast_int<dof_id_type>(this->size());
for (dof_id_type i = 0; i != n; ++i)
normsq += ((*this)[i] * (*this)[i]);
return std::sqrt(normsq);
}
| T libMesh::StatisticsVector< T >::maximum | ( | ) | const [virtual] |
Returns the maximum value in the data set.
Definition at line 63 of file statistics.C.
References end, std::max(), and libMesh::START_LOG().
| Real libMesh::StatisticsVector< T >::mean | ( | ) | const [virtual] |
Returns the mean value of the data set using a recurrence relation. Source: GNU Scientific Library
Reimplemented in libMesh::ErrorVector.
Definition at line 78 of file statistics.C.
References libMesh::Real, and libMesh::START_LOG().
Referenced by libMesh::StatisticsVector< ErrorVectorReal >::variance().
{
START_LOG ("mean()", "StatisticsVector");
const dof_id_type n = cast_int<dof_id_type>(this->size());
Real the_mean = 0;
for (dof_id_type i=0; i<n; i++)
{
the_mean += ( static_cast<Real>((*this)[i]) - the_mean ) /
static_cast<Real>(i + 1);
}
STOP_LOG ("mean()", "StatisticsVector");
return the_mean;
}
| Real libMesh::StatisticsVector< T >::median | ( | ) | [virtual] |
Returns the median (e.g. the middle) value of the data set. This function modifies the original data by sorting, so it can't be called on const objects. Source: GNU Scientific Library
Reimplemented in libMesh::ErrorVector.
Definition at line 101 of file statistics.C.
References end, libMesh::Real, and libMesh::START_LOG().
Referenced by libMesh::ErrorVector::median(), and libMesh::StatisticsVector< T >::median().
{
const dof_id_type n = cast_int<dof_id_type>(this->size());
if (n == 0)
return 0.;
START_LOG ("median()", "StatisticsVector");
std::sort(this->begin(), this->end());
const dof_id_type lhs = (n-1) / 2;
const dof_id_type rhs = n / 2;
Real the_median = 0;
if (lhs == rhs)
{
the_median = static_cast<Real>((*this)[lhs]);
}
else
{
the_median = ( static_cast<Real>((*this)[lhs]) +
static_cast<Real>((*this)[rhs]) ) / 2.0;
}
STOP_LOG ("median()", "StatisticsVector");
return the_median;
}
| Real libMesh::StatisticsVector< T >::median | ( | ) | const [virtual] |
A const version of the median funtion. Requires twice the memory of original data set but does not change the original.
Reimplemented in libMesh::ErrorVector.
Definition at line 138 of file statistics.C.
References libMesh::StatisticsVector< T >::median().
{
StatisticsVector<T> sv = (*this);
return sv.median();
}
| T libMesh::StatisticsVector< T >::minimum | ( | ) | const [virtual] |
Returns the minimum value in the data set.
Reimplemented in libMesh::ErrorVector.
Definition at line 48 of file statistics.C.
References end, std::min(), and libMesh::START_LOG().
| void libMesh::StatisticsVector< T >::normalize | ( | ) |
Divides all entries by the largest entry and stores the result
Definition at line 174 of file statistics.C.
References std::max(), and libMesh::Real.
{
const dof_id_type n = cast_int<dof_id_type>(this->size());
const Real max = this->maximum();
for (dof_id_type i=0; i<n; i++)
{
(*this)[i] = static_cast<T>((*this)[i] / max);
}
}
| void libMesh::StatisticsVector< T >::plot_histogram | ( | const processor_id_type | my_procid, |
| const std::string & | filename, | ||
| unsigned int | n_bins | ||
| ) |
Generates a Matlab/Octave style file which can be used to make a plot of the histogram having the desired number of bins. Uses the histogram(...) function in this class WARNING: The histogram(...) function is non-const, and changes the order of the vector.
Definition at line 285 of file statistics.C.
References std::max(), and std::min().
{
// First generate the histogram with the desired number of bins
std::vector<dof_id_type> bin_members;
this->histogram(bin_members, n_bins);
// The max, min and bin size are used to generate x-axis values.
T min = this->minimum();
T max = this->maximum();
T bin_size = (max - min) / static_cast<T>(n_bins);
// On processor 0: Write histogram to file
if (my_procid==0)
{
std::ofstream out_stream (filename.c_str());
out_stream << "clear all\n";
out_stream << "clf\n";
//out_stream << "x=linspace(" << min << "," << max << "," << n_bins+1 << ");\n";
// abscissa values are located at the center of each bin.
out_stream << "x=[";
for (unsigned int i=0; i<bin_members.size(); ++i)
{
out_stream << min + (i+0.5)*bin_size << " ";
}
out_stream << "];\n";
out_stream << "y=[";
for (unsigned int i=0; i<bin_members.size(); ++i)
{
out_stream << bin_members[i] << " ";
}
out_stream << "];\n";
out_stream << "bar(x,y);\n";
}
}
| virtual Real libMesh::StatisticsVector< T >::stddev | ( | ) | const [inline, virtual] |
Computes the standard deviation of the data set, which is simply the square-root of the variance.
Definition at line 165 of file statistics.h.
{ return std::sqrt(this->variance()); }
| virtual Real libMesh::StatisticsVector< T >::stddev | ( | const Real | known_mean | ) | const [inline, virtual] |
Computes the standard deviation of the data set, which is simply the square-root of the variance. This method can be used for efficiency when the mean has already been computed.
Definition at line 174 of file statistics.h.
{ return std::sqrt(this->variance(known_mean)); }
| virtual Real libMesh::StatisticsVector< T >::variance | ( | ) | const [inline, virtual] |
Computes the variance of the data set. Uses a recurrence relation to prevent data overflow for large sums. Note: The variance is equal to the standard deviation squared. Source: GNU Scientific Library
Reimplemented in libMesh::ErrorVector.
Definition at line 145 of file statistics.h.
Referenced by libMesh::StatisticsVector< ErrorVectorReal >::stddev(), and libMesh::StatisticsVector< ErrorVectorReal >::variance().
| Real libMesh::StatisticsVector< T >::variance | ( | const Real | known_mean | ) | const [virtual] |
Computes the variance of the data set where the mean is provided. This is useful for efficiency when you have already calculated the mean. Uses a recurrence relation to prevent data overflow for large sums. Note: The variance is equal to the standard deviation squared. Source: GNU Scientific Library
Reimplemented in libMesh::ErrorVector.
Definition at line 149 of file statistics.C.
References libMesh::Real, and libMesh::START_LOG().
{
const dof_id_type n = cast_int<dof_id_type>(this->size());
START_LOG ("variance()", "StatisticsVector");
Real the_variance = 0;
for (dof_id_type i=0; i<n; i++)
{
const Real delta = ( static_cast<Real>((*this)[i]) - mean_in );
the_variance += (delta * delta - the_variance) /
static_cast<Real>(i + 1);
}
if (n > 1)
the_variance *= static_cast<Real>(n) / static_cast<Real>(n - 1);
STOP_LOG ("variance()", "StatisticsVector");
return the_variance;
}