Thursday, September 15, 2011

Using Boost accumulators to calculate the variance

Boost seems a promising set of C++ library, but I found it's not easy for me to figure out its usage. After serveral trials, I finally got the variance calculated.

Here is the testing code I used (in VS2005):

#include <boost/accumulators/accumulators.hpp>
#include <boost/accumulators/statistics/stats.hpp>
#include <boost/accumulators/statistics/variance.hpp>

int main(){
    using namespace boost::accumulators;
    accumulator_set< double, stats<tag::variance> > acc_variance;

    for (int i = 0; i < 10; i++){
        std::cout << i << ", ";
        acc_variance(i);
    }

    std::cout << std::endl << "Variance = " 
        << variance(acc_variance) << std::endl;

    return 0;
}

The output is:
0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
Variance = 8.25

When I tried to varify the result, I use MS Excel but got unexpected result. The variances given by Boost and Excel are different!

Did I make any mistakes? Oh, I didn't realise the variance calculated by Boost function was the population variance, and the VAR function of Excel gave the sample variance. To have population variance in Excel, you need VARP, not VAR.


---
Ref: Variance calcs in the stats library and in Excel

0 comments: