Thursday, September 15, 2011

Using Boost accumulators to calculate the variance

Boost seems a promising set of C++ library, but I found it's not easy for me to figure out its usage. After serveral trials, I finally got the variance calculated.

Here is the testing code I used (in VS2005):

#include <boost/accumulators/accumulators.hpp>
#include <boost/accumulators/statistics/stats.hpp>
#include <boost/accumulators/statistics/variance.hpp>

int main(){
    using namespace boost::accumulators;
    accumulator_set< double, stats<tag::variance> > acc_variance;

    for (int i = 0; i < 10; i++){
        std::cout << i << ", ";
        acc_variance(i);
    }

    std::cout << std::endl << "Variance = " 
        << variance(acc_variance) << std::endl;

    return 0;
}

The output is:
0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
Variance = 8.25

When I tried to varify the result, I use MS Excel but got unexpected result. The variances given by Boost and Excel are different!

Did I make any mistakes? Oh, I didn't realise the variance calculated by Boost function was the population variance, and the VAR function of Excel gave the sample variance. To have population variance in Excel, you need VARP, not VAR.


---
Ref: Variance calcs in the stats library and in Excel

2 comments:

  1. Anonymous10:45 AM

    The variance computed by Boost Accumulators is the sample variance, not the population variance as you state here.

    ReplyDelete
    Replies
    1. Really? I have no idea about that.

      But according to the test results, the output of using Boost Accumulators was identical to Excel VARP function, which returned population variance, didn't it?
      (or maybe I've had wrong understanding about the definitions of both population and sample variances...)

      Delete