Subject: Re: [boost] interest in structure of arrays container?
From: Michael Marcin (mike.marcin_at_[hidden])
Date: 2016-10-15 04:14:00


On 10/14/2016 11:32 PM, degski wrote:
> On 15 October 2016 at 05:43, Michael Marcin <mike.marcin_at_[hidden]> wrote:
>
>> Does this already exist somewhere?
>>
>
> Take a look at Boost.DoubleEnded 1 <http://erenon.hu/double_ended/> and see
> whether boost::double_ended::batch_deque
> <http://erenon.hu/double_ended/double_ended/design_rationale.html#double_ended.design_rationale.batch_deque>
> fills all of your requirements.
>

Sorry I think I wasn't clear.

Let me back up a bit and give some examples.

Arrays of Structures is the normal programming model.

Let's take a toy example, say I'm looking at some census data and have a
structure that looks like:

struct citizen_t {
     string first_name;
     string last_name;
     int salary;
     int age;
};

Then the Array of Structures container would be:
vector<citizen_t> aos_citizens;

The Structure of Arrays version of the same data would look like:

struct soa_citizens_t {
     vector<string> first_names;
     vector<string> last_names;
     vector<int> salary;
     vector<int> age;
};

soa_citizens_t soa_citizens;

Why does this matter?

Let's say I want to calculate the average salary of 300 million
citizens. The code is a simple iterative average and very simple.

// Array of Structures
int avg = 0;
int t = 1;
for ( auto & c : aos_citizens ) {
     avg += (c.salary - avg) / t;
     ++t;
}

// Structures of Arrays
int avg = 0;
int t = 1;
for ( int salary : soa_citizens.salary ) {
     avg += (salary - avg) / t;
     ++t;
}

Run this under a simple benchmark:
https://ideone.com/QNqKpD

AoS 3.03523 seconds
SoA 1.94902 seconds

Both loops are doing the exact same work but the memory layout allows
the second loop to run much faster.