$include_dir="/home/hyper-archives/boost/include"; include("$include_dir/msg-header.inc") ?>
Subject: Re: [boost] Proposal: MapReduce library (single machine)
From: Craig Henderson (cdm.henderson_at_[hidden])
Date: 2009-06-16 02:27:54
> > I'm running some tests and will update the site with performance
> > comparisons shortly
> >
> Great
I've posted metrics from three runs of WordCount on a ~10Gb dataset at
http://www.craighenderson.co.uk/mapreduce/
Scalability is not linear, as you would expect, as there is contention in
reading the files from 8 or 16 threads simultaneously. This is where
multi-machine MapReduce clearly comes into its own - assuming the data is
distributed with a decent replication filesystem.
-- Craig