$include_dir="/home/hyper-archives/boost-commit/include"; include("$include_dir/msg-header.inc") ?>
Subject: [Boost-commit] svn:boost r55134 - in sandbox/libs/mapreduce: . doc test test/wordcount
From: cdm.henderson_at_[hidden]
Date: 2009-07-23 15:04:46
Author: chenderson
Date: 2009-07-23 15:04:45 EDT (Thu, 23 Jul 2009)
New Revision: 55134
URL: http://svn.boost.org/trac/boost/changeset/55134
Log:
Initial upload; based on v0.2 from the Boost Vault.
Added:
   sandbox/libs/mapreduce/
   sandbox/libs/mapreduce/doc/
   sandbox/libs/mapreduce/doc/future.html   (contents, props changed)
   sandbox/libs/mapreduce/doc/index.html   (contents, props changed)
   sandbox/libs/mapreduce/doc/platform.html   (contents, props changed)
   sandbox/libs/mapreduce/doc/schedule_policies.html   (contents, props changed)
   sandbox/libs/mapreduce/doc/tutorial.html   (contents, props changed)
   sandbox/libs/mapreduce/doc/wordcount.html   (contents, props changed)
   sandbox/libs/mapreduce/mapreduce.sln   (contents, props changed)
   sandbox/libs/mapreduce/mapreduce.vcproj   (contents, props changed)
   sandbox/libs/mapreduce/test/
   sandbox/libs/mapreduce/test/wordcount/
   sandbox/libs/mapreduce/test/wordcount/wordcount.cpp   (contents, props changed)
   sandbox/libs/mapreduce/test/wordcount/wordcount.vcproj   (contents, props changed)
Added: sandbox/libs/mapreduce/doc/future.html
==============================================================================
--- (empty file)
+++ sandbox/libs/mapreduce/doc/future.html	2009-07-23 15:04:45 EDT (Thu, 23 Jul 2009)
@@ -0,0 +1,137 @@
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
+    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> 
+ 
+<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"> 
+<head> 
+  <title>Boost.MapReduce Future Work</title> 
+  <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> 
+  <link href="http://www.boost.org/favicon.ico" rel="icon" type="http://www.boost.org/image/ico" /> 
+  <link rel="stylesheet" type="text/css" href="http://www.boost.org/style/basic.css" /> 
+</head> 
+ 
+<body> 
+  <div id="heading"> 
+      <div id="heading-placard"></div> 
+ 
+  <h1 id="heading-title"><a href="/"><img src="http://www.boost.org/gfx/space.png" alt=
+  "Boost C++ Libraries" id="heading-logo" /><span id="boost">Boost</span> 
+  <span id="cpplibraries">C++ Libraries</span></a></h1> 
+ 
+  <p id="heading-quote"><span class="quote">“...one of the most highly
+  regarded and expertly designed C++ library projects in the
+  world.”</span> <span class="attribution">— <a href=
+  "http://www.gotw.ca/" class="external">Herb Sutter</a> and <a href=
+  "http://en.wikipedia.org/wiki/Andrei_Alexandrescu" class="external">Andrei
+  Alexandrescu</a>, <a href=
+  "http://safari.awprofessional.com/?XmlId=0321113586" class="external">C++
+  Coding Standards</a></span></p> 
+  </div> 
+ 
+  <div id="body"> 
+    <div id="body-inner"> 
+      <div id="content"> 
+        <div class="section"> 
+          <div class="section-0"> 
+            <div class="section-title"> 
+              <h1>Boost.MapReduce Future Work</h1> 
+              <em>Note: This library is not yet part of the Boost Library and is still under development and review.</em> 
+            </div> 
+ 
+            <div class="section-body">
+            <p>
+              This is the first release of the MapReduce library, and there are a few features
+              that I'd still like to do.
+            </p>
+            <ul>
+              <li>
+                <p>Improve support for other platforms. This will require help from the Boost development community.</p>
+              </li>
+              <li>
+                <p>Add a <code>PartioningFunction</code> parameter in <code>local_disk</code> intermediate
+                handler to enable custominsation of the partitioning of data into the final result files.</p>
+              </li>
+              <li>
+                <p>Add a template to the <code>SortFn</code> sort function to prevent expansion of duplicates
+                if required. (For example, this expansion contradicts the <code>combiner</code> in wordcount,
+                and eliminating the two would improve performance considerably).</p>
+              </li>
+              <li>
+                <p>The only intermediate handler currently provided by the library is the <code>intermediates::local_disk<></code>
+                policy class. An early implementation of the library used in-memory storage for intermediates, and it
+                may be useful to redevelop this as a fully-fledged intermediate policy class.</p>
+              </li>
+              <li>
+                <p>An extension to the <code>intermediates::local_disk<></code> policy class could be to compress
+                the intermediate files, using the Boost.Iostreams zip/bzip2 compression libraries. This is a
+                long-term item that will be very useful when the library is extended to supported cross-machine
+                MapReduce. Until then, the value is very limited.</p>
+              </li>
+            </ul>
+            <h2>Multiple Machine Support</h2>
+            <p>
+              MapReduce was originally designed as a mechanism for working on large datasets across many (1000s) of
+              commodity servers. The current Boost library works across a plurality of CPU cores on a single machine.
+              There is a big jump to multi-machine support, so this is a long-term goal, but a goal nonetheless.
+            </p>
+            <h2>Distributed File System</h2>
+            <p>
+              To support the MapReduce across multiple machines, some form of distributed file system is required. I
+              have <a href='http://craighenderson.co.uk/blog/index.php/tag/distributed-file-system/'>begun development
+              of one using Boost libraries</a> (primarily Boost.FileSystem and Boost.Asio). The
+              question is going to be whether this really sits within Boost as a C++ library, or whether it is really
+              a runtime environment for MapReduce to sit atop. My feeling is that there is some value in having a scalable
+              and resilient DFS which is peerless and heterogenous across all platforms as a library that can be built into
+              an application, but whether that is the really remains to be seen.
+            </p>
+            </div>
+          </div> 
+        </div> 
+      </div> 
+      <div id="sidebar"> 
+        <a accesskey="p" href="./platform.html"><img src="http://www.boost.org/doc/html/images/prev.png" alt="Prev" /></a>
+        <a accesskey="u" href="http://www.boost.org/doc/libs"><img src="http://www.boost.org/doc/html/images/up.png" alt="Up" /></a>
+        <a accesskey="h" href="http://www.boost.org/"><img src="http://www.boost.org/doc/html/images/home.png" alt="Home" /></a>
+
+        <hr />
+        <p><a href='./index.html'>Boost.MapReduce</a></p>
+        <p><a href='./tutorial.html'>Tutorial</a></p>
+        <p><a href='./wordcount.html'>Example</a></p>
+        <hr />
+        <p><a href='./schedule_policies.html'>Schedule Policies</a></p>
+        <p><a href='./platform.html'>Platform Notes</a></p>
+        <p><a href='./future.html'>Future Work</a></p>
+      </div>
+      <div class="clear"></div> 
+    </div> 
+  </div> 
+ 
+  <div id="footer"> 
+    <div id="footer-left"> 
+ 
+      <div id="copyright"> 
+        <p>Copyright (C) 2009 Craig Henderson.</p> 
+       </div>  <div id="license"> 
+    <p>Distributed under the <a href="/LICENSE_1_0.txt" class=
+    "internal">Boost Software License, Version 1.0</a>.</p> 
+  </div> 
+    </div> 
+ 
+    <div id="footer-right"> 
+        <div id="banners"> 
+    <p id="banner-xhtml"><a href="http://validator.w3.org/check?uri=referer"
+    class="external">XHTML 1.0</a></p> 
+ 
+    <p id="banner-css"><a href=
+    "http://jigsaw.w3.org/css-validator/check/referer" class=
+    "external">CSS</a></p> 
+ 
+    <p id="banner-osi"><a href=
+    "http://www.opensource.org/docs/definition.php" class="external">OSI
+    Certified</a></p> 
+  </div> 
+    </div> 
+ 
+    <div class="clear"></div> 
+  </div> 
+</body> 
+</html> 
\ No newline at end of file
Added: sandbox/libs/mapreduce/doc/index.html
==============================================================================
--- (empty file)
+++ sandbox/libs/mapreduce/doc/index.html	2009-07-23 15:04:45 EDT (Thu, 23 Jul 2009)
@@ -0,0 +1,194 @@
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
+    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> 
+ 
+<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"> 
+<head> 
+  <title>Boost.MapReduce Documentation</title> 
+  <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> 
+  <link href="http://www.boost.org/favicon.ico" rel="icon" type="http://www.boost.org/image/ico" /> 
+  <link rel="stylesheet" type="text/css" href="http://www.boost.org/style/basic.css" /> 
+</head> 
+ 
+<body> 
+  <div id="heading"> 
+      <div id="heading-placard"></div> 
+ 
+  <h1 id="heading-title"><a href="/"><img src="http://www.boost.org/gfx/space.png" alt=
+  "Boost C++ Libraries" id="heading-logo" /><span id="boost">Boost</span> 
+  <span id="cpplibraries">C++ Libraries</span></a></h1> 
+ 
+  <p id="heading-quote"><span class="quote">“...one of the most highly
+  regarded and expertly designed C++ library projects in the
+  world.”</span> <span class="attribution">— <a href=
+  "http://www.gotw.ca/" class="external">Herb Sutter</a> and <a href=
+  "http://en.wikipedia.org/wiki/Andrei_Alexandrescu" class="external">Andrei
+  Alexandrescu</a>, <a href=
+  "http://safari.awprofessional.com/?XmlId=0321113586" class="external">C++
+  Coding Standards</a></span></p> 
+ 
+  </div> 
+ 
+  <div id="body"> 
+    <div id="body-inner"> 
+      <div id="content"> 
+        <div class="section"> 
+          <div class="section-0"> 
+            <div class="section-title"> 
+              <h1>Boost.MapReduce</h1>
+              <em>Note: This library is not yet part of the Boost Library and is still under development and review.</em> 
+            </div> 
+ 
+            <div class="section-body"> 
+            <p><em>Copyright © 2009 Craig Henderson</em></p>
+            <p>Distributed under the Boost Software License, Version 1.0.<br />(See accompanying file LICENSE_1_0.txt
+            or copy at <a href='http://www.boost.org/LICENSE_1_0.txt' target='_blank'>http://www.boost.org/LICENSE_1_0.txt>)</p>
+
+              <h2>Motivation</h2>
+              <p>
+                MapReduce is a programming model and distributed processing platform implementation for generating and
+                processing large data sets using clusters of computers. Pioneered by Google and first presented in 2004,
+                the MapReduce programming model has gained significant momentum in commercial, research and open-source
+                projects since, and Google have updated and republished their seminal paper in 2008.
+              </p>
+              <p>
+                The scalability achieved using MapReduce to implement data processing across a large volume of CPUs, whether
+                on a single server or multiple machines is an attractive proposition. The Boost.MapReduce library is a
+                MapReduce implementation across a plurality of CPU cores rather than machines. The library is implemented
+                as a set of C++ class templates, and is a header-only library. It does, however, depend upon many other
+                Boost libraries, such as Boost.System, Boost.FileSystem and Boost.Thread.
+              </p>
+              <h2>Other Implementations</h2>
+              <p>
+                The Google MapReduce framework is written in C++ and is not made available publically. Hadoop is an Apache
+                project implementation of MapReduce, originally developed as an infrastructure for the Nutch Java Search
+                Engine project. Hadoop is written in Java, with interfaces to a number of programming languages including
+                C++ and Python. This system includes a distributed file system HDFS (Hadoop Distributed File System), which
+                is highly fault-tolerant and designed to be deployed on low-cost hardware. HDFS provides high throughput
+                access to application data and is suitable for applications that have large data sets.
+              </p>
+              <p>
+                Phoenix is a shared-memory implementation of MapReduce. Phoenix can be used to program multi-core chips as
+                well as shared-memory multiprocessors (SMPs and ccNUMAs) and is available from the original authors for the
+                Sun Solaris operating system. A port to the Linux operating system is also available. The Phoenix source code
+                is distributed under a BSD license and the copyright is held by Stanford University.
+              </p>
+              <p>
+                Phoenix runs on a single computer and implements MapReduce across a plurality of CPU cores rather than machines
+                as in the Google and Hadoop implementations. This single-machine restriction simplifies the architecture
+                significantly. In place of the distributed file system, Phoenix uses shared memory model for storing data to be
+                processed, and the results. Each Map or Reduce task runs on a CPU core and the Phoenix runtime is responsible
+                for consolidating results and load balancing (allocating data to Map and Reduce tasks). The complexities of
+                network communication and fault tolerance are not required for the Phoenix framework on a single server.
+              </p>
+              <h1>Change History</h1>
+              <dl class="fields"> 
+                <dt>21st July 2009</dt>
+                <dd>
+                  <a href='http://www.boostpro.com/vault/index.php?action=downloadfile&filename=mapreduce_0_2.zip&directory=&'>
+                    DOWNLOAD v0.2
+                  </a><br />
+                  <ul>
+                  <li>Moved the library into the <code>boost</code> namespace.</li>
+                  <li>Created <code>PartitionFn</code> template parameter on <code>intermediates::local_disk</code> to
+                  enable customisation of the partitioning of data into result files.</li>
+                  <li>Use of <code>BOOST_THROW_EXCEPTION</code> in place of <code>throw</code>.</li>
+                  <li>Rationalised and completed include guards</li>
+                  <li>Support for gcc 4.3.3 on Ubuntu Linux</li>
+                  </ul>
+                </dd>
+              </dl>
+              <dl class="fields"> 
+                <dt>19th July 2009</dt>
+                <dd>
+                  <a href='http://www.boostpro.com/vault/index.php?action=downloadfile&filename=mapreduce_0_1.zip&directory=&'>
+                    DOWNLOAD v0.1
+                  </a><br />
+                  Initial public release on Boost Vault<br />
+                </dd>
+              </dl>
+              <h1>References</h1>
+                <dl class="fields"> 
+                    <dt>Title</dt> 
+                    <dd>MapReduce: Simplified Data Processing on Large Clusters</dd> 
+                    <dt>Author(s)</dt> 
+                    <dd>Jeffrey Dean and Sanjay Ghemawat</dd> 
+                    <dt>Appeared in</dt> 
+                    <dd>OSDI'04: Sixth Symposium on Operating System Design and Implementation, San Francisco, CA, December, 2004.</dd>
+                    <dt>URL</dt>
+                    <dd><a target="_blank" href='http://labs.google.com/papers/mapreduce.html'>http://labs.google.com/papers/mapreduce.html></dd>
+                </dl> 
+
+                <dl class="fields"> 
+                    <dt>Title</dt> 
+                    <dd>MapReduce: Simplified Data Processing on Large Clusters</dd> 
+                    <dt>Author(s)</dt> 
+                    <dd>Jeffrey Dean and Sanjay Ghemawat</dd> 
+                    <dt>Appeared in</dt> 
+                    <dd>Communications of the ACM 51(1) January 2008</dd>
+                    <dt>URL</dt>
+                    <dd><a target="_blank" href='http://portal.acm.org/citation.cfm?id=1327492'>http://portal.acm.org/citation.cfm?id=1327492></dd>
+                </dl> 
+
+
+                <dl class="fields"> 
+                    <dt>Title</dt> 
+                    <dd>Evaluating MapReduce for Multi-core and Multiprocessor Systems</dd> 
+                    <dt>Author(s)</dt> 
+                    <dd>Ranger, C., Raghuraman, R., Penmetsa, A., Bradski, G., & Kozyrakis, C.</dd> 
+                    <dt>Appeared in</dt> 
+                    <dd>Proceedings of the 13th Intl. Symposium on High-Performance Computer Architecture (HPCA). Phoenix, AZ.</dd>
+                    <dt>URL</dt>
+                    <dd><a target="_blank" href='http://mapreduce.stanford.edu/'>http://mapreduce.stanford.edu/></dd>
+                </dl> 
+            </div> 
+          </div> 
+        </div> 
+      </div> 
+      <div id="sidebar"> 
+        <a accesskey="u" href="http://www.boost.org/doc/libs"><img src="http://www.boost.org/doc/html/images/up.png" alt="Up" /></a>
+        <a accesskey="h" href="http://www.boost.org/"><img src="http://www.boost.org/doc/html/images/home.png" alt="Home" /></a>
+        <a accesskey="n" href="./tutorial.html"><img src="http://www.boost.org/doc/html/images/next.png" alt="Next" /></a> 
+
+        <hr />
+        <p><a href='./index.html'>Boost.MapReduce</a></p>
+        <p><a href='./tutorial.html'>Tutorial</a></p>
+        <p><a href='./wordcount.html'>Example</a></p>
+        <hr />
+        <p><a href='./schedule_policies.html'>Schedule Policies</a></p>
+        <p><a href='./platform.html'>Platform Notes</a></p>
+        <p><a href='./future.html'>Future Work</a></p>
+      </div>
+      <div class="clear"></div> 
+    </div> 
+  </div> 
+ 
+  <div id="footer"> 
+    <div id="footer-left"> 
+ 
+      <div id="copyright"> 
+        <p>Copyright (C) 2009 Craig Henderson.</p> 
+       </div>  <div id="license"> 
+    <p>Distributed under the <a href="/LICENSE_1_0.txt" class=
+    "internal">Boost Software License, Version 1.0</a>.</p> 
+  </div> 
+    </div> 
+ 
+    <div id="footer-right"> 
+        <div id="banners"> 
+    <p id="banner-xhtml"><a href="http://validator.w3.org/check?uri=referer"
+    class="external">XHTML 1.0</a></p> 
+ 
+    <p id="banner-css"><a href=
+    "http://jigsaw.w3.org/css-validator/check/referer" class=
+    "external">CSS</a></p> 
+ 
+    <p id="banner-osi"><a href=
+    "http://www.opensource.org/docs/definition.php" class="external">OSI
+    Certified</a></p> 
+  </div> 
+    </div> 
+ 
+    <div class="clear"></div> 
+  </div> 
+</body> 
+</html>
Added: sandbox/libs/mapreduce/doc/platform.html
==============================================================================
--- (empty file)
+++ sandbox/libs/mapreduce/doc/platform.html	2009-07-23 15:04:45 EDT (Thu, 23 Jul 2009)
@@ -0,0 +1,200 @@
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
+    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> 
+ 
+<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"> 
+<head> 
+  <title>Boost.MapReduce platform notes</title> 
+  <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> 
+  <link href="http://www.boost.org/favicon.ico" rel="icon" type="http://www.boost.org/image/ico" /> 
+  <link rel="stylesheet" type="text/css" href="http://www.boost.org/style/basic.css" /> 
+</head> 
+ 
+<body> 
+  <div id="heading"> 
+      <div id="heading-placard"></div> 
+ 
+  <h1 id="heading-title"><a href="/"><img src="http://www.boost.org/gfx/space.png" alt=
+  "Boost C++ Libraries" id="heading-logo" /><span id="boost">Boost</span> 
+  <span id="cpplibraries">C++ Libraries</span></a></h1> 
+ 
+  <p id="heading-quote"><span class="quote">“...one of the most highly
+  regarded and expertly designed C++ library projects in the
+  world.”</span> <span class="attribution">— <a href=
+  "http://www.gotw.ca/" class="external">Herb Sutter</a> and <a href=
+  "http://en.wikipedia.org/wiki/Andrei_Alexandrescu" class="external">Andrei
+  Alexandrescu</a>, <a href=
+  "http://safari.awprofessional.com/?XmlId=0321113586" class="external">C++
+  Coding Standards</a></span></p> 
+ 
+  </div> 
+ 
+  <div id="body"> 
+    <div id="body-inner"> 
+      <div id="content"> 
+        <div class="section"> 
+          <div class="section-0"> 
+            <div class="section-title"> 
+              <h1>Boost.MapReduce platform notes</h1> 
+              <em>Note: This library is not yet part of the Boost Library and is still under development and review.</em> 
+            </div> 
+ 
+            <div class="section-body"> 
+            <h2>Microsoft Windows and MSVC 8 (2005)</h2>
+            <p>
+              This library has been developed and tested using Micrsoft Visual C++ v8, aka Visual Studio 2005.
+              The code compiles cleanly for and runs as 32bit and 64bit processes on Windows XP 32Bit and Windows
+              2003 Server 64Bit Edition.</p>
+            <h2>STL</h2>
+            <p>
+              The STL implementation supplied with Micrsoft Visual C++ v8 suffers significant performance
+              problems as it includes indiscriminate fine granularity synchronisation locking. The MapReduce
+              library is designed to be a high performance library and partitions data such that multiple threads
+              can process data independently of other threads. The unnecessary overhead of locking in MSVC8's STL
+              library negates some of the high-performance benefits of the library.
+            </p>
+            <p>
+              I therefore recommend using an alternative STL implementation to achieve maximum performance. I have
+              tested the library with STLPort 5.2.1, compiled without thread support
+              <pre>STLport-5.2.1>configure msvc8 -p winxp -x --without-thread --with-dynamic-rtl</pre> and have seen
+              significant time differences. Using the <a href='./wordcount.html'>Word Count example</a> on a sample
+              dataset consists of six plain text files consisting a total of 90.8 MB (95,284,354 bytes), the STLPort
+              version ran in 26% of the time taken using the MSVC STL.
+            </p>
+<pre>
+MapReduce Wordcount Application
+2 CPU cores
+class mapreduce::job<class wordcount::map_task,class wordcount::reduce_task,clas
+s wordcount::combiner,class mapreduce::datasource::directory_iterator<class word
+count::map_task>,class mapreduce::intermediates::local_disk<class wordcount::map
+_task,struct mapreduce::detail::file_sorter,struct mapreduce::detail::file_merge
+r> >
+
+Running CPU Parallel MapReduce...
+CPU Parallel MapReduce Finished.
+
+MapReduce statistics:
+  MapReduce job runtime                     : 434 seconds, of which...
+    Map phase runtime                       : 418 seconds
+    Reduce phase runtime                    : 16 seconds
+
+  Map:
+    Total Map keys                          : 6
+    Map keys processed                      : 6
+    Map key processing errors               : 0
+    Number of Map Tasks run (in parallel)   : 2
+    Fastest Map key processed in            : 8 seconds
+    Slowest Map key processed in            : 389 seconds
+    Average time to process Map keys        : 81 seconds
+
+  Reduce:
+    Number of Reduce Tasks run (in parallel): 2
+    Number of Result Files                  : 10
+    Fastest Reduce key processed in         : 2 seconds
+    Slowest Reduce key processed in         : 4 seconds
+    Average time to process Reduce keys     : 5 seconds</pre>
+<pre>
+MapReduce Wordcount Application
+2 CPU cores
+class mapreduce::job<class wordcount::map_task,class wordcount::reduce_task,clas
+s wordcount::combiner,class mapreduce::datasource::directory_iterator<class word
+count::map_task>,class mapreduce::intermediates::local_disk<class wordcount::map
+_task,struct mapreduce::detail::file_sorter,struct mapreduce::detail::file_merge
+r> >
+
+Running CPU Parallel MapReduce...
+CPU Parallel MapReduce Finished.
+
+MapReduce statistics:
+  MapReduce job runtime                     : 116 seconds, of which...
+    Map phase runtime                       : 114 seconds
+    Reduce phase runtime                    : 2 seconds
+
+  Map:
+    Total Map keys                          : 6
+    Map keys processed                      : 6
+    Map key processing errors               : 0
+    Number of Map Tasks run (in parallel)   : 2
+    Fastest Map key processed in            : 1 seconds
+    Slowest Map key processed in            : 112 seconds
+    Average time to process Map keys        : 19 seconds
+
+  Reduce:
+    Number of Reduce Tasks run (in parallel): 2
+    Number of Result Files                  : 10
+    Fastest Reduce key processed in         : 0 seconds
+    Slowest Reduce key processed in         : 1 seconds
+    Average time to process Reduce keys     : 0 seconds
+</pre>
+            <h2>gcc 3.4.4 under cygwin</h2>
+            <p>
+              I have successfully compiled using GCC 3.4.4 under Cygwin, but do not have a full
+              development environment with Boost et al. to run any tests.</p>
+            <pre>$ g++ -Wall -c -DLINUX -I../../../.. -I/cygdrive/c/root/Development/Library/Boost/boost_1_39_0 wordcount.cpp</pre>
+            <p>
+              There are also some missing functions in the <code>linux_os</code> namespace which
+              I have not implemented. Any help implementing these for non-Windows platforms is appreciated.</p>  
+<pre>
+namespace linux_os {
+    unsigned const  number_of_cpus(void);                            // !!! not implemented
+    std::string    &get_temporary_filename(std::string &pathname);   // !!! not implemented
+}   // namespace linux_os
+</pre>
+            <h2>gcc 4.3.3 on Ubuntu Linux 9.04</h2>
+            <p>
+              I have successfully compiled using GCC 4.3.3 on Ubuntu Linux 9.04 (32bit), but do not yet have a full
+              development environment with Boost et al. to run any tests.</p>
+            <pre>$ g++ -Wall -c -DLINUX -I../../../.. -I/cygdrive/c/root/Development/Library/Boost/boost_1_39_0 wordcount.cpp</pre>
+            
+            </div>
+          </div> 
+        </div> 
+      </div> 
+      <div id="sidebar"> 
+        <a accesskey="p" href="./schedule_policies.html"><img src="http://www.boost.org/doc/html/images/prev.png" alt="Prev" /></a>
+        <a accesskey="u" href="http://www.boost.org/doc/libs"><img src="http://www.boost.org/doc/html/images/up.png" alt="Up" /></a>
+        <a accesskey="h" href="http://www.boost.org/"><img src="http://www.boost.org/doc/html/images/home.png" alt="Home" /></a>
+        <a accesskey="n" href="./future.html"><img src="http://www.boost.org/doc/html/images/next.png" alt="Next" /></a>
+
+        <hr />
+        <p><a href='./index.html'>Boost.MapReduce</a></p>
+        <p><a href='./tutorial.html'>Tutorial</a></p>
+        <p><a href='./wordcount.html'>Example</a></p>
+        <hr />
+        <p><a href='./schedule_policies.html'>Schedule Policies</a></p>
+        <p><a href='./platform.html'>Platform Notes</a></p>
+        <p><a href='./future.html'>Future Work</a></p>
+      </div>
+      <div class="clear"></div> 
+    </div> 
+  </div> 
+ 
+  <div id="footer"> 
+    <div id="footer-left"> 
+ 
+      <div id="copyright"> 
+        <p>Copyright (C) 2009 Craig Henderson.</p> 
+       </div>  <div id="license"> 
+    <p>Distributed under the <a href="/LICENSE_1_0.txt" class=
+    "internal">Boost Software License, Version 1.0</a>.</p> 
+  </div> 
+    </div> 
+ 
+    <div id="footer-right"> 
+        <div id="banners"> 
+    <p id="banner-xhtml"><a href="http://validator.w3.org/check?uri=referer"
+    class="external">XHTML 1.0</a></p> 
+ 
+    <p id="banner-css"><a href=
+    "http://jigsaw.w3.org/css-validator/check/referer" class=
+    "external">CSS</a></p> 
+ 
+    <p id="banner-osi"><a href=
+    "http://www.opensource.org/docs/definition.php" class="external">OSI
+    Certified</a></p> 
+  </div> 
+    </div> 
+ 
+    <div class="clear"></div> 
+  </div> 
+</body> 
+</html> 
\ No newline at end of file
Added: sandbox/libs/mapreduce/doc/schedule_policies.html
==============================================================================
--- (empty file)
+++ sandbox/libs/mapreduce/doc/schedule_policies.html	2009-07-23 15:04:45 EDT (Thu, 23 Jul 2009)
@@ -0,0 +1,132 @@
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
+    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> 
+ 
+<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"> 
+<head> 
+  <title>Boost.MapReduce Schedule Policies</title> 
+  <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> 
+  <link href="http://www.boost.org/favicon.ico" rel="icon" type="http://www.boost.org/image/ico" /> 
+  <link rel="stylesheet" type="text/css" href="http://www.boost.org/style/basic.css" /> 
+</head> 
+ 
+<body> 
+  <div id="heading"> 
+      <div id="heading-placard"></div> 
+ 
+  <h1 id="heading-title"><a href="/"><img src="http://www.boost.org/gfx/space.png" alt=
+  "Boost C++ Libraries" id="heading-logo" /><span id="boost">Boost</span> 
+  <span id="cpplibraries">C++ Libraries</span></a></h1> 
+ 
+  <p id="heading-quote"><span class="quote">“...one of the most highly
+  regarded and expertly designed C++ library projects in the
+  world.”</span> <span class="attribution">— <a href=
+  "http://www.gotw.ca/" class="external">Herb Sutter</a> and <a href=
+  "http://en.wikipedia.org/wiki/Andrei_Alexandrescu" class="external">Andrei
+  Alexandrescu</a>, <a href=
+  "http://safari.awprofessional.com/?XmlId=0321113586" class="external">C++
+  Coding Standards</a></span></p> 
+ 
+  </div> 
+ 
+  <div id="body"> 
+    <div id="body-inner"> 
+      <div id="content"> 
+        <div class="section"> 
+          <div class="section-0"> 
+            <div class="section-title"> 
+              <h1>Boost.MapReduce Schedule Policies</h1> 
+              <em>Note: This library is not yet part of the Boost Library and is still under development and review.</em> 
+            </div> 
+ 
+            <div class="section-body"> 
+            <p>
+              <em>Schedule Policies</em> are used by the MapReduce runtime system to schedule
+              execution of Map and Reduce tasks. The policy is specified in the call to
+              <code>mapreduce::job::run()</code>, which has two variants for coding convenience.
+            </p>
+<pre>
+template<typename SchedulePolicy>
+void run(specification const &spec, results &result);
+
+template<typename SchedulePolicy>
+void run(SchedulePolicy &schedule, specification const &spec, results &result);
+</pre>
+            <p>
+              Both overloads of <code>run()</code> are template functions where the template parameter
+              is a <code>SchedulePolicy</code>. The first variant will default construct a schedule policy
+              class, and the second variant will use the supplied policy class. This enables the library
+              user to develop their own scheduler policies that may need configuration before being used.
+            </p>
+            <p>
+              Boost.MapReduce provides two Schedule Policy implementations in the <code>mapreduce::schedule_policy</code>
+              namespace; <code>sequential</code> and <code>cpu_parallel</code>.
+            </p>
+            <h2>sequential</h2>
+            <p>
+              The <code>sequential</code> schedule policy runs the MapReduce job on the main execution thread,
+              first running a single Map Task followed by a number of Reduce Tasks in sequence. This schedule
+              policy provides a simple MapReduce execution system without any multi-threaded activity. While
+              unlikely to be useful in a production system, it is a very useful policy to aid debugging of a
+              MapReduce-implemented algorithm.
+            </p>
+            <h2>cpu_parallel</h2>
+            <p>
+              The <code>cpu_parallel</code> schedule policy is the main scheduling algorithm for Boost.MapReduce.
+              The class implements a multi-threaded execution of multiple simultaneous Map tasks followed by multiple
+              simultaneous Reduce tasks. Statistics from the individual Map and Reduce tasks are then collated into
+              statistics for the Job as a whole.
+            </p>
+            <p>The <em>Boost.Threads</em> library is used for the multi-threading to ensure portability is maximised.</p>
+            </div> 
+          </div> 
+        </div> 
+      </div> 
+      <div id="sidebar"> 
+        <a accesskey="p" href="./wordcount.html"><img src="http://www.boost.org/doc/html/images/prev.png" alt="Prev" /></a>
+        <a accesskey="u" href="http://www.boost.org/doc/libs"><img src="http://www.boost.org/doc/html/images/up.png" alt="Up" /></a>
+        <a accesskey="h" href="http://www.boost.org/"><img src="http://www.boost.org/doc/html/images/home.png" alt="Home" /></a>
+        <a accesskey="n" href="./platform.html"><img src="http://www.boost.org/doc/html/images/next.png" alt="Next" /></a> 
+
+        <hr />
+        <p><a href='./index.html'>Boost.MapReduce</a></p>
+        <p><a href='./tutorial.html'>Tutorial</a></p>
+        <p><a href='./wordcount.html'>Example</a></p>
+        <hr />
+        <p><a href='./schedule_policies.html'>Schedule Policies</a></p>
+        <p><a href='./platform.html'>Platform Notes</a></p>
+        <p><a href='./future.html'>Future Work</a></p>
+      </div>
+      <div class="clear"></div> 
+    </div> 
+  </div> 
+ 
+  <div id="footer"> 
+    <div id="footer-left"> 
+ 
+      <div id="copyright"> 
+        <p>Copyright (C) 2009 Craig Henderson.</p> 
+       </div>  <div id="license"> 
+    <p>Distributed under the <a href="/LICENSE_1_0.txt" class=
+    "internal">Boost Software License, Version 1.0</a>.</p> 
+  </div> 
+    </div> 
+ 
+    <div id="footer-right"> 
+        <div id="banners"> 
+    <p id="banner-xhtml"><a href="http://validator.w3.org/check?uri=referer"
+    class="external">XHTML 1.0</a></p> 
+ 
+    <p id="banner-css"><a href=
+    "http://jigsaw.w3.org/css-validator/check/referer" class=
+    "external">CSS</a></p> 
+ 
+    <p id="banner-osi"><a href=
+    "http://www.opensource.org/docs/definition.php" class="external">OSI
+    Certified</a></p> 
+  </div> 
+    </div> 
+ 
+    <div class="clear"></div> 
+  </div> 
+</body> 
+</html> 
\ No newline at end of file
Added: sandbox/libs/mapreduce/doc/tutorial.html
==============================================================================
--- (empty file)
+++ sandbox/libs/mapreduce/doc/tutorial.html	2009-07-23 15:04:45 EDT (Thu, 23 Jul 2009)
@@ -0,0 +1,182 @@
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
+    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> 
+ 
+<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"> 
+<head> 
+  <title>Boost.MapReduce Tutorial</title> 
+  <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> 
+  <link href="http://www.boost.org/favicon.ico" rel="icon" type="http://www.boost.org/image/ico" /> 
+  <link rel="stylesheet" type="text/css" href="http://www.boost.org/style/basic.css" /> 
+</head> 
+ 
+<body> 
+  <div id="heading"> 
+      <div id="heading-placard"></div> 
+ 
+  <h1 id="heading-title"><a href="/"><img src="http://www.boost.org/gfx/space.png" alt=
+  "Boost C++ Libraries" id="heading-logo" /><span id="boost">Boost</span> 
+  <span id="cpplibraries">C++ Libraries</span></a></h1> 
+ 
+  <p id="heading-quote"><span class="quote">“...one of the most highly
+  regarded and expertly designed C++ library projects in the
+  world.”</span> <span class="attribution">— <a href=
+  "http://www.gotw.ca/" class="external">Herb Sutter</a> and <a href=
+  "http://en.wikipedia.org/wiki/Andrei_Alexandrescu" class="external">Andrei
+  Alexandrescu</a>, <a href=
+  "http://safari.awprofessional.com/?XmlId=0321113586" class="external">C++
+  Coding Standards</a></span></p> 
+ 
+  </div> 
+ 
+  <div id="body"> 
+    <div id="body-inner"> 
+      <div id="content"> 
+        <div class="section"> 
+          <div class="section-0"> 
+            <div class="section-title"> 
+              <h1>Boost.MapReduce Tutorial</h1> 
+              <em>Note: This library is not yet part of the Boost Library and is still under development and review.</em> 
+            </div> 
+ 
+            <div class="section-body">
+              <p>This tutorial introduces the concepts and framework for MapReduce programming using the Boost library.
+              Note that it is NOT a tutorial on the MapReduce programming idiom itself. Maybe that will follow one day...</p>
+              <h2>Principles</h2>
+              <p>
+                As a library user, you specify a <em>map</em> function object that processes a key/value pair to generate
+                a set of intermediate key/value pairs, and a <em>reduce</em> function object that merges all intermediate
+                values associated with the same intermediate key. These function objects are call MapTask and ReduceTask
+                respectively.
+              </p>
+<pre>
+map (k1,v1) --> list(k2,v2)
+reduce (k2,list(v2)) --> list(v2)</pre>
+              <h2>MapReduce Job</h2>
+              <p>
+                A single instance of execution in MapReduce is called a Job, and is implemented by <code>boost::mapreduce::job</code>.
+                The simplest definition of a MapReduce Job type just specifies the user-defined MapTask and ReduceTask:</p>
+<pre>typedef
+mapreduce::job<
+  wordcount::map_task,
+  wordcount::reduce_task>
+job;
+</pre>
+              <p>
+                The library's <code>job</code> class provides for more configuration than this, though.
+                <!-- !!! See <a href='./job.html'>Job</a> for more information. -->
+              </p>
+<pre>template<typename MapTask,
+         typename ReduceTask,
+         typename Combiner=null_combiner,
+         typename Datasource=datasource::directory_iterator<MapTask>,
+         typename IntermediateStore=intermediates::local_disk<MapTask> >
+class job;
+</pre>
+              <h2>MapTask</h2>
+              <p>Requirements of a MapTask function object are</p>
+              <ul>
+                <li>Provide type definitions for Map Key (<code>k1</code>) and Map Value (<code>v1</code>);
+                  <code>key_type</code> and <code> value_type</code></li>
+                <li>Provide type definitions for Intermediate Key (<code>k2</code>) and Intermediate Value (<code>v2</code>);
+                  <code>intermediate_key_type</code> and <code> intermediate_value_type</code></li>
+                <li>Define a constructor taking a <code>job::map_task_runner</code> object by reference</li>
+                <li>Store a reference to the <code>job::map_task_runner</code> object passed to the constructor,
+                  to be used to emit intermediate results</li>
+                <li>Define a function-call operator <code>void operator()(key_type const &key, value_type
+                    const &value);</code> Note that the <code>const</code> qualifiers on these parameters
+                    are optional, but recommended where possible.</li>
+              </ul>
+<pre>
+class map_task
+{
+  public:
+    typedef std::string   key_type;
+    typedef std::ifstream value_type;
+    typedef std::string   intermediate_key_type;
+    typedef unsigned      intermediate_value_type;
+
+    map_task(job::map_task_runner &runner);
+    void operator()(key_type const &key, value_type const &value);
+
+  private:
+    job::map_task_runner &runner_;
+};
+</pre>
+              <h2>ReduceTask</h2>
+              <p>Requirements of a ReduceTask function object are</p>
+              <ul>
+                <li>Provide type definitions for Reduce Value (<code>v2</code>);
+                  <code> value_type</code></li>
+                <li>Define a constructor taking a <code>job::reduce_task_runner</code> object by reference</li>
+                <li>Store a reference to the <code>job::reduce_task_runner</code> object passed to the constructor,
+                  to be used to emit results</li>
+                <li>Define a function-call operator <code>void operator()(typename map_task::intermediate_key_type
+                    const &key, It it, It ite);</code> where It is an iterator type.</li>
+              </ul>
+<pre>
+class reduce_task
+{
+  public:
+    typedef unsigned value_type;
+
+    reduce_task(job::reduce_task_runner &runner);
+
+    template<typename It>
+    void operator()(typename map_task::intermediate_key_type const &key, It it, It ite);
+
+  private:
+    job::reduce_task_runner &runner_;
+};
+</pre>
+<p>See the <a href='./wordcount.html'>Word Count example</a> for a detailed breakdown of a simple implementation.</p>
+            </div> 
+          </div> 
+        </div> 
+      </div> 
+      <div id="sidebar"> 
+        <a accesskey="p" href="./index.html"><img src="http://www.boost.org/doc/html/images/prev.png" alt="Prev" /></a>
+        <a accesskey="u" href="./index.html"><img src="http://www.boost.org/doc/html/images/up.png" alt="Up" /></a>
+        <a accesskey="h" href="http://www.boost.org/"><img src="http://www.boost.org/doc/html/images/home.png" alt="Home" /></a>
+        <a accesskey="n" href="./wordcount.html"><img src="http://www.boost.org/doc/html/images/next.png" alt="Next" /></a> 
+
+        <hr />
+        <p><a href='./index.html'>Boost.MapReduce</a></p>
+        <p><a href='./tutorial.html'>Tutorial</a></p>
+        <p><a href='./wordcount.html'>Example</a></p>
+        <hr />
+        <p><a href='./schedule_policies.html'>Schedule Policies</a></p>
+        <p><a href='./platform.html'>Platform Notes</a></p>
+        <p><a href='./future.html'>Future Work</a></p>
+      </div>
+      <div class="clear"></div> 
+    </div> 
+  </div> 
+ 
+  <div id="footer"> 
+    <div id="footer-left"> 
+ 
+      <div id="copyright"> 
+        <p>Copyright (C) 2009 Craig Henderson.</p> 
+       </div>  <div id="license"> 
+    <p>Distributed under the <a href="/LICENSE_1_0.txt" class=
+    "internal">Boost Software License, Version 1.0</a>.</p> 
+  </div> 
+    </div> 
+ 
+    <div id="footer-right"> 
+      <div id="banners"> 
+        <p id="banner-xhtml">XHTML 1.0</p> 
+ 
+        <p id="banner-css"><a href=
+        "http://jigsaw.w3.org/css-validator/check/referer" class=
+        "external">CSS</a></p> 
+ 
+        <p id="banner-osi"><a href=
+        "http://www.opensource.org/docs/definition.php" class="external">OSI
+        Certified</a></p> 
+      </div> 
+    </div> 
+    <div class="clear"></div> 
+  </div> 
+</body> 
+</html> 
\ No newline at end of file
Added: sandbox/libs/mapreduce/doc/wordcount.html
==============================================================================
--- (empty file)
+++ sandbox/libs/mapreduce/doc/wordcount.html	2009-07-23 15:04:45 EDT (Thu, 23 Jul 2009)
@@ -0,0 +1,425 @@
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
+    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> 
+ 
+<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"> 
+<head> 
+  <title>Boost.MapReduce Word Count example</title> 
+  <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> 
+  <link href="http://www.boost.org/favicon.ico" rel="icon" type="http://www.boost.org/image/ico" /> 
+  <link rel="stylesheet" type="text/css" href="http://www.boost.org/style/basic.css" /> 
+</head> 
+ 
+<body> 
+  <div id="heading"> 
+      <div id="heading-placard"></div> 
+ 
+  <h1 id="heading-title"><a href="/"><img src="http://www.boost.org/gfx/space.png" alt=
+  "Boost C++ Libraries" id="heading-logo" /><span id="boost">Boost</span> 
+  <span id="cpplibraries">C++ Libraries</span></a></h1> 
+ 
+  <p id="heading-quote"><span class="quote">“...one of the most highly
+  regarded and expertly designed C++ library projects in the
+  world.”</span> <span class="attribution">— <a href=
+  "http://www.gotw.ca/" class="external">Herb Sutter</a> and <a href=
+  "http://en.wikipedia.org/wiki/Andrei_Alexandrescu" class="external">Andrei
+  Alexandrescu</a>, <a href=
+  "http://safari.awprofessional.com/?XmlId=0321113586" class="external">C++
+  Coding Standards</a></span></p> 
+ 
+  </div> 
+ 
+  <div id="body"> 
+    <div id="body-inner"> 
+      <div id="content"> 
+        <div class="section"> 
+          <div class="section-0"> 
+            <div class="section-title"> 
+              <h1>Boost.MapReduce Word Count example</h1> 
+              <em>Note: This library is not yet part of the Boost Library and is still under development and review.</em> 
+            </div> 
+ 
+<div class="section-body">
+<p>
+By way of an example of using the MapReduce library, we implement a Word Count application.
+We'll use a <code>datasource</code> class supplied by the library to iterate through a directory
+of files containing words to be counted. The Map phase will create a list of words and a count of 1,
+and the Reduce phase will accept a list of words and corresponding counts, total the counts
+for each word, and produce a final list of words with their totals.
+</p>
+<pre>
+map (filename; string, file stream; ifstream) --> list(word; string, count; unsigned int)
+reduce (word; string, list(count; unsigned int)) --> list(count; unsigned int)</pre>
+
+
+<h2>Type Definitions</h2>
+<p>
+  For convenience, brevity and maintainability, define a <code>job</code> type for the MapReduce job.
+  This local <code>job</code> type will be an defined in terms of the library's <code>mapreduce::job</code>
+  class with template parameters specific to the Word Count application.
+</p>
+
+<pre>
+typedef
+mapreduce::job<
+  wordcount::map_task,
+  wordcount::reduce_task>
+job;
+</pre>
+<p>The class <code>mapreduce::job</code> actually has 5 template parameters. The first two must be supplied, the last
+three have default values. The definition above is therefore equivalent to</p>
+<pre>
+typedef
+mapreduce::job<
+    class wordcount::map_task,
+    class wordcount::reduce_task,
+    struct mapreduce::null_combiner,
+    class mapreduce::datasource::directory_iterator<class wordcount::map_task>,
+    class mapreduce::intermediates::local_disk<
+        class wordcount::map_task,
+        struct mapreduce::detail::file_sorter,
+        struct mapreduce::detail::file_merger>
+    >
+job;
+</pre>
+
+<h2>MapTask</h2>
+<p>
+  The MapTask will be implemented by a function-object <code>wordcount::map_task</code>. There are four required
+  data types to be defined in the functor for the <code>key</code>/<code>value</code> types of the input and
+  output of the map task.
+</p>
+<pre>
+typedef std::string   key_type;
+typedef std::ifstream value_type;
+typedef std::string   intermediate_key_type;
+typedef unsigned      intermediate_value_type;
+</pre>
+<p>
+  Now the function-call operator, which takes two parameters; the <code>key</code> and <code>value</code> for the
+  map task to process. Normally these parameters would be expected to be passed as a reference-to-const, but in
+  the Word Count example, the <code>value</code> parameter is defined as an <code>std::ifstream</code> object. If
+  this was passed as reference-to-const, then the function would not be able to read from the file as the read
+  operation modifies the state of the object. As a result, the <code>value</code> parameter is passed as a plain
+  reference.
+</p>
+<p>
+  The function simply loops until the end-of-file is reached on the supplied <code>std::ifstream</code> object.
+  In each iteration a <em>word</em> is read into a <code>string</code> object, converted to lowercase text and
+  non-alphanumeric characters are stripped from the beginning and end. The <em>word</em> is then stored as an
+  intermediate <code>key</code> with a <code>value</code> of <code>1</code>, by calling the
+  <code>emit_intermediate()</code> function of the <code>job::map_task_runner</code> object which was passed to
+  the constructor of the <code>map_task</code> object.
+</p>
+<pre>
+// not a reference to const to enable streams to be passed
+void operator()(key_type const &/*key*/, value_type &value) 
+{
+    while (!value.eof())
+    {
+        std::string word;
+        value >> word;
+        std::transform(word.begin(), word.end(), word.begin(),
+                       std::bind1st(
+                           std::mem_fun(&std::ctype<char>::tolower),
+                           &std::use_facet<std::ctype<char> >(std::locale::classic())));
+
+        size_t length = word.length();
+        size_t const original_length = length;
+        std::string::const_iterator it;
+        for (it=word.begin();
+             it!=word.end()  &&  !std::isalnum(*it, std::locale::classic());
+             ++it)
+        {
+            --length;
+        }
+
+        for (std::string::const_reverse_iterator rit=word.rbegin();
+             length>0  &&  !std::isalnum(*rit, std::locale::classic());
+             ++rit)
+        {
+            --length;
+        }
+
+        if (length > 0)
+        {
+            if (length == original_length)
+                runner_.emit_intermediate(word, 1);
+            else
+                runner_.emit_intermediate(std::string(&*it,length), 1);
+        }
+    }
+}
+</pre>
+
+<h2>ReduceTask</h2>
+<p>
+  The ReduceTask will be implemented by a function-object <code>wordcount::reduce_task</code>. There
+  is one required data type to be defined in the functor for the <code>value</code> type output of
+  the reduce task.
+</p>
+<pre>
+typedef unsigned value_type;
+</pre>
+<p>
+  The function-call operator takes three parameters; the <code>key</code> of the reduce task and a pair
+  of iterators dictating the range of <code>value</code> objects for the reduce task. In this Word Count
+  example, the <code>key</code> is a text string containing the <em>word</em>, and the iterators contain
+  a list of frequencies for the word. The ReduceTask simply sums the frequencies by calling
+  <code>std::accumulate</code> and stores the final result by calling the <code>emit()</code> function of
+  the <code>job::reduce_task_runner</code> object which was passed to the constructor of the
+  <code>reduce_task</code> object.
+</p>
+<pre>
+template<typename It>
+void operator()(typename map_task::intermediate_key_type const &key, It it, It const ite)
+{
+    runner_.emit(key, std::accumulate(it, ite, reduce_task::value_type()));
+}
+</pre>
+
+<h2>Program</h2>
+<p>
+  To run the MapReduce Word Count algorithm, we need a program to set up an
+  environment, run the algorithm and report the results.
+</p>
+<p>
+  The code below shows an example. Note that error handling has been removed for brevity.
+  A <code>datasource</code> object is created to iterate through a directory of files and
+  pass each file into a map task. A <code>mapreduce::specification</code> object is then
+  created. This is used to specify system parameters such a the number of map tasks to run.
+  <em>Note that this is a hint to the MapReduce runtime, and may differ from th actual
+  number of maps that are used.</em> The final supporting object that is created is an
+  instance of <code>mapreduce::results</code>. This structure will be populated by the
+  runtime to provide metrics and timings of the MapReduce job execution.
+</p>
+<p>
+  To run the MapReduce job, call the <code>run</code> function of the <code>job</code> class.
+  There are two variant of <code>run</code>, for coding convenience.
+</p>
+<pre>
+    template<typename SchedulePolicy>
+    void run(specification const &spec, results &result);
+
+    template<typename SchedulePolicy>
+    void run(SchedulePolicy &schedule, specification const &spec, results &result);
+</pre>
+<p>
+  Both overloads of <code>run()</code> are template functions where the template parameter
+  is a <code>SchedulePolicy</code>. The first variant will default construct a schedule policy
+  class, and the second variant will use the supplied policy class. This enables the library
+  user to develop their own scheduler policies that may need configuration before being used.
+  See <a href='./schedule_policies.html'>Schedule Policies</a> for more information.
+</p>
+
+<pre>
+int main(int argc, char **argv)
+{
+    wordcount::job::datasource_type datasource;
+    datasource.set_directory(argv[1]);
+
+    mapreduce::specification spec;
+    spec.map_tasks = atoi(argv[2]);
+
+    mapreduce::results result;
+    wordcount::job     mr2(datasource);
+
+    mr2.run<mapreduce::schedule_policy::cpu_parallel<wordcount::job> >(spec, result);
+
+...
+</pre>
+<p>
+  At the end of the MapReduce job execution, the results can be written to the screen.
+</p>
+<pre>
+std::cout << std::endl << "\n" << "MapReduce statistics:";
+std::cout << "\n  " << "MapReduce job runtime                     : " << result.job_runtime << " seconds, of which...";
+std::cout << "\n  " << "  Map phase runtime                       : " << result.map_runtime << " seconds";
+std::cout << "\n  " << "  Reduce phase runtime                    : " << result.reduce_runtime << " seconds";
+std::cout << "\n\n  " << "Map:";
+std::cout << "\n    " << "Total Map keys                          : " << result.counters.map_tasks;
+std::cout << "\n    " << "Map keys processed                      : " << result.counters.map_tasks_completed;
+std::cout << "\n    " << "Map key processing errors               : " << result.counters.map_tasks_error;
+std::cout << "\n    " << "Number of Map Tasks run (in parallel)   : " << result.counters.actual_map_tasks;
+std::cout << "\n    " << "Fastest Map key processed in            : " << *std::min_element(result.map_times.begin(), result.map_times.end()) << " seconds";
+std::cout << "\n    " << "Slowest Map key processed in            : " << *std::max_element(result.map_times.begin(), result.map_times.end()) << " seconds";
+std::cout << "\n    " << "Average time to process Map keys        : " << std::accumulate(result.map_times.begin(), result.map_times.end(), boost::int64_t()) / result.map_times.size() << " seconds";
+
+std::cout << "\n\n  " << "Reduce:";
+std::cout << "\n    " << "Number of Reduce Tasks run (in parallel): " << result.counters.actual_reduce_tasks;
+std::cout << "\n    " << "Number of Result Files                  : " << result.counters.num_result_files;
+std::cout << "\n    " << "Fastest Reduce key processed in         : " << *std::min_element(result.reduce_times.begin(), result.reduce_times.end()) << " seconds";
+std::cout << "\n    " << "Slowest Reduce key processed in         : " << *std::max_element(result.reduce_times.begin(), result.reduce_times.end()) << " seconds";
+std::cout << "\n    " << "Average time to process Reduce keys     : " << std::accumulate(result.reduce_times.begin(), result.reduce_times.end(), boost::int64_t()) / result.map_times.size() << " seconds";
+</pre>
+
+<h2>Output</h2>
+<p>
+  The wordcount program was run on a sample dataset consists of six plain text files consisting
+  a total of 90.8 MB (95,284,354 bytes). The smallest file is 163 KB (167,529 bytes) and the largest
+  is 88.1 MB (92,392,601 bytes).
+</p>
+<pre>
+MapReduce Wordcount Application
+2 CPU cores
+class mapreduce::job<class wordcount::map_task,class wordcount::reduce_task,stru
+ct mapreduce::null_combiner,class mapreduce::datasource::directory_iterator<clas
+s wordcount::map_task>,class mapreduce::intermediates::local_disk<class wordcoun
+t::map_task,struct mapreduce::detail::file_sorter,struct mapreduce::detail::file
+_merger> >
+
+Running CPU Parallel MapReduce...
+CPU Parallel MapReduce Finished.
+
+MapReduce statistics:
+  MapReduce job runtime                     : 141 seconds, of which...
+    Map phase runtime                       : 44 seconds
+    Reduce phase runtime                    : 97 seconds
+
+  Map:
+    Total Map keys                          : 6
+    Map keys processed                      : 6
+    Map key processing errors               : 0
+    Number of Map Tasks run (in parallel)   : 2
+    Fastest Map key processed in            : 0 seconds
+    Slowest Map key processed in            : 43 seconds
+    Average time to process Map keys        : 7 seconds
+
+  Reduce:
+    Number of Reduce Tasks run (in parallel): 2
+    Number of Result Files                  : 10
+    Fastest Reduce key processed in         : 12 seconds
+    Slowest Reduce key processed in         : 36 seconds
+    Average time to process Reduce keys     : 30 seconds
+</pre>
+
+<h2>Adding a Combiner</h2>
+<p>
+  In some circumstances, an optimisation can be made by consolidating the results of
+  the Map phase before they are passed to the Reduce phase. This consolidation is
+  done by a <code>combiner</code> functor.
+</p>
+<p>
+  In the case of the Word Count example, the Map phase will naturally produce list of
+  words, each with a count of 1. The <code>combiner</code> can be used to total the
+  number of each word in the list and produce a shorter list with unique word occurrences.
+</p>
+<pre>
+class combiner
+{
+  public:
+    void start(map_task::intermediate_key_type const &)
+    {
+        total_ = 0;
+    }
+
+    template<typename IntermediateStore>
+    void finish(map_task::intermediate_key_type const &key, IntermediateStore &intermediate_store)
+    {
+        if (total_ > 0)
+            intermediate_store.insert(key, total_);
+    }
+
+    void operator()(map_task::intermediate_value_type const &value)
+    {
+        total_ += value;
+    }
+
+  private:
+    size_t total_;
+};
+</pre>
+
+<p>
+The <code>combiner</code> runs as a part of the Map Task, hence the time
+taken for the Map phase is significantly increased with the introduction
+of a combiner, but the Reduce phase is reduce almost no time at all.
+</p>
+
+<pre>
+MapReduce Wordcount Application
+2 CPU cores
+class mapreduce::job<class wordcount::map_task,class wordcount::reduce_task,clas
+s wordcount::combiner,class mapreduce::datasource::directory_iterator<class word
+count::map_task>,class mapreduce::intermediates::local_disk<class wordcount::map
+_task,struct mapreduce::detail::file_sorter,struct mapreduce::detail::file_merge
+r> >
+
+Running CPU Parallel MapReduce...
+CPU Parallel MapReduce Finished.
+
+MapReduce statistics:
+  MapReduce job runtime                     : 116 seconds, of which...
+    Map phase runtime                       : 114 seconds
+    Reduce phase runtime                    : 2 seconds
+
+  Map:
+    Total Map keys                          : 6
+    Map keys processed                      : 6
+    Map key processing errors               : 0
+    Number of Map Tasks run (in parallel)   : 2
+    Fastest Map key processed in            : 1 seconds
+    Slowest Map key processed in            : 112 seconds
+    Average time to process Map keys        : 19 seconds
+
+  Reduce:
+    Number of Reduce Tasks run (in parallel): 2
+    Number of Result Files                  : 10
+    Fastest Reduce key processed in         : 0 seconds
+    Slowest Reduce key processed in         : 1 seconds
+    Average time to process Reduce keys     : 0 seconds
+</pre>
+
+<h2>Source Code</h2>
+<p>The full source code for the Word Count example can be found <code>libs/mapreduce/test/wordcount/wordcount.cpp</code>.</p>
+
+            </div> 
+          </div> 
+        </div> 
+      </div> 
+      <div id="sidebar"> 
+        <a accesskey="p" href="./tutorial.html"><img src="http://www.boost.org/doc/html/images/prev.png" alt="Prev" /></a>
+        <a accesskey="u" href="./index.html"><img src="http://www.boost.org/doc/html/images/up.png" alt="Up" /></a>
+        <a accesskey="h" href="http://www.boost.org/"><img src="http://www.boost.org/doc/html/images/home.png" alt="Home" /></a>
+        <a accesskey="n" href="./schedule_policies.html"><img src="http://www.boost.org/doc/html/images/next.png" alt="Next" /></a>
+
+        <hr />
+        <p><a href='./index.html'>Boost.MapReduce</a></p>
+        <p><a href='./tutorial.html'>Tutorial</a></p>
+        <p><a href='./wordcount.html'>Example</a></p>
+        <hr />
+        <p><a href='./schedule_policies.html'>Schedule Policies</a></p>
+        <p><a href='./platform.html'>Platform Notes</a></p>
+        <p><a href='./future.html'>Future Work</a></p>
+      </div>
+      <div class="clear"></div> 
+    </div> 
+  </div> 
+ 
+  <div id="footer"> 
+    <div id="footer-left"> 
+ 
+      <div id="copyright"> 
+        <p>Copyright (C) 2009 Craig Henderson.</p> 
+       </div>  <div id="license"> 
+    <p>Distributed under the <a href="/LICENSE_1_0.txt" class=
+    "internal">Boost Software License, Version 1.0</a>.</p> 
+  </div> 
+    </div> 
+ 
+    <div id="footer-right"> 
+      <div id="banners"> 
+        <p id="banner-xhtml">XHTML 1.0</p> 
+ 
+        <p id="banner-css"><a href=
+        "http://jigsaw.w3.org/css-validator/check/referer" class=
+        "external">CSS</a></p> 
+ 
+        <p id="banner-osi"><a href=
+        "http://www.opensource.org/docs/definition.php" class="external">OSI
+        Certified</a></p> 
+      </div> 
+    </div> 
+    <div class="clear"></div> 
+  </div> 
+</body> 
+</html> 
\ No newline at end of file
Added: sandbox/libs/mapreduce/mapreduce.sln
==============================================================================
--- (empty file)
+++ sandbox/libs/mapreduce/mapreduce.sln	2009-07-23 15:04:45 EDT (Thu, 23 Jul 2009)
@@ -0,0 +1,26 @@
+
+Microsoft Visual Studio Solution File, Format Version 9.00
+# Visual Studio 2005
+Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "mapreduce", "mapreduce.vcproj", "{F1A9A9FC-ACE9-4F93-8162-B888697FD81B}"
+EndProject
+Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "wordcount", "test\wordcount\wordcount.vcproj", "{AB0444E8-E927-470A-BF0B-A60E67F91B06}"
+EndProject
+Global
+	GlobalSection(SolutionConfigurationPlatforms) = preSolution
+		Debug|Win32 = Debug|Win32
+		Release|Win32 = Release|Win32
+	EndGlobalSection
+	GlobalSection(ProjectConfigurationPlatforms) = postSolution
+		{F1A9A9FC-ACE9-4F93-8162-B888697FD81B}.Debug|Win32.ActiveCfg = Debug|Win32
+		{F1A9A9FC-ACE9-4F93-8162-B888697FD81B}.Debug|Win32.Build.0 = Debug|Win32
+		{F1A9A9FC-ACE9-4F93-8162-B888697FD81B}.Release|Win32.ActiveCfg = Release|Win32
+		{F1A9A9FC-ACE9-4F93-8162-B888697FD81B}.Release|Win32.Build.0 = Release|Win32
+		{AB0444E8-E927-470A-BF0B-A60E67F91B06}.Debug|Win32.ActiveCfg = Debug|Win32
+		{AB0444E8-E927-470A-BF0B-A60E67F91B06}.Debug|Win32.Build.0 = Debug|Win32
+		{AB0444E8-E927-470A-BF0B-A60E67F91B06}.Release|Win32.ActiveCfg = Release|Win32
+		{AB0444E8-E927-470A-BF0B-A60E67F91B06}.Release|Win32.Build.0 = Release|Win32
+	EndGlobalSection
+	GlobalSection(SolutionProperties) = preSolution
+		HideSolutionNode = FALSE
+	EndGlobalSection
+EndGlobal
Added: sandbox/libs/mapreduce/mapreduce.vcproj
==============================================================================
--- (empty file)
+++ sandbox/libs/mapreduce/mapreduce.vcproj	2009-07-23 15:04:45 EDT (Thu, 23 Jul 2009)
@@ -0,0 +1,229 @@
+<?xml version="1.0" encoding="Windows-1252"?>
+<VisualStudioProject
+	ProjectType="Visual C++"
+	Version="8.00"
+	Name="mapreduce"
+	ProjectGUID="{F1A9A9FC-ACE9-4F93-8162-B888697FD81B}"
+	RootNamespace="mapreduce"
+	Keyword="Win32Proj"
+	>
+	<Platforms>
+		<Platform
+			Name="Win32"
+		/>
+	</Platforms>
+	<ToolFiles>
+	</ToolFiles>
+	<Configurations>
+		<Configuration
+			Name="Debug|Win32"
+			OutputDirectory="$(SolutionDir)$(ConfigurationName)"
+			IntermediateDirectory="$(SolutionDir)$(ConfigurationName)\compiler\$(ProjectName)"
+			ConfigurationType="4"
+			CharacterSet="1"
+			>
+			<Tool
+				Name="VCPreBuildEventTool"
+			/>
+			<Tool
+				Name="VCCustomBuildTool"
+			/>
+			<Tool
+				Name="VCXMLDataGeneratorTool"
+			/>
+			<Tool
+				Name="VCWebServiceProxyGeneratorTool"
+			/>
+			<Tool
+				Name="VCMIDLTool"
+			/>
+			<Tool
+				Name="VCCLCompilerTool"
+				Optimization="0"
+				AdditionalIncludeDirectories="../.."
+				PreprocessorDefinitions="WIN32_LEAN_AND_MEAN"
+				MinimalRebuild="true"
+				BasicRuntimeChecks="3"
+				RuntimeLibrary="3"
+				UsePrecompiledHeader="0"
+				WarningLevel="4"
+				Detect64BitPortabilityProblems="true"
+				DebugInformationFormat="3"
+			/>
+			<Tool
+				Name="VCManagedResourceCompilerTool"
+			/>
+			<Tool
+				Name="VCResourceCompilerTool"
+			/>
+			<Tool
+				Name="VCPreLinkEventTool"
+			/>
+			<Tool
+				Name="VCLibrarianTool"
+			/>
+			<Tool
+				Name="VCALinkTool"
+			/>
+			<Tool
+				Name="VCXDCMakeTool"
+			/>
+			<Tool
+				Name="VCBscMakeTool"
+			/>
+			<Tool
+				Name="VCFxCopTool"
+			/>
+			<Tool
+				Name="VCPostBuildEventTool"
+			/>
+		</Configuration>
+		<Configuration
+			Name="Release|Win32"
+			OutputDirectory="$(SolutionDir)$(ConfigurationName)"
+			IntermediateDirectory="$(SolutionDir)$(ConfigurationName)\compiler\$(ProjectName)"
+			ConfigurationType="4"
+			CharacterSet="1"
+			WholeProgramOptimization="1"
+			>
+			<Tool
+				Name="VCPreBuildEventTool"
+			/>
+			<Tool
+				Name="VCCustomBuildTool"
+			/>
+			<Tool
+				Name="VCXMLDataGeneratorTool"
+			/>
+			<Tool
+				Name="VCWebServiceProxyGeneratorTool"
+			/>
+			<Tool
+				Name="VCMIDLTool"
+			/>
+			<Tool
+				Name="VCCLCompilerTool"
+				AdditionalIncludeDirectories="../.."
+				PreprocessorDefinitions="WIN32_LEAN_AND_MEAN"
+				RuntimeLibrary="2"
+				UsePrecompiledHeader="0"
+				WarningLevel="3"
+				Detect64BitPortabilityProblems="true"
+				DebugInformationFormat="3"
+			/>
+			<Tool
+				Name="VCManagedResourceCompilerTool"
+			/>
+			<Tool
+				Name="VCResourceCompilerTool"
+			/>
+			<Tool
+				Name="VCPreLinkEventTool"
+			/>
+			<Tool
+				Name="VCLibrarianTool"
+			/>
+			<Tool
+				Name="VCALinkTool"
+			/>
+			<Tool
+				Name="VCXDCMakeTool"
+			/>
+			<Tool
+				Name="VCBscMakeTool"
+			/>
+			<Tool
+				Name="VCFxCopTool"
+			/>
+			<Tool
+				Name="VCPostBuildEventTool"
+			/>
+		</Configuration>
+	</Configurations>
+	<References>
+	</References>
+	<Files>
+		<Filter
+			Name="Source Files"
+			Filter="cpp;c;cc;cxx;def;odl;idl;hpj;bat;asm;asmx"
+			UniqueIdentifier="{4FC737F1-C7A5-4376-A066-2A32D752A2FF}"
+			>
+		</Filter>
+		<Filter
+			Name="Header Files"
+			>
+			<File
+				RelativePath="..\..\boost\mapreduce.hpp"
+				>
+			</File>
+			<Filter
+				Name="mapreduce"
+				>
+				<File
+					RelativePath="..\..\boost\mapreduce\datasource.hpp"
+					>
+				</File>
+				<File
+					RelativePath="..\..\boost\mapreduce\hash_partitioner.hpp"
+					>
+				</File>
+				<File
+					RelativePath="..\..\boost\mapreduce\intermediates.hpp"
+					>
+				</File>
+				<File
+					RelativePath="..\..\boost\mapreduce\job.hpp"
+					>
+				</File>
+				<File
+					RelativePath="..\..\boost\mapreduce\mergesort.hpp"
+					>
+				</File>
+				<File
+					RelativePath="..\..\boost\mapreduce\null_combiner.hpp"
+					>
+				</File>
+				<File
+					RelativePath="..\..\boost\mapreduce\platform.hpp"
+					>
+				</File>
+				<File
+					RelativePath="..\..\boost\mapreduce\schedule_policy.hpp"
+					>
+				</File>
+				<Filter
+					Name="schedule_policy"
+					>
+					<File
+						RelativePath="..\..\boost\mapreduce\schedule_policy\cpu_parallel.hpp"
+						>
+					</File>
+					<File
+						RelativePath="..\..\boost\mapreduce\schedule_policy\sequential.hpp"
+						>
+					</File>
+				</Filter>
+				<Filter
+					Name="intermediates"
+					>
+					<File
+						RelativePath="..\..\boost\mapreduce\intermediates\in_memory.hpp"
+						>
+					</File>
+					<File
+						RelativePath="..\..\boost\mapreduce\intermediates\local_disk.hpp"
+						>
+					</File>
+				</Filter>
+			</Filter>
+		</Filter>
+		<Filter
+			Name="Resource Files"
+			Filter="rc;ico;cur;bmp;dlg;rc2;rct;bin;rgs;gif;jpg;jpeg;jpe;resx;tiff;tif;png;wav"
+			UniqueIdentifier="{67DA6AB6-F800-4c08-8B7A-83BB121AAD01}"
+			>
+		</Filter>
+	</Files>
+	<Globals>
+	</Globals>
+</VisualStudioProject>
Added: sandbox/libs/mapreduce/test/wordcount/wordcount.cpp
==============================================================================
--- (empty file)
+++ sandbox/libs/mapreduce/test/wordcount/wordcount.cpp	2009-07-23 15:04:45 EDT (Thu, 23 Jul 2009)
@@ -0,0 +1,260 @@
+// Boost.MapReduce library
+//
+//  Copyright (C) 2009 Craig Henderson.
+//  cdm.henderson_at_[hidden]
+//
+//  Use, modification and distribution is subject to the
+//  Boost Software License, Version 1.0. (See accompanying
+//  file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
+//
+// For more information, see http://www.boost.org/libs/mapreduce/
+//
+ 
+/*
+Use a variant of Boost built with STLport IOStreams
+
+STLport-5.2.1>set include=C:\root\Development\Library\STLport\STLport-5.2.1\stlport;%include%
+STLport-5.2.1>configure msvc8 -p winxp -x --without-thread --with-dynamic-rtl
+STLport-5.2.1\build\lib>nmake clean install
+
+Edit boost_1_39_0\tools\build\v2\user-config.jam to
+# -------------------
+# MSVC configuration.
+# -------------------
+# Configure specific msvc version (searched for in standard locations and PATH).
+using msvc : 8.0 ;
+# ----------------------
+# STLPort configuration.
+# ----------------------
+using stlport : : .../STLport/STLport-5.2.1/stlport : .../STLport/STLport-5.2.1/lib/vc8 ;
+
+boost_1_39_0> set include=C:\root\Development\Library\STLport\STLport-5.2.1\stlport;%include%
+boost_1_39_0> set INCLUDE=%INCLUDE%;C:\root\Development\Library\zlib\include
+boost_1_39_0> set ZLIB_INCLUDE=C:\root\Development\Library\zlib\include
+boost_1_39_0> set LIBPATH=%LIBPATH%;C:\root\Development\Library\zlib\lib
+boost_1_39_0> set ZLIB_LIBPATH=C:\root\Development\Library\zlib\lib
+boost_1_39_0> set ZLIB_BINARY=zdll
+boost_1_39_0> ..\bjam --toolset=msvc stdlib=stlport "stdlib:stlport-iostream=on" --without-python --with-filesystem --with-thread --with-date_time
+*/
+
+#ifdef _WIN32
+#ifndef WINVER				// Allow use of features specific to Windows XP or later.
+#define WINVER 0x0501		// Change this to the appropriate value to target other versions of Windows.
+#endif
+
+#ifndef _WIN32_WINNT		// Allow use of features specific to Windows XP or later.                   
+#define _WIN32_WINNT 0x0501	// Change this to the appropriate value to target other versions of Windows.
+#endif						
+
+#ifndef _WIN32_WINDOWS		// Allow use of features specific to Windows 98 or later.
+#define _WIN32_WINDOWS 0x0410 // Change this to the appropriate value to target Windows Me or later.
+#endif
+
+#ifndef _WIN32_IE			// Allow use of features specific to IE 6.0 or later.
+#define _WIN32_IE 0x0600	// Change this to the appropriate value to target other versions of IE.
+#endif
+
+#endif
+
+#include <boost/config.hpp>
+#if defined(BOOST_MSVC)
+#pragma warning(disable:4996 4512)   // for wordcount std::transform
+#endif
+
+#include <boost/mapreduce.hpp>
+#include <iostream>
+#include <numeric>          // accumulate
+
+#if defined(BOOST_MSVC)  && defined(_DEBUG)
+#include <crtdbg.h>
+#endif
+
+namespace wordcount {
+
+class map_task;
+class reduce_task;
+class combiner;
+
+typedef
+boost::mapreduce::job
+  < wordcount::map_task
+  , wordcount::reduce_task
+  , wordcount::combiner
+#if 0  &&  defined(_DEBUG)
+  , boost::mapreduce::datasource::directory_iterator<wordcount::map_task>
+  , boost::mapreduce::intermediates::in_memory<wordcount::map_task>
+#endif
+  >
+job;
+
+class map_task : boost::noncopyable
+{
+  public:
+    typedef std::string   key_type;
+    typedef std::ifstream value_type;
+    typedef std::string   intermediate_key_type;
+    typedef unsigned      intermediate_value_type;
+
+    map_task(job::map_task_runner &runner)
+      : runner_(runner)
+    {
+    }
+
+    // not a reference to const to enable streams to be passed
+    void operator()(key_type const &/*key*/, value_type &value) 
+    {
+        while (!value.eof())
+        {
+            std::string word;
+            value >> word;
+            std::transform(word.begin(), word.end(), word.begin(),
+                           std::bind1st(
+                               std::mem_fun(&std::ctype<char>::tolower),
+                               &std::use_facet<std::ctype<char> >(std::locale::classic())));
+
+            size_t length = word.length();
+            size_t const original_length = length;
+            std::string::const_iterator it;
+            for (it=word.begin();
+                 it!=word.end()  &&  !std::isalnum(*it, std::locale::classic());
+                 ++it)
+            {
+                --length;
+            }
+
+            for (std::string::const_reverse_iterator rit=word.rbegin();
+                 length>0  &&  !std::isalnum(*rit, std::locale::classic());
+                 ++rit)
+            {
+                --length;
+            }
+
+            if (length > 0)
+            {
+                if (length == original_length)
+                    runner_.emit_intermediate(word, 1);
+                else
+                    runner_.emit_intermediate(std::string(&*it,length), 1);
+            }
+        }
+    }
+
+  private:
+    job::map_task_runner &runner_;
+};
+
+class reduce_task : boost::noncopyable
+{
+  public:
+    typedef unsigned value_type;
+
+    reduce_task(job::reduce_task_runner &runner)
+      : runner_(runner)
+    {
+    }
+
+    template<typename It>
+    void operator()(typename map_task::intermediate_key_type const &key, It it, It const ite)
+    {
+        runner_.emit(key, std::accumulate(it, ite, reduce_task::value_type()));
+    }
+
+  private:
+    job::reduce_task_runner &runner_;
+};
+
+class combiner
+{
+  public:
+    void start(map_task::intermediate_key_type const &)
+    {
+        total_ = 0;
+    }
+
+    template<typename IntermediateStore>
+    void finish(map_task::intermediate_key_type const &key, IntermediateStore &intermediate_store)
+    {
+        if (total_ > 0)
+            intermediate_store.insert(key, total_);
+    }
+
+    void operator()(map_task::intermediate_value_type const &value)
+    {
+        total_ += value;
+    }
+
+  private:
+    unsigned total_;
+};
+
+}   // namespace wordcount
+
+
+int main(int argc, char **argv)
+{
+#if defined(BOOST_MSVC)  &&  defined(_DEBUG)
+//    _CrtSetBreakAlloc(380);
+    _CrtSetDbgFlag(_CrtSetDbgFlag(_CRTDBG_REPORT_FLAG) | _CRTDBG_LEAK_CHECK_DF);
+#endif
+
+    std::cout << "MapReduce Wordcount Application";
+    if (argc < 2)
+    {
+        std::cerr << "Usage: wordcount directory [num_map_tasks]\n";
+        return 1;
+    }
+
+    wordcount::job::datasource_type datasource;
+    datasource.set_directory(argv[1]);
+
+    std::cout << "\n" << std::max(1,(int)boost::thread::hardware_concurrency()) << " CPU cores";
+    std::cout << "\n" << typeid(wordcount::job).name() << "\n";
+    
+#if 0  ||  defined(_DEBUG)
+    std::cout << "\nRunning Sequential MapReduce...";
+
+    boost::mapreduce::specification spec;
+    spec.map_tasks    = 1;
+
+    boost::mapreduce::results                                     result;
+    boost::mapreduce::schedule_policy::sequential<wordcount::job> scheduler;
+    wordcount::job                                         mr1(datasource);
+    mr1.run(scheduler, spec, result);
+
+    std::cout << "\nFinished.";
+#else
+    std::cout << "\nRunning CPU Parallel MapReduce...";
+
+    boost::mapreduce::specification spec;
+    boost::mapreduce::results       result;
+    wordcount::job                  mr2(datasource);
+
+    if (argc > 2)
+        spec.map_tasks = atoi(argv[2]);
+
+    mr2.run<boost::mapreduce::schedule_policy::cpu_parallel<wordcount::job> >(spec, result);
+
+    std::cout << "\nCPU Parallel MapReduce Finished.";
+#endif
+    std::cout << std::endl << "\n" << "MapReduce statistics:";
+    std::cout << "\n  " << "MapReduce job runtime                     : " << result.job_runtime << " seconds, of which...";
+    std::cout << "\n  " << "  Map phase runtime                       : " << result.map_runtime << " seconds";
+    std::cout << "\n  " << "  Reduce phase runtime                    : " << result.reduce_runtime << " seconds";
+    std::cout << "\n\n  " << "Map:";
+    std::cout << "\n    " << "Total Map keys                          : " << result.counters.map_tasks;
+    std::cout << "\n    " << "Map keys processed                      : " << result.counters.map_tasks_completed;
+    std::cout << "\n    " << "Map key processing errors               : " << result.counters.map_tasks_error;
+    std::cout << "\n    " << "Number of Map Tasks run (in parallel)   : " << result.counters.actual_map_tasks;
+    std::cout << "\n    " << "Fastest Map key processed in            : " << *std::min_element(result.map_times.begin(), result.map_times.end()) << " seconds";
+    std::cout << "\n    " << "Slowest Map key processed in            : " << *std::max_element(result.map_times.begin(), result.map_times.end()) << " seconds";
+    std::cout << "\n    " << "Average time to process Map keys        : " << std::accumulate(result.map_times.begin(), result.map_times.end(), boost::int64_t()) / result.map_times.size() << " seconds";
+
+    std::cout << "\n\n  " << "Reduce:";
+    std::cout << "\n    " << "Number of Reduce Tasks run (in parallel): " << result.counters.actual_reduce_tasks;
+    std::cout << "\n    " << "Number of Result Files                  : " << result.counters.num_result_files;
+    std::cout << "\n    " << "Fastest Reduce key processed in         : " << *std::min_element(result.reduce_times.begin(), result.reduce_times.end()) << " seconds";
+    std::cout << "\n    " << "Slowest Reduce key processed in         : " << *std::max_element(result.reduce_times.begin(), result.reduce_times.end()) << " seconds";
+    std::cout << "\n    " << "Average time to process Reduce keys     : " << std::accumulate(result.reduce_times.begin(), result.reduce_times.end(), boost::int64_t()) / result.map_times.size() << " seconds";
+
+    return 0;
+}
Added: sandbox/libs/mapreduce/test/wordcount/wordcount.vcproj
==============================================================================
--- (empty file)
+++ sandbox/libs/mapreduce/test/wordcount/wordcount.vcproj	2009-07-23 15:04:45 EDT (Thu, 23 Jul 2009)
@@ -0,0 +1,206 @@
+<?xml version="1.0" encoding="Windows-1252"?>
+<VisualStudioProject
+	ProjectType="Visual C++"
+	Version="8.00"
+	Name="wordcount"
+	ProjectGUID="{AB0444E8-E927-470A-BF0B-A60E67F91B06}"
+	RootNamespace="wordcount"
+	Keyword="Win32Proj"
+	>
+	<Platforms>
+		<Platform
+			Name="Win32"
+		/>
+	</Platforms>
+	<ToolFiles>
+	</ToolFiles>
+	<Configurations>
+		<Configuration
+			Name="Debug|Win32"
+			OutputDirectory="$(ConfigurationName)"
+			IntermediateDirectory="$(ConfigurationName)\compiler"
+			ConfigurationType="1"
+			CharacterSet="1"
+			>
+			<Tool
+				Name="VCPreBuildEventTool"
+			/>
+			<Tool
+				Name="VCCustomBuildTool"
+			/>
+			<Tool
+				Name="VCXMLDataGeneratorTool"
+			/>
+			<Tool
+				Name="VCWebServiceProxyGeneratorTool"
+			/>
+			<Tool
+				Name="VCMIDLTool"
+			/>
+			<Tool
+				Name="VCCLCompilerTool"
+				Optimization="0"
+				AdditionalIncludeDirectories="../../../.."
+				PreprocessorDefinitions="WIN32_LEAN_AND_MEAN"
+				MinimalRebuild="true"
+				BasicRuntimeChecks="3"
+				RuntimeLibrary="3"
+				UsePrecompiledHeader="0"
+				WarningLevel="4"
+				WarnAsError="true"
+				Detect64BitPortabilityProblems="true"
+				DebugInformationFormat="3"
+			/>
+			<Tool
+				Name="VCManagedResourceCompilerTool"
+			/>
+			<Tool
+				Name="VCResourceCompilerTool"
+			/>
+			<Tool
+				Name="VCPreLinkEventTool"
+			/>
+			<Tool
+				Name="VCLinkerTool"
+				LinkIncremental="2"
+				AdditionalLibraryDirectories=""
+				GenerateDebugInformation="true"
+				SubSystem="1"
+				OptimizeForWindows98="1"
+				TargetMachine="1"
+			/>
+			<Tool
+				Name="VCALinkTool"
+			/>
+			<Tool
+				Name="VCManifestTool"
+			/>
+			<Tool
+				Name="VCXDCMakeTool"
+			/>
+			<Tool
+				Name="VCBscMakeTool"
+			/>
+			<Tool
+				Name="VCFxCopTool"
+			/>
+			<Tool
+				Name="VCAppVerifierTool"
+			/>
+			<Tool
+				Name="VCWebDeploymentTool"
+			/>
+			<Tool
+				Name="VCPostBuildEventTool"
+			/>
+		</Configuration>
+		<Configuration
+			Name="Release|Win32"
+			OutputDirectory="$(ConfigurationName)"
+			IntermediateDirectory="$(ConfigurationName)\compiler"
+			ConfigurationType="1"
+			CharacterSet="1"
+			WholeProgramOptimization="1"
+			>
+			<Tool
+				Name="VCPreBuildEventTool"
+			/>
+			<Tool
+				Name="VCCustomBuildTool"
+			/>
+			<Tool
+				Name="VCXMLDataGeneratorTool"
+			/>
+			<Tool
+				Name="VCWebServiceProxyGeneratorTool"
+			/>
+			<Tool
+				Name="VCMIDLTool"
+			/>
+			<Tool
+				Name="VCCLCompilerTool"
+				InlineFunctionExpansion="2"
+				AdditionalIncludeDirectories="../../../.."
+				PreprocessorDefinitions="WIN32_LEAN_AND_MEAN;BOOST_LIB_DIAGNOSTIC"
+				RuntimeLibrary="2"
+				UsePrecompiledHeader="0"
+				WarningLevel="4"
+				WarnAsError="true"
+				Detect64BitPortabilityProblems="true"
+				DebugInformationFormat="3"
+			/>
+			<Tool
+				Name="VCManagedResourceCompilerTool"
+			/>
+			<Tool
+				Name="VCResourceCompilerTool"
+			/>
+			<Tool
+				Name="VCPreLinkEventTool"
+			/>
+			<Tool
+				Name="VCLinkerTool"
+				LinkIncremental="1"
+				AdditionalLibraryDirectories=""
+				GenerateDebugInformation="true"
+				SubSystem="1"
+				OptimizeReferences="2"
+				EnableCOMDATFolding="2"
+				OptimizeForWindows98="1"
+				TargetMachine="1"
+			/>
+			<Tool
+				Name="VCALinkTool"
+			/>
+			<Tool
+				Name="VCManifestTool"
+			/>
+			<Tool
+				Name="VCXDCMakeTool"
+			/>
+			<Tool
+				Name="VCBscMakeTool"
+			/>
+			<Tool
+				Name="VCFxCopTool"
+			/>
+			<Tool
+				Name="VCAppVerifierTool"
+			/>
+			<Tool
+				Name="VCWebDeploymentTool"
+			/>
+			<Tool
+				Name="VCPostBuildEventTool"
+			/>
+		</Configuration>
+	</Configurations>
+	<References>
+	</References>
+	<Files>
+		<Filter
+			Name="Source Files"
+			Filter="cpp;c;cc;cxx;def;odl;idl;hpj;bat;asm;asmx"
+			UniqueIdentifier="{4FC737F1-C7A5-4376-A066-2A32D752A2FF}"
+			>
+			<File
+				RelativePath=".\wordcount.cpp"
+				>
+			</File>
+		</Filter>
+		<Filter
+			Name="Header Files"
+			Filter="h;hpp;hxx;hm;inl;inc;xsd"
+			UniqueIdentifier="{93995380-89BD-4b04-88EB-625FBE52EBFB}"
+			>
+		</Filter>
+		<Filter
+			Name="Resource Files"
+			Filter="rc;ico;cur;bmp;dlg;rc2;rct;bin;rgs;gif;jpg;jpeg;jpe;resx;tiff;tif;png;wav"
+			UniqueIdentifier="{67DA6AB6-F800-4c08-8B7A-83BB121AAD01}"
+			>
+		</Filter>
+	</Files>
+	<Globals>
+	</Globals>
+</VisualStudioProject>