$include_dir="/home/hyper-archives/boost/include"; include("$include_dir/msg-header.inc") ?>
Subject: [boost] [Endian] Performance
From: Phil Endecott (spam_from_boost_dev_at_[hidden])
Date: 2011-09-05 18:38:15
Dear All,
I've just done some quick benchmarks of Beman's proposed byte-swapping
code using the following test program:
#include <cstdint>
#include <iostream>
#include <arpa/inet.h> // for htonl()
using std::uint32_t;
static inline uint32_t byteswap(uint32_t src)
{
#if defined(USE_GCC_BUILTIN)
return __builtin_bswap32(src);
#elif defined(USE_HTONL)
return htonl(src);
#elif defined(USE_REV_INSTR)
uint32_t target;
__asm__ ( "rev %0, %1\n"
: "=r" (target)
: "r" (src)
);
return target;
#elif defined(USE_BEMAN)
const char* s (reinterpret_cast<const char*>(&src));
uint32_t target;
char * t (reinterpret_cast<char*>(&target) + sizeof(target) - 1);
*t = *s;
*--t = *++s;
*--t = *++s;
*--t = *++s;
return target;
#else
#error "Define USE_*"
#endif
}
int main()
{
uint32_t s = 0;
for (uint32_t i = 0; i < 1000000000; ++i) {
s += byteswap(i);
}
std::cout << s << "\n"; // Expect 506498560
}
I've tested this on two systems:
(A): Marvell ARMv5TE ("Feroceon") @ 1.2 GHz, g++ 4.4
(B): Freescale ARMv7 (i.MX53, Coretex A8) @ 1.0 GHz, g++ 4.6.1
Compiled with: --std=c++0x -O4
Execution times in seconds are:
A B
USE_GCC_BUILTIN 29.4 3.0
USE_HTONL 12.6 3.0
USE_REV_INSTR n/a 3.0
USE_BEMAN 17.6 8.0
ARM architecture version 6 (IIRC) introduced a "rev" byte-swapping
instruction. On system B, which supports this, it is used by the gcc
builtin and glibc's htonl() giving better performance than Beman's
code. On system A which does not support this instruction, the gcc
builtin makes a library call; I've not looked at what htonl() does.
What do people see on other platforms?
How important do we consider performance to be for this library?
Regards, Phil.