Subject: Re: [boost] [chrono/date] Performance goals and design summary
From: Vicente J. Botet Escriba (vicente.botet_at_[hidden])
Date: 2013-05-05 04:15:08


Le 05/05/13 09:31, Vicente J. Botet Escriba a écrit :
> Le 05/05/13 04:31, Howard Hinnant a écrit :
>> Fwiw, I went to
>> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3344.pdf to
>> experiment with the "Modified Following" benchmark which has some
>> code shown for it.
>>
>> I wanted to see what it would look like with two date types: field
>> and serial, instead of just one (date) which has been implemented as
>> both.
>>
>> Here is the original code from N3344:
>>
>> bool
>> isNonBusinessDay(const Date& targetDate,
>> const Date& startDate,
>> const int* calendar)
>> {
>> int offset = targetDate - startDate;
>> int wordIndex = offset / 8 / sizeof(int);
>> int bitIndex = offset - wordIndex * sizeof(int);
>> return (1 == (calendar[wordIndex] & (1 << bitIndex)));
>> }
>>
>> Date
>> modifiedFollowing(const Date& targetDate,
>> const Date& startDate,
>> const int* calendar)
>> {
>> Date date(targetDate);
>> while (isNonBusinessDay(date, startDate, calendar))
>> ++date;
>> if (targetDate.month() == date.month())
>> return date;
>> date = targetDate;
>> do
>> {
>> --date;
>> } while (isNonBusinessDay(date, startDate, calendar));
>> return date;
>> }
>>
>> For purposes of discussion, assume we now have two date types:
>>
>> 1. Serial type named day_point, which is a chrono::time_point.
>> 2. Field type named ymd (not a great name, just trying to
>> differentiate it).
>>
>> All isNonBusinessDay does is subtract two Dates. This is clearly the
>> domain of a serial type:
>>
>> bool
>> isNonBusinessDay(const day_point& targetDate,
>> const day_point& startDate,
>> const int* calendar)
>> {
>> days offset = targetDate - startDate;
>> days wordIndex = offset / (8 * sizeof(int));
>> days bitIndex = offset - wordIndex * sizeof(int);
>> return (1 == (calendar[wordIndex.count()] & (1 <<
>> bitIndex.count())));
>> }
>>
>> The modifications are trivial and expense is not compromised. The
>> code is perhaps a little more type safe, utilizing the days units,
>> but nothing spectacular. I would expect the exact same assembly to
>> be generated.
>>
>> modifiedFollowing is more interesting. If it takes two day_points
>> (two serial dates), it might look like this:
>>
>> day_point
>> modifiedFollowing(const day_point& targetDate,
>> const day_point& startDate,
>> const int* calendar)
>> {
>> day_point date(targetDate);
>> while (isNonBusinessDay(date, startDate, calendar))
>> ++date;
>> if (ymd(targetDate).month() == ymd(date).month()) //
>> serial->field twice
>> return date;
>> date = targetDate;
>> do
>> {
>> --date;
>> } while (isNonBusinessDay(date, startDate, calendar));
>> return date;
>> }
>>
>> The only difference is the need to convert the serial dates to a ymd
>> type so that the month can be extracted. This is done exactly twice
>> on the line commented. Otherwise the code is remarkably similar, and
>> I would argue, the exact same efficiency.
>>
>> <disclaimer>
>> std::chrono::time_point is currently missing operator-- and
>> operator++. I view this a defect that should be corrected.
>> </disclaimer>
>>
>> One could explore with passing in targetDate as a ymd type instead:
>>
>> day_point
>> modifiedFollowing(const ymd& targetDate,
>> const day_point& startDate,
>> const int* calendar)
>> {
>> day_point date(targetDate); // field->serial
>> day_point sdate = date;
>> while (isNonBusinessDay(date, startDate, calendar))
>> ++date;
>> if (targetDate.month() == ymd(date).month()) // serial->field
>> return date;
>> date = sdate;
>> do
>> {
>> --date;
>> } while (isNonBusinessDay(date, startDate, calendar));
>> return date;
>> }
>>
>> This rewrite trades one serial->field conversion for one
>> field->serial conversion. It might be a win if the client actually
>> has a ymd already for input, as my measurements are showing that
>> serial->field conversions are more expensive than field->serial. My
>> measurements are raw and new, so I could be off on that. And no
>> doubt such a measurement is going to depend upon things like hardware
>> and algorithms (caches).
>>
>> But the main point is that having two date types is not disruptive,
>> and mainly serves to give the client more options in optimizing his
>> date algorithms.
>>
>>
> I agree completely that we heed several dates and the standard (the
> library) must describe the performances provided by each one.
> <snip>
>
> For the purpose of showing the Date concept I will use a template
>
>
> template <typename Date1, typename Date2>
> days_date
> modifiedFollowing(const Date1& targetDate,
> const Date2& startDate,
> const int* calendar)
> {
> ymd_date ymdTargetDate(targetDate); // serial->field or nothing
> (so that the month() is efficient - we could store the month also).
> days_date date(targetDate); // field->serial or nothing, but at
> least there is a conversion (as we are making day arithmetic)
> days_date sdate = date;
> while (isNonBusinessDay(date, startDate, calendar))
> ++date;
> if (ymdTargetDate.month() == date.month()) // serial->field ++
> No need to convert explicitly
> return date;
> date = sdate;
> do
> {
> --date;
> } while (isNonBusinessDay(date, startDate, calendar));
> return date;
> }
>
>
> As you can see the two dates with the same interface have not only
> drawbacks ;-) Providing the same interface and been convertible to
> each other helps the user.
>
One additional advantage of using date.month() and not using a specific
conversion:
* class days_date could have a better algorithm that converting the
days_date to ymd_date, e.g. can convert it to an ordinal_date and use a
table to get the month from the is_leap() and day_of_year().

How better than the class days_date could know how to make these
operations more efficient? itself or the user?

Best,
Vicente