From: Lubomir Bourdev (lbourdev_at_[hidden])
Date: 2006-10-31 20:47:01


 
Stefan Heinzmann wrote:
>
> Maybe it's just me but I find extending GIL to support
> something like the v210 Quicktime format quite challenging (I
> don't want to imply that this is GIL's fault). This is a
> 10-bit YUV 4:2:2 format which stores 6 pixels in 16 bytes. It
> appears to me as if trying to support it would touch on a lot
> of concepts and corners of GIL, as it would require a new
> pixel storage format, color space, component subsampling, and
> maybe more.
>
> I believe it would help understanding if you could try to
> give at least a road map of what needs doing to support this
> properly (a fully coded example would probably require quite
> some effort).
>

Stefan,

This is an excellent example for a very complicated image format.
Here is a link that I found that describes it:

http://developer.apple.com/quicktime/icefloe/dispatch019.html#v210

Basically, each 16 bytes contain 6 packed Y'CbCr pixels, each channel of
which is 10-bits long. Some of the channels are shared between different
pixels.

Here is a rough plan of how I would approach modeling this in GIL:

1. Provide yCrCb color space
2. Provide a model of sub-byte channel reference whose offset can be
specified at run time
3. Create a custom pixel iterator to handle v120 format

__________________________
Detail:

1. Provide yCrCb color space (see design guide for detail):

struct ycrcb_t {
    typedef ycrcb_t base;
    BOOST_STATIC_CONSTANT(int, num_channels=3);
};

This defines the common typedefs for pixels, iterators, locators,
images, etc:

GIL_DEFINE_ALL_TYPEDEFS(8, ycrcb)
GIL_DEFINE_ALL_TYPEDEFS(8s, ycrcb)
GIL_DEFINE_ALL_TYPEDEFS(16, ycrcb)
GIL_DEFINE_ALL_TYPEDEFS(16s,ycrcb)
GIL_DEFINE_ALL_TYPEDEFS(32f,ycrcb)
GIL_DEFINE_ALL_TYPEDEFS(32s,ycrcb)

2. Create a model of a sub-byte channel reference, whose offset is a
dynamic parameter.
This is almost identical to class packed_channel_reference from the
packed_pixel example:

template <typename DataValue, typename ChannelValue,
          int FirstBit, int NumBits, bool Mutable>
class packed_channel_reference;

Except that FirstBit is passed at run time and stored inside of it:

template <typename DataValue, typename ChannelValue,
          int NumBits, bool Mutable>
class packed_runtime_channel_reference {
    ...
    const int _first_bit;
};

We now have a model of the 10-bit channel:

typedef packed_runtime_channel_reference<uint32_t, uint16_t, 10, true>
    v120_channel_ref_t;

We can use it to define a model of a pixel reference. We can reuse
pixel_ref, which is a class that models PixelConcept whose channels are
at disjoint places in memory:

typedef planar_ref<v120_channel_ref, ycrcb_t> v120_pixel_ref_t;

3. Create a custom pixel iterator, containing a pointer to the first
byte in 16-byte block and index to the current pixel in the block:

// Models PixelIteratorConcept
struct v120_pixel_ptr : public boost::iterator_facade<...> {
    uint32_t* p; // pointer to the first byte of a 16-byte chunk
    int index; // which pixel is it currently on? (0..5)

    typedef v120_pixel_ref_t reference;
    typedef ycrcb16_pixel_t value_type;

    void increment();
    reference dereference() const;
};

Its increment will bump up the index of the pixel, and if it reaches 6,
will move the pointer to the next 16 bytes:

void v120_pixel_ptr::increment() {
   if (++index==6) {
      index=0;
      p+=4;
   }
}

Its dereference will return a reference to the appropriate channels.
For example, the fourth pixel uses:
 For Y': bits [22..31] of the 3rd word
 For Cb: bits [2 ..11] of the 2nd word
 For Cr: bits [12..21] of the 3rd word

reference v120_pixel_ptr::dereference() const {
   switch (index) {
      ...
      case 4: return reference(
              v120_channel_ref_t(*(p+3),22),
              v120_channel_ref_t(*(p+2),2),
              v120_channel_ref_t(*(p+3),12));
      ...
   }
}

You can now construct a view from the iterator:

typedef type_from_x_iterator<v120_pixel_ptr>::view_t v120_view_t;

And you should be able to construct it with common GIL functions:

v120_view_t v120_view=interleaved_view(width, height, ptr, row_bytes);

You should be able to use this view in algorithms:

copy_pixels(v120_view1, v120_view2);

Note that it is only compatible with other v120 views. So you cannot
copy to/from a regular view, even if it is Y'CbCr type. To do that you
will have to write channel conversion and color conversion. Use the
packed_pixel.hpp example to see how to do that.
Once you do that you should be able to do:

copy_and_convert_pixels(v120_view, rgb8_view);
copy_and_convert_pixels(rgb8_view, v120_view);

or:

jpeg_write_view("out.jpg",
    color_converted_view<rgb8_pixel_t>(v120_view,
v120_color_converter));

You should be able to run carefully designed generic algorithms directly
on native v120 data.

Lubomir