$include_dir="/home/hyper-archives/boost-gil/include"; include("$include_dir/msg-header.inc") ?>
From: stefan (stefan_at_[hidden])
Date: 2019-10-10 19:35:40
Hi Olzhas,
this is a very interesting topic ! As it happens, I have worked on 
HP(Embedded)C software with similar features in the past,
with support for heterogeneous memory management, multiple compute 
engines, etc. See http://openvsip.org/
In fact, the idea for the memory management came from a tool I found 
years ago, called StarPU: http://starpu.gforge.inria.fr/
The idea is that for a given user-facing object (A tensor, an image, 
etc.), multiple mappings into different memory spaces exist, which are 
updated on-demand, depending on where a computation is performed. In 
OpenVSIP we used a "dispatcher" to decide what backend to use for a 
given operation, and that dispatch logic was used on entire assignment 
expressions (using C++ expression templates), so multiple unary and 
binary operations could be fused together for maximum performance.
Of course, that's a whole lot of things to care for, and may be a bit 
too much for your needs. I mentored a project for Boost.uBLAS summer '18 
that added GPU support, using OpenCL, rather than CUDA (but the basic 
idea would be the same).
So now there are two similar low-level APIs to handle operations on the 
host or the device (in fact, given that ublas vectors and matrices 
already are parametrized for storage, I simply added a "gpu" storage 
type), so now you can:
* explicitly run a computation on the host
* explicitly run a computation on the device
* explicitly copy data from one side to the other
Something similar may work well for Boost.GIL, too, I suspect. Let me 
know if you would like to chat in more details about any of this...
Stefan
--
       ...ich hab' noch einen Koffer in Berlin...