Subject: Re: [boost] [BGL] Trying to get a correclty working Parallel BFS code
From: Sandeep Gupta (gupta.sandeep_at_[hidden])
Date: 2009-11-04 18:22:12


On Wed, Nov 4, 2009 at 3:11 PM, Sandeep Gupta <gupta.sandeep_at_[hidden]>wrote:

>
>
> On Wed, Nov 4, 2009 at 12:18 PM, Nick Edmonds <ngedmond_at_[hidden]>wrote:
>
>>
>> On Nov 3, 2009, at 10:44 PM, Sandeep Gupta wrote:
>>
>> On Mon, Nov 2, 2009 at 12:43 PM, Nick Edmonds <ngedmond_at_[hidden]
>>> >wrote:
>>>
>>>
>>>> On Oct 31, 2009, at 1:00 AM, Sandeep Gupta wrote:
>>>>
>>>> On Fri, Oct 30, 2009 at 6:41 AM, Jeremiah Willcock <jewillco_at_[hidden]
>>>>
>>>>> wrote:
>>>>>>
>>>>>
>>>>> On Thu, 29 Oct 2009, Sandeep Gupta wrote:
>>>>>
>>>>>>
>>>>>> On Thu, Oct 29, 2009 at 1:12 PM, Jeremiah Willcock <
>>>>>> jewillco_at_[hidden]
>>>>>>
>>>>>> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>> On Thu, 29 Oct 2009, Sandeep Gupta wrote:
>>>>>>>
>>>>>>>
>>>>>>>> On Wed, Oct 28, 2009 at 5:49 PM, Sandeep Gupta <
>>>>>>>> gupta.sandeep_at_[hidden]
>>>>>>>>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, Oct 27, 2009 at 9:31 PM, Jeremiah Willcock <
>>>>>>>>>
>>>>>>>>>> jewillco_at_[hidden]
>>>>>>>>>>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Tue, 27 Oct 2009, Sandeep Gupta wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> The error snippet regarding predecessor map:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> error: no match for call to
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> '(boost::property_reduce<boost::vertex_predecessor_t>::apply<boost::detail::error_property_not_found>)
>>>>>>>>>>>>> (const boost::detail::parallel::global_descriptor<unsigned
>>>>>>>>>>>>> int>&)'
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> ../../../boost/property_map/parallel/impl/distributed_property_map.ipp:141:
>>>>>>>>>>>>> error: no match for call to
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> '(boost::property_reduce<boost::vertex_predecessor_t>::apply<boost::detail::error_property_not_found>)
>>>>>>>>>>>>> (const boost::detail::parallel::global_descriptor<unsigned
>>>>>>>>>>>>> int>&,
>>>>>>>>>>>>> boost::detail::error_property_not_found&, const
>>>>>>>>>>>>> boost::detail::error_property_not_found&)'
>>>>>>>>>>>>> ../../../boost/graph/parallel/properties.hpp:95: note:
>>>>>>>>>>>>> candidates
>>>>>>>>>>>>> are:
>>>>>>>>>>>>> T
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> boost::property_reduce<boost::vertex_predecessor_t>::apply<T>::operator()(T)
>>>>>>>>>>>>> const [with T = boost::detail::error_property_not_found]
>>>>>>>>>>>>> ../../../boost/graph/parallel/properties.hpp:96: note:
>>>>>>>>>>>>> T
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> boost::property_reduce<boost::vertex_predecessor_t>::apply<T>::operator()(T,
>>>>>>>>>>>>> T, T) const [with T = boost::detail::error_property_not_found]
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Jeremiah, Do you think that I should file a bug report for
>>>>>>>>>>>>> this.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Although I was hoping (and actually needed quite urgently) that
>>>>>>>>>>>>> it
>>>>>>>>>>>>>
>>>>>>>>>>>>> would
>>>>>>>>>>>>>
>>>>>>>>>>>> be
>>>>>>>>>>>> minor issue and get fixed quickly.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> You need to pass a predecessor map to the algorithm as one of
>>>>>>>>>>>> the
>>>>>>>>>>>>
>>>>>>>>>>>> parameters
>>>>>>>>>>> (or put it as a part of your graph but making an external
>>>>>>>>>>> property
>>>>>>>>>>> map
>>>>>>>>>>> to
>>>>>>>>>>> give
>>>>>>>>>>> to the algorithm is easier).
>>>>>>>>>>>
>>>>>>>>>>> Hi Jeremy,
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> It took me a while but I finally figure out how to pass the
>>>>>>>>>>>
>>>>>>>>>> predecessor
>>>>>>>>>>
>>>>>>>>>> map. Unfortunately it doesn't have any effect. Also, I might be
>>>>>>>>>>
>>>>>>>>>>> wrong
>>>>>>>>>>> but I
>>>>>>>>>>> don't see any logical reason why predecessor map should have any
>>>>>>>>>>> bearing
>>>>>>>>>>> on
>>>>>>>>>>> the correctness of the depth. I have attached the new code. I am
>>>>>>>>>>> not
>>>>>>>>>>> able
>>>>>>>>>>> to print out the predecessor because I am not able to figure out
>>>>>>>>>>> its
>>>>>>>>>>> type.
>>>>>>>>>>> Maybe you could help me resolve this.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> It would be nice to have this code running. I need to profile
>>>>>>>>>>> graph
>>>>>>>>>>>
>>>>>>>>>>> performance on a new machine by tomorrow. Again, thanks for you
>>>>>>>>>> patience
>>>>>>>>>> and
>>>>>>>>>> time. I really appreciate you looking into this.
>>>>>>>>>>
>>>>>>>>>> Hi Jeremiah,
>>>>>>>>>>
>>>>>>>>>> Was just wondering if you had time to look into this or any
>>>>>>>>>> suggestion
>>>>>>>>>>
>>>>>>>>>> on
>>>>>>>>>>
>>>>>>>>> how further proceed. I can get you the output of predecessor map if
>>>>>>>>> that
>>>>>>>>> would help.
>>>>>>>>> Just that I haven't be been able to figure out what the is type of
>>>>>>>>> the
>>>>>>>>> property map returned by make_distributed_property_map.
>>>>>>>>> Please let me know your thoughts.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> The person who really knows this library is not around as far as I
>>>>>>>>>
>>>>>>>> know;
>>>>>>>> he
>>>>>>>> might be on vacation. The type returned from
>>>>>>>> make_distributed_property_map
>>>>>>>> is documented on <URL:
>>>>>>>> http://www.osl.iu.edu/research/pbgl/documentation/graph/
>>>>>>>> distributed_property_map.html>. One thing that you could do to
>>>>>>>> debug
>>>>>>>> things is to put in your own color map and look at what values it
>>>>>>>> ends
>>>>>>>> up
>>>>>>>> with at the end of the algorithm, and possibly even to put in a
>>>>>>>> local
>>>>>>>> color
>>>>>>>> map that prints out when elements are changed. That might be a lot
>>>>>>>> of
>>>>>>>> work
>>>>>>>> for what could end up being an obvious problem (that I just don't
>>>>>>>> know
>>>>>>>> how
>>>>>>>> to diagnose), though. BTW, does the example in
>>>>>>>> libs/graph_parallel/example/breadth_first_search.cpp work for you?
>>>>>>>> It
>>>>>>>> appears to be doing the same thing you want to do. If that works,
>>>>>>>> it
>>>>>>>> could
>>>>>>>> be a problem in your visitor.
>>>>>>>>
>>>>>>>>
>>>>>>>> I derived this code from
>>>>>>>> graph_parallel/example/breadth_first_search.cpp.
>>>>>>>>
>>>>>>>> The problem is that I don't understand the input and the output of
>>>>>>>> the
>>>>>>>>
>>>>>>> example code. It would be great if I could get an explanation for
>>>>>>> the
>>>>>>> format of the input file. Then I can transform my graph into that
>>>>>>> format
>>>>>>> and pass to the algorithm. That would be enough for my current
>>>>>>> purpose.
>>>>>>>
>>>>>>> As time permits I would try out your suggestions and let know of
>>>>>>> update.
>>>>>>> In
>>>>>>> the meantime I am hoping that I would get input from other relevant
>>>>>>> authors.
>>>>>>>
>>>>>>>
>>>>>>> This is what I could tell from the code.
>>>>>>
>>>>>> Input:
>>>>>> http://people.sc.fsu.edu/~burkardt/data/metis_graph/>
>>>>>> <
http://people.sc.fsu.edu/%7Eburkardt/data/metis_graph/>
>>>>>>
>>>>>> metis_graph.html<
>>>>>> http://people.sc.fsu.edu/%7Eburkardt/data/metis_graph/metis_graph.html
>>>>>> >
>>>>>>
>>>>>>
>>>>>> Output:
>>>>>> http://www.graphviz.org/doc/info/lang.html
>>>>>>
>>>>>>
>>>>>> Thanks Jeremiah. The link was helpful. I understood the input graph
>>>>>> format.
>>>>>> Its for undirected graph only. However the BFS output is still coming
>>>>>> out
>>>>>> to
>>>>>> be incorrect. I tried with a sample line graph of 4 nodes and it
>>>>>> exhibits the same problem as before.
>>>>>>
>>>>>> The distances of nodes on process 2 is not getting updated. On the
>>>>>> other
>>>>>>
>>>>> hand dijkstra shortest paths example is working correctly.
>>>>>
>>>>> Thanks
>>>>> Sandeep
>>>>>
>>>>>
>>>> I rewrote most of your code before I realized that you're outputting all
>>>> your distances on process 0, even for vertices process 0 doesn't own.
>>>> The
>>>> large value you see is std::numeric_limits<distance_type>::max. When
>>>> you
>>>> call get(distance, vertex(i, g)) for some i not owned by the current
>>>> process
>>>> a ghost cell is created for the requested value and a request sent to
>>>> the
>>>> owner of that vertex for the correct value. get() then returns whatever
>>>> the
>>>> default value is for the distance property map, which in this case is
>>>> infinity (i.e. std::numeric_limits<distance_type>::max.
>>>>
>>>> In some cases you may get the correct distance value because process 0
>>>> has
>>>> previously requested that value (and it may or may not be up to date).
>>>> If
>>>> process 0 has never requested that distance value and it doesn't own the
>>>> vertex, then you'll get infinity.
>>>>
>>>> You need to gather all your data before you output it from a single node
>>>> (i.e. issue a get() or request() which is a get with no return value,
>>>> for
>>>> each data element). If you want your code to be scalable you should
>>>> output
>>>> your data in a distributed fashion from all nodes at once. Remember
>>>> that
>>>> updated values won't be guaranteed to arrive until the next
>>>> synchronization
>>>> step.
>>>>
>>>> If you want the modified version of your code let me know and I'll
>>>> finish
>>>> it and ship it your way.
>>>>
>>>> -Nick
>>>>
>>>>
>>>> Hi Nick,
>>>>
>>>> Thanks so much for looking into this. Sorry, I couldn't get to it
>>> earlier
>>> due to other concerns. I corrected the mistake you mentioned and I don't
>>> get
>>> numeric_limit<>::max() for any nodes anymore.
>>>
>>> I believe the graph/distributed/graphviz.hpp was outputting the distance
>>> in
>>> correct fashion.
>>>
>>> Unfortunately, the output is still coming out to be incorrect.
>>> It was also the case with output from graphviz.hpp.
>>>
>>> For the test graph that we mentioned before, essentially
>>> 0 --> 1 --> 2 --> 3,
>>> 0 --> 4 --> 3
>>>
>>> the distance output is as follows:
>>> global node-id : distance
>>> 0 : 0
>>> 1 : 1
>>> 2 : 2
>>> 3 : 1 //(wrong: should be 2)
>>> 4 : 0 // should be 1
>>>
>>> If possible can you send me your version of the code. Let me know if you
>>> need mine (the updated version).
>>>
>>
>>
>> Given the description of your problem above, and assuming you're using a
>> block distribution with 8 vertices, it sounds like you're initializing the
>> distance to the first vertex on *each* process to 0, as opposed to only the
>> distance to the source. The vertex with global index 4 is actually local
>> index 0 on process one using the above distribution. If you initialize the
>> distances to the vertices with global indices 0 and 4 to 0, then your
>> results make sense. Let me know if that's not the case, but I surmise that
>> it is.
>>
>> Very much so :-). Thanks for getting me started with by catching these
> minor but fatal mis-understandings with PBFS code. I understood that call
> vertex(i,g) creates a descriptor for ith global vertex.
>
> Also there's a problem in your depth labeling visitor. In the case of
>> cross-processor edges the process that owns the target of the edge doesn't
>> necessarily have the distance to the source of the edge available locally.
>> This will lead to incorrect distance labels even though tree and non-tree
>> edges will still be correctly recognized since the BFS is
>> level-synchronized. The fastest way to handle this is to modify the BFS
>> queue to also pass distance data (which is how the shortest paths algorithms
>> do it). You could also send the data separately and probe for it on the
>> receiving side. Check out the distributed betweenness centrality code for
>> how to create a distributed queue which contains tuples of data one element
>> of which is a vertex (in order to know where to route the data).
>> Integrating this type of vertex queue (i.e. one that contains a data
>> payload) with BFS basically just requires handling the data payload on the
>> receiving side.
>>
>> I would try this out. So basically the default call
>
> boost::breadth_first_search
> (g, start,
> boost::visitor(boost::make_bfs_visitor(boost::record_distances(distance,
> boost::on_tree_edge()))));
>
> wouldn't work. I assumed that distributed BFS implementation did exactly
> what you mentioned.
> Although I would write my own, parallel visitor as well per your
> suggestions.
>
> thanks
> Sandeep
>
>
>
> Let me also add that I did modified code that initializes only the
distance to 0th vertex of processor 0. In this case i am facing the dilemma
of what should I pass as start vertex for processes > 0. I can't pass the
global start vertex (probably the type won't be acceptable to
breadth_first_search.

I will dig further into the betweeness centrality and dijkstra code to
figure this out. Appreciate your input.

Thanks
Sandeep