$include_dir="/home/hyper-archives/boost/include"; include("$include_dir/msg-header.inc") ?>
From: chintan rao (chintanraoh_at_[hidden])
Date: 2008-03-22 22:31:08
Hi,
On Sun, Mar 23, 2008 at 5:37 AM, Phil Endecott <
spam_from_boost_dev_at_[hidden]> wrote:
> chintan rao wrote:
> > I was thinking of implementing trie and various variation of tries,
> namely
> > Ternary trees, Suffix trees and DAWG(Directed Acyclic Word Graphs) , for
> > Google Soc.
>
> That's interesting.  Do you have a motivating application?
Till now i was gathering ideas here at the mailing list and other places.
Will have to prepare a final application.
> I last thought about tries when Boost.Flyweight was being discussed.
> Could your tries be used there?
 I have not looked into Boost.Flyweight but I suppose they can be used :).
At one time I was considering writing an Apache log file analysis
> program.  This would have used some sort of trie or similar structure
> to store the URLs (the aim is to be able to analyse very large logs in
> RAM).  At the time I was considering PATRICIA tries.
Some people had suggested PATRACIA tries. I looked into the algorithm and
added it to my todo list.
Actually even the DAWG idea was suggested by someone else too :). I also
looked in Digital Search Trees :).
> I also have some code that stores large numbers of very similar filenames:
>
> /photos/christmas_2007/img_3277.jpg
> /photos/christmas_2007/img_3278.jpg
> ....
>
> These are currently stored in a binary file which is mmap()ed.  The
> time to read the file is non-trivial and is bad for program start-up
> time.  So perhaps I need a trie that can be stored in the mmap()ed
> file, i.e. not using pointers, like the Boost.Interprocess containers do.
>
> > I was thinking something like this
> >
> > This is the abstract of the interface for trie
> > Interface:
> >     trie<type_2> something;
> >     trie[string key]=type2 value;  // O(key.size());
> >     value trie::delete(string key);       // O(key.size());
> >     trie::iterator trie::find(string key);         // O(key.size())'
> >     int trie::size()         // O(1) time. Gives the total no of
> elements in
> > the trie
> >     trie::iterator trie::begin()     //gives the beginning ie the
> > lexicographically least key :)
> >     trie::iterator trie::end()     //gives a iterator which is same as
> > x=x+trie.size();
> >     string trie::get_key(iterator x) gets the key pointed to by iterator
> x;
> >
> >     trie::iterator x;            //worst case O(height of tree)
> >     x++; // traverses the tree left to right and goes to the "next" node
> >     x--; // traverses the tree right to left and goes to "prev" node
> >
> >     *x // referrences the value of the key x is pointing to.
> >     trie.delete(x) // deletes the key pointed to by x;
> >
> >     trie.count_prefix(string key) // counts the no of keys with prefix
> key
> >
> > Examples:
> >     trie::interator x;
> >     trie<vector<int> > t;
> >     t["hello"].push_back(9);
> >     t["eat"].push_back(10);
> >
> >     x=t.find("hello");
> >     x--; //or --x; x now points to "eat" since eat comes before "hello"
> > lexicographically;
> >     cout<<(*x)[0]<<endl;
> >
> >     x++;//or ++x, x now points to "hello";
> >
> >     x=t.end(); // or x++ will point to t.end().
> >
> >     do
> >     {
> >         x--;
> >         int size=(*x).size();
> >         cout<<t.get_key(x)<<endl;
> >         for(int j=0;j<size;j++)
> >             cout<<(*x)[j]<<" ";
> >         cout<<endl;
> >     }
> >     while(x!=t.begin());
> >
> > Please suggest changes to the interface and other things.
>
>
> I think you should be able to make it (almost) identical to std::map's
> interface.  You should certainly use std::map as your starting point,
> and only deviate from it when you have a good reason to do so.
>
Should be easily possible :).
>
>
> Regards,  Phil.
>
Regards,
Chintan
>
>
>
> _______________________________________________
> Unsubscribe & other changes:
> http://listarchives.boost.org/mailman/listinfo.cgi/boost
>