marisa-trie
MARISA: Matching Algorithm with Recursively Implemented StorAge
0.2.6
Matching Algorithm with Recursively Implemented StorAge (MARISA) is a static and space-efficient trie data structure. And libmarisa is a C++ library to provide an implementation of MARISA. Also, the package of libmarisa contains a set of command line tools for building and operating a MARISA-based dictionary.
A MARISA-based dictionary supports not only lookup but also reverse lookup, common prefix search and predictive search.
The biggest advantage of libmarisa is that its dictionary size is considerably more compact than others. See below for the dictionary size of other implementations.
Implementation | Size (bytes) | Remarks |
---|---|---|
darts-clone | 376,613,888 | Compacted double-array trie |
tx-trie | 127,727,058 | LOUDS-based trie |
marisa-trie | 50,753,560 | MARISA trie |
You can get the latest version via git clone
. Then, you can generate a configure
script via autoreconf -i
. After that, you can build and install libmarisa and its command line tools via configure
and make
. For details, see also documentation in docs
.
$ git clone https://github.com/s-yata/marisa-trie.git $ cd marisa-trie $ autoreconf -i $ ./configure --enable-native-code $ make $ make install