b***@gmail.com
2018-08-16 17:32:10 UTC
Hi,
I'm investigating using Cap'n Proto as the basis for a format containing a
large collection of r-tree indexed data. The typical access pattern would
be to query the index resulting in a set of nodes in the tree. The
collection of data would be physically clustered on node indices so that
one can efficiently seek and read the data items for the searched node
indexes.
The recommendations for random access has been to simply use mmap which I
assume would work well in this case but AFAIK it's something that is only
used for files readily available on attached block storage. However, in
this case the full dataset might very well be too large to keep locally
and the preferred access method would be streaming access over network with
the same pattern of random access using index searches.
I'm a C++ novice and I fail to understand if something remotely like this
can be done already with the reference C++ implementation. Indeed, I have
not even been able to understand if it supports sequential streaming access
of a part of a message - it seems assumed that a message is fully read into
RAM, except when using mmap which would then be the only way to partially
read a message (sequential or random). But I do not want to give up yet,
perhaps there is something I'm missing?
Regards,
Björn
I'm investigating using Cap'n Proto as the basis for a format containing a
large collection of r-tree indexed data. The typical access pattern would
be to query the index resulting in a set of nodes in the tree. The
collection of data would be physically clustered on node indices so that
one can efficiently seek and read the data items for the searched node
indexes.
The recommendations for random access has been to simply use mmap which I
assume would work well in this case but AFAIK it's something that is only
used for files readily available on attached block storage. However, in
this case the full dataset might very well be too large to keep locally
and the preferred access method would be streaming access over network with
the same pattern of random access using index searches.
I'm a C++ novice and I fail to understand if something remotely like this
can be done already with the reference C++ implementation. Indeed, I have
not even been able to understand if it supports sequential streaming access
of a part of a message - it seems assumed that a message is fully read into
RAM, except when using mmap which would then be the only way to partially
read a message (sequential or random). But I do not want to give up yet,
perhaps there is something I'm missing?
Regards,
Björn
--
You received this message because you are subscribed to the Google Groups "Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email to capnproto+***@googlegroups.com.
Visit this group at https://groups.google.com/group/capnproto.
You received this message because you are subscribed to the Google Groups "Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email to capnproto+***@googlegroups.com.
Visit this group at https://groups.google.com/group/capnproto.