Discussion:
Need help making pycapnp/capnproto work across python and extension boundaries
(too old to reply)
vitaly numenta
2017-02-15 00:46:59 UTC
Permalink
I am experiencing binary compatibility issues trying to get pycapnp
serialization/deserialization working with C extensions. There appear to be
ABI compatibility issues when passing C++ structs compiled in pycapnp into
our C extensions that are compiled in a different environment.

When serializing an instance of a class that's implemented in NuPIC, we
create a message builder via pycapnp and pass it to the corresponding
instance's write method, which in turn invokes write methods of its own
contained members. This works fine for members whose classes are
implemented in python, but doesn't always work for those implemented in the
nupic.bindings extension due to ABI issues.

For example, when serializing the TemporalMemory class, we might employ the
following sequence:

from nupic.proto import TemporalMemoryProto_capnp

builder = TemporalMemoryProto_capnp.TemporalMemoryProto.new_message()

temporal_memory.write(builder)

Inside TemporalMemory.write(builder), we have something along these lines:

class TemporalMemory(object):
def write(self, builder):
builder.columnDimensions = list(self.columnDimensions)
self.connections.write(builder.connections) # pure python
*self._random.write(builder.random) # C++ Random class from extension*


The Random class that's implemented inside the nupic.bindings extension
needs to rely on our own build of capnproto that's linked into the
extension, but this doesn't seem to be compatible with the object
constructed in pycapnp.

We learned the hard way, after much trial and error, that we can't simply
pass the underlying message builders that were instantiated by pycapnp's
capnp.so module to our own build of capnproto contained in the
nupic.bindigns extension. This was particularly evident when working on the
manylinux wheel for nupic.bindings, which needs to be compiled using the
toolchain and c/c++ runtimes from CentOS-6. This resulted in ABI
incompatibilities when the capnproto code compiled into the extension
attempts to operate on a message builder that was constructed by pycapnp's
build of capnp.so. The message builder instance created by pycapnp's
capnp.so appears corrupted when operated upon by the capnproto code linked
into the extension.


Is there any recommendation for handling this dual Python/C-extension
scenario that avoids the ABI compatibility problem with C++ objects?
--
You received this message because you are subscribed to the Google Groups "Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email to capnproto+***@googlegroups.com.
Visit this group at https://groups.google.com/group/capnproto.
Kenton Varda
2017-02-15 22:59:32 UTC
Permalink
Hi Vitaly,

For ABI compatibility, you'd need pycapnp built against exactly the same
version of Cap'n Proto which you're using elsewhere in the process. Ideally
both would link against the same libcapnp.so, although I *think* loading
two copies of the library should not create problems as long as they are
the same version. (This differs from libprotobuf, which definitely can't
handle being loaded multiple times in the same process.)

You may also need to make sure both copies are built with the same
compiler. We're aware of at least one ABI incompatibility issue between
Clang and GCC that affects Cap'n Proto.

Of course, if you can't make anything work, you can always fall back to
transferring byte buffers, at the expense of possibly needing to make a
copy to merge the sub-messages into one overall message.

-Kenton

On Tue, Feb 14, 2017 at 4:46 PM, vitaly numenta <
Post by vitaly numenta
I am experiencing binary compatibility issues trying to get pycapnp
serialization/deserialization working with C extensions. There appear to be
ABI compatibility issues when passing C++ structs compiled in pycapnp into
our C extensions that are compiled in a different environment.
When serializing an instance of a class that's implemented in NuPIC, we
create a message builder via pycapnp and pass it to the corresponding
instance's write method, which in turn invokes write methods of its own
contained members. This works fine for members whose classes are
implemented in python, but doesn't always work for those implemented in the
nupic.bindings extension due to ABI issues.
For example, when serializing the TemporalMemory class, we might employ
from nupic.proto import TemporalMemoryProto_capnp
builder = TemporalMemoryProto_capnp.TemporalMemoryProto.new_message()
temporal_memory.write(builder)
builder.columnDimensions = list(self.columnDimensions)
self.connections.write(builder.connections) # pure python
*self._random.write(builder.random) # C++ Random class from extension*
The Random class that's implemented inside the nupic.bindings extension
needs to rely on our own build of capnproto that's linked into the
extension, but this doesn't seem to be compatible with the object
constructed in pycapnp.
We learned the hard way, after much trial and error, that we can't simply
pass the underlying message builders that were instantiated by pycapnp's
capnp.so module to our own build of capnproto contained in the
nupic.bindigns extension. This was particularly evident when working on the
manylinux wheel for nupic.bindings, which needs to be compiled using the
toolchain and c/c++ runtimes from CentOS-6. This resulted in ABI
incompatibilities when the capnproto code compiled into the extension
attempts to operate on a message builder that was constructed by pycapnp's
build of capnp.so. The message builder instance created by pycapnp's
capnp.so appears corrupted when operated upon by the capnproto code linked
into the extension.
Is there any recommendation for handling this dual Python/C-extension
scenario that avoids the ABI compatibility problem with C++ objects?
--
You received this message because you are subscribed to the Google Groups
"Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at https://groups.google.com/group/capnproto.
--
You received this message because you are subscribed to the Google Groups "Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email to capnproto+***@googlegroups.com.
Visit this group at https://groups.google.com/group/capnproto.
vitaly numenta
2017-02-16 02:18:13 UTC
Permalink
Unfortunately, we don't have any control over the toolchain and toolchain
version or the compilation/linking flags that pycapnp uses, since it's a
3rd-party project for us. We also don't have any control over which version
of capnproto a given version pf pycapnp would use or which patches it might
apply to it, if needed. pycapnp evolves completely independently from our
software. Furthermore, pycapnp is built automatically upon installation
from PyPi, using whatever version/type compiler that the user happens to
have. E.g., pip install pycapnp==0.5.8 automatically downloads and builds
pycapnp, including the build of some version of capnproto module that's
specific to that version of pycapnp.

Our own software is the nupic.bindings package distributed as binary wheels
for windows, osx, and linux. The linux wheel in particular is a "manylinux"
wheel per PEP-513, which is built by definition on an old CentOS system
using an old toolchain.

As you can see, there is absolutely no way to guarantee ABI compatibility
between the compiled version of capnproto in pycapnp and the one used by
the python extensions in nupic.bindings. It's well known that C++ is not a
robust interface, since - unlike C - it does not have a stable ABI. There
is variability introduced by version of ABI supported by a given toolchain
as well as the type of toolchain (e.g., clang vs. g++). Compatibility is
also affected by build flags. Finally, the version of capnproto sources in
pycapnp is outside our control and could easily differ from the version of
capnproto compiled/linked into the nupic.bindings extension.
--
You received this message because you are subscribed to the Google Groups "Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email to capnproto+***@googlegroups.com.
Visit this group at https://groups.google.com/group/capnproto.
Scott Purdy
2017-02-16 20:51:55 UTC
Permalink
Kenton, thanks for helping bring some clarity to this. It sounds like our
two options are:

1. Require pycapnp and our extensions to be compiled in the same
environment. We could potentially do this. We could make the install
process easy for end users by forking pycapnp and putting wheels up on PyPI
but we'd like to avoid that if possible.
2. Pass the byte buffers, incurring a memory copy for anything that we pass
across the boundary.

I'd like to explore #2 a bit more. Would this involve extracting the
segments from the pycapnp builder/reader, passing that to our extension,
and constructing a new builder/reader around the byte buffer? Or would we
have to construct a new message in the extension, pass the segments from
that back and find a way to copy that buffer into the pycapnp message
builder/reader?

We are also happy to put together a little demo project once we figure this
out so others that want to do something similar have a starting place.
Post by Kenton Varda
Hi Vitaly,
For ABI compatibility, you'd need pycapnp built against exactly the same
version of Cap'n Proto which you're using elsewhere in the process. Ideally
both would link against the same libcapnp.so, although I *think* loading
two copies of the library should not create problems as long as they are
the same version. (This differs from libprotobuf, which definitely can't
handle being loaded multiple times in the same process.)
You may also need to make sure both copies are built with the same
compiler. We're aware of at least one ABI incompatibility issue between
Clang and GCC that affects Cap'n Proto.
Of course, if you can't make anything work, you can always fall back to
transferring byte buffers, at the expense of possibly needing to make a
copy to merge the sub-messages into one overall message.
-Kenton
Post by vitaly numenta
I am experiencing binary compatibility issues trying to get pycapnp
serialization/deserialization working with C extensions. There appear to be
ABI compatibility issues when passing C++ structs compiled in pycapnp into
our C extensions that are compiled in a different environment.
When serializing an instance of a class that's implemented in NuPIC, we
create a message builder via pycapnp and pass it to the corresponding
instance's write method, which in turn invokes write methods of its own
contained members. This works fine for members whose classes are
implemented in python, but doesn't always work for those implemented in the
nupic.bindings extension due to ABI issues.
For example, when serializing the TemporalMemory class, we might employ
from nupic.proto import TemporalMemoryProto_capnp
builder = TemporalMemoryProto_capnp.TemporalMemoryProto.new_message()
temporal_memory.write(builder)
builder.columnDimensions = list(self.columnDimensions)
self.connections.write(builder.connections) # pure python
*self._random.write(builder.random) # C++ Random class from extension*
The Random class that's implemented inside the nupic.bindings extension
needs to rely on our own build of capnproto that's linked into the
extension, but this doesn't seem to be compatible with the object
constructed in pycapnp.
We learned the hard way, after much trial and error, that we can't simply
pass the underlying message builders that were instantiated by pycapnp's
capnp.so module to our own build of capnproto contained in the
nupic.bindigns extension. This was particularly evident when working on the
manylinux wheel for nupic.bindings, which needs to be compiled using the
toolchain and c/c++ runtimes from CentOS-6. This resulted in ABI
incompatibilities when the capnproto code compiled into the extension
attempts to operate on a message builder that was constructed by pycapnp's
build of capnp.so. The message builder instance created by pycapnp's
capnp.so appears corrupted when operated upon by the capnproto code linked
into the extension.
Is there any recommendation for handling this dual Python/C-extension
scenario that avoids the ABI compatibility problem with C++ objects?
--
You received this message because you are subscribed to the Google Groups
"Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at https://groups.google.com/group/capnproto.
--
You received this message because you are subscribed to the Google Groups "Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email to capnproto+***@googlegroups.com.
Visit this group at https://groups.google.com/group/capnproto.
Kenton Varda
2017-02-16 22:29:00 UTC
Permalink
Post by Scott Purdy
Kenton, thanks for helping bring some clarity to this. It sounds like our
1. Require pycapnp and our extensions to be compiled in the same
environment. We could potentially do this. We could make the install
process easy for end users by forking pycapnp and putting wheels up on PyPI
but we'd like to avoid that if possible.
I would argue that pycapnp should somehow export its version of libcapnp so
that other Python extensions that also use libcapnp are able to reuse the
same one. It makes sense for any Python extension that uses libcapnp.so to
declare a dependency on pycapnp, I would think.

But I have no idea what this looks like logistically.

2. Pass the byte buffers, incurring a memory copy for anything that we pass
Post by Scott Purdy
across the boundary.
I'd like to explore #2 a bit more. Would this involve extracting the
segments from the pycapnp builder/reader, passing that to our extension,
and constructing a new builder/reader around the byte buffer? Or would we
have to construct a new message in the extension, pass the segments from
that back and find a way to copy that buffer into the pycapnp message
builder/reader?
There's no good way to share builders, since there would be no way for them
to synchronize memory allocation. So, once a buffer has been passed, it
needs to be read-only.

If you are trying to build a message in Python code but have one branch of
the message be built in C++ code, I think what you'll need to do is create
a brand new MessageBuilder in C++, build just the C++ branch of the message
there, and then pass this message to Python. In Python, you could read the
message with a MessageReader and then copy the contents into the branch of
the final message. This is where the copy is incurred -- when moving data
from one message into another message. Presumably you can transmit
individual messages between languages without any copies.

-Kenton
Post by Scott Purdy
We are also happy to put together a little demo project once we figure
this out so others that want to do something similar have a starting place.
Post by Kenton Varda
Hi Vitaly,
For ABI compatibility, you'd need pycapnp built against exactly the same
version of Cap'n Proto which you're using elsewhere in the process. Ideally
both would link against the same libcapnp.so, although I *think* loading
two copies of the library should not create problems as long as they are
the same version. (This differs from libprotobuf, which definitely can't
handle being loaded multiple times in the same process.)
You may also need to make sure both copies are built with the same
compiler. We're aware of at least one ABI incompatibility issue between
Clang and GCC that affects Cap'n Proto.
Of course, if you can't make anything work, you can always fall back to
transferring byte buffers, at the expense of possibly needing to make a
copy to merge the sub-messages into one overall message.
-Kenton
Post by vitaly numenta
I am experiencing binary compatibility issues trying to get pycapnp
serialization/deserialization working with C extensions. There appear to be
ABI compatibility issues when passing C++ structs compiled in pycapnp into
our C extensions that are compiled in a different environment.
When serializing an instance of a class that's implemented in NuPIC, we
create a message builder via pycapnp and pass it to the corresponding
instance's write method, which in turn invokes write methods of its own
contained members. This works fine for members whose classes are
implemented in python, but doesn't always work for those implemented in the
nupic.bindings extension due to ABI issues.
For example, when serializing the TemporalMemory class, we might employ
from nupic.proto import TemporalMemoryProto_capnp
builder = TemporalMemoryProto_capnp.TemporalMemoryProto.new_message()
temporal_memory.write(builder)
builder.columnDimensions = list(self.columnDimensions)
self.connections.write(builder.connections) # pure python
*self._random.write(builder.random) # C++ Random class from extension*
The Random class that's implemented inside the nupic.bindings extension
needs to rely on our own build of capnproto that's linked into the
extension, but this doesn't seem to be compatible with the object
constructed in pycapnp.
We learned the hard way, after much trial and error, that we can't
simply pass the underlying message builders that were instantiated by
pycapnp's capnp.so module to our own build of capnproto contained in the
nupic.bindigns extension. This was particularly evident when working on the
manylinux wheel for nupic.bindings, which needs to be compiled using the
toolchain and c/c++ runtimes from CentOS-6. This resulted in ABI
incompatibilities when the capnproto code compiled into the extension
attempts to operate on a message builder that was constructed by pycapnp's
build of capnp.so. The message builder instance created by pycapnp's
capnp.so appears corrupted when operated upon by the capnproto code linked
into the extension.
Is there any recommendation for handling this dual Python/C-extension
scenario that avoids the ABI compatibility problem with C++ objects?
--
You received this message because you are subscribed to the Google
Groups "Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send
Visit this group at https://groups.google.com/group/capnproto.
--
You received this message because you are subscribed to the Google Groups
"Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at https://groups.google.com/group/capnproto.
--
You received this message because you are subscribed to the Google Groups "Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email to capnproto+***@googlegroups.com.
Visit this group at https://groups.google.com/group/capnproto.
vitaly numenta
2017-02-17 00:27:15 UTC
Permalink
Hi Kenton - thank you for the insights. Regarding

In Python, you could read the message with a MessageReader and then copy
the contents into the branch of the final message.
So, the Python side would have a MessageBuilder. To read the branch message
with a MessageReader in python, we would use the from_segments method,
which would make use of the SegmentReader under the covers. However, I am a
bit stomped how to do the last part, namely how to copy the contents of the
MessageReader into the branch of the final message. I couldn't find the way
to do that in both pycapnp and capnproto. How would you copy the contents
of the MessageReader into a branch of a MessageBuilder in capnproto in c++?

Many thanks!

Vitaly
--
You received this message because you are subscribed to the Google Groups "Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email to capnproto+***@googlegroups.com.
Visit this group at https://groups.google.com/group/capnproto.
Kenton Varda
2017-02-27 23:23:40 UTC
Permalink
Post by vitaly numenta
Hi Kenton - thank you for the insights. Regarding
In Python, you could read the message with a MessageReader and then copy
the contents into the branch of the final message.
So, the Python side would have a MessageBuilder. To read the branch
message with a MessageReader in python, we would use the from_segments
method, which would make use of the SegmentReader under the covers.
However, I am a bit stomped how to do the last part, namely how to copy the
contents of the MessageReader into the branch of the final message. I
couldn't find the way to do that in both pycapnp and capnproto. How would
you copy the contents of the MessageReader into a branch of a
MessageBuilder in capnproto in c++?
Does a regular assignment work?

parent_struct_builder.some_field = some_struct_reader

In C++, we generate a "setFoo()" method for each struct-typed field that
takes a Reader as the parameter and makes a copy.

-Kenton
--
You received this message because you are subscribed to the Google Groups "Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email to capnproto+***@googlegroups.com.
Visit this group at https://groups.google.com/group/capnproto.
vitaly numenta
2017-03-01 21:14:29 UTC
Permalink
Hi Kenton, thank you for the answer concerning Builder field assignment
from a Reader. I will check whether this works in pycapnp out of the box,
but knowing that it's already supported in C++ is very encouraging.
Post by Kenton Varda
Post by vitaly numenta
Hi Kenton - thank you for the insights. Regarding
In Python, you could read the message with a MessageReader and then copy
the contents into the branch of the final message.
So, the Python side would have a MessageBuilder. To read the branch
message with a MessageReader in python, we would use the from_segments
method, which would make use of the SegmentReader under the covers.
However, I am a bit stomped how to do the last part, namely how to copy the
contents of the MessageReader into the branch of the final message. I
couldn't find the way to do that in both pycapnp and capnproto. How would
you copy the contents of the MessageReader into a branch of a
MessageBuilder in capnproto in c++?
Does a regular assignment work?
parent_struct_builder.some_field = some_struct_reader
In C++, we generate a "setFoo()" method for each struct-typed field that
takes a Reader as the parameter and makes a copy.
-Kenton
--
You received this message because you are subscribed to the Google Groups "Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email to capnproto+***@googlegroups.com.
Visit this group at https://groups.google.com/group/capnproto.
v***@numenta.com
2017-04-25 18:17:40 UTC
Permalink
Hi Kenton, I thought I was almost there, but got stuck here:

One of the use cases involves a "Network" class in C++ python extension
code that needs to serialize several subordinate "region" instances, some
of which are implemented in C++ and some in Python. I am having a problem
with the latter. To demonstrate that specific problem, I defined the
following schemas:

struct NetworkProto {
region @2 : RegionProto;
}

struct RegionProto {
# This stores the data for the RegionImpl. This will be a PyRegionProto
# instance if it is a PyRegion.
regionImpl @0 :AnyPointer;
}

struct PyRegionProto {
regionImpl @0 :AnyPointer;
}

As you recommended, we're passing byte buffers between the python and C++
layers. In this case, I have the C++ method _writePyRegion in the extension
that makes the call into the python layer and converts the bytes returned
by the python layer into `PyRegionProto::Reader`: `PyRegionProto::Reader
Network::_writePyRegion()`.

Then, the following higher level method attempts to stuff the result of
`Network::_writePyRegion` into "NetworkProto:: RegionProto:: regionImpl",
but the compilation fails with "*error: **no member named 'setRegionImpl'
in 'RegionProto::Builder'*":

void Network::write(NetworkProto::Builder& proto) const
{
// Serialize the python region
auto regionProto = proto.initRegion();
regionProto.setRegionImpl(_writePyRegion()); // copy
}
--
You received this message because you are subscribed to the Google Groups "Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email to capnproto+***@googlegroups.com.
Visit this group at https://groups.google.com/group/capnproto.
'Kenton Varda' via Cap'n Proto
2017-04-25 18:28:30 UTC
Permalink
Hi,

Since regionImpl is an AnyPointer, it doesn't have a direct setter.
Instead, do:

regionProto.getRegionImpl().setAs<PyRegionProto>(_writePyRegion());

-Kenton
Post by v***@numenta.com
One of the use cases involves a "Network" class in C++ python extension
code that needs to serialize several subordinate "region" instances, some
of which are implemented in C++ and some in Python. I am having a problem
with the latter. To demonstrate that specific problem, I defined the
struct NetworkProto {
}
struct RegionProto {
# This stores the data for the RegionImpl. This will be a PyRegionProto
# instance if it is a PyRegion.
}
struct PyRegionProto {
}
As you recommended, we're passing byte buffers between the python and C++
layers. In this case, I have the C++ method _writePyRegion in the extension
that makes the call into the python layer and converts the bytes returned
by the python layer into `PyRegionProto::Reader`: `PyRegionProto::Reader
Network::_writePyRegion()`.
Then, the following higher level method attempts to stuff the result of
`Network::_writePyRegion` into "NetworkProto:: RegionProto:: regionImpl",
but the compilation fails with "*error: **no member named 'setRegionImpl'
void Network::write(NetworkProto::Builder& proto) const
{
// Serialize the python region
auto regionProto = proto.initRegion();
regionProto.setRegionImpl(_writePyRegion()); // copy
}
--
You received this message because you are subscribed to the Google Groups
"Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at https://groups.google.com/group/capnproto.
--
You received this message because you are subscribed to the Google Groups "Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email to capnproto+***@googlegroups.com.
Visit this group at https://groups.google.com/group/capnproto.
v***@numenta.com
2017-04-25 18:46:35 UTC
Permalink
And we have compilation with
`regionProto.getRegionImpl().setAs<PyRegionProto>(_writePyRegion());` !

Many thanks Kenton!
--
You received this message because you are subscribed to the Google Groups "Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email to capnproto+***@googlegroups.com.
Visit this group at https://groups.google.com/group/capnproto.
v***@numenta.com
2017-04-26 21:35:01 UTC
Permalink
Hi Kenton, I have good news - my basic prototype of serializing across
C++/Python boundaries (in both directions) via capnp byte buffer passing is
working. I am shifting to optimizing memory utilization. In NuPIC, our
core machine learning algorithm objects may get huge - upwards of GBs, and
we run many on same machine, serializing periodically. So, memory
utilization is critical for minimizing compute resource cost.

Presently, I am focusing on the C++ extension => python deserialization
control flow. In this scenario, the C++ python extension layer has a
message reader that contains a python object encoding. So, we need to
extract the byte buffer representing the python-native object in the C++
code in order to pass it to Python layer. This is what the relevant code in
C++ looks like:

PyObject* Network::_readPyRegion(const std::string& moduleName,
const std::string& className,
const RegionProto::Reader& proto)
{
// Extract data bytes from reader to pass to python layer
capnp::MallocMessageBuilder builder;

builder.setRoot(pyRegionImplProto); // copy

auto array = capnp::messageToFlatArray(builder); // copy

// Copy from array to PyObject so that we can pass it to the Python layer
py::String pyRegionImplBytes((const char *)array.begin(),
sizeof(capnp::word)*array.size()); // copy

}

As you can see, this involves a lot of copies of potentially huge amounts
of data. The python layer will then reconstruct a reader from those bytes
using pycapnp (yet another copy).

Ideally, I would like to extract the data segment(s) directly from
RegionProto::Reader, but that doesn't appear to be supported. I think that
we need to find/create some way to handle this efficiently in order to
support serialization/deserialization across C++/Python boundaries.

Thank you,
Vitaly
--
You received this message because you are subscribed to the Google Groups "Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email to capnproto+***@googlegroups.com.
Visit this group at https://groups.google.com/group/capnproto.
v***@numenta.com
2017-04-26 21:40:08 UTC
Permalink
Here is more complete C++ code snippet for my prior post:

PyObject* Network::_readPyRegion(const std::string& moduleName,
const std::string& className,
const RegionProto::Reader& proto)
{
capnp::AnyPointer::Reader implProto = proto.getRegionImpl();

PyRegionProto::Reader pyRegionImplProto =
implProto.getAs<PyRegionProto>(); // no copy here, right?

// Extract data bytes from reader to pass to python layer

capnp::MallocMessageBuilder builder;
builder.setRoot(pyRegionImplProto); // copy
auto array = capnp::messageToFlatArray(builder); // copy
// Copy from array to PyObject so that we can pass it to the Python layer
py::String pyRegionImplBytes((const char *)array.begin(),
sizeof(capnp::word)*array.size()); // copy

}
--
You received this message because you are subscribed to the Google Groups "Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email to capnproto+***@googlegroups.com.
Visit this group at https://groups.google.com/group/capnproto.
'Kenton Varda' via Cap'n Proto
2017-04-26 22:44:16 UTC
Permalink
Hi,

I think what you want here is for pycapnp to be extended with some API that
other Python extensions can use to interact with it in order to wrap and
unwrap builders. pycapnp builders are actually wrapping a
capnp::DynamicStruct::Builder under the hood, which is easy to cast back
and forth to your native builder type. You just need pycapnp to give you
access somehow.

I unfortunately do not know very much about how pycapnp and cython work, so
I'm not sure I can help. This may be a question for Jason Paryani.

By the way, if you guys are in the Bay Area, you should come to our Cap'n
Proto 0.6 release party on May 18 at Cloudflare:
https://www.meetup.com/Sandstorm-SF-Bay-Area/events/239341254/

-Kenton
Post by v***@numenta.com
PyObject* Network::_readPyRegion(const std::string& moduleName,
const std::string& className,
const RegionProto::Reader& proto)
{
capnp::AnyPointer::Reader implProto = proto.getRegionImpl();
PyRegionProto::Reader pyRegionImplProto = implProto.getAs<PyRegionProto>();
// no copy here, right?
// Extract data bytes from reader to pass to python layer
capnp::MallocMessageBuilder builder;
builder.setRoot(pyRegionImplProto); // copy
auto array = capnp::messageToFlatArray(builder); // copy
// Copy from array to PyObject so that we can pass it to the Python layer
py::String pyRegionImplBytes((const char *)array.begin(),
sizeof(capnp::word)*array.size()); // copy
}
--
You received this message because you are subscribed to the Google Groups
"Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at https://groups.google.com/group/capnproto.
--
You received this message because you are subscribed to the Google Groups "Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email to capnproto+***@googlegroups.com.
Visit this group at https://groups.google.com/group/capnproto.
vitaly numenta
2017-05-08 23:38:16 UTC
Permalink
pycapnp builders are actually wrapping a capnp::DynamicStruct::Builder
under the hood, which is easy to cast back and forth to your native builder
type. You just need pycapnp to give you access somehow.
Dear Kenton, regarding the above: we're working with your earlier
suggestion to pass byte buffers across Python and C++ extension
environments. We believe that this results in a more robust and portable
implementation, since we have no control over which version of pycapnp the
user desires to use, including which version of capnproto that pycapnp
includes, and the compiler toolchain that built that pycapnp's capnproto
.so on the user's machine, which build flags, etc. versus the build of
capnproto in our own binary wheel containing our python extension.

To this end, we often need to convert between capnproto readers/builders
and flat array encodings (from messageToFlatArray) encapsulated as python
byte string. Since our machine learning models may be huge (GBs), the
multiple levels of copying is prohibitively expensive in memory resources
(and possibly in time). So, it's pertinent to eliminate as many levels of
copying as possible. Presently, pycapnp only exposes `to_bytes`, which is a
method that extracts data bytes from a builder via
`capnp::messageToFlatArray` and then copies to a python byte string.
Unfortunately, capnproto doesn't provide `capnp::messageToFlatArray` for
readers, so when a reader is involved, yet another level of copy is
necessitated to convert the reader to a build before applying
`capnp::messageToFlatArray`.

I believe that the problem is not unique to our extension, and anyone
attempting to implement this type of binding would run against this issues,
especially if they are cognizant of the memory resource and performance
implications.

Ideally, I think it would be great to be able to use something like
`capnp::messageToFlatArray` on readers as well as builders and also have it
copy the output efficiently to a user-provided byte-aligned buffer instead
of returning `kj::Array<capnp::word>`. This way, several levels of copying
would be eliminated, and instantaneous memory utilization would be cut
several-fold.
--
You received this message because you are subscribed to the Google Groups "Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email to capnproto+***@googlegroups.com.
Visit this group at https://groups.google.com/group/capnproto.
Kenton Varda
2017-06-04 22:27:53 UTC
Permalink
Hi Vitaly,

You can direct Cap'n Proto to write bytes to an arbitrary target by
creating a custom subclass of kj::OutputStream which does whatever you
need, then pass that to capnp::writeMessage().

You can also use MessageBuilder::getSegmentsForOutput() to get direct
pointers to the message content without any copies. You can construct a
SegmentArrayMessageReader from these segments elsewhere to read them.

It sounds like the limitations here are on the Python side, which I don't
know very much about.

-Kenton

On Mon, May 8, 2017 at 4:38 PM, vitaly numenta <
pycapnp builders are actually wrapping a capnp::DynamicStruct::Builder
under the hood, which is easy to cast back and forth to your native builder
type. You just need pycapnp to give you access somehow.
Dear Kenton, regarding the above: we're working with your earlier
suggestion to pass byte buffers across Python and C++ extension
environments. We believe that this results in a more robust and portable
implementation, since we have no control over which version of pycapnp the
user desires to use, including which version of capnproto that pycapnp
includes, and the compiler toolchain that built that pycapnp's capnproto
.so on the user's machine, which build flags, etc. versus the build of
capnproto in our own binary wheel containing our python extension.
To this end, we often need to convert between capnproto readers/builders
and flat array encodings (from messageToFlatArray) encapsulated as python
byte string. Since our machine learning models may be huge (GBs), the
multiple levels of copying is prohibitively expensive in memory resources
(and possibly in time). So, it's pertinent to eliminate as many levels of
copying as possible. Presently, pycapnp only exposes `to_bytes`, which is a
method that extracts data bytes from a builder via
`capnp::messageToFlatArray` and then copies to a python byte string.
Unfortunately, capnproto doesn't provide `capnp::messageToFlatArray` for
readers, so when a reader is involved, yet another level of copy is
necessitated to convert the reader to a build before applying
`capnp::messageToFlatArray`.
I believe that the problem is not unique to our extension, and anyone
attempting to implement this type of binding would run against this issues,
especially if they are cognizant of the memory resource and performance
implications.
Ideally, I think it would be great to be able to use something like
`capnp::messageToFlatArray` on readers as well as builders and also have it
copy the output efficiently to a user-provided byte-aligned buffer instead
of returning `kj::Array<capnp::word>`. This way, several levels of copying
would be eliminated, and instantaneous memory utilization would be cut
several-fold.
--
You received this message because you are subscribed to the Google Groups
"Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at https://groups.google.com/group/capnproto.
--
You received this message because you are subscribed to the Google Groups "Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email to capnproto+***@googlegroups.com.
Visit this group at https://groups.google.com/group/capnproto.
Scott Purdy
2017-02-17 00:30:50 UTC
Permalink
Post by Kenton Varda
Post by Scott Purdy
Kenton, thanks for helping bring some clarity to this. It sounds like our
1. Require pycapnp and our extensions to be compiled in the same
environment. We could potentially do this. We could make the install
process easy for end users by forking pycapnp and putting wheels up on PyPI
but we'd like to avoid that if possible.
I would argue that pycapnp should somehow export its version of libcapnp
so that other Python extensions that also use libcapnp are able to reuse
the same one. It makes sense for any Python extension that uses libcapnp.so
to declare a dependency on pycapnp, I would think.
But I have no idea what this looks like logistically.
I think it would be great to have pycapnp export a pure-C interface. Numpy,
for instance, has Python function call that returns the paths needed to
enable you to build extensions against numpy. But I'm pretty sure this has
to be pure-C to avoid ABI issues if you don't want to enforce that
everything is built which the exact same toolchain. Do you think that is
possible with capnp? My understanding was that a C interface wasn't on the
roadmap but perhaps a more limited interface for this specific use case
wouldn't be quite as much work.
Post by Kenton Varda
2. Pass the byte buffers, incurring a memory copy for anything that we
Post by Scott Purdy
pass across the boundary.
I'd like to explore #2 a bit more. Would this involve extracting the
segments from the pycapnp builder/reader, passing that to our extension,
and constructing a new builder/reader around the byte buffer? Or would we
have to construct a new message in the extension, pass the segments from
that back and find a way to copy that buffer into the pycapnp message
builder/reader?
There's no good way to share builders, since there would be no way for
them to synchronize memory allocation. So, once a buffer has been passed,
it needs to be read-only.
If you are trying to build a message in Python code but have one branch of
the message be built in C++ code, I think what you'll need to do is create
a brand new MessageBuilder in C++, build just the C++ branch of the message
there, and then pass this message to Python. In Python, you could read the
message with a MessageReader and then copy the contents into the branch of
the final message. This is where the copy is incurred -- when moving data
from one message into another message. Presumably you can transmit
individual messages between languages without any copies.
-Kenton
Post by Scott Purdy
We are also happy to put together a little demo project once we figure
this out so others that want to do something similar have a starting place.
Post by Kenton Varda
Hi Vitaly,
For ABI compatibility, you'd need pycapnp built against exactly the same
version of Cap'n Proto which you're using elsewhere in the process. Ideally
both would link against the same libcapnp.so, although I *think* loading
two copies of the library should not create problems as long as they are
the same version. (This differs from libprotobuf, which definitely can't
handle being loaded multiple times in the same process.)
You may also need to make sure both copies are built with the same
compiler. We're aware of at least one ABI incompatibility issue between
Clang and GCC that affects Cap'n Proto.
Of course, if you can't make anything work, you can always fall back to
transferring byte buffers, at the expense of possibly needing to make a
copy to merge the sub-messages into one overall message.
-Kenton
Post by vitaly numenta
I am experiencing binary compatibility issues trying to get pycapnp
serialization/deserialization working with C extensions. There appear to be
ABI compatibility issues when passing C++ structs compiled in pycapnp into
our C extensions that are compiled in a different environment.
When serializing an instance of a class that's implemented in NuPIC, we
create a message builder via pycapnp and pass it to the corresponding
instance's write method, which in turn invokes write methods of its
own contained members. This works fine for members whose classes are
implemented in python, but doesn't always work for those implemented in the
nupic.bindings extension due to ABI issues.
For example, when serializing the TemporalMemory class, we might employ
from nupic.proto import TemporalMemoryProto_capnp
builder = TemporalMemoryProto_capnp.TemporalMemoryProto.new_message()
temporal_memory.write(builder)
builder.columnDimensions = list(self.columnDimensions)
self.connections.write(builder.connections) # pure python
*self._random.write(builder.random) # C++ Random class from extension*
The Random class that's implemented inside the nupic.bindings extension
needs to rely on our own build of capnproto that's linked into the
extension, but this doesn't seem to be compatible with the object
constructed in pycapnp.
We learned the hard way, after much trial and error, that we can't
simply pass the underlying message builders that were instantiated by
pycapnp's capnp.so module to our own build of capnproto contained in the
nupic.bindigns extension. This was particularly evident when working on the
manylinux wheel for nupic.bindings, which needs to be compiled using the
toolchain and c/c++ runtimes from CentOS-6. This resulted in ABI
incompatibilities when the capnproto code compiled into the extension
attempts to operate on a message builder that was constructed by pycapnp's
build of capnp.so. The message builder instance created by pycapnp's
capnp.so appears corrupted when operated upon by the capnproto code linked
into the extension.
Is there any recommendation for handling this dual Python/C-extension
scenario that avoids the ABI compatibility problem with C++ objects?
--
You received this message because you are subscribed to the Google
Groups "Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send
Visit this group at https://groups.google.com/group/capnproto.
--
You received this message because you are subscribed to the Google Groups
"Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at https://groups.google.com/group/capnproto.
--
You received this message because you are subscribed to a topic in the
Google Groups "Cap'n Proto" group.
To unsubscribe from this topic, visit https://groups.google.com/d/
topic/capnproto/MG9RijMCpHo/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
Visit this group at https://groups.google.com/group/capnproto.
--
You received this message because you are subscribed to the Google Groups "Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email to capnproto+***@googlegroups.com.
Visit this group at https://groups.google.com/group/capnproto.
Hedge Hog
2017-05-04 00:00:15 UTC
Permalink
Hi,
I'm contemplating working on the Ruby binding. It seems reasonable to
anticipate that I or others will strike this same issue. Some further
questions below...
Post by Kenton Varda
Post by Scott Purdy
Kenton, thanks for helping bring some clarity to this. It sounds like our
1. Require pycapnp and our extensions to be compiled in the same
environment. We could potentially do this. We could make the install
process easy for end users by forking pycapnp and putting wheels up on PyPI
but we'd like to avoid that if possible.
I would argue that pycapnp should somehow export its version of libcapnp
so that other Python extensions that also use libcapnp are able to reuse
the same one. It makes sense for any Python extension that uses libcapnp.so
to declare a dependency on pycapnp, I would think.
I'm pretty sure I don't understand this correctly ;)

Is it correct that issue only applies to CP's struct types (the case cited
in the OP)?
So when using all the other CP types we're good to go across different
environments?
I recall from the distant past some sensitivity issues around ABI
compatibility and `enum` types.
Now I'm not sure if the enum in CP's language maps that closely to the
compiler's `enum`, and if they too will expose the issue raised here.

I know it is a lot to ask, but could the doc here [1] be updated to warn
users of these issues for each of CP's types?

Is guidance to users as simple as 'use only the built in types in your
messages to minimise ABI compatibility risks/issues'?
i.e. are `List`, `Data` and `Text` subject to this same issue?

[1]: https://capnproto.org/language.html#interfaces

Best wishes
Post by Kenton Varda
But I have no idea what this looks like logistically.
2. Pass the byte buffers, incurring a memory copy for anything that we
Post by Scott Purdy
pass across the boundary.
I'd like to explore #2 a bit more. Would this involve extracting the
segments from the pycapnp builder/reader, passing that to our extension,
and constructing a new builder/reader around the byte buffer? Or would we
have to construct a new message in the extension, pass the segments from
that back and find a way to copy that buffer into the pycapnp message
builder/reader?
There's no good way to share builders, since there would be no way for
them to synchronize memory allocation. So, once a buffer has been passed,
it needs to be read-only.
If you are trying to build a message in Python code but have one branch of
the message be built in C++ code, I think what you'll need to do is create
a brand new MessageBuilder in C++, build just the C++ branch of the message
there, and then pass this message to Python. In Python, you could read the
message with a MessageReader and then copy the contents into the branch of
the final message. This is where the copy is incurred -- when moving data
from one message into another message. Presumably you can transmit
individual messages between languages without any copies.
-Kenton
Post by Scott Purdy
We are also happy to put together a little demo project once we figure
this out so others that want to do something similar have a starting place.
Post by Kenton Varda
Hi Vitaly,
For ABI compatibility, you'd need pycapnp built against exactly the same
version of Cap'n Proto which you're using elsewhere in the process. Ideally
both would link against the same libcapnp.so, although I *think* loading
two copies of the library should not create problems as long as they are
the same version. (This differs from libprotobuf, which definitely can't
handle being loaded multiple times in the same process.)
You may also need to make sure both copies are built with the same
compiler. We're aware of at least one ABI incompatibility issue between
Clang and GCC that affects Cap'n Proto.
Of course, if you can't make anything work, you can always fall back to
transferring byte buffers, at the expense of possibly needing to make a
copy to merge the sub-messages into one overall message.
-Kenton
Post by vitaly numenta
I am experiencing binary compatibility issues trying to get pycapnp
serialization/deserialization working with C extensions. There appear to be
ABI compatibility issues when passing C++ structs compiled in pycapnp into
our C extensions that are compiled in a different environment.
When serializing an instance of a class that's implemented in NuPIC, we
create a message builder via pycapnp and pass it to the corresponding
instance's write method, which in turn invokes write methods of its
own contained members. This works fine for members whose classes are
implemented in python, but doesn't always work for those implemented in the
nupic.bindings extension due to ABI issues.
For example, when serializing the TemporalMemory class, we might employ
from nupic.proto import TemporalMemoryProto_capnp
builder = TemporalMemoryProto_capnp.TemporalMemoryProto.new_message()
temporal_memory.write(builder)
builder.columnDimensions = list(self.columnDimensions)
self.connections.write(builder.connections) # pure python
*self._random.write(builder.random) # C++ Random class from extension*
The Random class that's implemented inside the nupic.bindings extension
needs to rely on our own build of capnproto that's linked into the
extension, but this doesn't seem to be compatible with the object
constructed in pycapnp.
We learned the hard way, after much trial and error, that we can't
simply pass the underlying message builders that were instantiated by
pycapnp's capnp.so module to our own build of capnproto contained in the
nupic.bindigns extension. This was particularly evident when working on the
manylinux wheel for nupic.bindings, which needs to be compiled using the
toolchain and c/c++ runtimes from CentOS-6. This resulted in ABI
incompatibilities when the capnproto code compiled into the extension
attempts to operate on a message builder that was constructed by pycapnp's
build of capnp.so. The message builder instance created by pycapnp's
capnp.so appears corrupted when operated upon by the capnproto code linked
into the extension.
Is there any recommendation for handling this dual Python/C-extension
scenario that avoids the ABI compatibility problem with C++ objects?
--
You received this message because you are subscribed to the Google
Groups "Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send
Visit this group at https://groups.google.com/group/capnproto.
--
You received this message because you are subscribed to the Google Groups
"Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at https://groups.google.com/group/capnproto.
--
You received this message because you are subscribed to the Google Groups "Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email to capnproto+***@googlegroups.com.
Visit this group at https://groups.google.com/group/capnproto.
Kenton Varda
2017-05-04 05:21:02 UTC
Permalink
Hi,

I'm not sure I understand your message.

The Cap'n Proto encoding is binary-compatible across all implementations
(it wouldn't be a very good serialization format otherwise).

The ABI issue we're discussing here is that of the libcapnp library -- that
is, the C++ interfaces. pycapnp is implemented as a wrapper around
libcapnp. Vitaly was discussing a case where there is a second Python
extension loaded into the same program which *also* uses libcapnp and
wishes to interact with pycapnp as well. Hence they would be passing C++
objects (not just serialized messages) back and forth, which requires C++
ABI compatibility (not just binary message encoding compatibility).

-Kenton
Post by Hedge Hog
Hi,
I'm contemplating working on the Ruby binding. It seems reasonable to
anticipate that I or others will strike this same issue. Some further
questions below...
Post by Kenton Varda
Post by Scott Purdy
Kenton, thanks for helping bring some clarity to this. It sounds like
1. Require pycapnp and our extensions to be compiled in the same
environment. We could potentially do this. We could make the install
process easy for end users by forking pycapnp and putting wheels up on PyPI
but we'd like to avoid that if possible.
I would argue that pycapnp should somehow export its version of libcapnp
so that other Python extensions that also use libcapnp are able to reuse
the same one. It makes sense for any Python extension that uses libcapnp.so
to declare a dependency on pycapnp, I would think.
I'm pretty sure I don't understand this correctly ;)
Is it correct that issue only applies to CP's struct types (the case cited
in the OP)?
So when using all the other CP types we're good to go across different
environments?
I recall from the distant past some sensitivity issues around ABI
compatibility and `enum` types.
Now I'm not sure if the enum in CP's language maps that closely to the
compiler's `enum`, and if they too will expose the issue raised here.
I know it is a lot to ask, but could the doc here [1] be updated to warn
users of these issues for each of CP's types?
Is guidance to users as simple as 'use only the built in types in your
messages to minimise ABI compatibility risks/issues'?
i.e. are `List`, `Data` and `Text` subject to this same issue?
[1]: https://capnproto.org/language.html#interfaces
Best wishes
Post by Kenton Varda
But I have no idea what this looks like logistically.
2. Pass the byte buffers, incurring a memory copy for anything that we
Post by Scott Purdy
pass across the boundary.
I'd like to explore #2 a bit more. Would this involve extracting the
segments from the pycapnp builder/reader, passing that to our extension,
and constructing a new builder/reader around the byte buffer? Or would we
have to construct a new message in the extension, pass the segments from
that back and find a way to copy that buffer into the pycapnp message
builder/reader?
There's no good way to share builders, since there would be no way for
them to synchronize memory allocation. So, once a buffer has been passed,
it needs to be read-only.
If you are trying to build a message in Python code but have one branch
of the message be built in C++ code, I think what you'll need to do is
create a brand new MessageBuilder in C++, build just the C++ branch of the
message there, and then pass this message to Python. In Python, you could
read the message with a MessageReader and then copy the contents into the
branch of the final message. This is where the copy is incurred -- when
moving data from one message into another message. Presumably you can
transmit individual messages between languages without any copies.
-Kenton
Post by Scott Purdy
We are also happy to put together a little demo project once we figure
this out so others that want to do something similar have a starting place.
Post by Kenton Varda
Hi Vitaly,
For ABI compatibility, you'd need pycapnp built against exactly the
same version of Cap'n Proto which you're using elsewhere in the process.
Ideally both would link against the same libcapnp.so, although I *think*
loading two copies of the library should not create problems as long as
they are the same version. (This differs from libprotobuf, which definitely
can't handle being loaded multiple times in the same process.)
You may also need to make sure both copies are built with the same
compiler. We're aware of at least one ABI incompatibility issue between
Clang and GCC that affects Cap'n Proto.
Of course, if you can't make anything work, you can always fall back to
transferring byte buffers, at the expense of possibly needing to make a
copy to merge the sub-messages into one overall message.
-Kenton
On Tue, Feb 14, 2017 at 4:46 PM, vitaly numenta <
Post by vitaly numenta
I am experiencing binary compatibility issues trying to get pycapnp
serialization/deserialization working with C extensions. There appear to be
ABI compatibility issues when passing C++ structs compiled in pycapnp into
our C extensions that are compiled in a different environment.
When serializing an instance of a class that's implemented in NuPIC,
we create a message builder via pycapnp and pass it to the corresponding
instance's write method, which in turn invokes write methods of its
own contained members. This works fine for members whose classes are
implemented in python, but doesn't always work for those implemented in the
nupic.bindings extension due to ABI issues.
For example, when serializing the TemporalMemory class, we might
from nupic.proto import TemporalMemoryProto_capnp
builder = TemporalMemoryProto_capnp.TemporalMemoryProto.new_message()
temporal_memory.write(builder)
builder.columnDimensions = list(self.columnDimensions)
self.connections.write(builder.connections) # pure python
*self._random.write(builder.random) # C++ Random class from extension*
The Random class that's implemented inside the nupic.bindings
extension needs to rely on our own build of capnproto that's linked into
the extension, but this doesn't seem to be compatible with the object
constructed in pycapnp.
We learned the hard way, after much trial and error, that we can't
simply pass the underlying message builders that were instantiated by
pycapnp's capnp.so module to our own build of capnproto contained in the
nupic.bindigns extension. This was particularly evident when working on the
manylinux wheel for nupic.bindings, which needs to be compiled using the
toolchain and c/c++ runtimes from CentOS-6. This resulted in ABI
incompatibilities when the capnproto code compiled into the extension
attempts to operate on a message builder that was constructed by pycapnp's
build of capnp.so. The message builder instance created by pycapnp's
capnp.so appears corrupted when operated upon by the capnproto code linked
into the extension.
Is there any recommendation for handling this dual Python/C-extension
scenario that avoids the ABI compatibility problem with C++ objects?
--
You received this message because you are subscribed to the Google
Groups "Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send
Visit this group at https://groups.google.com/group/capnproto.
--
You received this message because you are subscribed to the Google
Groups "Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send
Visit this group at https://groups.google.com/group/capnproto.
--
You received this message because you are subscribed to the Google Groups
"Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at https://groups.google.com/group/capnproto.
--
You received this message because you are subscribed to the Google Groups "Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email to capnproto+***@googlegroups.com.
Visit this group at https://groups.google.com/group/capnproto.
Hedge Hog
2017-05-04 08:12:04 UTC
Permalink
Thanks, you're right I had misunderstood where the issue was.
Best wishes.
Hedge
Post by Kenton Varda
Hi,
I'm not sure I understand your message.
The Cap'n Proto encoding is binary-compatible across all implementations (it
wouldn't be a very good serialization format otherwise).
The ABI issue we're discussing here is that of the libcapnp library -- that
is, the C++ interfaces. pycapnp is implemented as a wrapper around libcapnp.
Vitaly was discussing a case where there is a second Python extension loaded
into the same program which *also* uses libcapnp and wishes to interact with
pycapnp as well. Hence they would be passing C++ objects (not just
serialized messages) back and forth, which requires C++ ABI compatibility
(not just binary message encoding compatibility).
-Kenton
Post by Hedge Hog
Hi,
I'm contemplating working on the Ruby binding. It seems reasonable to
anticipate that I or others will strike this same issue. Some further
questions below...
Post by Kenton Varda
Post by Scott Purdy
Kenton, thanks for helping bring some clarity to this. It sounds like
1. Require pycapnp and our extensions to be compiled in the same
environment. We could potentially do this. We could make the install process
easy for end users by forking pycapnp and putting wheels up on PyPI but we'd
like to avoid that if possible.
I would argue that pycapnp should somehow export its version of libcapnp
so that other Python extensions that also use libcapnp are able to reuse the
same one. It makes sense for any Python extension that uses libcapnp.so to
declare a dependency on pycapnp, I would think.
I'm pretty sure I don't understand this correctly ;)
Is it correct that issue only applies to CP's struct types (the case cited
in the OP)?
So when using all the other CP types we're good to go across different
environments?
I recall from the distant past some sensitivity issues around ABI
compatibility and `enum` types.
Now I'm not sure if the enum in CP's language maps that closely to the
compiler's `enum`, and if they too will expose the issue raised here.
I know it is a lot to ask, but could the doc here [1] be updated to warn
users of these issues for each of CP's types?
Is guidance to users as simple as 'use only the built in types in your
messages to minimise ABI compatibility risks/issues'?
i.e. are `List`, `Data` and `Text` subject to this same issue?
[1]: https://capnproto.org/language.html#interfaces
Best wishes
Post by Kenton Varda
But I have no idea what this looks like logistically.
Post by Scott Purdy
2. Pass the byte buffers, incurring a memory copy for anything that we
pass across the boundary.
I'd like to explore #2 a bit more. Would this involve extracting the
segments from the pycapnp builder/reader, passing that to our extension, and
constructing a new builder/reader around the byte buffer? Or would we have
to construct a new message in the extension, pass the segments from that
back and find a way to copy that buffer into the pycapnp message
builder/reader?
There's no good way to share builders, since there would be no way for
them to synchronize memory allocation. So, once a buffer has been passed, it
needs to be read-only.
If you are trying to build a message in Python code but have one branch
of the message be built in C++ code, I think what you'll need to do is
create a brand new MessageBuilder in C++, build just the C++ branch of the
message there, and then pass this message to Python. In Python, you could
read the message with a MessageReader and then copy the contents into the
branch of the final message. This is where the copy is incurred -- when
moving data from one message into another message. Presumably you can
transmit individual messages between languages without any copies.
-Kenton
Post by Scott Purdy
We are also happy to put together a little demo project once we figure
this out so others that want to do something similar have a starting place.
Post by Kenton Varda
Hi Vitaly,
For ABI compatibility, you'd need pycapnp built against exactly the
same version of Cap'n Proto which you're using elsewhere in the process.
Ideally both would link against the same libcapnp.so, although I *think*
loading two copies of the library should not create problems as long as they
are the same version. (This differs from libprotobuf, which definitely can't
handle being loaded multiple times in the same process.)
You may also need to make sure both copies are built with the same
compiler. We're aware of at least one ABI incompatibility issue between
Clang and GCC that affects Cap'n Proto.
Of course, if you can't make anything work, you can always fall back to
transferring byte buffers, at the expense of possibly needing to make a copy
to merge the sub-messages into one overall message.
-Kenton
On Tue, Feb 14, 2017 at 4:46 PM, vitaly numenta
Post by vitaly numenta
I am experiencing binary compatibility issues trying to get pycapnp
serialization/deserialization working with C extensions. There appear to be
ABI compatibility issues when passing C++ structs compiled in pycapnp into
our C extensions that are compiled in a different environment.
When serializing an instance of a class that's implemented in NuPIC,
we create a message builder via pycapnp and pass it to the corresponding
instance's write method, which in turn invokes write methods of its own
contained members. This works fine for members whose classes are implemented
in python, but doesn't always work for those implemented in the
nupic.bindings extension due to ABI issues.
For example, when serializing the TemporalMemory class, we might
from nupic.proto import TemporalMemoryProto_capnp
builder = TemporalMemoryProto_capnp.TemporalMemoryProto.new_message()
temporal_memory.write(builder)
builder.columnDimensions = list(self.columnDimensions)
self.connections.write(builder.connections) # pure python
self._random.write(builder.random) # C++ Random class from
extension
The Random class that's implemented inside the nupic.bindings
extension needs to rely on our own build of capnproto that's linked into the
extension, but this doesn't seem to be compatible with the object
constructed in pycapnp.
We learned the hard way, after much trial and error, that we can't
simply pass the underlying message builders that were instantiated by
pycapnp's capnp.so module to our own build of capnproto contained in the
nupic.bindigns extension. This was particularly evident when working on the
manylinux wheel for nupic.bindings, which needs to be compiled using the
toolchain and c/c++ runtimes from CentOS-6. This resulted in ABI
incompatibilities when the capnproto code compiled into the extension
attempts to operate on a message builder that was constructed by pycapnp's
build of capnp.so. The message builder instance created by pycapnp's
capnp.so appears corrupted when operated upon by the capnproto code linked
into the extension.
Is there any recommendation for handling this dual Python/C-extension
scenario that avoids the ABI compatibility problem with C++ objects?
--
You received this message because you are subscribed to the Google
Groups "Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send
Visit this group at https://groups.google.com/group/capnproto.
--
You received this message because you are subscribed to the Google
Groups "Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send
Visit this group at https://groups.google.com/group/capnproto.
--
You received this message because you are subscribed to the Google Groups
"Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at https://groups.google.com/group/capnproto.
--
πόλλ' οἶδ ἀλώπηξ, ἀλλ' ἐχῖνος ἓν μέγα
[The fox knows many things, but the hedgehog knows one big thing.]
Archilochus, Greek poet (c. 680 BC – c. 645 BC)
http://hedgehogshiatus.com
--
You received this message because you are subscribed to the Google Groups "Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email to capnproto+***@googlegroups.com.
Visit this group at https://groups.google.com/group/capnproto.
Loading...