-
Notifications
You must be signed in to change notification settings - Fork 213
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid pre-compressing points and remove the need for operator< #29
Conversation
Python bindings tests are passing on my machine |
} | ||
} | ||
// Pks are sorted by message then pk | ||
std::sort(begin(sortedKeys), end(sortedKeys), BLSUtil::BytesCompare80()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we using BytesCompare80 anymore? If not we can remove it from BLSUtil
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still used in SortIntoVectors
} | ||
|
||
const uint8_t* BLSPublicKey::begin() const { | ||
std::vector<uint8_t> BLSPublicKey::Serialize() const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need a second serialize method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- for convenience (otherwise you'd have to do
std::vector<uint8_t> b; b.resize(size); pk.Serialize(b.data())
- as a replacement when begin/end/size was needed (which I removed).
src/bls.cpp
Outdated
bn_new(order); | ||
g2_get_ord(order); | ||
|
||
std::vector<uint8_t> pkBuffer; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why use a vector here instead of the array?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
array seems cleaner and more consistent with the rest of the code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And other places where arrays are replaced by vectors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With arrays you mean dynamically allocated arrays as used before or std::array?
I prefer the use of std::vector<uint8_t> over new[] whenever possible. I actually prefer to not use new/new[] at all, simply because it is one of the main sources for memory related bugs. The memory leak that I fixed in fdf15f9 was such an example. With std::vector you've generally on the secure side of things and also get a lot of flexibility (e.g. everything from std can be used, or the < which makes sorting simpler). You also have better control about ownership of buffers/arrays, as you can always pass around ownership with std::move/std::swap.
About consistency: I noticed this as well and actually started a refactoring of all places where new[] was used...but stopped in the middle as it got too much. Would you consider using std::vector in the future (for all new things) and gradually move new[] to std::vector step by step?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I tend to use arrays as I'm more used to c, I prefer simpler language features, but this rationale makes sense. @fchirica what do you think about this approach?
Is there any performance overhead for using vectors?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With modern C++ (starting with C++11), the overhead is neglectable. Only when iterating byte-by-byte, you'd be able to measure a performance impact. But usually, you'd use stuff from std:: to perform operations, which all have specialized implementations to work fast on primitive data types.
It's however very important to pass vectors by reference, as otherwise you end up copying them over and over. Returning vectors by-value depends on the place it is used. If the vector is from the stack, the C++ compiler will will use return value optimization (avoiding a copy). If you return a member variable, it will be copied.
If you put vectors in maps or other vectors, always use m.emplace(k, vec) and v2.emplace_back(vec). If the vector is not required after adding it to another container, use m.emplace(std::move(vec)) and v2.emplace_back(std::move(vec)), which will basically swap (very fast) the contents of the inserted vector into the new position in the target container. Generally, always prefer emplace over push/insert (it avoids unnecessary copies).
If using vectors in vectors, make sure to v.reserve(expectedSize) in advance, otherwise it might end up copying stuff unnecessarily.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@codablock do you have some backup for the negligible overhead of std::vector with C++11?
Another alternative is to use std::unique_ptr. This avoids the memory bugs and ensures that the size is set and fixed for all time. Thus making the code easier to understand than using std::vector (across the board) where the size could be set at any place or time. With std::vector it can be harder to know whether additional elements have been added or not later.
src/bls.cpp
Outdated
uint8_t *pkBuffer = new uint8_t[BLSPublicKey::PUBLIC_KEY_SIZE | ||
* (pubKeys.size())]; | ||
std::vector<std::vector<uint8_t>> const &serPubKeys, | ||
std::vector<size_t> const& sorted) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe rename sorted to sortedIndeces
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
// Serializes ONLY the 96 byte public key. It does not serialize | ||
// the aggregation info. | ||
void Serialize(uint8_t* buffer) const; | ||
std::vector<uint8_t> Serialize() const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to keep both serialize, we should also do it for other classes like ExtendedPubKey, ChainCode, etc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Other c++ crypto libraries I've seen, use pointers, for example https://github.com/weidai11/cryptopp. I'm open to switching to vectors, but I think this requires more thought and should be done in a different PR, which keeps everything consistent. @codablock could you use arrays for this PR instead? This bug fix and refactoring is still definitely useful. |
sortKey was never freed, using std::vector now. Also avoiding unnecessary copies.
Avoid unnecessary copies and serialization.
Avoid unnecessary copies and serialization
Rebased on master on added commits to revert back to uint8_t* |
* feat: added back in private keys from BIP32 Seed * feat: added in unit test
operator< was the only reason why BLSSignature and BLSPublicKey had too keep internal serialized data up-to-data. This required calling of CompressPoint even if the data was never actually needed.
This PR is a set of refactorings which in the end remove the need for operator< and then finally removes the operator and the data field.
It also includes a few other refactorings to reduce the amount of copying and serialization needed when sorting public keys and signatures. There is also a fix for a memory leak included.
I also removed the stl-like api (begin/end/size) as the classes were probably never intended to be used in a stl-compatible manner (and thus should not imply compatibility if there is none really). I had to change the python tests for this, but could not test these changes as tests currently fail on master as well.
The result of this PR is that copying and moving around signatures and public keys gets a lot faster. We rely on this to be fast internally as we use a lot of paralellism where ownership of objects is important.
42f4660 is a fix for relic, which I also create a PR for in relic.