Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid pre-compressing points and remove the need for operator< #29

Merged
merged 12 commits into from
Sep 24, 2018
Merged

Avoid pre-compressing points and remove the need for operator< #29

merged 12 commits into from
Sep 24, 2018

Conversation

codablock
Copy link
Contributor

operator< was the only reason why BLSSignature and BLSPublicKey had too keep internal serialized data up-to-data. This required calling of CompressPoint even if the data was never actually needed.

This PR is a set of refactorings which in the end remove the need for operator< and then finally removes the operator and the data field.

It also includes a few other refactorings to reduce the amount of copying and serialization needed when sorting public keys and signatures. There is also a fix for a memory leak included.

I also removed the stl-like api (begin/end/size) as the classes were probably never intended to be used in a stl-compatible manner (and thus should not imply compatibility if there is none really). I had to change the python tests for this, but could not test these changes as tests currently fail on master as well.

The result of this PR is that copying and moving around signatures and public keys gets a lot faster. We rely on this to be fast internally as we use a lot of paralellism where ownership of objects is important.

42f4660 is a fix for relic, which I also create a PR for in relic.

@mariano54
Copy link
Contributor

Python bindings tests are passing on my machine

}
}
// Pks are sorted by message then pk
std::sort(begin(sortedKeys), end(sortedKeys), BLSUtil::BytesCompare80());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we using BytesCompare80 anymore? If not we can remove it from BLSUtil

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still used in SortIntoVectors

}

const uint8_t* BLSPublicKey::begin() const {
std::vector<uint8_t> BLSPublicKey::Serialize() const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need a second serialize method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. for convenience (otherwise you'd have to do std::vector<uint8_t> b; b.resize(size); pk.Serialize(b.data())
  2. as a replacement when begin/end/size was needed (which I removed).

src/bls.cpp Outdated
bn_new(order);
g2_get_ord(order);

std::vector<uint8_t> pkBuffer;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use a vector here instead of the array?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

array seems cleaner and more consistent with the rest of the code

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And other places where arrays are replaced by vectors

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With arrays you mean dynamically allocated arrays as used before or std::array?

I prefer the use of std::vector<uint8_t> over new[] whenever possible. I actually prefer to not use new/new[] at all, simply because it is one of the main sources for memory related bugs. The memory leak that I fixed in fdf15f9 was such an example. With std::vector you've generally on the secure side of things and also get a lot of flexibility (e.g. everything from std can be used, or the < which makes sorting simpler). You also have better control about ownership of buffers/arrays, as you can always pass around ownership with std::move/std::swap.

About consistency: I noticed this as well and actually started a refactoring of all places where new[] was used...but stopped in the middle as it got too much. Would you consider using std::vector in the future (for all new things) and gradually move new[] to std::vector step by step?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I tend to use arrays as I'm more used to c, I prefer simpler language features, but this rationale makes sense. @fchirica what do you think about this approach?

Is there any performance overhead for using vectors?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With modern C++ (starting with C++11), the overhead is neglectable. Only when iterating byte-by-byte, you'd be able to measure a performance impact. But usually, you'd use stuff from std:: to perform operations, which all have specialized implementations to work fast on primitive data types.

It's however very important to pass vectors by reference, as otherwise you end up copying them over and over. Returning vectors by-value depends on the place it is used. If the vector is from the stack, the C++ compiler will will use return value optimization (avoiding a copy). If you return a member variable, it will be copied.

If you put vectors in maps or other vectors, always use m.emplace(k, vec) and v2.emplace_back(vec). If the vector is not required after adding it to another container, use m.emplace(std::move(vec)) and v2.emplace_back(std::move(vec)), which will basically swap (very fast) the contents of the inserted vector into the new position in the target container. Generally, always prefer emplace over push/insert (it avoids unnecessary copies).

If using vectors in vectors, make sure to v.reserve(expectedSize) in advance, otherwise it might end up copying stuff unnecessarily.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@codablock do you have some backup for the negligible overhead of std::vector with C++11?
Another alternative is to use std::unique_ptr. This avoids the memory bugs and ensures that the size is set and fixed for all time. Thus making the code easier to understand than using std::vector (across the board) where the size could be set at any place or time. With std::vector it can be harder to know whether additional elements have been added or not later.

src/bls.cpp Outdated
uint8_t *pkBuffer = new uint8_t[BLSPublicKey::PUBLIC_KEY_SIZE
* (pubKeys.size())];
std::vector<std::vector<uint8_t>> const &serPubKeys,
std::vector<size_t> const& sorted) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe rename sorted to sortedIndeces

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

// Serializes ONLY the 96 byte public key. It does not serialize
// the aggregation info.
void Serialize(uint8_t* buffer) const;
std::vector<uint8_t> Serialize() const;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to keep both serialize, we should also do it for other classes like ExtendedPubKey, ChainCode, etc

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@mariano54
Copy link
Contributor

Other c++ crypto libraries I've seen, use pointers, for example https://github.com/weidai11/cryptopp.

I'm open to switching to vectors, but I think this requires more thought and should be done in a different PR, which keeps everything consistent. @codablock could you use arrays for this PR instead? This bug fix and refactoring is still definitely useful.

@codablock
Copy link
Contributor Author

Rebased on master on added commits to revert back to uint8_t*

@mariano54 mariano54 merged commit 658b619 into Chia-Network:master Sep 24, 2018
@codablock codablock deleted the pr_noprecompress branch October 19, 2018 13:35
nmarley pushed a commit to nmarley/bls-signatures that referenced this pull request Jun 4, 2022
* feat: added back in private keys from BIP32 Seed

* feat: added in unit test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants