Skip to content

Initial multivalue support#2675

Merged
tlively merged 17 commits intoWebAssembly:masterfrom
tlively:tuples
Mar 5, 2020
Merged

Initial multivalue support#2675
tlively merged 17 commits intoWebAssembly:masterfrom
tlively:tuples

Conversation

@tlively
Copy link
Member

@tlively tlively commented Feb 29, 2020

Adds binary and s-expression parsing and printing for first-class tuple types for representing multivalue constructs. Tuples are created with the new tuple.make pseudoinstruction and their elements can be extracted with the new tuple.extract pseudoinstruction. They can also be stored in locals, which are declared with a new nonstandard type syntax.

Simple multivalue functions can be round-tripped between binary and text formats, although the number of locals increases on each trip. Future work includes Stack IR optimizations to reduce the number of locals emitted, more robust tests, and C/JS API inclusion.

TODO:
 - Add feature flag
 - Extend interpreter to handle tuple values
 - Validation
 - Check whether implemented for Precompute
 - C and JS API support/testing
 - Test parsing and printing

Follow ups:
 - Figure out how to lower in stack IR
 - Multivalue decoding support
 - Fuzzing
@tlively tlively requested review from aheejin and kripken February 29, 2020 00:09
Copy link
Member

@kripken kripken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

TypeCounter(Counts& counts) : counts(counts) {}

void visitCallIndirect(CallIndirect* curr) { counts[curr->sig]++; }
void visitBlock(Block* curr) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not loops and ifs? and is this necessary for tuples?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is possibly also needed for loops and ifs. I'll have to do more extensive testing of those.

Is your second question referring to tuple-typed locals? Those don't need a signature in the type section for either binaries or the text format.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd guess loops and ifs are needed - would be odd if there's an asymmetry there, but i might be missing something...

The second question was just that I wasn't sure if this PR was just tuples or not, but I guess not?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the reason if and loop were omitted that because loops and ifs contain an internal block whose signature will be the same as that of loop or if's?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, although now that I think about it, there could be loops or ifs that contain single expressions that are not blocks.

This PR previously made a half-hearted attempt at supporting multivalue control flow, but now I'm going all in on it.

return printName(name, o);
}

struct LocalType {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not obvious to me how printing a local's type differs from a type in general - perhaps add a comment?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No commas in s-expressions. I'll add a comment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason local types should be printed without commas, unlike other types? Not against it, but curious.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The s-expression parser doesn't interpret commas, so after parsing we would end up with individual type strings with commas in them, which would be annoying. It's also better not to have commas for consistency with the rest of the text format.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, then what is the 'default format' that's with commas and where is it used?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default format looks like (i32, i64) and it's used for printing types in debugging messages and similar things. We could change the default format to get rid of the comma, but I like having the comma.

// try to evaluate this into a const
Flow flow = precomputeExpression(curr);
if (flow.value.type.isVector()) {
if (flow.getValue().type.isVector()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would a multivalue instruction trap in the call to precomputeExpression? If yes, maybe we can add a TODO to optimize those in this pass. If no, these getValue()s might fail?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, a multivalue instruction wouldn't necessarily trap. I'll add a check and a TODO for handling multivalues here in the future.


namespace wasm {

void BinaryInstWriter::emitBlockType(Type type) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the name suggests it's just for Blocks, how about emitControlFlowType? or maybe emitGeneralType? (but not sure that's a good name for "might be non-single")

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about emitResultType?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sgtm, better than the others for sure.

Copy link
Member

@aheejin aheejin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Haven't read until the end yet, but some comments:

void visitCallIndirect(CallIndirect* curr) { counts[curr->sig]++; }
void visitBlock(Block* curr) {
if (curr->type.isMulti()) {
counts[Signature(Type::none, curr->type)]++;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So basically this assumes currently blocks cannot take values yet and only return multivalues, right? It would be clearer to say that in the comment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add a TODO for this.

return printName(name, o);
}

struct LocalType {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason local types should be printed without commas, unlike other types? Not against it, but curious.

Name breakTo; // if non-null, a break is going on

Literal& getValue() {
assert(values.size() == 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In which case do we need multiple values in values? Why is it restricted to size 1? A TODO comment or something can be helpful.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need multiple values in the flow to handle the case where a tuple.make or a multivalue function call is interpreted. For most cases, the interpreter can assume due to validation that there is only a single value in the flow, so those cases can use this helper function instead of having to examine values directly. I'm open to other opinions on how this could be made cleaner.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. How about getSingleValue? I've seen this naming several times in LLVM, e.g. getSinglePredecessor, that returns a meaningful value only when there is a single value. And also a comment will be helpful, saying that this function is a helper function that only works when there is a single value.

TypeCounter(Counts& counts) : counts(counts) {}

void visitCallIndirect(CallIndirect* curr) { counts[curr->sig]++; }
void visitBlock(Block* curr) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the reason if and loop were omitted that because loops and ifs contain an internal block whose signature will be the same as that of loop or if's?

return printName(name, o);
}

struct LocalType {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, then what is the 'default format' that's with commas and where is it used?

Name breakTo; // if non-null, a break is going on

Literal& getValue() {
assert(values.size() == 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. How about getSingleValue? I've seen this naming several times in LLVM, e.g. getSinglePredecessor, that returns a meaningful value only when there is a single value. And also a comment will be helpful, saying that this function is a helper function that only works when there is a single value.


Type WasmBinaryBuilder::getType() {
int type = getS32LEB();
if (type >= 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe then a comment saying all single type immediates are negative will be good enough.

results.push_back(parseResults(curr));
for (auto result : parseResults(curr)) {
results.push_back(result);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder why we need an extra inner for loop here? I think this is the same situation as params, which also can have multiple types, and parseParamOfLocal call above does not need one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, I can use insert here as well 👍


var module = new binaryen.Module();
module.setFeatures(binaryen.Features.ExceptionHandling);
module.setFeatures(binaryen.Features.ExceptionHandling | binaryen.Features.Multivalue);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? If it's because the EH proposal depends on multivalue, that's the same for the reference types I think. This file only tests events (not EH instructions) so this uses neither though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made it so that multivalue events specifically require multivalue to be enabled in addition to exceptions. This file does have a multivalue event type, even though it is not used.

if (curr->ifFalse->type != Type::unreachable) {
shouldBeTrue(
curr->ifFalse->type.isSingle(), curr, "select value may not be a tuple");
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the complexities for multivalue select that's not the case for ifs? Then do we have a plan to support multivalue select later? Their encoding is different, i.e., a vector of types, but technically it support multivalue right? If we have a plan to support that how about adding a TODO comment?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not plan to support tuples with the MVP select. (Although I think the new type-parameterized select will be able to handle them). The lowering would require scratch registers for the condition and for almost all elements of each tuple, so I think it's just not worth it.

shouldBeTrue(!curr->sig.results.isMulti(),
curr->body,
"Multivalue function results (multivalue is not enabled)");
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be written without double negatives:

if (curr->sig.results.isMulti()) {
  shouldBeTrue(getModule()->features.hasMultiValue(), ...);
}

which seems to be the way other functions are written.

void BinaryInstWriter::visitLocalSet(LocalSet* curr) {
o << int8_t(curr->isTee() ? BinaryConsts::LocalTee : BinaryConsts::LocalSet)
<< U32LEB(mappedLocals[std::make_pair(curr->index, 0)]);
size_t numLocals = func->getLocalType(curr->index).size();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: How about numValues? This is one local and you used numValues in visitLocalGet above.


class FeatureValidationTest(utils.BinaryenTestCase):
def check_feature(self, module, error, flag):
def check_feature(self, module, error, flag, const_flags=[]):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is const_flags?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flags that are present whether or not the feature under test is turned on. Another name might be invariable_flags, but that's kind of long.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, good idea.

@aheejin
Copy link
Member

aheejin commented Mar 4, 2020

Oh btw does this support emission of stack IR? I don't see that part of the code, so.. (not optimization but just emission)

@tlively
Copy link
Member Author

tlively commented Mar 5, 2020

Oh btw does this support emission of stack IR? I don't see that part of the code, so.. (not optimization but just emission)

It does not. I wanted to get basic binary emitting working first. I suspect that I will have to move the lowering logic for drops, local.get, etc. in order to get it working for Stack IR as well.

results.push_back(parseResults(*s[i++]));
for (auto result : parseResults(*s[i++])) {
results.push_back(result);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we don't need a inner loop here either because parseResults returns a vector? (The loop above for newParams also can use insert I think)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll remove this loop, but the loop above projects p.type before pushing, so it can't easily be removed.


class FeatureValidationTest(utils.BinaryenTestCase):
def check_feature(self, module, error, flag):
def check_feature(self, module, error, flag, const_flags=[]):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, good idea.

@tlively
Copy link
Member Author

tlively commented Mar 5, 2020

Thanks, @aheejin! @kripken, do you want to take another look before I merge this?

BYN_TRACE("== popping unreachable from polymorphic stack" << std::endl);
return allocator.alloc<Unreachable>();
}
assert(false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

old debugging code that needs to be removed?

@tlively tlively merged commit 9e3dbb1 into WebAssembly:master Mar 5, 2020
@tlively tlively deleted the tuples branch March 5, 2020 23:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants