Initial multivalue support#2675
Conversation
TODO: - Add feature flag - Extend interpreter to handle tuple values - Validation - Check whether implemented for Precompute - C and JS API support/testing - Test parsing and printing Follow ups: - Figure out how to lower in stack IR - Multivalue decoding support - Fuzzing
| TypeCounter(Counts& counts) : counts(counts) {} | ||
|
|
||
| void visitCallIndirect(CallIndirect* curr) { counts[curr->sig]++; } | ||
| void visitBlock(Block* curr) { |
There was a problem hiding this comment.
why not loops and ifs? and is this necessary for tuples?
There was a problem hiding this comment.
It is possibly also needed for loops and ifs. I'll have to do more extensive testing of those.
Is your second question referring to tuple-typed locals? Those don't need a signature in the type section for either binaries or the text format.
There was a problem hiding this comment.
I'd guess loops and ifs are needed - would be odd if there's an asymmetry there, but i might be missing something...
The second question was just that I wasn't sure if this PR was just tuples or not, but I guess not?
There was a problem hiding this comment.
Is the reason if and loop were omitted that because loops and ifs contain an internal block whose signature will be the same as that of loop or if's?
There was a problem hiding this comment.
Yes, although now that I think about it, there could be loops or ifs that contain single expressions that are not blocks.
This PR previously made a half-hearted attempt at supporting multivalue control flow, but now I'm going all in on it.
| return printName(name, o); | ||
| } | ||
|
|
||
| struct LocalType { |
There was a problem hiding this comment.
it's not obvious to me how printing a local's type differs from a type in general - perhaps add a comment?
There was a problem hiding this comment.
No commas in s-expressions. I'll add a comment.
There was a problem hiding this comment.
Is there a reason local types should be printed without commas, unlike other types? Not against it, but curious.
There was a problem hiding this comment.
The s-expression parser doesn't interpret commas, so after parsing we would end up with individual type strings with commas in them, which would be annoying. It's also better not to have commas for consistency with the rest of the text format.
There was a problem hiding this comment.
Sorry, then what is the 'default format' that's with commas and where is it used?
There was a problem hiding this comment.
The default format looks like (i32, i64) and it's used for printing types in debugging messages and similar things. We could change the default format to get rid of the comma, but I like having the comma.
src/passes/Precompute.cpp
Outdated
| // try to evaluate this into a const | ||
| Flow flow = precomputeExpression(curr); | ||
| if (flow.value.type.isVector()) { | ||
| if (flow.getValue().type.isVector()) { |
There was a problem hiding this comment.
Would a multivalue instruction trap in the call to precomputeExpression? If yes, maybe we can add a TODO to optimize those in this pass. If no, these getValue()s might fail?
There was a problem hiding this comment.
No, a multivalue instruction wouldn't necessarily trap. I'll add a check and a TODO for handling multivalues here in the future.
src/wasm/wasm-stack.cpp
Outdated
|
|
||
| namespace wasm { | ||
|
|
||
| void BinaryInstWriter::emitBlockType(Type type) { |
There was a problem hiding this comment.
the name suggests it's just for Blocks, how about emitControlFlowType? or maybe emitGeneralType? (but not sure that's a good name for "might be non-single")
There was a problem hiding this comment.
How about emitResultType?
There was a problem hiding this comment.
sgtm, better than the others for sure.
aheejin
left a comment
There was a problem hiding this comment.
Nice! Haven't read until the end yet, but some comments:
| void visitCallIndirect(CallIndirect* curr) { counts[curr->sig]++; } | ||
| void visitBlock(Block* curr) { | ||
| if (curr->type.isMulti()) { | ||
| counts[Signature(Type::none, curr->type)]++; |
There was a problem hiding this comment.
So basically this assumes currently blocks cannot take values yet and only return multivalues, right? It would be clearer to say that in the comment.
There was a problem hiding this comment.
I'll add a TODO for this.
| return printName(name, o); | ||
| } | ||
|
|
||
| struct LocalType { |
There was a problem hiding this comment.
Is there a reason local types should be printed without commas, unlike other types? Not against it, but curious.
| Name breakTo; // if non-null, a break is going on | ||
|
|
||
| Literal& getValue() { | ||
| assert(values.size() == 1); |
There was a problem hiding this comment.
In which case do we need multiple values in values? Why is it restricted to size 1? A TODO comment or something can be helpful.
There was a problem hiding this comment.
We need multiple values in the flow to handle the case where a tuple.make or a multivalue function call is interpreted. For most cases, the interpreter can assume due to validation that there is only a single value in the flow, so those cases can use this helper function instead of having to examine values directly. I'm open to other opinions on how this could be made cleaner.
There was a problem hiding this comment.
I see. How about getSingleValue? I've seen this naming several times in LLVM, e.g. getSinglePredecessor, that returns a meaningful value only when there is a single value. And also a comment will be helpful, saying that this function is a helper function that only works when there is a single value.
| TypeCounter(Counts& counts) : counts(counts) {} | ||
|
|
||
| void visitCallIndirect(CallIndirect* curr) { counts[curr->sig]++; } | ||
| void visitBlock(Block* curr) { |
There was a problem hiding this comment.
Is the reason if and loop were omitted that because loops and ifs contain an internal block whose signature will be the same as that of loop or if's?
| return printName(name, o); | ||
| } | ||
|
|
||
| struct LocalType { |
There was a problem hiding this comment.
Sorry, then what is the 'default format' that's with commas and where is it used?
| Name breakTo; // if non-null, a break is going on | ||
|
|
||
| Literal& getValue() { | ||
| assert(values.size() == 1); |
There was a problem hiding this comment.
I see. How about getSingleValue? I've seen this naming several times in LLVM, e.g. getSinglePredecessor, that returns a meaningful value only when there is a single value. And also a comment will be helpful, saying that this function is a helper function that only works when there is a single value.
|
|
||
| Type WasmBinaryBuilder::getType() { | ||
| int type = getS32LEB(); | ||
| if (type >= 0) { |
There was a problem hiding this comment.
Maybe then a comment saying all single type immediates are negative will be good enough.
src/wasm/wasm-s-parser.cpp
Outdated
| results.push_back(parseResults(curr)); | ||
| for (auto result : parseResults(curr)) { | ||
| results.push_back(result); | ||
| } |
There was a problem hiding this comment.
I wonder why we need an extra inner for loop here? I think this is the same situation as params, which also can have multiple types, and parseParamOfLocal call above does not need one.
There was a problem hiding this comment.
You're right, I can use insert here as well 👍
|
|
||
| var module = new binaryen.Module(); | ||
| module.setFeatures(binaryen.Features.ExceptionHandling); | ||
| module.setFeatures(binaryen.Features.ExceptionHandling | binaryen.Features.Multivalue); |
There was a problem hiding this comment.
Why? If it's because the EH proposal depends on multivalue, that's the same for the reference types I think. This file only tests events (not EH instructions) so this uses neither though.
There was a problem hiding this comment.
I made it so that multivalue events specifically require multivalue to be enabled in addition to exceptions. This file does have a multivalue event type, even though it is not used.
| if (curr->ifFalse->type != Type::unreachable) { | ||
| shouldBeTrue( | ||
| curr->ifFalse->type.isSingle(), curr, "select value may not be a tuple"); | ||
| } |
There was a problem hiding this comment.
What are the complexities for multivalue select that's not the case for ifs? Then do we have a plan to support multivalue select later? Their encoding is different, i.e., a vector of types, but technically it support multivalue right? If we have a plan to support that how about adding a TODO comment?
There was a problem hiding this comment.
I do not plan to support tuples with the MVP select. (Although I think the new type-parameterized select will be able to handle them). The lowering would require scratch registers for the condition and for almost all elements of each tuple, so I think it's just not worth it.
| shouldBeTrue(!curr->sig.results.isMulti(), | ||
| curr->body, | ||
| "Multivalue function results (multivalue is not enabled)"); | ||
| } |
There was a problem hiding this comment.
This can be written without double negatives:
if (curr->sig.results.isMulti()) {
shouldBeTrue(getModule()->features.hasMultiValue(), ...);
}which seems to be the way other functions are written.
src/wasm/wasm-stack.cpp
Outdated
| void BinaryInstWriter::visitLocalSet(LocalSet* curr) { | ||
| o << int8_t(curr->isTee() ? BinaryConsts::LocalTee : BinaryConsts::LocalSet) | ||
| << U32LEB(mappedLocals[std::make_pair(curr->index, 0)]); | ||
| size_t numLocals = func->getLocalType(curr->index).size(); |
There was a problem hiding this comment.
Nit: How about numValues? This is one local and you used numValues in visitLocalGet above.
|
|
||
| class FeatureValidationTest(utils.BinaryenTestCase): | ||
| def check_feature(self, module, error, flag): | ||
| def check_feature(self, module, error, flag, const_flags=[]): |
There was a problem hiding this comment.
flags that are present whether or not the feature under test is turned on. Another name might be invariable_flags, but that's kind of long.
|
Oh btw does this support emission of stack IR? I don't see that part of the code, so.. (not optimization but just emission) |
It does not. I wanted to get basic binary emitting working first. I suspect that I will have to move the lowering logic for drops, local.get, etc. in order to get it working for Stack IR as well. |
src/wasm/wasm-s-parser.cpp
Outdated
| results.push_back(parseResults(*s[i++])); | ||
| for (auto result : parseResults(*s[i++])) { | ||
| results.push_back(result); | ||
| } |
There was a problem hiding this comment.
I guess we don't need a inner loop here either because parseResults returns a vector? (The loop above for newParams also can use insert I think)
There was a problem hiding this comment.
I'll remove this loop, but the loop above projects p.type before pushing, so it can't easily be removed.
|
|
||
| class FeatureValidationTest(utils.BinaryenTestCase): | ||
| def check_feature(self, module, error, flag): | ||
| def check_feature(self, module, error, flag, const_flags=[]): |
src/wasm/wasm-binary.cpp
Outdated
| BYN_TRACE("== popping unreachable from polymorphic stack" << std::endl); | ||
| return allocator.alloc<Unreachable>(); | ||
| } | ||
| assert(false); |
There was a problem hiding this comment.
old debugging code that needs to be removed?
Adds binary and s-expression parsing and printing for first-class tuple types for representing multivalue constructs. Tuples are created with the new
tuple.makepseudoinstruction and their elements can be extracted with the newtuple.extractpseudoinstruction. They can also be stored in locals, which are declared with a new nonstandard type syntax.Simple multivalue functions can be round-tripped between binary and text formats, although the number of locals increases on each trip. Future work includes Stack IR optimizations to reduce the number of locals emitted, more robust tests, and C/JS API inclusion.