Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decompilation of lifted operators #205

Merged
merged 57 commits into from
Sep 18, 2011
Merged

Conversation

pentp
Copy link
Contributor

@pentp pentp commented May 27, 2011

This decompiles the C# generated code for lifted operators back to the original short form.
This can result in a significant improvement to the readability of the resulting code in some more complex functions because C# generates 2 temporary locals for lifted operators that can now be inlined.

As a byproduct the following improvements are also included: more thorough boolean logic decompilation; various project/solution file improvements; Ldloca support for variable splitting; simplified shift operators (already in icsharpcode branch); significant performance improvement for stack analysis of variables to offset the added complexity from ldloca support; decompilation of compound assignment with overloaded operators; some type analysis refactoring.

pentp added 26 commits May 3, 2011 01:52
Conflicts:
	ICSharpCode.Decompiler/ICSharpCode.Decompiler.csproj
@pentp
Copy link
Contributor Author

pentp commented Jun 7, 2011

I added Ldloca opcode support for the variable splitting logic in ILAstBuilder so that the test cases pass again. C# debug builds apparently re-use generated variables as much as possible and variable inlining needs those variables split to independent temporary variables. Also started working on math operators, but that's still a work in progress.

@pentp
Copy link
Contributor Author

pentp commented Aug 14, 2011

I replaced the ParenthesisExpression with an annotation, as suggested and improved compound assignment decompilation and fixed a type analysis regression.

I've thoroughly compared the decompilation result of the baseline and this version on the same projects (2 LOB web apps + AvalonEdit, NRefactory, Decompiler and ILSpy) and didn't find any regressions so this pull request should be ready for general use.


I tried to follow a code formatting style consistent with the rest of the project but there are at least two different styles used in the ICSharpCode.Decompiler project so I used the most common one. Is there a general guideline somewhere for contributions to ILSpy?
Similarly to code formatting, the line ending convention used by code files also seems to be inconsistent - some files even have both unix and windows style line endings in the same file that makes editing them quite a pain in the ass (AstMethodBodyBuilder.cs for example).

// can also mean Equality, when used with object references
TypeReference arg1Type = byteCode.Arguments[0].InferredType;
if (arg1Type != null && !arg1Type.IsValueType) goto case ILCode.Ceq;
goto case ILCode.Cle;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only cgt.un can be used for inequality, cle.un is not affected.
See footnote 2 on table 4 "Binary Comparison or Branch Operations" in the ECMA-335:

cgt.un is allowed and verifiable on ObjectRefs (O). This is commonly used when comparing an
ObjectRef with null (there is no “compare-not-equal” instruction, which would otherwise be a
more obvious solution)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this because SimplifyLogicNot can change a logicnot(cgt.un) to cle.un, which in turn is needed to keep the number of different patterns recognized by SimplifyLiftedOperators somewhat reasonable. I could probably change SimplifyLogicNot to not touch cgt.un on reference types or transform it into an cne opcode instead.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that would be a possible solution.
The not-simplification logic needs to look at type information anyways, as currently we perform the transformation from !(a < b) to (a >= b) for all primitive types, but that is incorrect for floats/doubles if one or both arguments are NaN.

@dgrunwald
Copy link
Member

I just tried to decompile some BCL assemblies using your patch, and got an exception in System.Globalization.DateTimeFormatInfoScanner.ScanDateWord (mscorlib):

System.ArgumentOutOfRangeException: Index was out of range. Must be non-negative and less than the size of the collection.
at static void System.ThrowHelper.ThrowArgumentOutOfRangeException()
at VariableInfo System.Collections.Generic.List1[ICSharpCode.Decompiler.ILAst.ILAstBuilder+VariableInfo].get_Item(Int32 index) at void ICSharpCode.Decompiler.ILAst.ILAstBuilder.ConvertLocalVariables(List1 body) in c:\work\ILSpy\ICSharpCode.Decompiler\ILAst\ILAstBuilder.cs:line 588
at List1 ICSharpCode.Decompiler.ILAst.ILAstBuilder.StackAnalysis(MethodDefinition methodDef) in c:\work\ILSpy\ICSharpCode.Decompiler\ILAst\ILAstBuilder.cs:line 488 at List1 ICSharpCode.Decompiler.ILAst.ILAstBuilder.Build(MethodDefinition methodDef, Boolean optimize, DecompilerContext context) in c:\work\ILSpy\ICSharpCode.Decompiler\ILAst\ILAstBuilder.cs:line 237
at BlockStatement ICSharpCode.Decompiler.Ast.AstMethodBodyBuilder.CreateMethodBody(IEnumerable1 parameters, ConcurrentDictionary2 localVariables) in c:\work\ILSpy\ICSharpCode.Decompiler\Ast\AstMethodBodyBuilder.cs:line 98
...

Debug builds of ILSpy have command "File > DEBUG -- decompile all", which will decompile all currently loaded assemblies. Very useful for testing (and then looking at decompiled diffs). I'm running that command on all .NET 4.0 System.* assemblies.

@dgrunwald
Copy link
Member

This is our coding style guideline: http://sharpdevelop.net/TechNotes/SharpDevelopCodingStyle03.pdf
But we aren't religious about the coding style, so when some contributors sent us patches with the default VS coding style ('{' on separate line), we accepted them as-is.

On line endings: we use LF in the git repository, but CRLF on check-out using git's autocrlf feature. We just fixed a bug in SharpDevelop's git integration a few days ago that caused mixed line endings when reverting parts of a file using the editor's change margin. It's likely that some files in ILSpy were messed up by that bug.

dgrunwald added a commit that referenced this pull request Sep 18, 2011
This decompiles the C# generated code for lifted operators back to the original short form.
This can result in a significant improvement to the readability of the resulting code in some more complex functions because C# generates 2 temporary locals for lifted operators that can now be inlined.

As a byproduct the following improvements are also included: more thorough boolean logic decompilation; various project/solution file improvements; Ldloca support for variable splitting; simplified shift operators (already in icsharpcode branch); significant performance improvement for stack analysis of variables to offset the added complexity from ldloca support; decompilation of compound assignment with overloaded operators; some type analysis refactoring.
@dgrunwald dgrunwald merged commit 41e71ea into icsharpcode:master Sep 18, 2011
@dgrunwald
Copy link
Member

Finally merged into ILSpy master. :)

Thanks for your work!

@dsrbecky
Copy link
Member

I have a question about the Ldloca variable splitting. Is that a proper solution (always produces correct code) or just a hack that kind of works with most of the time?

@dgrunwald
Copy link
Member

I don't know; but I looked at the diffs decompiling ILSpy with the old version, and with this branch merged; and did not see a single change where the decompiled code was wrong.
I did see some regressions where the inlining now got a bit too aggressive (but still producing correct code), but those were quite rare, certainly less frequent than the cases where inlining wasn't aggressive enough in the old version.

I also looked at portions of the diffs of a couple BCLs assemblies and didn't see any newly introduced bugs there, either.
But of course just because it's correct for code produced by the C# compiler doesn't mean it's correct for any IL code.

@pentp
Copy link
Contributor Author

pentp commented Sep 19, 2011

The first solution I came up with was done in somewhat wrong places and looked like a hack. However the version that went in to the master branch should be quite robust. It basically treats every ldloca as a load+store with two exceptions - ldloca;initobj and ldloca on an uninitialized variable are considered just a store. Initobj only stores a value and the second case can be thought of as a load default + store and the default doesn't depend on the value of the variable (just like regular ldloc on an uninitialized variable). It's a conservative approach so it should produce correct code all the time.

This whole thing has been a great learning experience for me and most of the changes not directly related to lifted variables were done on an as-needed basis so I'm not sure if they were done in the most appropriate way.

@dsrbecky
Copy link
Member

What about something like this (pseudocode):

val = 1;
ptr = &val;

val = 2;
*ptr = 3;
print(val); // Should print 3

Wouldn't your algorithm separate the val into two variables and thus print 2?

@pentp
Copy link
Contributor Author

pentp commented Sep 19, 2011

That too aggressive inlining regression is probably caused by while (--pos >= 0 && inlining.InlineIfPossible(body, ref pos)) ; after lifted operator simplification. I think I tried it with a limitation of two passes but that didn't always work and since the current variant doesn't produce incorrect code I left it at that.

Another thing that wasn't really resolved is the case of SimplifyLogicNot/PushNegation on floats/doubles and reference types (cgt.un). I started looking into it but it turned out to be somewhat more complex than expected. Cgt.un on reference types should probably be converted to cne virtual opcode sometime after the first type analysis and cgt/cge/clt/cle should only be transformed if the type is known (both in SimplifyLogicNot and PushNegation).

@pentp
Copy link
Contributor Author

pentp commented Sep 19, 2011

Yeah, that doesn't quite work. Although the resulting code looks correct the variable splitting is too aggressive in that case. It only works at the moment because those variables aren't inlined :|

@dsrbecky
Copy link
Member

I am pretty sure that this problem can not be solved in general, but you are along the right path. Some specific cases can be treated as load or store depending on the following instruction. However, if there are any unknown ldloca instructions, then the variable simply can not be split. The first priority of ILSpy is to generate correct code.

Can you try to limit the algorithm only to the cases where it is guaranteed to work?

@ghost ghost assigned dsrbecky Oct 5, 2011
dgrunwald added a commit that referenced this pull request Dec 1, 2011
36c9cae Add ICompilation.Import() extension methods.
1467ce3 Fixed accessibility check for protected members in outer classes.
f410a2b Enum members are implicitly cast to the underlying type when used in an enum member initializer.
a9c743c Fixed type inference for "condition ? someEnum : 0"
42ce4ca Fixed type inference in foreach when the collection type does not implement IEnumerable.
45bcad4 Added getsubtype definitions helper method.
78ac5bb Fixed some code completion unit tests.
ff2a11b Worked on unit tests.
51986c4 Worked on code completion unit tests.
c453405 Updated code completion to the latest type system changes.
a564ebb Added easy to access enumerables for various member types.
a71670f Fixed some issues with code round-tripping; and added unit tests for some parser failures.
4db74d9 Fixed roundtripping of preprocessor directives.
bf88746 Added some failing parser tests for bugs discovered trying to roundtrip NRefactory itself.
a6433d4 Do not try to infer a type from the null literal.
b0b9942 Add ISolutionSnapshot for creating compilations for multiple projects from a single consistent snapshot.
7e95cb7 Add CSharpAstResolver.GetResolverStateBefore
4ccc313 Normalize newlines
9ffbdb8 Merge type system refactoring into NRefactory master.
4d4f1f4 Fixed various resolver bugs.
c4ce934 Fixed bug when resolving base constructor calls.
0b263b0 TypeSystemConvertVisitor: implemented ConvertInterfaceImplementation for methods and properties. Implemented 'goto case' support in control flow analysis.
f11eed9 Introduce a named unknown type (this allows TypeSystemAstBuilder to work better when there are resolve errors).
e2cb546 Adjust CodeDomConvertVisitor and DefiniteAssignmentAnalysis to new type system.
3d9b3ec Improved enum parsing.
e69e9f1 Fixed enum context.
bf3a1d7 Added gtk text editor lib.
9cc151d Started support for unclosed expression statements.
bcde6f1 Worked on enum context.
37798b0 Improved get/set keyword handling.
2ec341e Merged with mcs.
7e53805 Updated mcs/fixed some code completion cases.
73438b7 Make CSharpResolveVisitor internal and expose CSharpAstResolver instead.
9d7c018 Adjusted C# resolver to refactored type system.
c02e801 Introduce ResolvedUsingScope (serves as cache per using-scope; avoids resolving imported namespaces repeatedly).
5069b98 Fixed code completion bug.
3b6fda2 Added gtk demo & fixed code completion bug.
54851a7 C# Type System implementation
057c0be Fixed "partial" bug.
ba88599 Added a method to determine the current parameter index.
a701436 Changed parameter completion API slightly.
4a41972 Fixed code completion bug "Bug 1932 - [new resolver] fields don't show up unless prefixed with 'this.'".
2ea298c Fixed some null reference exceptions.
f479cb0 fixed bug in parameter parsing.
98d1826 Added error expression as ast node.
fa6ea12 Added error expression for the for construct.
817a2da Added error expression for invalid initializers.
4376927 Renamed file.
a4259ef * cs-tokenizer.cs: Fixed location bug.
ec82082 Handled pre processor directives as separate AST node.
11a9ce2 Added context action unit tests.
f9916d8 WIP: Type system refactoring.
2d70017 Fixed some parameter resolve result issues.
1bc609f Added pre processor "if"/"elif" contexts.
03ee478 Improved current member recognition.
c29b4ed Added missing follow up char.
0ab566c Fixed "partial" context.
a097af0 Fixed multiple line comment content.
8124eaa Fixed invalid expression statement.
a348bbd Merge branch 'master' of github.com:icsharpcode/NRefactory
41607d7 Added invalid expression statements to the ast.
d461987 Improved comment & string context recognition.
f631199 Type system refactoring: split unresolved/resolved type systems.
1ebf835 Fixed parameter completion issue.
af6ba7d Fixed unit test.
bf3fdb3 Added variable statement declaration tests.
4824080 Implemented indexer parameter data provider.
8c82864 Fixed field declaration context & "new" expression context.
86b3da2 Added object initializer tests & fixed them.
fb374aa Fixed unit test.
f1612d5 Added some more keyword tests & fixed some cases.
f5730d5 * CSharpCompletionEngine.cs: Fixed some bugs in global/type context.
85e1173 Fixed parser bug.
59ce505 Fixed last failing code completion unit test.
bc4ca21 * NRefactory.sln:
cfcaca8 Merged mcs.
b74cfbb Added switch context handling.
fa5e7e5 Fixed potential resolve issue. Note: That doesn't really fix the problem - only 90% of the cases. The partial class representation could need a change.
c498bc8 Fixed completion inside try ... catch bodies.
03407fd Fixed failing unit test.
2a795ae Fixed some unit tests.
8942cd5 Fixed failing unit test.
2cfd9b2 Added attribute context.
7a8e9d0 Merged with mcs master.
18e088e SpecializedMethod: Perform type substitution in the type parameter constraints.
69360a2 Implemented constraint inheritance.
57d55c6 Fixed resolving simple names within a generic class that refer to a non-generic inner class.
5d614c0 Fixed unit test.
98ad3ba Fixed unit test.
ffa2fc9 Fixed failing unit test.
42c0a2d Fixed unit test.
6d4a825 Fixed some unit tests.
a81096d Included interfaces in type parameter base types.
2bc6105 Fixed some completion unit tests. I'm not 100% sure if that's the best solution fixing this, but makes things easier for now.
fd91bdc Fixed resolving non-generic classes that are nested within generic classes.
f043e30 Fix ResolveVisitor.GetResolverStateBefore(): ensure that the resolver always registers the state before it caches a result.
ed86963 Failed assertions now fail unit tests.
81cac5e Fixed little code completion ast bug.
f3ea71a Fixed some completion tests.
7f697b3 Fixed resolver usage.
e96dbbd Added 'UnknownTypeResolveResult'. Makes it easier to implement the 'add missing namespace import' function.
0320a66 Added monodevelop code completion tests.
5ee9b73 Added missing unit test.
1c0ce1d Added default parameters for type parameter count.
f7b3094 Merged with md master.
7063203 FieldDeclaration/EventDeclaration/VariableDeclarationStatement now resolve to 'void'. Only the individual VariableInitializers will resolve to the field/event/variable.
800b951 Fixed bugs in ResolveVisitor:  - forgot scanning into ForEachStatement.InExpression when the variable type was not 'var'  - ProcessConversion() was called for Expression.Null  - made Resolve() internal because it hard to use correctly
b3d07d8 Ported over the monodevelop completion engine to nrefactory.
f46609b Worked on basic nrefactory completion infrastructure.
8389d7a Use OperatorResolveResult for assignments.
9ddf9bc Combine C#-specific UnaryOperatorResolveResult/BinaryOperatorResolveResult and ConditionalOperatorResolveResult classes into a single language-independent OperatorResolveResult class.
4bbcf2d Add "public ResolveResult Body { get; }" to LambdaResolveResult.
751b601 Shorten inner type names.
434ec17 Correctly set kind of compound types.
177fb85 Fix infinite recursion when resolving the base type of "class Test : Test.Base { public class Base {} }"
475f838 Make DefaultTypeDefinition.FullName cache thread-safe.
8c3899f Cached TypeDefiniton FullName. This is required for the navigate to feature which checks the full name as well as the name. Otherwise it would cause thousands of slow string concats. Since the type system should be treaded as read only namespaces & declaring type definitions can't change.
7b542f4 Enabled async tests.
e4c70d4 Added await/async support & updated mcs.
d91eb2c Print error information.
b7fcc55 Re-enable resolver unit tests that failed due to the parser returning incorrect positions.
25176ef Fixed IMember.IsOverridable to return true for abstract members.
e4850e9 Added async modifier.
0c3d5e0 Fix bug in ParameterListComparer: the method signatures "Method<T>(T a)" and "Method<S>(S b)" were considered unequal.
4d73e48 Add GetEffectiveBaseClass() and GetEffectiveInterfaceSet() to ITypeParameter, and fixed a bug in DefaultTypeParameter.IsReferenceType().
0c03236 Update AssemblyInfo for ICSharpCode.NRefactory.CSharp
813c8f5 Merge NRefactory 'b059dbcf413786069599d1686ac608150bd3f357' into SharpDevelop repository.
b059dbc Fixed NullReferenceException when resolving group join clause.
ad1e90b Merge pull request #205 - Decompilation of lifted operators
f0bfded C# AST: when setting a string property to null or an empty string, remove the corresponding identifier token.
9f988f6 Adjust ILSpy to NRefactory changes.
6994e19 Merge NRefactory '88ebd0b9596edab0e916ff28ae53f6febbdff032' to ILSpy.
5a2b4d2 fix out parameter output in CSharpToVBConverterVisitor
f085e9c fix AlwaysUseShortTypeNames omitting generic type arguments
02f1929 implement find references on local variables
8a598a7 fix CSharpBinding reference to NRefactory.CSharp; fix long type names bug in CSharpAmbience
02b1e15 Fix unit tests after moving CSharpAmbienceTests to different namespace.
e6d84ff moved Ambience to NRefactory
49c5c79 implement all different members in CSharpAmbience output
5beb702 implement CSharpAmbience for ITypeDefinition
9526969 Adjust SharpDevelop to NRefactory changes (move ICSharpCode.Editor -> ICSharpCode.NRefactory.Editor; put NR.C# in separate assembly)
2f4f82c Merge changes from NRefactory into SharpDevelop newNR branch.
10a6608 Merge branch 'master' of git://github.com/icsharpcode/ILSpy
472daa9 remove old files
588557e fix case sensitivity in IsKeyword check
3e37fb2 fix output of object initializers
7a9bb56 InsertParenthesesVisitor: insert parenthesis in "(new int[1])[0]"
ddc5d8a Merge branch 'master' of git://github.com/icsharpcode/ILSpy
1398d8a add missing space after Case keyword
a854421 corrected spelling of Overrides-modifier; convert virtual to Overridable and override to Overrides
5c10903 Merge branch 'master' of git://github.com/icsharpcode/ILSpy
ee384a2 PDB files for release mode. File alignments and base addresses improved.

git-subtree-dir: NRefactory
git-subtree-split: 36c9caec101ad5b2c47d11ac12bcd043d8d6e115
ashmind added a commit to ashmind/ILSpy that referenced this pull request Oct 14, 2017
…decompiling exception handlers that end at the very last instruction.
ashmind added a commit to ashmind/ILSpy that referenced this pull request Oct 14, 2017
…used by decompiling exception handlers that end at the very last instruction.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants