Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up Resolve.scala and related code to improve rigor and error reporting #2453

Merged
merged 60 commits into from
Apr 26, 2023

Conversation

lihaoyi
Copy link
Member

@lihaoyi lihaoyi commented Apr 23, 2023

This PR overhauls Resolve.scala and Resolve{Tasks,Metadata,Segments}.scala, which is some of the oldest and most poorly organized code in the repo. This gives us several things:

  1. Improving maintainability and understandability: I can explain how the resolve logic works now, where I couldn't before. The way it called back and forth between RunScripts and Resolve* sub-classes and the Resolve superclass was crazy

  2. Make the resolve behavior more predictable: the previous implementation had a lot of copy-pasty code, both within Resolve.scala and between Resolve{Tasks,Metadata,Segments}.scala, with subtle changes between them especially in error reporting. e.g. the code to handle _ and __ was copy-pasted 4-5 times. Now the implementations share much more code, without weird copy-pasty divergences, and should behave much more consistently

  3. Fix a lot of old bugs: e.g. when resolution failed, the error message sometimes (but not always) had the segments in the backwards order. Passing in a query to Mill with a parse error would sometimes (but not always) just truncate it at the parse error. Brace expansion would fail to expand empty sections e.g. {,foo,bar}qux => qux fooqux barqux. These are now fixed, along with many other issues mostly around edge cases or error reporting

  4. Greatly improve error reporting in the case of module initialization failure. These previously returned massive stacktraces from un-caught exceptions, and now return truncated short(ish) stacktraces with mostly the stack frames the user cares about it. The fact that the exceptions are caught also means we can properly unit test and integration test their contents

  5. Improve the laziness of the module initialization process. Previously, we would initialize many more modules than we needed to while performing target resolution. Now we resolve a much tighter set:

    1. When some portions of the module tree fail to initialize, you can still run queries and builds on the other parts
    2. If you have a very large build.sc that takes a non-trivial amount of time to initialize all the modules, now you only initialize what you need to initialize depending on what command you are running

Major Changes

  1. RunScript.scala, Resolve.scala, and Resolve{Tasks,Metadata,Segments}.scala have been mostly rewritten. Now there are three main components:

    1. ResolveCore.scala, that performs resolution of everything-that-can-be-resolved, wrapped in ResolveCore.Resolved objects
    2. ResolveNonEmpty.scala, which wraps ResolveCore.scala and standardizes the error reporting logic when resolution fails
    3. Resolve.scala, which contains ResolveTasks, ResolveMetadata, and ResolveSegments that take the ResolveCore.Resolved objects and perform the minimal processing necessary for each use case
  2. Improved the laziness of ResolveCore.scala.

    1. A lot of previous calls to millModuleDirectChildren that previously would force all children to be instantiated have been replaced by more targeted APIs, e.g. .millModuleDirectChildren.collect { case b: RootModule => b } is now .reflectNestedObjects[RootModule](), .millModuleDirectChildren.find(_.millOuterCtx.segment == Segment.Label(singleLabel)) is now .reflectNestedObjects0[Module](namePred: String => Boolean), etc.
    2. A bunch of the helpers in Module.scala were refactored to allow reflecting on members without invoking them, or specifying the name of the thing you want to reflect on to select it directly rather than having to list out and instantiate everything
    3. Cross modules also had some tweaks in order to return MapViews instead of Maps, to allow instantiation of the individual Cross.Module instances to happen separately and on-demand
  3. Greatly improved the strictness of error reporting of ResolveCore.scala. In particular, the places which can throw exceptions should be mostly wrapped in catchReflectException or catchNormalException and turned into Either[String, T]s

Minor Changes

  1. I broke out ExpandBraces.scala and ExpandBracesTests.scala, so they're no longer mixed in with ParseArgs.scala. Mill's brace expansion happens textually in a totally separate phase before argument parsing runs, following how the Bash shell performs brace expansion, and so neither query parsing nor brace expansion need to know about each other.

  2. Segments API was tweaked a bit to better suite the common usage patterns

Testing

  1. MainTests was renamed ResolveTests, and now exercises the ResolveMetadata code path in addition to ResolveTasks

  2. ResolveTests has a new section moduleInitError, which goes through a number of scenarios - three standalone modules, two dependent modules, partially and fully-broken cross module - and checks that module initialization failures in various places are properly caught and handled with truncated stack traces shown to users

  3. Added a simple integration test integration/failure/module-init-error to exercise the error handling during module initialization end-to-end and make sure that truncated stack traces are shown to users

  4. Added some more tests to ResolveTests.scala, covering the end-to-end handling of brace expansion, along with error reporting for double-nested modules (to reproduce a previous bug)

Notes

  1. The API to task resolution used by most code internal and external should be mostly the same, the only minor change is from RunScript.resolveTasks(mill.main.ResolveFoo, ...) to ResolveFoo.resolve(...) due to the removal of one layer of indirection

  2. I might have missed some spots where we need to catch exceptions to truncate/convert-to-either; the nature of exceptions means there's no way to statically guarantee you caught them all. But the tests should cover most use cases, and if we later notice some code paths still throwing uncaught exceptions it'll be easy to wrap them then

  3. The laziness of module initialization can probably be improved further, e.g. we don't need to initialize modules at all for mill resolve unless there are cross modules whose keys we need to compute. Leaving that for future work

@lihaoyi lihaoyi changed the title [WIP] Try to clean up Resolve.scala and related code [WIP] Clean up Resolve.scala and related code to improve rigor and error reporting Apr 25, 2023
@lihaoyi lihaoyi changed the title [WIP] Clean up Resolve.scala and related code to improve rigor and error reporting Clean up Resolve.scala and related code to improve rigor and error reporting Apr 26, 2023
@lihaoyi lihaoyi requested review from lefou and lolgab April 26, 2023 00:47
@lihaoyi lihaoyi marked this pull request as ready for review April 26, 2023 00:47
Copy link
Member

@lefou lefou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great change! 👏 I myself got lost multiple times in the resolver code. Looks good to me.

@lihaoyi lihaoyi merged commit 135d0fd into com-lihaoyi:main Apr 26, 2023
@lefou lefou added this to the 0.11.0-M9 milestone Apr 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants