A couple of weeks ago I started prototyping a new code completion backend. The one available in Pharo 8.0 is good-enough to complete variables and messages to literal objects. However, the accuracy of the completion was pretty low because auto completion candidates were sorted alphabetically, and this was even producing completion pauses for large images. This code completion prototype engine, that I originally baptised as Complishon, was based on two ideas: generators and heuristics.
Generators are objects that provide a stream-like interface to consume values generated sequentially. Better with an example.
stream := Generator on: [ :generator |
generator yield: 1.
generator yield: 'hello'.
].
stream next.
>>> 1
stream next.
>>> 'hello'
One cool thing about generators is that the generator code will block until we call next
. This is particularly interesting when iterating several big collections: we only really iterate them if we need them. And they come by default in Pharo :).
Heuristics, on the other hand, are not funky Pharo objects that come in a library. Heuristics are objects that represent rules. In our case rules for auto completion. Here you have some example of heuristics:
- chances are I want to autocomplete first the methods defined in my package
- chances are that a variable called
anOrderedCollection
will have effectively anOrderedCollection
instance
And so on.
If we encode enough of these little rules on objects, we could have a smart enough auto-completion engine that does not require complex algorithms or machine learning models. Not saying that these may not be good, just saying that I wanted a quick solution to make our life easier.
This post talks about the architecture inside the completion engine. What parts are inside it and how they relate. Turns out this prototype grew up and is making it to the big leagues in Pharo 9.0, where it will be called HeuristicsCompletion.
An architecture for lazy code completion
The heuristics completion engine is done out of three main components:
- lazy fetchers implemented using generators,
- a completion object that works as a result-set
- and several completion heuristics that are chosen from the current AST node.
Fetchers
Fetchers are subclasses of CoFetcher
. They need to define the method #entriesDo: aBlock
iterating a collection. The fetcher framework will take care of creating a streaming API for it using generators. For example, a simple fetcher can be created and used with:
CoFetcher subclass: #MyFetcher instanceVariableNames: '' classVariableNames: '' package: 'Complishon-Core' MyFetcher >> entriesDo: aBlock [ 1 to: 10000000000 do: aBlock ] MyFetcher new next >>> 1.
Basic Fetchers
A simple generic fetcher works on a collection.
CoCollectionFetcher onCollection: #( a b b a c )
And an empty fetcher is useful to provide no results
CoEmptyFetcher new
Fetcher combinators
Fetchers can be combined by using simple combinators. A sequence of fetchers, created with the message #,
, fetches first the elements in the first fetcher, then in the second one. This also means that the second fetcher will never be executed until the first one is completely consumed, which is an interesting property if it is expensive to calculate.
CoInstanceVariableFetcher new, CoGlobalVariableFetcher new
A filter fetcher, created with the message #select:
, decorates another fetcher and filters it with a block.
CoInstanceVariableFetcher new select: [ :e | "a condition..." ]
A no-repetition fetcher, created with the message #withoutRepetition
, decorates another fetcher and filters those elements that were already returned previously by himself.
aFetcher withoutRepetition
The Juice: Fetchers for code completion
In addition, the engine provides fetchers to iterate the following:
CoInstanceVariableFetcher
: instance variables of a classCoClassVariableFetcher
: class variables of a classCoClassImplementedMessagesFetcher
: messages implemented by a classCoGlobalVariableFetcher
: global variables in an environmentCoMethodVariableFetcher
: variables accessible to an AST node (in order)CoPackageImplementedMessagesFetcher
: messages sent in a package
For code completion, another combinator shows useful to iterate a fetcher up a class hierarchy: CoHierarchyFetcher
. This hierarchy fetcher decorates a ClassBasedComplishonFetcher
(i.e., a CoClassImplementedMessagesFetcher
, a CoInstanceVariableFetcher
or a CoClassImplementedMessagesFetcher
), and can be created with the message #forHierarchy
.
Completion ResultSet
The CoResultSet
object stores a fetcher and lazily fetches and internally store objects on demand:
c := CoResultSet fetcher: (CoInstanceVariableFetcher new
completionClass: aClass;
yourself).
"The first time it will fetch 20 elements from the fetcher and store them"
c first: 20.
"The second time it will just return the already stored elements"
c first: 20.
"Now it will fetch some more objects to have its 25"
c first: 25.
The CoResultSet
object allows one to:
- explicit fetch using
fetch:
andfetchAll
- retrieve the results using
first
,first:
andresults
- filter those results using
filterByString:
- query it with
hasMoreElements
andnotEmpty
Heuristics
When the autocompletion is invoked, it calculates how to autocomplete given the AST node where the caret is in the text editor. The following piece of code shows examples of heuristics implemented in the prototype.
Heuristics for messages
If the AST node is a message whose receiver is self
, autocomplete all messages implemented in the hierarchy.
(CoClassImplementedMessagesFetcher new
completionClass: aClass;
forHierarchy) withoutRepetition
If the AST node is a message whose receiver is super
, autocomplete all messages implemented in the hierarchy starting from the superclass.
(CoClassImplementedMessagesFetcher new
completionClass: aClass superclass;
forHierarchy) withoutRepetition
If the AST node is a message whose receiver is a class name like OrderedCollection
, autocomplete all messages in the class-side hierarchy of that class.
(ClassImplementedMessagesComplishonFetcher new
completionClass: (completionContext environmentAt: aRBMessageNode receiver name) classSide;
forHierarchy) withoutRepetition
If the AST node is a message whose receiver is a variable that has type information in its name name, like anOrderedCollection
, autocomplete all messages in the instance-side hierarchy of guessed class. Then continue with normal completion. There are two cases: variables starting with a
such as aPoint
and variables starting with an
such as anASTCache
.
completionContext
environmentAt: aRBMessageNode receiver name allButFirst asSymbol
ifPresent: [ :class |
^ (ClassImplementedMessagesComplishonFetcher new
completionClass: class;
forHierarchy), PackageImplementedMessagesComplishonFetcher new ].
completionContext
environmentAt: (aRBMessageNode receiver name allButFirst: 2) asSymbol
ifPresent: [ :class |
^ (ClassImplementedMessagesComplishonFetcher new
completionClass: class;
forHierarchy), PackageImplementedMessagesComplishonFetcher new ]
If all the above fail, autocomplete all messages used in the current package. Chances are we are going to send them again.
^ CoPackageImplementedMessagesFetcher new
complishonPackage: aPackage;
yourself
Heuristics for Method signature
When autocompleting the name of a new method, chances are we want to override a method in the hierarchy, or to reimplement a method polymorphic with the ones existing in the current package.
self newSuperMessageInHierarchyFetcher,
(CoPackageImplementedMessagesFetcher new
complishonPackage: aPackage;
yourself)
withoutRepetition
Heuristics for variables
Variables accessed by an instance are first the ones in the method, then the ones in the hierarchy.
instanceAccessible := (CoMethodVariableFetcher new
complishonASTNode: anAST;
yourself),
(CoInstanceVariableFetcher new
completionClass: aClass)
forHierarchy ].
Then, variables accessed by an instance are also the ones in the class variables and globals.
globallyAccessible := (CoClassVariableFetcher new
completionClass: complishonContext complishonClass)
forHierarchy,
(CoGlobalVariableFetcher new
complishonEnvironment: anEnvironment;
yourself).
What’s next?
Try it in the lastest Pharo 9.0 dev build!
We would like to have new ideas for heuristics.
For example, I’ve started integrating it with a fast type inferencer that looks at initialize
methods, or with a fast type inferencer like roel typer which latest version for Pharo can be found on github.
Why not machine learning?
…
Statistical models?
…