PyDev adventures: debugger

This is a walkthrough with the steps I'm taking to add support to launch and debug a Python script in PyDev for VSCode (note that I'm writing as I'm learning).

The debugger protocol is the protocol used in VSCode to talk to debuggers and handle launching in general (the naming may be a bit weird as the same protocol is used for regular launches and debugging, but apparently the team first did the debugging and then launching came as an afterthought just passing a separate flag during the launching of the program to specify that no debugging should be done -- and not the other way around as I think would be more common).

There is an overview of the protocol at https://code.visualstudio.com/docs/extensionAPI/api-debugging and https://code.visualstudio.com/docs/extensionAPI/extension-points provides more information on what an extension must use to provide a debugger.

There's also a json schema which specifies the format of the messages sent back and forth in the debugger at https://raw.githubusercontent.com/Microsoft/vscode-debugadapter-node/master/debugProtocol.json.

But, after reading all that, it seems that many things are still cloudy on my head on how to actually go on about it and what should be done concretely to implement a debugger in VSCode.

So, my approach is getting the debugProtocol.json, converting it to a structure with Python classes (so that each message that can be sent has a Python representation) and playing a bit doing a debugger stub, just to exercise a dummy debugger talking to VSCode (but without actually doing anything).

It's interesting to note that the first thing to do is actually making the debugger available in the extension. For that, I've used the json below in package.json (as a note, my package.json is actually generated from Python code, so, the structure below is actually a Python dict which is later converted to json, not the actual json -- if you're doing a VSCode extension, I highly recommend generating your package.json and parts of the code that are related and not doing it all by hand... this way it's possible to see it in small pieces and auto generate command ids and the related code, etc... initially I haven't done so in PyDev, but as the declarative files grow, it becomes harder to follow and make changes while keeping the code and declaration in sync):

  
{
    'type': 'PyDev',
    'label': 'PyDev (Python)',
    'languages': ['python'],
    'adapterExecutableCommand': 'pydev.start.debugger', 
    
    # Note: adapterExecutableCommand will be replaced by a different API (right now still in proposal mode). 
    # See: https://code.visualstudio.com/updates/v1_20#_debug-api
    # See: https://github.com/Microsoft/vscode/blob/7636a7d6f7d2749833f783e94fd3d48d6a1791cb/src/vs/vscode.proposed.d.ts#L388-L395
    
    'enableBreakpointsFor': {
        'languageIds': ['python', 'html'],
    },
    'configurationAttributes': {
        'launch': {
            'required': [
                'mainModule'
            ],
            'properties': {

                'mainModule': {
                    'type': 'string',
                    'description': 'The .py file that should be debugged.',
                },

                'args': {
                    'type': 'string',
                    'description': 'The command line arguments passed to the program.'
                },

                "cwd": {
                    "type": "string",
                    "description": "The working directory of the program.",
                    "default": "${workspaceFolder}"
                },

                "console": {
                    "type": "string",
                    "enum": [
                        "integratedTerminal",
                        "externalTerminal"
                    ],
                    "enumDescriptions": [
                        "VS Code integrated terminal.",
                        "External terminal that can be configured in user settings."
                    ],
                    "description": "The specified console to launch the program.",
                    "default": "integratedTerminal"
                },
            }
        }
    },

    "configurationSnippets": [
        {
            "label": "PyDev: Launch Python Program",
            "description": "Add a new configuration for launching a python program with the PyDev debugger.",
            "body": {
                "type": "PyDev",
                "name": "PyDev Debug (Launch)",
                "request": "launch",
                "cwd": "^\"\\${workspaceFolder}\"",
                "console": "integratedTerminal",
                "mainModule": "",
                "args": ""
            }
        },
    ]
}

So, although there are many things there, initially we just need to make adapterExecutableCommand return the command to be executed (you could also create a standalone executable or something to run along with a supported vm -- such as mono, but there's nothing for python there, so, the adapterExecutableCommand is probably the best approach for a python debugger).

In my case it's something as:

  
commands.registerCommand('pydev.start.debugger', () => {
    return {
        command: "C:/bin/python27/python.exe",  // paths initially hardcoded for simplicity
        args: ["X:/vscode-pydev/vscode-pydev/src/debug_adapter/debugger_protocol.py"]
    }
});

The configurationSnippets section provides the snippets which allow VSCode to autogenerate the configuration for the user and the configurationAttributes are actually custom for each implementation (so, those will probably need more tweaking going forward).

Another interesting point is that when VSCode launches the debug adapter it'll use stdin and stdout to communicate with the adapter (this makes some things a bit quirky to develop the debugger because you have to (initially) resort to printing debug information to a file to be able to check what's happening, although on the bright side, you won't have to worry about having a firewall at that point).

Also, don't forget to flush after writing messages to stdout.

Now, on to the protocol itself... I created something which would read from stdin and then redirect that to a file to see what's coming (after digging up things a bit more I found an issue in the VSCode tracker referencing: https://github.com/buggerjs/bugger-v8-client/blob/master/PROTOCOL.md which details that a bit more -- although not all that's there is actually applicable to the VSCode debugger).

The first message that arrives from stdin is:

  
Content-Length: 312\r\n
\r\n
{
    "arguments": {
        "adapterID": "PyDev", 
        "clientID": "vscode", 
        "clientName": "Visual Studio Code", 
        "columnsStartAt1": true, 
        "linesStartAt1": true, 
        "locale": "en-us", 
        "pathFormat": "path", 
        "supportsRunInTerminalRequest": true, 
        "supportsVariablePaging": true, 
        "supportsVariableType": true
    }, 
    "command": "initialize", 
    "seq": 1, 
    "type": "request"
}

-- this is the InitializeRequest in the json schema.

So, it seems a regular http-protocol, sending json contents as the actual content... so, in response to that, the debug adapter should do its initialization and return the capabilities it has -- something as:

  
{
    "seq": 1,
    "request_seq": 1, 
    "command": "initialize", 
    "body": {"supportsConfigurationDoneRequest": true, 
             "supportsConditionalBreakpoints": true}, 
    "type": "response", 
    "success": true
}

-- this is the InitializeResponse in the json schema.

and then send and event saying that it has initialized properly:

  
{"type": "event", "event": "initialized", "seq": 2}

-- this is the InitializedEvent in the json schema.

Note that those are all http responses, so, the Content-Length: $size\r\n\r\n needs to be passed on each request (note that each message sent or received has a seq, which is a number that should be raised whenever a new message is sent -- the seq is raised independently on the server and on the client and responses should reference the seq from the request in request_seq).

Afterwards, the client (VSCode) sends the actual launch request (which should be based on the configurationAttributes previously configured). In this case:

  
{
    "arguments": {
        "__sessionId": "474aa497-0a90-4b30-8cc6-edf3bebbe703", 
        "args": "", 
        "console": "integratedTerminal", 
        "cwd": "X:\\vscode_example", 
        "name": "PyDev Debug (Launch)", 
        "program": "X:/vscode_example/robots.py", 
        "request": "launch", 
        "type": "PyDev"
    }, 
    "command": "launch", 
    "seq": 2, 
    "type": "request"
}

-- this is the launch request in the json schema (it comes with additional attributes the user specified in the launch... each extension needs to tweak the actual parameters to its use case).

At this point, it becomes clear that this is really just an adapter: we're expected to actually launch the process and provide the communication layer to the actual debugger (so, the debugger doesn't really have to be changed -- although on some cases that may be benefical if possible... for instance, the debugger could already give output on the variable frames as json so that the message doesn't need to be decoded and recoded in a new format).

Also, the stdin and stdout may be in use (because VSCode uses it to communicate to the debug adapter), so, it may be hard to reuse this process to be the actual debugger process (for instance, launch could then make main proceed to launch the program in this process if the debugger could directly handle the debug protocol, but then if clients managed to write to the 'real' stdin/stdout handles, the debugger would stop working).

The launch request just requires a notification that the program was launched, so, the response would be a launch response with an empty body (or if there was some error -- say, the file to be launched no longer exists -- a "message" could be set and "success" could be False).

  
{
    "request_seq": 2, 
    "command": "launch", 
    "body": {}, 
    "type": "response", 
    "success": true
}

-- this is the LaunchResponse in the json schema.

Ok, now, at this point I already have a structure which parses the json and creates python instances for each protocol message (and vice-versa), so, instead of specifying each message in its full format, I'll just reference it from the identifier on the schema instead of the actual json.

After the launch request, I get a ConfigurationDoneRequest and return the proper ConfigurationDoneResponse and for the ThreadsRequest a ThreadsResponse.

At this point, the debugger will sit idle, waiting for actions from the user or events from our debug adapter (if more than one thread was returned in the ThreadsResponse, the threads will appear in the CallStack).

Now, the only thing different at this point is that the debug controls will appear, so, a pause or stop can be activated from the UI.

Pressing stop will send us a DisconnectRequest (for which a DisconnectResponse should be sent as an acknowledgement) and the pause will send a PauseRequest (which requires us to send back a PauseResponse -- and after a thread is actually paused, a StoppedEvent should be sent).

Ok, this is the end of part 1 (we have something which can be started and later stopped -- without actually doing anything, so, pretty much a mock debugger)... This actually took me 2 full days to implement (most of the work trying to wrap my head around how things worked and generating python code from the json schema -- I tried some libraries and none of them worked as I needed, so, I rolled my own here).

My main gripe was the lack of a better documentation on how to approach doing a debug protocol from scratch and how it should work. For instance, it took me quite a while to find a reference to launching from the adapterExecutableCommand where I could construct a command line -- initial references I found pointed only to using an executable or a supported runtime such as mono -- some things I still don't know how to handle such as how to actually provide output based on the console type the user expects: (i.e.: integratedTerminal, externalTerminal) -- anyways, hope to get to that in the upcoming parts...

The final code I have at this point (which also contains the code generator I did) may be seen at:

https://github.com/fabioz/python_debug_adapter_tutorial/tree/master/part1

Part 2 should get us to the point of actually launching a process...

Pydev 1.4.8 has been released (a few days ago).

The usability improvements on this release had already been highlighted before, so, I think that the major highlight to talk about is the possibility of 'live coding' in the debugger (although it's still a bit on the experimental side).

To enable that, 2 features were done: Jump to line (ctrl+R with the focus on the line) and reload module contents (right now only available for python 2.x).

The idea is that you can get to some point in your code in the debugger, watch what's happening, change the code to fix it and then jump to the line that created that frame (so, you actually need to do a jump to an outer frame -- which actually requires finishing the current frame, as python has no support for dropping a frame -- although you can jump to the return line and then jump in the outer frame to the caller of that frame -- yes, it can be a bit confusing).

Note that there are still a number of shortcomings in the current version -- some because of python, which can only make a jump when the debugger receives a line event (so, if you were at the end of the frame, it might be that you cannot properly do the jump, because a return event will be available in that case, instead of a line event, and being inside a try..except has its peculiarities too), and others because the reload is still pretty naive (it uses the xreload module -- currently only available for 2.x -- but I already have some ideas that can make it better).

So, in the current version, it can be a bit difficult to use it properly, but still, it's pretty nice when it does :)

p.s.: I was going to provide a screencast for that, but decided on waiting a bit more until I'm able to work on some of the rough edges in the current implementation.

p.s2.: The jump to line in python can only jump inside the current topmost frame, but in pydev you can ask to make a jump to any place in the stack, but that means it'll execute everything normally until it gets to the frame where the jump was asked -- and only then will it actually execute the jump -- maybe if there's enough interest in that, someone at the python side could provide a way to drop frames :)

PyDev adventures

Wednesday, May 09, 2018

Howto launch and debug in VSCode using the debug adapter protocol (part 1)

Friday, August 21, 2009

Pydev 1.4.8 released & Live coding in Pydev

Thursday, May 31, 2007

Pydev release 1.3.4 and first impressions on Eclipse 3.3