OUseful.Info, the blog...

Pondering AnyHeadlessWasmWidget — Headless Application Packaging for Jupyter, Marimon, VS Code Notebooks, Shiny Live etc.

The Python based anywidget specification / toolkit, and the anyhtmlwidget R variant that is based on it, provide a set of tools for “authoring reusable web-based widgets for interactive computing environments” by allowing developers and, in many cases, end-user developers / AI-assisted users, to wrap pre-existing Javascript applications with the machinery necessary to allow coding against those applications as if they were native widgets.

Note the assumption of “web-based”, and the implication of REPL (“interactive computing environments“).

(For convenience, I am going to imagine the use of these widgets in a notebook style environment in a Python environment; but it could equally well be in a “live” interactive HTML book style environment such as Quarto Live, and withanyhtmlwidget, could be in an R environment.)

There are several ways the widget framework can be used:

as a simple wrapper for rendering the application in the notebook environment. For example, you have an application that provides interactive rendering of an image uploaded from the desktop or retrieved from a URL: renderMyAnyWidget()
pass data in from the Python side when creating the widget; for example, pass in image data for rendering: w = renderMyAnyWidget([image1, image2])
pass data in from the Python side to update the widget. For example, if we are running the widget as an application in its own panel in JupyterLab, we might pass in new images from the Python side to the application: w.addFile(image)
retrieve data from the widget; for example, suppose the application is an image editor and we want to retrieve a modified image: w.getFile(imageName)

In the majority of examples in the anywidget community gallery, the widgets are intended to be visually rendered.

But I think an equally interesting possibility is to use the framework to wrap headless applications; which is to say, ones that have no GUI (graphical user interface).

To this end, I have been exploring using anywidget as a wrapper for WASM applications that are often installed “on the command line” and then accessed via a Python wrapper. For example, in the world of databases, packages such as sqlalchemy provide a way of connecting to, and working with, database services from within a Python environment. With Postgres now available in the browser as pglite, I have started to explore embedding it in a notebook environment using jupyter_anywidget_pglite. This widget:

loads the pglite application into the browser environment;
provides a headless API to it (with some limited support for DBAPI2 and SQLAlchemy style connections).

As another example, pygraphviz provides a Python wrapper for the Graphviz C application. But Graphviz is also available as a WASM package, so we can wrap it as an anywidget (for example, jupyter_anywidget_graphviz), and then use it in web-based Python notebooks.

One of the main differences between the common “wrap a JS GUI anywidget” and “create a headless anywidget” is than in the headless, function calling usage style, we often require blocking behaviour; which is to say, when we call the application, we may need to wait some time for it to respond: w.waitTillReady(), w.waitTillFinished(). (Either that, or we need to be able to set up a handler to an event that is raised when the application has completed its action.)

In a “full” IPython environment, we can use jupyter-ui-poll to support this sort of behaviour, but in JupyterLite environments, for example, this behaviour is not supported. (I think it should be supportable in Marimo notebooks, but I think the “patterns” I’ve been using for my headless anywidgets to date are not ideal in that environment, though the widgets do work to a limited extent. Starting from scratch and looking at building a version of one of the widgets for marimo as the intended first-class host might be a useful thing to do… It would also be interesting looking at building something for Quarto-Live as the first-class user environment. And then seeing if there are some robust design principles / code patterns that work across all those enviornments and use cases.

So why is this interesting? For me, the main reason is because it provides a relatively natural way for packaging and installing applications that would otherwise be installed on the command-line and then called using a Python wrapper, to “installing” them into the browser runtime environment. That is, using the browser environment even more like the operating system environment by “installing” applications to it as WASM applications and then calling them via a headless anywidget-style wrapper.

Also note that it’s not just WASM powered applications or services we might want to call; we might also want to use packages from other languages in our web-based Python environment. For example, turfjs provides a handy geo toolkit in Typescript, and gdal3.js implements a load of geo-tools that are often hard to install on the desktop. Tesseract.js, pdf.js and pandoc-wasm offer dcoument conversion, rendering and text extraction services (see for example my demos: jupyter_anywidget_tesseract_pdfjs and jupyter_anywidget_pandoc, whisper can run in the browser to offer speech to text services, webr could perhaps be co-opted by a WASM-calling variant of rpy2 to execute R code, and vice versa in an R anyhtmlwidget context (R calling Python) using a WASM-calling variant of reticulate. As for LLMs running in the browser, we could also wrap those (for example, jupyter_anywidget_webllm).

User environments such as JuptyerLite and Marimo can already use browser storage to provide some sort of in-browser file system, and things like the jupyter-offlinenotebook extension provide a way of saving and retrieving notebooks from a Jupyter environment using browser storage. (I also note jupyterlab-browser-storage but I’m not sure what it’s intended to do? While I’m on this topic, there are also a couple of Jupyter extensions offering filesystem access: jupyter-fs and jupyterlab-filesystem-access.)

In passing, I also note ongoing development of jupyterlite/terminal, through I’m not really sure what the purpose of it is. Just as applications like postgres and graphviz can be called and used from the command-line, it would be nice if wasm-packaged versions of those applications could also be be called from a browser based terminal, if it exists… So in imagining a variant of anywidget that was particularly tuned to the wrapping of headless widgets, it would be nice if it could be installed into a jupyterlite/terminal style environment. Which sort of thinking might also influence how jupyterlite/terminal style environments might develop their own package installation processes…

One thing I’m pretty sure of is that I don’t (currently?) have the sort of discipline to code this or contribute code to it. But I can make observations, such as the above, about my own attempts to hack wasm applications into anywidget packages!

If Your Users Are Doing This a Lot, Build an API

If your users are repeatedly using AI automation to perform tasks on your site, ask yourself:

could the UI be improved?
should I be providing an API?

Scraping and website automation is faff enough without bots of any sort having a go at trying to perform a task on a site that has (human) anti-patterns baked in, then starts to either include AIgent anti-patterns, or the AIgent starts succumbing to your anti-patterns and annoying your users even more.

Actually, the latter might be the better option if it puts people off using AI tools that mask bad human-user interface design and just push supposedly human user UIs further down the route of being responsive to machines. Which is what APIs are for, ffs…

Getting an Answer is Not the Same as Coming to an Understanding

Noting a pattern that is the next thing after RAG (retrieval augmented generation), the DeepSearch pattern, which aims to recognise when documents are lacking and the LLM cannot generate an answer from them, so instead it does an “agentific” search for related documents and summarises anwsers from those.

And via Charles Arthur’s Overspill blog [via], I also note that folk are increasingly not clicking through from “AI-d” ~~websearch~~ answer results pages, but instead are happy with the summary they’re given.

This as all very well. Being given an answer in response to a shonky natural language prompt is one thing. But it does not help you come to an understanding.

Skimming through a set of resources and realising they do not provide the answer tells you something.

Search for documents that may help you get to an answer tells you something.

Back in the day, things like search were termed “knowledge tools”; workers were “knowledge workers”. There is f**k all knowledge (“justified true belief” if want an old, agentific defnition) in the mind of a user issuing a crap prompt to get a machine generated response and seeing that as some sort of “answer”. There is no justification in the mind of the user. There may or may not be any truth. There is potentially no belief (the user forgetting whatever the answer was as quickly as it was generated for them).

Recalling Searle, folk who use genAI tools and answer engines are becoming indistinguishable from Chinese rooms, boxes that have no real understanding, although they may appear to from the outside.

When they AI “knowledge worker” checks out of their Chinese box in an evening, even in they appear to have worked knowledgeably from inside the box using their AI tools during the day, they leave the box as much a muppet as they did when they went onto the box in the morning.

See also my anti-GPS rant in Getting Lost In LLM-Supported Coding…;-)

AI Chat Agents… sigh…

Back in the day (almost a decade ago…) I seem to remember I spent a period of time hacking together slack slash actions such as a Slack Slash Parliamentary Auto-Responder Using AWS Lambda Functions or Chatting With ONS Data Via a Simple Slack Bot that would parse simple conversational chat commands within a particular domain and generate a response that pulled on a third party data service.

The responses could also include images:

The downsides were the language parsing was simple, but the upsides were the code was simple to inspect, and it would have been easy to add a transparency switch so that the response could optionally include a statement of what was extracted from the input to generate the output.

To use the system, you had to learn how to ask for particular things in particular ways.

So what does genAI offer over that, other than complexity? You still have to talk to the genAI models in a particular way so that they parse out what you want them to parse out; they still make mistakes; and they can play loose and free with facty things, like names and numbers (which are the bits you want to be right).

For all the talk of genAI’n’agents’n’all that b****ks, where are the benefits now over what was pretty much equally possible then?

Getting Lost In LLM-Supported Coding

I hate satnav. I really hate it. I see the utility in the last mile getting to somewhere new, but for route planning, it sucks.

The US techbros may think all roads are 6 lane dual carriageways, but in the UK we have more than our fair share of single track, green centreline roads, and a route suggestion made on the basis it is predicted to be two minutes quicker over two hours, by taking a trek over hill and dale on very single lane roads with a close correspondence to rally gravel stages, rather than than major trunk roads and motorways, is generally a bad idea.

I f***king hate it. I f***king hate the way the solid line, bold blue route occludes the colour of the road it suggests. Why not do blue dots, so you can see the route is being suggested on a single lane white road or a motorway? Why not give me the option of a routeplan than minimises the number of road numbers on the route? Why not give me an option to avoid the use of minor class roads (just a quick diversion down a 1 in 4, across a ford, and back up a 1 in 3 to save a predicted one minute en route versus staying on a major road and going down to the bridge) except at the start at and the end of the journey?

I f***king hate it.

I hate the way that people who use it have no idea where they are or where they are heading next. In the UK, I suspect the majority of most routes can be given by a couple of fiddly bits at the start and the end, and then a list of towns and, if necessary, villages, that are likely to be signposted. I do not want a shortcut that involves turning left into a housing estate to give me a 2 minute time saving and get me lost in an urban hellscape, when I could follow the main road signs to BigTown. I do not want a route through tiny forest roads with no signs to save me a minute going to the motorway which is clearly signed and pretty much guaranteed to use major roads.

By listening to the route and following the instructions, folk ignore the signs, have no memory of features to help them on the way out or on a return journey, have no sense of where they are or where they are going.

I f***king hate it.

And that, I think, is what coding using LLMs is going to be like for a lot of people.

As an experiment, I tried using various code tools yesterday as I think folk blindly listen to GPS, letting it “guide” me through a development thing; and whilst at first it seemed productive, I was soon completely lost, had no idea what was going on, where the solution might, or what the code that kept being generated in response to the emergent error messages I kept receiving as I ran the code, was supposed to do.

After several hours of floundering, I realised I felt like I do when following satnav routes: I was completely lost, at times thought I could see signs that pointed me to a place that I knew was probably on the way, and had no real sense of why some suggestions were being made rather than others. “Here, have this piece of code”, that is the equivalent of veering off on a diagonal through a housing estate rat run, rather than following the main road, going straight down to the traffic lights, and then turning right.

I know, when forced to follow satnav routes, how angry I get at their stupidity. I know how lost I get, clueless as to where we are, or where we are going, because the last 17 instructions were of the form “in a hundred yards, take the second left down Lesser Pidlington Street, take the third exit onto New Old Street” when a simple “follow the signs to Manchester” would have worked equally well.

And I suspect that the blind following the lost form of navigation will be the process for a lot of folk using genAI coding support tools. As it is for satnav led driving.

PS FWIW, I went back to the code today, had a look at it, backtracked the error messages, checked out Stack Overflow as well as using some very, very specific prompts with particular queries and carefully chosen code fragments, then carefully read the the recommendation, and tweaked it as I felt I needed to, and got something that worked, I think. But I have no idea of how most of the code really got to the state it is now or, or what things were being addressed. Just like following a satnav into a place and having no idea how to out, or how to reproduce the journey because attention was focussed on the generated instruction and not the environment.

PPS FWIW, I do generally try to lead and be critical in the way I use genAI tools for coding support, but yesterday I thought I’d just follow its satnav lead and rerouting as I copied and pasted the code it generated, and pasted back in the errors. And it was a horrible experience. And I have no idea of what is in the code or what it’s supposed to be doing where.

Visualising SQL db schema diagrams using eralchemy and graphviz wasm

Although I don’t post here any where near as regularly as I used to, the days of my hoped for OUsefulness are now on a pretty strict countdown, so things will be as they are — possibly a last flurry of activity as I try to make things “safe” and “handover” any that may OUtlive my OUseful time, as I start to look for opportunities anew, and maybe a return to some open data tinkering and data storytelling projects…

A few days ago, I reminded myself of an open issue around the IPython magic we use for generating schema diagrams for SQL dbs in our Jupyter notebooks. This uses Graphviz installed as an O/S package to render SVG or png diagrams (I think?) from a schema generated from a connected db using a found package that I suspect hasn’t been updated for years.

I’ve had an open issue for a year or so making a note that eralchemy provides a rather more actively maintained package for generating ERDs from connected databases, although it doesn’t have any associated magic, is geared up for CLI rather than python API use, and it also makes use of Graphviz installed on the o/s to render dot scripts to schema diagrams.

(It also claims to generate mermaid diagram script, but that didn’t look valid to me?)

Anyway, I thought i’d have a quick tinker, and came up with the typoed repo https://github.com/innovationOUtside/erlachemy_schemadisplay_anywidget_magic which:

creates a simple hacky Python API around eralchemy that Claude.ai generated for me from the original CLI code;
includes a custom IPython magic class that lets you use IPython magic to generate schema diagram specifications, e.g. in dot language, from a db given the db connection string;
uses my innovationOUtside/jupyter_anywidget_graphviz package to render dot code to SVG in the browser (i.e. without requiring Graphviz installed to you computer, or a graphviz server/server connection to do the rendering).

In my head, that’s another thing ticked off the list of things we can offer that don’t necessarily require a server or O/S installed package, and that, as such, we may be able to offer to students in secure environments, or students working offline who can’t install s/w onto their machines, etc etc.

PS I wonder… can we get eralchemy to work as it is with a connection to my innovationOUtside/jupyter_anywidget_pglite widget, or would we need to do some fettling around the connection? Or maybe the pglite instance doesn’t respond to or return all the things eralchemy needs? [UPDATE – my half-baked sqlalchemy connection is not quite rich enough. Maybe one to work on?]

Living Documents, Personal Software

Via Simon Willison, who regularly surfaces nuggets from Hacker News into his own feeds, a flexible spreadsheet UI that offers “ambiguous” or alternative values that make it easy to role play various scenarios: Ambsheets.

From there, I also learn of design agency Ink & Switch, who do the Bret Victor thing of reimagining what computation could and should be like, and then building working demos of it.

So for example, live documents with Potluck (demo):

Via Geoffrey Litt’s PhD defense, I find Riffle, reactive databases that also put me in mind of pglite live queries, something I really should try to have a play with.

Interestingly, they also had a demo of a Jupyter/IPython .ipynb notebook compatible live/in-browser editor a decade ago (livebook); well worth comparing that sort of thinking with things like JupyterLite, and newer, arguably more interesting, live notebook environments such as marimo.

There is also more recent work on authoring notebooks that combine programmatically accessed data and generated charts (Jacquard) , which makes me think I need to see how support for transcluded content now works as a core feature in quarto (includes) and curvenote (embedded cell content).

And some folk to follow? Geoffrey Litt, Alex Warth, Josh Horowitz

Go Team Lords…

Whatever you think about the House of Lords, they provide a useful long view against the confused vagaries of a populist Commons wrapped up in making reactionary responses and chasing the headlines.

See for example Baroness Kidron in defence of copyright and against ceding content the AI techbros in Hansard here.

Video: https://parliamentlive.tv/event/index/d7da6908-8663-4412-8840-e6de3e180636?in=16:47:17

Visualising Reactive Dependencies in ShinyLive Python Application

Over the last couple of weeks I have been tinkering with a ShinyLive application for visualising WRC data (try it here: takes a minute or two to load, but then runs completely in your browser, pulling data from the WRC live timing API).

The app is made up from a series of widgets reactively wired to functions. Here’s a fragment that includes an example of how I wire a dropdown list widget to a display function:

# Create driver rebase selector
with ui.tooltip(id="rebase_driver_tt"):
    ui.input_select(
        "rebase_driver",
        "Driver rebase:",
        {},
    ),
    '"Rebase" times relative to a nominated driver. The "ULTIMATE" driver is derived from the quickest times within each split sector .'

@render.ui
@reactive.event(input.stage, input.rebase_driver)
def rebase_info():
    stage = input.stage()
    rebase_driver = input.rebase_driver()
    if (
        # ...

Here. a ui.select input widget with ID rebase_driver is wired into a reactive event function rebase_info().

So… what connects to what? I asked claude.ai the following, and pasted in my `app.py` code:

I have a shinylive pyhton app with various components that react in response to other components. Could you generate a dot language scripted diagram for me that shows how each component (widget, function) is connected reactively to other components. Also Identify any components that are not connect

It generated me some Mermaid code (see it rendered here):

flowchart TD
    %% Input components
    subgraph Inputs
        season["input.season"]
        championship["input.championship"]
        event["input.event"]
        stage["input.stage"]
        rebase_driver["input.rebase_driver"]
        stage_rebase_driver["input.stage_rebase_driver"]
        splits_section_view["input.splits_section_view"]
        rebase_reverse_palette["input.rebase_reverse_palette"]
        splits_section_plot["input.splits_section_plot"]
    end

    %% Reactive calculations
    subgraph Reactive_Calcs
        season_data["season_data()"]
        rally_id_var["rally_id_var()"]
        stages_data["stages_data()"]
        itinerary_data["itinerary_data()"]
        startlist_data["startlist_data()"]
        stage_times_data["stage_times_data()"]
        split_times_data["split_times_data()"]
        carNum2name["carNum2name()"]
        getSplitDists["getSplitDists()"]
        split_dists_for_stage["split_dists_for_stage()"]
    end

    %% UI Updates
    subgraph UI_Updates
        update_events_select["update_events_select()"]
        update_stages_select["update_stages_select()"]
        update_stages_driver_rebase["update_stages_driver_rebase_select()"]
        update_driver_rebase["update_driver_rebase_select()"]
        update_splits_driver_rebase["update_splits_driver_rebase_select()"]
    end

    %% Rendered outputs
    subgraph Outputs
        season_frame["season_frame()"]
        stages_frame["stages_frame()"]
        itinerary_frame["itinerary_frame()"]
        startlist_frame["startlist_frame()"]
        overall_short["overall_short()"]
        split_times_original["split_times_original()"]
        split_report["split_report()"]
        seaborn_heatmap_splits["seaborn_heatmap_splits()"]
        seaborn_barplot_splits["seaborn_barplot_splits()"]
        seaborn_linechart_splits["seaborn_linechart_splits()"]
    end

    %% Connections
    season --> season_data
    season_data --> update_events_select
    championship --> update_events_select
    
    event --> rally_id_var
    rally_id_var --> stages_data
    rally_id_var --> itinerary_data
    rally_id_var --> startlist_data
    rally_id_var --> getSplitDists
    
    stages_data --> update_stages_select
    championship --> update_stages_select
    
    stage --> stage_times_data
    stage_times_data --> update_stages_driver_rebase
    stage_times_data --> update_driver_rebase
    stage_times_data --> update_splits_driver_rebase
    
    stage --> split_times_data
    championship --> split_times_data
    
    stage --> split_dists_for_stage
    
    %% Output connections
    season_data --> season_frame
    stages_data --> stages_frame
    itinerary_data --> itinerary_frame
    startlist_data --> startlist_frame
    stage_times_data --> overall_short
    split_times_data --> split_times_original
    split_times_data --> split_report
    splits_section_view --> split_report
    
    rebase_driver --> seaborn_heatmap_splits
    rebase_reverse_palette --> seaborn_heatmap_splits
    split_times_data --> seaborn_heatmap_splits
    
    rebase_driver --> seaborn_barplot_splits
    splits_section_plot --> seaborn_barplot_splits
    rebase_reverse_palette --> seaborn_barplot_splits
    split_times_data --> seaborn_barplot_splits
    
    rebase_driver --> seaborn_linechart_splits
    split_times_data --> seaborn_linechart_splits

    %% Style
    classDef input fill:#e1f5fe,stroke:#01579b
    classDef calc fill:#fff3e0,stroke:#e65100  
    classDef update fill:#f3e5f5,stroke:#4a148c
    classDef output fill:#e8f5e9,stroke:#1b5e20

    class season,championship,event,stage,rebase_driver,stage_rebase_driver,splits_section_view,rebase_reverse_palette,splits_section_plot input
    class season_data,rally_id_var,stages_data,itinerary_data,startlist_data,stage_times_data,split_times_data,carNum2name,getSplitDists,split_dists_for_stage calc
    class update_events_select,update_stages_select,update_stages_driver_rebase,update_driver_rebase,update_splits_driver_rebase update
    class season_frame,stages_frame,itinerary_frame,startlist_frame,overall_short,split_times_original,split_report,seaborn_heatmap_splits,seaborn_barplot_splits,seaborn_linechart_splits output

Which worked. But I wanted dot code, so I asked again:

Thanks. Could you convert that to dot language so I can view it with graphviz

digraph G {
    rankdir=TD;
    compound=true;
    node [style=filled];

    // Define subgraphs
    subgraph cluster_inputs {
        label="Inputs";
        style=filled;
        color=lightblue;
        node [fillcolor=lightskyblue];
        
        season [label="input.season"];
        championship [label="input.championship"];
        event [label="input.event"];
        stage [label="input.stage"];
        rebase_driver [label="input.rebase_driver"];
        stage_rebase_driver [label="input.stage_rebase_driver"];
        splits_section_view [label="input.splits_section_view"];
        rebase_reverse_palette [label="input.rebase_reverse_palette"];
        splits_section_plot [label="input.splits_section_plot"];
    }

    subgraph cluster_reactive_calcs {
        label="Reactive Calculations";
        style=filled;
        color=peachpuff;
        node [fillcolor=bisque];
        
        season_data [label="season_data()"];
        rally_id_var [label="rally_id_var()"];
        stages_data [label="stages_data()"];
        itinerary_data [label="itinerary_data()"];
        startlist_data [label="startlist_data()"];
        stage_times_data [label="stage_times_data()"];
        split_times_data [label="split_times_data()"];
        carNum2name [label="carNum2name()"];
        getSplitDists [label="getSplitDists()"];
        split_dists_for_stage [label="split_dists_for_stage()"];
    }

    subgraph cluster_ui_updates {
        label="UI Updates";
        style=filled;
        color=plum;
        node [fillcolor=thistle];
        
        update_events_select [label="update_events_select()"];
        update_stages_select [label="update_stages_select()"];
        update_stages_driver_rebase [label="update_stages_driver_rebase_select()"];
        update_driver_rebase [label="update_driver_rebase_select()"];
        update_splits_driver_rebase [label="update_splits_driver_rebase_select()"];
    }

    subgraph cluster_outputs {
        label="Outputs";
        style=filled;
        color=palegreen;
        node [fillcolor=lightgreen];
        
        season_frame [label="season_frame()"];
        stages_frame [label="stages_frame()"];
        itinerary_frame [label="itinerary_frame()"];
        startlist_frame [label="startlist_frame()"];
        overall_short [label="overall_short()"];
        split_times_original [label="split_times_original()"];
        split_report [label="split_report()"];
        seaborn_heatmap_splits [label="seaborn_heatmap_splits()"];
        seaborn_barplot_splits [label="seaborn_barplot_splits()"];
        seaborn_linechart_splits [label="seaborn_linechart_splits()"];
    }

    // Connections
    season -> season_data;
    season_data -> update_events_select;
    championship -> update_events_select;
    
    event -> rally_id_var;
    rally_id_var -> stages_data;
    rally_id_var -> itinerary_data;
    rally_id_var -> startlist_data;
    rally_id_var -> getSplitDists;
    
    stages_data -> update_stages_select;
    championship -> update_stages_select;
    
    stage -> stage_times_data;
    stage_times_data -> update_stages_driver_rebase;
    stage_times_data -> update_driver_rebase;
    stage_times_data -> update_splits_driver_rebase;
    
    stage -> split_times_data;
    championship -> split_times_data;
    
    stage -> split_dists_for_stage;
    
    // Output connections
    season_data -> season_frame;
    stages_data -> stages_frame;
    itinerary_data -> itinerary_frame;
    startlist_data -> startlist_frame;
    stage_times_data -> overall_short;
    split_times_data -> split_times_original;
    split_times_data -> split_report;
    splits_section_view -> split_report;
    
    rebase_driver -> seaborn_heatmap_splits;
    rebase_reverse_palette -> seaborn_heatmap_splits;
    split_times_data -> seaborn_heatmap_splits;
    
    rebase_driver -> seaborn_barplot_splits;
    splits_section_plot -> seaborn_barplot_splits;
    rebase_reverse_palette -> seaborn_barplot_splits;
    split_times_data -> seaborn_barplot_splits;
    
    rebase_driver -> seaborn_linechart_splits;
    split_times_data -> seaborn_linechart_splits;

    // Layout hints
    {rank=same; season championship event stage rebase_driver stage_rebase_driver splits_section_view rebase_reverse_palette splits_section_plot}
    {rank=same; season_frame stages_frame itinerary_frame startlist_frame overall_short split_times_original split_report seaborn_heatmap_splits seaborn_barplot_splits seaborn_linechart_splits}
}

The view follows the original wide layout (see it rendered interactively here).

A long vertical layout rather than wide horizontal layout would be easier to read:

That renders as a very wide chart with arrows going top down; is there a way of coding it so that it is long (left right arrows)

(see it rendered interactively here).

From the diagram, we see that changes to input.stage_rebase_driver() aren’t reacted to, so I need to check that.

And here’s the generated dot script for the above…

digraph G {
    rankdir=LR;
    compound=true;
    node [style=filled];
    splines=ortho;  // Use orthogonal lines
    nodesep=0.5;    // Increase space between nodes
    ranksep=1;      // Increase space between ranks

    // Define subgraphs
    subgraph cluster_inputs {
        label="Inputs";
        style=filled;
        color=lightblue;
        node [fillcolor=lightskyblue];
        
        season [label="input.season"];
        championship [label="input.championship"];
        event [label="input.event"];
        stage [label="input.stage"];
        rebase_driver [label="input.rebase_driver"];
        stage_rebase_driver [label="input.stage_rebase_driver"];
        splits_section_view [label="input.splits_section_view"];
        rebase_reverse_palette [label="input.rebase_reverse_palette"];
        splits_section_plot [label="input.splits_section_plot"];
    }

    subgraph cluster_reactive_calcs {
        label="Reactive Calculations";
        style=filled;
        color=peachpuff;
        node [fillcolor=bisque];
        
        season_data [label="season_data()"];
        rally_id_var [label="rally_id_var()"];
        stages_data [label="stages_data()"];
        itinerary_data [label="itinerary_data()"];
        startlist_data [label="startlist_data()"];
        stage_times_data [label="stage_times_data()"];
        split_times_data [label="split_times_data()"];
        carNum2name [label="carNum2name()"];
        getSplitDists [label="getSplitDists()"];
        split_dists_for_stage [label="split_dists_for_stage()"];
    }

    subgraph cluster_ui_updates {
        label="UI Updates";
        style=filled;
        color=plum;
        node [fillcolor=thistle];
        
        update_events_select [label="update_events_select()"];
        update_stages_select [label="update_stages_select()"];
        update_stages_driver_rebase [label="update_stages_driver_rebase_select()"];
        update_driver_rebase [label="update_driver_rebase_select()"];
        update_splits_driver_rebase [label="update_splits_driver_rebase_select()"];
    }

    subgraph cluster_outputs {
        label="Outputs";
        style=filled;
        color=palegreen;
        node [fillcolor=lightgreen];
        
        season_frame [label="season_frame()"];
        stages_frame [label="stages_frame()"];
        itinerary_frame [label="itinerary_frame()"];
        startlist_frame [label="startlist_frame()"];
        overall_short [label="overall_short()"];
        split_times_original [label="split_times_original()"];
        split_report [label="split_report()"];
        seaborn_heatmap_splits [label="seaborn_heatmap_splits()"];
        seaborn_barplot_splits [label="seaborn_barplot_splits()"];
        seaborn_linechart_splits [label="seaborn_linechart_splits()"];
    }

    // Connections
    season -> season_data;
    season_data -> update_events_select;
    championship -> update_events_select;
    
    event -> rally_id_var;
    rally_id_var -> stages_data;
    rally_id_var -> itinerary_data;
    rally_id_var -> startlist_data;
    rally_id_var -> getSplitDists;
    
    stages_data -> update_stages_select;
    championship -> update_stages_select;
    
    stage -> stage_times_data;
    stage_times_data -> update_stages_driver_rebase;
    stage_times_data -> update_driver_rebase;
    stage_times_data -> update_splits_driver_rebase;
    
    stage -> split_times_data;
    championship -> split_times_data;
    
    stage -> split_dists_for_stage;
    
    // Output connections
    season_data -> season_frame;
    stages_data -> stages_frame;
    itinerary_data -> itinerary_frame;
    startlist_data -> startlist_frame;
    stage_times_data -> overall_short;
    split_times_data -> split_times_original;
    split_times_data -> split_report;
    splits_section_view -> split_report;
    
    rebase_driver -> seaborn_heatmap_splits;
    rebase_reverse_palette -> seaborn_heatmap_splits;
    split_times_data -> seaborn_heatmap_splits;
    
    rebase_driver -> seaborn_barplot_splits;
    splits_section_plot -> seaborn_barplot_splits;
    rebase_reverse_palette -> seaborn_barplot_splits;
    split_times_data -> seaborn_barplot_splits;
    
    rebase_driver -> seaborn_linechart_splits;
    split_times_data -> seaborn_linechart_splits;

    // Layout hints
    {rank=same; season championship event stage rebase_driver stage_rebase_driver splits_section_view rebase_reverse_palette splits_section_plot}
    {rank=same; season_data rally_id_var}
    {rank=same; stages_data itinerary_data startlist_data stage_times_data split_times_data}
    {rank=same; update_events_select update_stages_select update_stages_driver_rebase update_driver_rebase update_splits_driver_rebase}
    {rank=same; season_frame stages_frame itinerary_frame startlist_frame overall_short split_times_original split_report seaborn_heatmap_splits seaborn_barplot_splits seaborn_linechart_splits}
}

My next thought was to ask for some code to statically analyse the app.py code to generate the visualisation deterministically:

Could that sort of thing be generated deterministically from a static or AST analysis of the code by a py function?

But I’ve run out of free queries for a couple of hours…

Anyway, it strikes me that that sort of feature might be an interesting optional addition to the ShinyLive editor?

PS here’s some example code that does something, again via claude, with a quick sanitisation hack added by me on the end:

import ast
from typing import Dict, Set, List
from dataclasses import dataclass
import networkx as nx


@dataclass
class ReactiveNode:
    name: str
    node_type: str  # 'input', 'calc', 'effect', 'output'
    dependencies: Set[str]
    triggers: Set[str]  # Event dependencies


class ReactiveGraphAnalyzer:
    def __init__(self, source_code: str):
        self.tree = ast.parse(source_code)
        self.inputs: Set[str] = set()
        self.calcs: Dict[str, ReactiveNode] = {}
        self.effects: Dict[str, ReactiveNode] = {}
        self.outputs: Dict[str, ReactiveNode] = {}

    def analyze(self):
        """Perform static analysis of the code to extract reactive dependencies"""
        for node in ast.walk(self.tree):
            # Find input declarations
            if isinstance(node, ast.Call) and isinstance(node.func, ast.Attribute):
                if node.func.attr.startswith("input_"):
                    input_name = self._extract_input_name(node)
                    if input_name:
                        self.inputs.add(input_name)

            # Find reactive calculations and effects
            elif isinstance(node, ast.FunctionDef):
                decorators = self._get_decorators(node)
                if "reactive.calc" in decorators:
                    self._analyze_reactive_calc(node)
                elif "reactive.effect" in decorators:
                    self._analyze_reactive_effect(node)

                # Check if it's an output function (render decorators)
                if any(d.startswith("render.") for d in decorators):
                    self._analyze_output(node)

    def _extract_input_name(self, node) -> str:
        """Extract input variable name from ui.input_* calls"""
        try:
            return node.args[0].value
        except (AttributeError, IndexError):
            return ""

    def _get_decorators(self, node) -> List[str]:
        """Extract decorator names from a function definition"""
        decorators = []
        for decorator in node.decorator_list:
            if isinstance(decorator, ast.Name):
                decorators.append(decorator.id)
            elif isinstance(decorator, ast.Attribute):
                decorators.append(f"{decorator.value.id}.{decorator.attr}")
            elif isinstance(decorator, ast.Call):
                if isinstance(decorator.func, ast.Attribute):
                    decorators.append(
                        f"{decorator.func.value.id}.{decorator.func.attr}"
                    )
        return decorators

    def _analyze_reactive_calc(self, node):
        """Analyze a reactive calculation function"""
        name = node.name
        dependencies = self._find_input_dependencies(node)
        triggers = self._find_event_triggers(node)
        self.calcs[name] = ReactiveNode(name, "calc", dependencies, triggers)

    def _analyze_reactive_effect(self, node):
        """Analyze a reactive effect function"""
        name = node.name
        dependencies = self._find_input_dependencies(node)
        triggers = self._find_event_triggers(node)
        self.effects[name] = ReactiveNode(name, "effect", dependencies, triggers)

    def _analyze_output(self, node):
        """Analyze an output render function"""
        name = node.name
        dependencies = self._find_input_dependencies(node)
        triggers = self._find_event_triggers(node)
        self.outputs[name] = ReactiveNode(name, "output", dependencies, triggers)

    def _find_input_dependencies(self, node) -> Set[str]:
        """Find all input() calls within a function"""
        dependencies = set()
        for n in ast.walk(node):
            if isinstance(n, ast.Call):
                if isinstance(n.func, ast.Attribute) and n.func.attr == "input":
                    dependencies.add(f"input.{n.args[0].value}")
        return dependencies

    def _find_event_triggers(self, node) -> Set[str]:
        """Find reactive.event triggers in decorators"""
        triggers = set()
        for decorator in node.decorator_list:
            if (
                isinstance(decorator, ast.Call)
                and isinstance(decorator.func, ast.Attribute)
                and decorator.func.attr == "event"
            ):
                for arg in decorator.args:
                    if isinstance(arg, ast.Attribute):
                        triggers.add(f"{arg.value.id}.{arg.attr}")
        return triggers

    def generate_dot(self) -> str:
        """Generate DOT language representation"""
        dot = [
            "digraph G {",
            "    rankdir=LR;",
            "    compound=true;",
            "    node [style=filled];",
        ]

        # Define subgraphs
        dot.extend(self._generate_subgraph("inputs", self.inputs))
        dot.extend(self._generate_subgraph("calcs", self.calcs))
        dot.extend(self._generate_subgraph("effects", self.effects))
        dot.extend(self._generate_subgraph("outputs", self.outputs))

        # Generate edges
        dot.extend(self._generate_edges())

        dot.append("}")
        return "\n".join(dot)

    def _generate_subgraph(
        self, name: str, nodes: Dict[str, ReactiveNode]
    ) -> List[str]:
        """Generate subgraph DOT code"""
        lines = [
            f"    subgraph cluster_{name} {{",
            f'        label="{name.title()}";',
            "        style=filled;",
        ]

        for node_name in nodes:
            lines.append(f'        {node_name} [label="{node_name}"];')

        lines.append("    }")
        return lines

    def _generate_edges(self) -> List[str]:
        """Generate edges between nodes based on dependencies"""
        edges = []
        all_nodes = {**self.calcs, **self.effects, **self.outputs}

        for node in all_nodes.values():
            for dep in node.dependencies:
                edges.append(f"    {dep} -> {node.name};")
            for trigger in node.triggers:
                edges.append(f"    {trigger} -> {node.name} [style=dashed];")

        return edges


def analyze_reactive_code(source_code: str) -> str:
    """Main function to analyze reactive code and generate DOT visualization"""
    analyzer = ReactiveGraphAnalyzer(source_code)
    analyzer.analyze()
    # TH — sanitise hack
    out_= analyzer.generate_dot()
    out_ = out_.replace("input.", "input_")
    return out_

Whose Learning Journey Is It Anyway?

Back in the day, it took effort to generate words and images on a particular topic. With the advent of generative LLMs, you can spew as much content on a topic as you want. And as the models get smaller, the models will run locally on more and more devices (that is, the models will be installed on your computer and won’t need a web connection or third party service for them to work). As an example, today’s hotness (Janus-1.3B) is a ~2GB download to your browser… or phone…

These models will spew content at you in response to a prompt, summarise and rewrite texts for you, and more (things like OCR and scraping (“structured document parsing”).

Which raises interesting questions for distance learning organisations that tend to rely on text as the major giveaway (and text as the main form of assessment).

If learners get into the habit of asking a machine to summarise a text for them before, or instead of, actually reading it, what is the point is spending 18 months to 2 years generating that text and running it through an editorial process?

If we want to guarantee students do read OUr actual words, then the only way we can do that is to send them a book. (Though if someone had the patience, they could just photograph each pair of pages and run it through an OCR/text extraction model.) I’m all for books, and prefer them to reading from the screen, not least because they have a physical memory, can be annotated with sticky notes, pencil, or (highlighter) pen, and (traditionally) a resale or physical share-on value. But they incur a considerable physical publication cost (physical printing and physical distribution, if you don’t print them at home).

So if OUr students don’t read the actual words we produce, can we be lazier in producing them? Can we generate (or collate) a mass of materials, dump them into a database, and then access them via a set of prompts? (I’ve pondered this sort of thing into the void before: GenAI in Edu and The Chinese Room.)

Rather than provide a set of carefully written and edited materials, could we just provide a box of content written howsoever by an expert in the field (so probably not wrong, but maybe not as readable as it could be), perhaps with some additional “approved” resources, and then provide a set of prompts to extract / generate text from that corpus. Like a set of headings with sections generated from the corpus. Or a syllabus. Like this.

If learners get into the habit of asking a machine to summarise a text for them before, or instead of, actually reading it, whose learning journey are they on in terms of following a set of particularly organised and worded texts? Not OUrs, because the text they actually read will have been generated by something else.

Presumably, if they are getting a qualification in a particular topic, then they should know something about the topic, such as the list of things provided in a course syllabus.

So is the syllabus the thing we provide? A list of topics and things to know?

If the syllabus is sequenced, is that enough of a structured pathway through a set of topics to provide a student with a sensible learning journey through it?

Is the learning journey actually the set of realisations that a learner has as they work they way through a topic? Or the sorts of questions, tasks and problems they realise they can now address with a particular degree of confidence as a result of have coming to (been led to?) those realisations through “studying” a particular course?

And as for assessment, I think we need to encourage learners to have an intrinsic motivational stance towards it. By putting themselves in for assessment, even if they lack confidence in themselves, they can take confidence that someone else has assessed them as being competent or knowledgeable in a thing, to a particular standard or level. If then then feel the need to cheat, they won’t actually have that confidence.

But this is all moot anyway, maybe… I have a couple of days to decide whether to accept the mutually assured destruction/resignation thing…

Writing just a couple of days ago on Prototyping and Power, Libby Miller, whose path I crossed at various times over the last few years, and who also seems to have spent “15 years trying and failing to do something” quoted Cory Doctorow:

Any time you see a group of people who’ve tried over some long timescale to do something, without success, it’s *possible* that their failure is down to having bad tactics, but it’s *far* more likely that they’ve failed because they just aren’t powerful enough on their own to save the owls, or the planet, or their would-be ethnostate or whatever they’re agitating for.

At 25 years on (I got my long service recognition last year), I’m not sure I can continue to motivate myself with self-directed projects that I see as being OUseful (i.e. relevant, of interest to, and potentially useful within, the OU) but that go nowhere. I don’t know if I can turn whatever skills I have towards something that are really useful elsewhere, and, ideally, either generate income (I have been so bad at asking for money in the past that I’m not sure how that’d play out!) or reduce costs (e.g. by getting me free entry into things I would have otherwise paid for, free meals I would otherwise have paid for, free accommodation I would otherwise have paid for, free subscriptions, access, or resource I would otherwise have paid for, &c. I have made a couple of weak attempts internally over the last couple of months to see if I could solicit interest in “rolling out” some of the ~~innovations~~ inventions that I still have some interest in/motivation towards; but they need social capital to drive them; and a couple of emails into the void were never really going to elicit any response. I feel worthless within, and to, the organisation, although it continues to pay me; I have no idea any more about the terms under which it does continue to pay me, or expect from me. Should I just keep taking the money as long as I can get away with it, not worry about what I spend my time on, and just wait to be picked up on it? At times, I have and continue to put stupid hours into projects I think of as being OUseful (I’ve been on a 0.8 FTE contract for years because I felt guilty about my OUseful side-projects not making it into courses, research publications, etc.), feeling that I needed to put in the unpaid additional hours to offset the appaerent lack of productivity doing “what I was supposed to be doing” (whatever that is/was; research publications that would attract 1.4 readers and the Library then buys back through research publication subscriptions, or whatever). The less interest there was (is) in a thing I considered OUseful, the more work I felt (feel) I need(ed) to put into it, pushing it even further away from the status quo (“why can’t you see it’s use (a useful direction of travel? It means we can do this… You still can’t see it? But we can obviously then do that… Or this other thing… Or all these other things…”) And so the time ticks… I should have doing something else. But instead I wrote this. And if I stay and get let go (asked to leave) rather than mutually accept a payoff, I may get nothing… Or is there something I can still contribute, that’s OUseful, somehow, and that fits in with how I;ve come to “work” over the last two and a half-decades? [Fingers do and did the talking. Tippity tap. ust like LLMs, right??? WTF am I gonna do…?]