Skip to content

Consider encoding/storing arglists as strings #771

Open
@lread

Description

@lread

Currently

API analysis is delivered to cljdoc from cljdoc-analyzer as edn.

This is fine, but...

:arglists are delivered and stored as symbols

Sometimes :arglists contain elements that can be written but not read.
A specific example that we have worked around with a custom reader is regular expressions in arglists.

Now that I am exploring #543, I see more examples of potentially writable, but unreadable edn.
For example, the orchestra lib has an arglists that contains ::some-kw.
This is handled fine for current dynamic analysis, static analysis will not expand to :some-ns/some-kw and if we try to edn/read-string ::some-kw it will fail.

So...

Why not just store arglists as a list/vector of strings?
This is what clj-kondo analysis does and it has the advantage of being readable with no muss and no fuss.

Even if we never implement #543, we'd have a better design choice and could drop our custom edn reader.

This means for example, edn :arglists ([x y] [x y z])
would become: :arglists ("[x y]" "[x y z]")

The Impact

We sort arglists by arity when rendering the API.
We'll have to ensure this still happens.

We format arglists for rendering with zprint, it expects strings, so we should only need to tweak conversion slightly.

The sqlite database var table holds the arglists in a nippy blob in the meta column.
We have maybe ~4 million rows in this table (I'll take a peek).
I'll explore how long a migration might take.

The searchset delivered to the client side already converts arglists to strings when converting to json, we'd have to check/update that conversion.

The cljdoc-analyzer will need to deliver the new format of :arglists.
I'll work out a migration strategy.

I don't think any other known clients should be affected.
The Dash offline reader works off rendered HTML, not json.

Please chime in if you, dear reader, can think of any problems with this approach.

Next Steps

I'll explore, and if things seem to be making sense, I'll follow up with a PR.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions