Skip to content

strong typing for LlamaIndexTS #739

Open
@himself65

Description

@himself65

I've noticed most of the PRs recently are generated by LLM, and this project has had some issues.

  1. No type check, no information. It takes work to maintain and understand what's going on.

For example, this API doesn't tell what are objects, indexCls, and indexKwargs. Even though there's some trouble, it won't get fixed as time passes.

static async fromObjects(
objects: any,
objectMapping: BaseObjectNodeMapping,
// TODO: fix any (bundling issue)
indexCls: any,
indexKwargs?: Record<string, any>,
): Promise<ObjectIndex> {

  1. Not a good design; it's just a literary copy-pasted version of llama index Python code. llama index is a widely validated and used package, but some architectures are unsuitable for JS/TS. I listed here:

    1. async/sync: there's no sync call on the JS side, so every function call relies on LLM (uses network request) and is async-forced.
    2. iterator: JS has built-in async/streaming support, so there's no need to wrap into a manual iterator
    3. class & interface: JS doesn't have an actual class syntax; it's a syntactic sugar. And there's some issue when combining the TS interface with the class (abstract class/ static class?)
    4. For dynamic function calls, there are lots of fromDefaults | fromObjects ... which you can rename them all into from and use typescript function overloading to handle it, which will simplify the APIs
    5. generic type: python doesn't have a substantial type check as the typescript. So, most of the code is more challenging to maintain when pasted from the Python side. Because on the TS side, fewer developers are maintaining this project.
    6. What is the goal of the TS project? LITS aims to support all core features of the Python side. However, the developers in JS/TS are more likely to use LLM in web apps and some end products instead of data analysis... Anyway, even the core design will be the same, but the final APIs should go toward the JS&Web standard and be compatible with JS runtimes(node.js/deno/bun/edge/cloudflare...) as much as possible. At this point, we are not targeting the Web(which means you are not supposed to literary import LITS in the browser), but we can gradually port some packages for cross-runtimes step by step.
  2. No testing; each feature only added example code but did not fully cover the features and might have regression in the future.

Some PRs need to be reworked to fit the strong type:

Related issues:

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions