Skip to content

Conversation

@lovincyrus
Copy link
Contributor

@lovincyrus lovincyrus commented Dec 8, 2025

This PR implements a shared, config-driven multi-step form architecture for object storage connectors (GCS, S3,
Azure). It replaces a GCS-specific implementation with a generalized system that splits connector configuration
into two steps:

  1. Connector step: Authentication configuration
  2. Source step: Resource path and model creation
  • Azure explicit credentials
  • S3 explicit credentials
  • Remove Skip and add a contextual messaging in right panel that acts the same
  • Add public option that changes button to Continue
  • Pin Help text to bottom of right panel, update messaging on Source model page
  • Change Test and Import Data to, Just Import Data or Create Model
  • https://github.com/rilldata/rill/pull/8395/files
  • Connector validation for authentication method
  • ⚠️ Centralize auth method state in connectorStepStore to avoid initialization races
  • ⚠️ Extract isSubmitDisabled logic to testable functions
  • ⚠️ Add constants for magic strings ("public", etc.)

Checklist:

  • Covered by tests
  • Ran it and it works as intended
  • Reviewed the diff before requesting a review
  • Checked for unhandled edge cases
  • Linked the issues it closes
  • Checked if the docs need to be updated. If so, create a separate Linear DOCS issue
  • Intend to cherry-pick into the release branch
  • I'm proud of this work!

This was referenced Dec 8, 2025
@lovincyrus lovincyrus self-assigned this Dec 8, 2025
@lovincyrus lovincyrus changed the title Shared Multi Step Form Shared Multi Step Form for Object Storage Connectors Dec 9, 2025
@lovincyrus
Copy link
Contributor Author

still creating s3, s4 file names

Image

Have migrated the fix from #8354

@royendo royendo self-requested a review December 12, 2025 16:44
Copy link
Contributor

@royendo royendo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good from Product QA.

@ericpgreen2
Copy link
Contributor

ericpgreen2 commented Dec 12, 2025

Hey Cyrus,

A while back we wrote a design doc for connector forms that aligned on using JSON Schema as the form definition format: https://www.notion.so/rilldata/Snippets-API-design-doc-1f5ba33c8f5780798a0cfc819a406b42

The idea is that these schemas eventually live on the backend and can be used by both UI and AI/CLI.

Defining a config-driven form system that works across multiple connectors is definitely the right direction! But the custom TypeScript format (authFieldGroups, clearFieldsByMethod, etc.) diverges from that design. Instead, let's define these as JSON Schema in the frontend. This gets us closer to the design doc without requiring backend work now, and when we move the schemas to the backend later, it's just moving files.

Here's a rough sketch of what S3 might look like:

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "type": "object",
  "properties": {
    "auth_method": {
      "type": "string",
      "title": "Authentication method",
      "enum": ["access_keys", "public"],
      "default": "access_keys",
      "x-display": "radio"
    },
    "aws_access_key_id": {
      "type": "string",
      "title": "Access Key ID",
      "description": "AWS access key ID for the bucket",
      "x-secret": true,
      "x-step": "connector",
      "x-visible-if": { "auth_method": "access_keys" }
    },
    "aws_secret_access_key": {
      "type": "string",
      "title": "Secret Access Key",
      "description": "AWS secret access key for the bucket",
      "x-secret": true,
      "x-step": "connector",
      "x-visible-if": { "auth_method": "access_keys" }
    },
    "path": {
      "type": "string",
      "title": "S3 URI",
      "pattern": "^s3://",
      "x-step": "source"
    },
    "name": {
      "type": "string",
      "title": "Model name",
      "pattern": "^[a-zA-Z_][a-zA-Z0-9_]*$",
      "x-step": "source"
    }
  },
  "allOf": [
    {
      "if": { "properties": { "auth_method": { "const": "access_keys" } } },
      "then": { "required": ["aws_access_key_id", "aws_secret_access_key"] }
    }
  ]
}

This is illustrative—you'll need to think through the right JSON Schema properties. Try to stay as close to native JSON Schema as possible before reaching for custom x-* extensions. The ones I used above:

  • x-secret: mask the input field
  • x-step: which step of the multi-step flow this field belongs to
  • x-display: UI hint for how to render (radio, select, etc.)
  • x-visible-if: conditional visibility based on other field values

There may be better ways to model some of these. Let me know if you want to talk through it.


Developed in collaboration with Claude Code

Copy link
Contributor

@ericpgreen2 ericpgreen2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the comment above. A declarative config language to define these forms is definitely the right direction! It's so close to the JSON Schema approach we had discussed previously. This seems like a good time to take your code here one step closer to that design doc.

@lovincyrus
Copy link
Contributor Author

Hey Cyrus,

A while back we wrote a design doc for connector forms that aligned on using JSON Schema as the form definition format: https://www.notion.so/rilldata/Snippets-API-design-doc-1f5ba33c8f5780798a0cfc819a406b42

The idea is that these schemas eventually live on the backend and can be used by both UI and AI/CLI.

Defining a config-driven form system that works across multiple connectors is definitely the right direction! But the custom TypeScript format (authFieldGroups, clearFieldsByMethod, etc.) diverges from that design. Instead, let's define these as JSON Schema in the frontend. This gets us closer to the design doc without requiring backend work now, and when we move the schemas to the backend later, it's just moving files.

Here's a rough sketch of what S3 might look like:

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "type": "object",
  "properties": {
    "auth_method": {
      "type": "string",
      "title": "Authentication method",
      "enum": ["access_keys", "public"],
      "default": "access_keys",
      "x-display": "radio"
    },
    "aws_access_key_id": {
      "type": "string",
      "title": "Access Key ID",
      "description": "AWS access key ID for the bucket",
      "x-secret": true,
      "x-step": "connector",
      "x-visible-if": { "auth_method": "access_keys" }
    },
    "aws_secret_access_key": {
      "type": "string",
      "title": "Secret Access Key",
      "description": "AWS secret access key for the bucket",
      "x-secret": true,
      "x-step": "connector",
      "x-visible-if": { "auth_method": "access_keys" }
    },
    "path": {
      "type": "string",
      "title": "S3 URI",
      "pattern": "^s3://",
      "x-step": "source"
    },
    "name": {
      "type": "string",
      "title": "Model name",
      "pattern": "^[a-zA-Z_][a-zA-Z0-9_]*$",
      "x-step": "source"
    }
  },
  "allOf": [
    {
      "if": { "properties": { "auth_method": { "const": "access_keys" } } },
      "then": { "required": ["aws_access_key_id", "aws_secret_access_key"] }
    }
  ]
}

This is illustrative—you'll need to think through the right JSON Schema properties. Try to stay as close to native JSON Schema as possible before reaching for custom x-* extensions. The ones I used above:

  • x-secret: mask the input field
  • x-step: which step of the multi-step flow this field belongs to
  • x-display: UI hint for how to render (radio, select, etc.)
  • x-visible-if: conditional visibility based on other field values

There may be better ways to model some of these. Let me know if you want to talk through it.

Developed in collaboration with Claude Code

Added the json schema changes in 83bfcdf

@ericpgreen2
Copy link
Contributor

ericpgreen2 commented Dec 15, 2025

Nice work adopting JSON Schema—this is the right direction.

One structural concern: the current flow is JSON Schema → getMultiStepFormConfig() → custom config → MultiStepFormRenderer. The ~200-line getMultiStepFormConfig() function transforms the schema into a different format that the renderer consumes. This defeats the purpose of using JSON Schema—the value is that consumers can interpret the schema directly, without needing a translation layer that duplicates the structure in a different format.

A cleaner approach would be a JSONSchemaFormRenderer that consumes the schema directly—interpreting properties, x-visible-if, allOf/if/then, etc. without the intermediate transformation. This is how libraries like react-jsonschema-form work. The renderer would emit form values via a callback (e.g., onSubmit={(values) => ...}), which the parent can use to call the existing submitAddSourceForm/submitAddConnectorForm functions.

Keep the renderer generic—use slots for context-specific UI outside the form fields themselves: the "Already connected? Import your data" link, connector-specific error messages, etc. That way the same renderer can be reused for non-connector templates later.

For incremental migration, isolate the new pattern in its own top-level feature directory to align with the design doc:

web-common/src/features/templates/
  schemas/
    s3.ts
    gcs.ts
    azure.ts
  JSONSchemaFormRenderer.svelte
  types.ts
  utils.ts

This keeps the new approach cleanly separated and positions it as a general pattern for template-based code generation, not just source connectors.

Optionally, you could also extract the file generation logic from submitAddDataForm.ts into this directory with an interface that mirrors the future backend API:

// Today: implemented client-side
// Tomorrow: calls RuntimeService.GenerateFile
async function generateFile(args: {
  template: string;
  properties: Record<string, unknown>;
  preview?: boolean;
}): Promise<{
  addedPaths: string[];
  modifiedPaths: string[];
  previews: Record<string, string>; // path → YAML
}>

This would make the backend migration a simple swap—same interface, different implementation.

Let me know if you want to talk through any of this.


Developed in collaboration with Claude Code

Copy link
Contributor

@ericpgreen2 ericpgreen2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above feedback on the JSON Schema implementation

@ericpgreen2
Copy link
Contributor

Good progress on the JSONSchemaFormRenderer—it's much better now that the intermediate transformation layer is gone.

One more structural issue: the renderer still has connector-specific logic baked in:

  • Hardcoded step: "connector" | "source" types
  • Auth-specific handling (authMethod, authInfo, authMethodKey)
  • Special "Authentication method" label
  • Logic that assumes the auth-method-with-nested-fields pattern

This makes it a "connector form renderer" rather than a generic "JSON Schema form renderer."

The principle: The renderer should be a pure function of the schema. It shouldn't know about connectors, sources, or auth methods. It should just:

  1. Render fields based on schema properties
  2. Filter by x-step if a step prop is passed (but step is just a string, not specifically "connector"/"source")
  3. Show/hide fields based on x-visible-if evaluated against current form values
  4. Handle x-display: radio for any enum field—not just auth methods

The auth method pattern (radio with nested fields) is just one instance of a general pattern: "enum field with conditionally visible related fields." The renderer should handle that generically.

Connector-specific logic belongs in a parent component. Some naming options:

  • ObjectStorageConnectorFlow.svelte - describes what it's for
  • MultiStepConnectorForm.svelte - describes the pattern
  • ConnectorFormWizard.svelte - wizard pattern

This parent component handles:

  • The multi-step flow (connector → source)
  • Button labels that change based on auth method
  • "Already connected? Import your data" skip link
  • Connection testing before advancing steps

The renderer just renders fields; the parent orchestrates the workflow.

Directory structure: The sources/modal/ directory is getting unwieldy. I'd suggest splitting things up:

web-common/src/features/templates/
  JSONSchemaFormRenderer.svelte   # Generic, reusable
  types.ts
  utils.ts
  schemas/
    s3.ts
    gcs.ts
    azure.ts

web-common/src/features/sources/modal/
  ObjectStorageConnectorFlow.svelte   # Connector-specific workflow
  AddDataForm.svelte                   # Existing, uses the flow component
  ...

The generic rendering infrastructure lives in templates/. The connector-specific workflow stays in sources/modal/ and imports from templates/. The schemas live in templates/ since they'll eventually be served from the backend per the design doc.

Let me know if you want to talk through any of this.


Developed in collaboration with Claude Code

Copy link
Contributor

@ericpgreen2 ericpgreen2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting closer! Please see feedback above.

@lovincyrus
Copy link
Contributor Author

lovincyrus commented Dec 17, 2025

Directory structure: The sources/modal/ directory is getting unwieldy. I'd suggest splitting things up:

web-common/src/features/templates/
  JSONSchemaFormRenderer.svelte   # Generic, reusable
  types.ts
  utils.ts
  schemas/
    s3.ts
    gcs.ts
    azure.ts

web-common/src/features/sources/modal/
  ObjectStorageConnectorFlow.svelte   # Connector-specific workflow
  AddDataForm.svelte                   # Existing, uses the flow component
  ...

The generic rendering infrastructure lives in templates/. The connector-specific workflow stays in sources/modal/ and imports from templates/. The schemas live in templates/ since they'll eventually be served from the backend per the design doc.

I've added the changes for the directory structure.

@lovincyrus
Copy link
Contributor Author

One more structural issue: the renderer still has connector-specific logic baked in:

  • Hardcoded step: "connector" | "source" types
  • Auth-specific handling (authMethod, authInfo, authMethodKey)
  • Special "Authentication method" label
  • Logic that assumes the auth-method-with-nested-fields pattern

This makes it a "connector form renderer" rather than a generic "JSON Schema form renderer."

The principle: The renderer should be a pure function of the schema. It shouldn't know about connectors, sources, or auth methods. It should just:

  1. Render fields based on schema properties
  2. Filter by x-step if a step prop is passed (but step is just a string, not specifically "connector"/"source")
  3. Show/hide fields based on x-visible-if evaluated against current form values
  4. Handle x-display: radio for any enum field—not just auth methods

The auth method pattern (radio with nested fields) is just one instance of a general pattern: "enum field with conditionally visible related fields." The renderer should handle that generically.

I've made some attempts to address this but some regressions are expected along the way. This will require some time to fully migrate to schema-driven design.

@royendo
Copy link
Contributor

royendo commented Dec 17, 2025

nit: lets not remove any of the required keys from s3.go IE aws_role_arn since this might be in use.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants