Skip to content

Conversation

@rossirpaulo
Copy link
Collaborator

@rossirpaulo rossirpaulo commented Dec 18, 2025

Important

Based on #2828

Summary

This PR adds exhaustiveness checking and unreachable arm detection for match expressions in BAML's type system. The implementation uses a value-based model (ValueSet) inspired by the treatment of pattern matching in TAPL.

Key capabilities:

  • Detect non-exhaustive matches (missing cases) with helpful error messages listing uncovered patterns
  • Detect unreachable arms (arms that can never match because previous arms already cover them)
  • Handle type aliases by expanding them to their underlying types
  • Treat enums as finite types, expanding them to their individual variants
  • Support literal patterns (integers, strings, booleans) and union patterns (200 | 201)

Key Design Decisions

1. Value-Based Model (ValueSet)

The core insight is that pattern matching operates on values, not types. A pattern like Status.Active matches one specific value, while s: Status matches all values of type Status.

The ValueSet enum captures this:

pub enum ValueSet {
    All,                                    // Catch-all: `_` or `other`
    OfType(Name),                           // Type pattern: `s: Status`
    EnumVariant { enum_name, variant_name }, // Specific variant: `Status.Active`
    Literal(Literal),                       // Literal value: `42`, `"hello"`
    Union(Vec<ValueSet>),                   // Union pattern: `200 | 201`
    Empty,                                  // Guarded pattern (doesn't contribute)
}

This separation allows clean reasoning about what values each arm covers.

2. Guards Do NOT Contribute to Exhaustiveness

Per BEP-002, guarded patterns are treated as Empty value sets. A pattern like x: int if x > 0 might not match negative numbers, so it can't guarantee coverage.

match (n) {
  x: int if x > 0 => "positive"
  // Still need a catch-all because guard doesn't cover all ints
}

3. Type Alias Expansion

Type aliases are expanded during exhaustiveness checking. For type Result = Success | Failure, matching on r: Result requires covering both Success and Failure.

4. Finite vs. Infinite Types

  • Finite types (enums, booleans): Expanded into individual values for precise tracking
  • Infinite types (int, string, classes): Represented as abstract OfType sets, only exhaustive via catch-all

5. Singleton Types for Literal Unions

Added singleton types (Ty::Singleton(SingletonValue)) to support exhaustiveness checking of literal unions like 200 | 201 | 204. This follows the type-theoretic notion of singleton types—types inhabited by exactly one value.

Files Changed

File Purpose
baml_thir/src/exhaustiveness.rs New module: ValueSet abstraction, ExhaustivenessChecker, coverage algorithms
baml_thir/src/lib.rs Integration: Calls exhaustiveness checker during match type inference
baml_thir/src/types.rs Added Ty::Singleton, SingletonValue, and is_uninhabited() method
baml_hir/src/type_ref.rs Added TypeRef::StringLiteral, TypeRef::IntLiteral, TypeRef::BoolLiteral for literal type support
baml_hir/src/lib.rs Type text parsing for literal types in union syntax

Algorithm Overview

The exhaustiveness checker maintains a list of covered ValueSets as it processes arms:

  1. Expand scrutinee type into required value sets

    • Enums → individual variants
    • Optionals → inner type + null
    • Type aliases → expanded recursively
  2. For each arm, convert pattern to ValueSet:

    • Check if arm is unreachable (already fully covered by previous arms)
    • Add to coverage list (unless guarded)
  3. After all arms, find uncovered value sets:

    • Report as TypeError::NonExhaustiveMatch with helpful error message

Test Coverage

The implementation includes unit tests for:

  • ValueSet display formatting
  • Coverage logic (type patterns, enum variants, literals)
  • Type alias expansion (union members covered separately)
  • Enum exhaustiveness (type pattern expands to all variants)
  • Literal-to-type coverage (42 is covered by int pattern)
  • Catch-all coverage (covers everything)

Known Limitations / Future Work

Error Span Improvements

Currently, unreachable arm errors point to the entire match expression rather than the specific unreachable arm. This requires tracking arm spans in HIR.

// TODO(exhaustiveness): Improve error spans to point to the specific unreachable
// arm rather than the entire match expression. Requires tracking arm spans in HIR.

Uninhabited Types (Zero-Variant Enums)

Empty matches on uninhabited types should be allowed (there are no cases to handle). Currently only empty unions are detected; zero-variant enums require database access.

// TODO(exhaustiveness): Check for zero-variant enums. This requires database
// access to look up enum definitions. Currently only empty unions are detected.

Cyclic Type Alias Detection

Recursive type aliases like type A = A | B could cause infinite recursion during expansion. A cycle detection mechanism should be added.

// TODO(exhaustiveness): Add cycle detection for recursive type aliases like
// `type A = A | B`. Currently this would cause infinite recursion.

Example Usage

enum Status { Active, Inactive, Pending }

function ProcessStatus(s: Status) -> string {
  match (s) {
    Status.Active => "active"
    Status.Inactive => "inactive"
    // Error: Non-exhaustive match, missing: Status.Pending
  }
}

function WithCatchAll(s: Status) -> string {
  match (s) {
    Status.Active => "active"
    _ => "other"  // OK: catch-all covers remaining cases
  }
}

type Result = Success | Failure

function HandleResult(r: Result) -> string {
  match (r) {
    s: Success => "ok"
    f: Failure => "error"
    // OK: both union members covered
  }
}

…mitives, named types, optional types, list types, union types, and string literal types. Refactor `lower_type_ref` and add `lower_type_text` for improved match exhaustiveness.
…g support for boolean and integer literal types.
…and more

This commit introduces functions to build type alias and enum variant definitions from a project, enhancing the exhaustiveness checking capabilities. The `TypeContext` struct is updated to store these definitions, and the `infer_function` and `check_match_exhaustiveness` functions are modified to utilize this new information for improved match coverage analysis.
… `from_type_text` to handle boolean and integer literal types
…rove comments for clarity on string and integer literal types.
…enhancing exhaustiveness checking in type lowering.
…rate it into `Ty` for singleton type support.
@vercel
Copy link

vercel bot commented Dec 18, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Review Updated (UTC)
promptfiddle Ready Ready Preview, Comment Dec 23, 2025 1:09am

@codspeed-hq
Copy link

codspeed-hq bot commented Dec 18, 2025

CodSpeed Performance Report

Merging #2838 will not alter performance

Comparing paulo/match-2 (1e6d23c) with paulo/match (f2f902f)1

Summary

✅ 15 untouched
⏩ 14 skipped2

Footnotes

  1. No successful run was found on paulo/match (c2d87df) during the generation of this report, so f7da957 was used instead as the comparison base. There might be some changes unrelated to this pull request in this report.

  2. 14 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@rossirpaulo rossirpaulo changed the title Paulo/match 2 Match Exhaustiveness Checking Dec 18, 2025
@rossirpaulo rossirpaulo marked this pull request as draft December 18, 2025 23:45
Copy link
Contributor

@antoniosarosi antoniosarosi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! All my review is in comments referencing source code. Overall great PR, needs some AI cleaning, additional tests, and other less important nits.

Comment on lines 770 to 772
else if let Ok(int_val) = text.parse::<i64>() {
TypeRef::IntLiteral(int_val)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to check for numbers that are greater than i64::MAX and issue diagnostics.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a comment and TODO here so we can address this at a later time.

For anyone else in the team reading, find more context here: https://magnaplay.slack.com/archives/C0A3P816B0W/p1766183421615089

@antoniosarosi antoniosarosi mentioned this pull request Dec 19, 2025
- Add is_union() and parts() methods to TypeExpr in ast.rs
- Simplify TypeRef::from_ast() to use new accessors
- Remove duplicate lower_type_ref() from lib.rs and signature.rs
- Fix: primitives now recognized inside type constructors (int? → Optional(Int))
- Add unit tests for from_type_text covering "literal"?, "literal"[],
  and nested combinations like "literal"[]?
- Add end-to-end snapshot tests with new type aliases in basic_types
… Introduces MatchArmSpans, expr_spans/pattern_spans/match_arm_spans maps, and captures spans during CST→HIR lowering.
…t to specific unreachable arms and match expressions instead of line 1.
…ertions for Ty::Class and Ty::Enum. Update handling for Ty::List to include element types, and note that Ty::Map is not fully implemented yet.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants