prompt-lint

A Java library that statically analyzes LLM prompt quality, no LLM calls required. Think of it as a linter for your AI agent prompts.

It scores prompts across 8 quality dimensions, flags issues by severity, and produces detailed reports. Runs in milliseconds, integrates into CI, and catches prompt regressions before they hit production.

Why use it?

Prompts are code. They silently degrade over time — someone adds vague language, removes the output schema, introduces contradictory rules. You don't notice until the LLM starts hallucinating.
No LLM calls. All analysis is rule-based and deterministic. No API keys, no latency, no flaky tests.
CI-ready. Set a quality threshold, run on every commit. Prompt regressions fail the build.
Weighted scoring by agent type. An extraction agent weights groundedness higher; a formatter weights output contract higher. Built-in profiles: EXTRACTION, CLASSIFICATION, FORMATTING, REVIEW, DEFAULT.

Quality Dimensions

Dimension	What it checks
CLARITY	Role definition, task statement, imperative verbs, vague language, output format section, task-before-rules ordering
SPECIFICITY	Numbered rules, concrete examples, quantified thresholds, boundary conditions, enum constraints
GROUNDEDNESS	Source-grounding phrases, anti-hallucination guards, evidence requirements, citation instructions
OUTPUT_CONTRACT	JSON examples present and valid, field-level documentation, schema completeness
CONSTRAINT_COVERAGE	Edge case handling, null/empty handling, error handling, ordering rules, boundary conditions
CONSISTENCY	Template variable alignment with declared inputs, contradictory instructions, terminology consistency
TOKEN_EFFICIENCY	Prompt length vs complexity ratio, redundancy detection, filler removal
INJECTION_RESISTANCE	System/user boundary enforcement, input sanitization, refusal instructions, role-lock phrases

Getting Started

Add dependency

<dependency>
    <groupId>com.eainde</groupId>
    <artifactId>prompt-lint</artifactId>
    <version>1.0-SNAPSHOT</version>
    <scope>test</scope>
</dependency>

Basic usage

import com.eainde.prompt.quality.PromptQualityAnalyzer;
import com.eainde.prompt.quality.model.AgentTypeProfile;
import com.eainde.prompt.quality.model.PromptUnderTest;
import com.eainde.prompt.quality.report.PromptQualityReport;
import com.eainde.prompt.quality.report.PromptQualityReportRenderer;

import java.util.Set;

// 1. Create the analyzer
var analyzer = PromptQualityAnalyzer.create();

// 2. Wrap your prompt
var prompt = new PromptUnderTest(
    "my-extraction-agent",
    systemPromptText,
    userPromptText,
    Set.of("sourceText", "fileNames"),   // declared input variables
    "rawNames",                           // declared output key
    AgentTypeProfile.EXTRACTION           // agent type profile
);

// 3. Analyze
PromptQualityReport report = analyzer.analyze(prompt);

// 4. Check results
System.out.println("Score: " + report.overallScore());
System.out.println("Critical issues: " + report.hasCriticalIssues());

Use in JUnit tests

import static com.eainde.prompt.quality.PromptQualityAssert.assertThat;

@Test
void promptMeetsQualityBar() {
    var analyzer = PromptQualityAnalyzer.create();
    var report = analyzer.analyze(myPrompt());

    assertThat(report)
        .passesThreshold(0.75)
        .hasNoCriticalIssues()
        .dimensionScoreAbove("GROUNDEDNESS", 0.80)
        .dimensionScoreAbove("OUTPUT_CONTRACT", 0.70)
        .hasFewerIssuesThan(10);
}

Validate multiple agents at once

@Test
void allAgentsPassQualityBar() {
    var analyzer = PromptQualityAnalyzer.create();
    var renderer = new PromptQualityReportRenderer();

    List<PromptUnderTest> prompts = List.of(
        extractorPrompt(),
        classifierPrompt(),
        formatterPrompt()
    );

    List<PromptQualityReport> reports = analyzer.analyzeAll(prompts);

    // Print CI summary
    System.out.println(renderer.renderSummary(reports, 0.75));

    // Assert all pass
    for (var report : reports) {
        assertThat(report).passesThreshold(0.75);
    }
}

Regression testing

@Test
void fixedIssueStaysFixed() {
    var report = analyzer.analyze(myPrompt());

    assertThat(report)
        .doesNotHaveIssue("CLR-005")   // output format section was added
        .doesNotHaveIssue("OUT-001");   // JSON example was added
}

Agent Type Profiles

Each profile adjusts dimension weights to match what matters for that agent type:

Profile	Top weighted dimensions
`EXTRACTION`	Groundedness (0.25), Specificity (0.15), Output Contract (0.15)
`CLASSIFICATION`	Specificity (0.20), Constraint Coverage (0.20), Groundedness (0.15)
`FORMATTING`	Output Contract (0.30), Consistency (0.15), Constraint Coverage (0.15)
`REVIEW`	Groundedness (0.20), Specificity (0.15), Output Contract (0.15)
`DEFAULT`	Equal weights (0.125 each)

You can also create custom profiles:

var customProfile = new AgentTypeProfile("MY_AGENT", Map.of(
    "CLARITY", 0.20,
    "SPECIFICITY", 0.20,
    "GROUNDEDNESS", 0.10,
    "OUTPUT_CONTRACT", 0.20,
    "CONSTRAINT_COVERAGE", 0.10,
    "CONSISTENCY", 0.10,
    "TOKEN_EFFICIENCY", 0.05,
    "INJECTION_RESISTANCE", 0.05
));

Report Output

The renderer produces formatted console output:

═══════════════════════════════════════════════════════════════
  PROMPT QUALITY REPORT: my-extraction-agent
  Profile: EXTRACTION
═══════════════════════════════════════════════════════════════

  Overall Score: 0.82 / 1.00  PASS (threshold: 0.75)

  ┌────────────────────────┬───────┬────────┬─────────┐
  │ Dimension              │ Score │ Weight │ Contrib │
  ├────────────────────────┼───────┼────────┼─────────┤
  │   CLARITY              │  0.83 │  0.10  │  0.083  │
  │   SPECIFICITY          │  0.75 │  0.15  │  0.113  │
  │   GROUNDEDNESS         │  0.90 │  0.25  │  0.225  │
  │   OUTPUT_CONTRACT      │  0.80 │  0.15  │  0.120  │
  │   ...                  │       │        │         │
  └────────────────────────┴───────┴────────┴─────────┘

  Issues Found: 3 total (0 critical, 2 warning, 1 info)
    WARNING  [CLR-004] Vague language detected: [try to]
    WARNING  [SPC-002] No quantified thresholds found
    INFO     [CLR-007] Ambiguous quantifiers found: [some of]

  Suggestions:
    1. Replace vague phrases with imperative: 'try to extract' -> 'Extract'
    2. Add quantified thresholds where possible
═══════════════════════════════════════════════════════════════

Issue Severity Levels

Severity	Meaning
`CRITICAL`	Prompt is missing something fundamental (e.g., no output format)
`WARNING`	Quality problem that should be fixed (e.g., vague language)
`INFO`	Improvement opportunity (e.g., add more imperative verbs)

Requirements

Java 17+
No runtime dependencies beyond Jackson (for JSON validation in OutputContractAnalyzer)

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.idea		.idea
src		src
.gitignore		.gitignore
README.md		README.md
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

prompt-lint

Why use it?

Quality Dimensions

Getting Started

Add dependency

Basic usage

Use in JUnit tests

Validate multiple agents at once

Regression testing

Agent Type Profiles

Report Output

Issue Severity Levels

Requirements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

prompt-lint

Why use it?

Quality Dimensions

Getting Started

Add dependency

Basic usage

Use in JUnit tests

Validate multiple agents at once

Regression testing

Agent Type Profiles

Report Output

Issue Severity Levels

Requirements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages