ai-testing

Here are 162 public repositories matching this topic...

Giskard-AI / giskard-oss

🐢 Open-Source Evaluation & Testing library for LLM Agents

ai-security mlops fairness-ai responsible-ai ml-validation red-team-tools trustworthy-ai ml-testing llm ai-red-team ai-testing llmops llm-security llm-eval llm-evaluation rag-evaluation agent-evaluation

Updated Mar 25, 2026
Python

langwatch / scenario

Star

Agentic testing for agentic codebases

python-library javascript-library ai-testing agent-simulations agent-testing

Updated Mar 25, 2026
TypeScript

PacificAI / langtest

Star

Deliver safe & effective language models

nlp artificial-intelligence benchmarks benchmark-framework model-assessment ai-safety mlops responsible-ai ml-safety trustworthy-ai ethics-in-ai ml-testing large-language-models llm ai-testing llm-test llm-evaluation-toolkit llm-as-evaluator llm-testing

Updated Mar 23, 2026
Python

AI-powered E2E testing for 10 platforms. 253 MCP tools. Zero config. Works with Claude, Cursor, Windsurf, Copilot. Test Flutter, React Native, iOS, Android, Web, Electron, Tauri, KMP, .NET MAUI — all from natural language.

Updated Mar 23, 2026
Dart

Addepto / contextcheck

Star

MIT-licensed Framework for LLMs, RAGs, Chatbots testing. Configurable via YAML and integrable into CI pipelines for automated testing.

open-source ci testing-tools chatbot-framework testing-framework chatbot-testing rag ai-chat large-language-models llm ai-testing llm-evaluation llm-evaluation-framework prompt-test llm-testing ai-testing-tool generative-ai-testing rag-testing summarization-testing

Updated Dec 11, 2024
Python

PramodDutta / qaskills

Star

QA Skills Directory QA Skills is a curated directory of testing-specific skills for AI coding agents (Claude Code, Cursor, Copilot, etc.).

testing qa selenium test-automation cursor cypress sdet playwright ai-testing agent-skills claude-code agent-browser vibium qaskils

Updated Mar 24, 2026
TypeScript

tianshanghong / GPT4Go

Star

GPT4Go: AI-Powered Test Case Generation for Golang 🧪

golang test-automation openai code-generation software-testing test-generation golang-testing test-case-generation go-testing golang-utility openai-api golang-test golang-tests gpt-4 chatgpt chatgpt-go ai-testing ai-powered-testing gpt4go

Updated Apr 5, 2023
Go

srvsngh99 / genai-testing-journey

Star

52-week journey from QA/SDET to GenAI Testing - learning in public with weekly mini-projects, code, and honest documentation of struggles and wins.

python test-automation qa-engineering learning-in-public prompt-engineering ai-testing genai llm-testing 52-week-challenge

Updated Mar 17, 2026
Python

TommyLemon / CVAuto

Star

👁 零代码零标注 CV AI 自动化测试工具 🚀 免除大量人工画框和打标签等，直接零代码快速自动化测试 CV 计算机视觉 AI 人工智能图像识别算法：行人检测、动植物分类、人脸识别、OCR 车牌识别、旋转校正、舞蹈姿态、抠图分割等，还可一键下载测试报告、导出训练和测试数据集

Updated Mar 24, 2026
JavaScript

kdunee / intentguard

Sponsor

Star

A Python library for verifying code properties using natural language assertions.

testing natural-language test-automation pytest unittest code-quality language-models code-verification llm ai-testing

Updated Mar 1, 2025
Python

NihadMemmedli / quorvex_ai

Star

Convert plain English test specs into self-healing Playwright tests using AI. Browser exploration, auto-fix, load/security/API/LLM testing. Open source.

Updated Mar 22, 2026
TypeScript

greynewell / matchspec

Sponsor

Star

Eval framework. Define correct, test against it, get results.

Updated Feb 17, 2026
Go

greynewell / evaldriven.org

Sponsor

Star

Ship evals before you ship features.

Updated Feb 25, 2026
Nunjucks

hemangjoshi37a / claude-code-frontend-dev

Star

🚀 First multimodal AI-powered visual testing plugin for Claude Code. AI that can SEE your UI! 10x faster frontend development with closed-loop testing, browser automation, and Claude 4.5 Sonnet vision.

Updated Jan 25, 2026
Python

naodeng / awesome-qa-prompt

Star

A professional collection of AI prompts for QA (Quality Assurance) professionals, designed to help test engineers and QA teams work more efficiently throughout the software testing lifecycle.

qa prompts prompt-engineering ai-testing

Updated Mar 11, 2026
TypeScript

onerun-ai / onerun

Star

Open-source framework for stress-testing LLMs and conversational AI. Identify hallucinations, policy violations, and edge cases with scalable, realistic simulations. Join the discord: https://discord.gg/ssd4S37WNW

security ai simulation chatbot ai-agents ai-testing llm-testing chatbot-simulation