How to select examples from a LangSmith dataset
langsmith>=0.1.101
, langchain-core>=0.2.34
. Please ensure you have the correct packages installed.LangSmith datasets have built-in support for similarity search, making them a great tool for building and querying few-shot examples.
In this guide we'll see how to use an indexed LangSmith dataset as a few-shot example selector.
Setup
Before getting started make sure you've created a LangSmith account and set your credentials:
import getpass
import os
if not os.environ.get("LANGSMITH_API_KEY"):
os.environ["LANGSMITH_API_KEY"] = getpass.getpass("Set LangSmith API key:\n\n")
os.environ["LANGSMITH_TRACING"] = "true"
Set LangSmith API key:
········
We'll need to install the langsmith
SDK. In this example we'll also make use of langchain
, langchain-openai
, and langchain-benchmarks
:
%pip install -qU "langsmith>=0.1.101" "langchain-core>=0.2.34" langchain langchain-openai langchain-benchmarks
Now we'll clone a public dataset and turn on indexing for the dataset. We can also turn on indexing via the LangSmith UI.
We'll clone the Multiverse math few shot example dataset.
This enables searching over the dataset and will make sure that anytime we update/add examples they are also indexed.
from langsmith import Client as LangSmith
ls_client = LangSmith()
dataset_name = "multiverse-math-few-shot-examples-v2"
dataset_public_url = (
"https://smith.langchain.com/public/620596ee-570b-4d2b-8c8f-f828adbe5242/d"
)
ls_client.clone_public_dataset(dataset_public_url)
dataset_id = ls_client.read_dataset(dataset_name=dataset_name).id
ls_client.index_dataset(dataset_id=dataset_id)
Querying dataset
Indexing can take a few seconds. Once the dataset is indexed, we can search for similar examples. Note that the input to the similar_examples
method must have the same schema as the examples inputs. In this case our example inputs are a dictionary with a "question" key:
examples = ls_client.similar_examples(
{"question": "whats the negation of the negation of the negation of 3"},
limit=3,
dataset_id=dataset_id,
)
len(examples)
3
examples[0].inputs["question"]
'evaluate the negation of -100'
For this dataset, the outputs are the conversation that followed the question in OpenAI message format:
examples[0].outputs["conversation"]
[{'role': 'assistant',
'content': None,
'tool_calls': [{'id': 'toolu_01HTpq4cYNUac6F7omUc2Wz3',
'type': 'function',
'function': {'name': 'negate', 'arguments': '{"a": -100}'}}]},
{'role': 'tool',
'content': '-100.0',
'tool_call_id': 'toolu_01HTpq4cYNUac6F7omUc2Wz3'},
{'role': 'assistant', 'content': 'So the answer is 100.'},
{'role': 'user',
'content': '100 is incorrect. Please refer to the output of your tool call.'},
{'role': 'assistant',
'content': [{'text': "You're right, my previous answer was incorrect. Let me re-evaluate using the tool output:",
'type': 'text'}],
'tool_calls': [{'id': 'toolu_01XsJQboYghGDygQpPjJkeRq',
'type': 'function',
'function': {'name': 'negate', 'arguments': '{"a": -100}'}}]},
{'role': 'tool',
'content': '-100.0',
'tool_call_id': 'toolu_01XsJQboYghGDygQpPjJkeRq'},
{'role': 'assistant', 'content': 'The answer is -100.0'},
{'role': 'user',
'content': 'You have the correct numerical answer but are returning additional text. Please only respond with the numerical answer.'},
{'role': 'assistant', 'content': '-100.0'}]
Creating dynamic few-shot prompts
The search returns the examples whose inputs are most similar to the query input. We can use this for few-shot prompting a model like so:
from langchain.chat_models import init_chat_model
from langchain_benchmarks.tool_usage.tasks.multiverse_math import (
add,
cos,
divide,
log,
multiply,
negate,
pi,
power,
sin,
subtract,
)
from langchain_core.runnables import RunnableLambda
from langsmith import AsyncClient as AsyncLangSmith
async_ls_client = AsyncLangSmith()
def similar_examples(input_: dict) -> dict:
examples = ls_client.similar_examples(input_, limit=5, dataset_id=dataset_id)
return {**input_, "examples": examples}
async def asimilar_examples(input_: dict) -> dict:
examples = await async_ls_client.similar_examples(
input_, limit=5, dataset_id=dataset_id
)
return {**input_, "examples": examples}
def construct_prompt(input_: dict) -> list:
instructions = """You are great at using mathematical tools."""
examples = []
for ex in input_["examples"]:
examples.append({"role": "user", "content": ex.inputs["question"]})
for msg in ex.outputs["conversation"]:
if msg["role"] == "assistant":
msg["name"] = "example_assistant"
if msg["role"] == "user":
msg["name"] = "example_user"
examples.append(msg)
return [
{"role": "system", "content": instructions},
*examples,
{"role": "user", "content": input_["question"]},
]
tools = [add, cos, divide, log, multiply, negate, pi, power, sin, subtract]
llm = init_chat_model("gpt-4o-2024-08-06")
llm_with_tools = llm.bind_tools(tools)
example_selector = RunnableLambda(func=similar_examples, afunc=asimilar_examples)
chain = example_selector | construct_prompt | llm_with_tools
ai_msg = await chain.ainvoke({"question": "whats the negation of the negation of 3"})
ai_msg.tool_calls
[{'name': 'negate',
'args': {'a': 3},
'id': 'call_uMSdoTl6ehfHh5a6JQUb2NoZ',
'type': 'tool_call'}]
Looking at the LangSmith trace, we can see that relevant examples were pulled in in the similar_examples
step and passed as messages to ChatOpenAI: https://smith.langchain.com/public/9585e30f-765a-4ed9-b964-2211420cd2f8/r/fdea98d6-e90f-49d4-ac22-dfd012e9e0d9.