When you send a prompt to the AI in Visual Studio Code, the language model provider can reuse the parts of your request that match a previous request. This is called prompt caching. The Cache Explorer view helps you diagnose prompt cache misses by comparing consecutive model requests in a chat session, which helps you reduce token cost and latency.
The Cache Explorer is one of the views in the Agent Debug Logs panel.
Prompt caching lets a model provider reuse the prefix of a request that matches a previous one. When the beginning of a request is identical to the previous request, the provider serves that portion from cache instead of processing it again. A higher cache hit rate reduces both latency and token cost.
The cache only applies to the matching prefix of a request. As soon as the content of two consecutive requests diverges, everything after that point is a cache miss. Small changes early in the prompt, such as a reordered tool definition or a modified instruction, break the cache for the rest of the request. The Cache Explorer pinpoints exactly where the prompt prefix diverges, which helps you understand and address a low cache hit rate.
Open the Agent Debug panel by selecting the ellipsis (...) menu in the Chat view and selecting Show Agent Debug Logs.
Select the session description in the breadcrumb at the top to go to the Summary view.
Select Cache Explorer to open the Cache Explorer view for the selected session.

The Cache Explorer has two panels:
The main content area includes the following information:
To find the cause of a cache miss, locate the first divergence point in the prompt signature, then expand the corresponding component to see the exact text that changed between the two requests.
Prompt caching works best when the early parts of your requests stay stable across turns. Use the following practices to keep the prompt prefix consistent:
/compact to rebuild from a short summary instead of the full history.