[Inference API] Add Chat Completion to Amazon Bedrock for the Inference API #139411

jonathan-buttner · 2025-12-11T22:48:17Z

This PR implements chat completion for amazon bedrock. It's based on this PR: #133697

Testing

Create the endpoint

PUT _inference/chat_completion/test-chat?timeout=30s
{
    "service": "amazonbedrock",
    "service_settings": {
        "provider": "anthropic",
        "model": "anthropic.claude-3-5-sonnet-20240620-v1:0",
        "region": "us-east-1",
        "access_key": "<access key>",
        "secret_key": "<secret>"
    }
}

Complex request

POST _inference/chat_completion/test-chat/_stream
{
    "messages": [
        {
            "role": "user",
            "content": "test"
        },
        {
            "role": "assistant",
            "content": "tool call",
            "tool_calls": [
                {
                    "function": {
                        "name": "context",
                        "arguments": "{}"
                    },
                    "id": "803434",
                    "type": "function"
                }
            ]
        },
        {
            "role": "tool",
            "content": "{\"screen_description\":\"The user is looking at http://localhost:5601/app/observability/overview?rangeFrom=now-15m&rangeTo=now. The current time range is 2024-12-13T01:18:25.752Z - 2024-12-13T01:33:25.752Z.\\n\\nThe user is viewing the Overview page which shows a summary of the following apps: {\\\"universal_profiling\\\":{\\\"hasData\\\":false,\\\"status\\\":\\\"success\\\"},\\\"alert\\\":{\\\"hasData\\\":false,\\\"status\\\":\\\"success\\\"},\\\"uptime\\\":{\\\"hasData\\\":false,\\\"indices\\\":\\\"heartbeat-*\\\",\\\"status\\\":\\\"success\\\"},\\\"infra_metrics\\\":{\\\"hasData\\\":false,\\\"indices\\\":\\\"metrics-*,metricbeat-*\\\",\\\"status\\\":\\\"success\\\"},\\\"ux\\\":{\\\"hasData\\\":false,\\\"indices\\\":\\\"traces-apm*,apm-*,traces-*.otel-*,logs-apm*,apm-*,logs-*.otel-*,metrics-apm*,apm-*,metrics-*.otel-*\\\",\\\"status\\\":\\\"success\\\"},\\\"infra_logs\\\":{\\\"hasData\\\":false,\\\"indices\\\":\\\"logs-*-*,logs-*,filebeat-*\\\",\\\"status\\\":\\\"success\\\"},\\\"apm\\\":{\\\"hasData\\\":false,\\\"indices\\\":{\\\"transaction\\\":\\\"traces-apm*,apm-*,traces-*.otel-*\\\",\\\"span\\\":\\\"traces-apm*,apm-*,traces-*.otel-*\\\",\\\"error\\\":\\\"logs-apm*,apm-*,logs-*.otel-*\\\",\\\"metric\\\":\\\"metrics-apm*,apm-*,metrics-*.otel-*\\\",\\\"onboarding\\\":\\\"apm-*\\\",\\\"sourcemap\\\":\\\"apm-*\\\"},\\\"status\\\":\\\"success\\\"}}\",\"learnings\":[]}",
            "tool_call_id": "803434"
        }
    ],
    "tool_choice": "auto",
    "tools": [
        {
            "type": "function",
            "function": {
                "name": "query",
                "description": "This function generates, executes and/or visualizes a query\n      based on the user's request. It also explains how ES|QL works and how to\n      convert queries from one language to another. Make sure you call one of\n      the get_dataset functions first if you need index or field names. This\n      function takes no input.",
                "parameters": {
                    "type": "object",
                    "properties": {}
                }
            }
        },
        {
            "type": "function",
            "function": {
                "name": "get_alerts_dataset_info",
                "description": "Use this function to get information about alerts data.",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "start": {
                            "type": "string",
                            "description": "The start of the current time range, in datemath, like now-24h or an ISO timestamp"
                        },
                        "end": {
                            "type": "string",
                            "description": "The end of the current time range, in datemath, like now-24h or an ISO timestamp"
                        }
                    }
                }
            }
        },
        
        {
            "type": "function",
            "function": {
                "name": "execute_connector",
                "description": "Use this function when user explicitly asks to call a kibana connector.",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "id": {
                            "type": "string",
                            "description": "The id of the connector"
                        },
                        "params": {
                            "type": "object",
                            "description": "The connector parameters"
                        }
                    },
                    "required": [
                        "id",
                        "params"
                    ]
                }
            }
        }
    ]
}

…support

…Completions-support' into Add-Amazon-Bedrock-Unified-Chat-Completions-support # Conflicts: # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/services/amazonbedrock/request/completion/ToolAwareUnifiedPublisher.java

…Completions-support' into Add-Amazon-Bedrock-Unified-Chat-Completions-support

…support

…ck-completions

elasticsearchmachine · 2025-12-11T22:48:42Z

Hi @jonathan-buttner, I've created a changelog YAML for you.

jonathan-buttner · 2025-12-11T22:52:08Z

...earch/xpack/inference/services/amazonbedrock/client/AmazonBedrockChatCompletionExecutor.java

-                inferenceResultsListener
-            );
-            chatCompletionRequest.executeChatCompletionRequest(awsBedrockClient, chatCompletionResponseListener);
+        // Chat completions only supports streaming


This is new

jonathan-buttner · 2025-12-11T22:53:23Z

.../inference/services/amazonbedrock/request/completion/AmazonBedrockChatCompletionRequest.java

-        } catch (IOException e) {
-            listener.onFailure(new RuntimeException(e));
-        }
+        throw new UnsupportedOperationException("Unsupported operation, use streaming execution instead");


This is new. This class is only used for chat completion. Chat completion doesn't support non-streaming in general for all providers (not just bedrock) so we don't need this method.

jonathan-buttner · 2025-12-11T22:53:33Z

.../inference/services/amazonbedrock/request/completion/AmazonBedrockChatCompletionRequest.java

-            .modelId(amazonBedrockModel.model())
-            .messages(getConverseMessageList(requestEntity.messages()))
-            .additionalModelResponseFieldPaths(requestEntity.additionalModelFields());
+    public Flow.Publisher<StreamingUnifiedChatCompletionResults.Results> executeStreamChatCompletionRequest(


This logic is all new

jonathan-buttner · 2025-12-11T22:53:55Z

.../inference/services/amazonbedrock/request/completion/AmazonBedrockChatCompletionRequest.java

 package org.elasticsearch.xpack.inference.services.amazonbedrock.request.completion;

-import software.amazon.awssdk.services.bedrockruntime.model.ConverseRequest;
+import software.amazon.awssdk.core.document.Document;


Please review the changes in this class carefully.

jonathan-buttner · 2025-12-11T22:54:55Z

...ch/xpack/inference/services/amazonbedrock/request/completion/AmazonBedrockConverseUtils.java


 package org.elasticsearch.xpack.inference.services.amazonbedrock.request.completion;

+import software.amazon.awssdk.core.document.Document;


Please review the changes in this class carefully, it's new content after the other PR.

jonathan-buttner · 2025-12-11T22:55:41Z

...main/java/org/elasticsearch/xpack/inference/services/amazonbedrock/AmazonBedrockService.java

+    /**
+     * The task types that the {@link InferenceAction.Request} can accept.
+     */
+    private static final EnumSet<TaskType> SUPPORTED_INFERENCE_ACTION_TASK_TYPES = EnumSet.of(TaskType.TEXT_EMBEDDING, TaskType.COMPLETION);


This is new so we can return an error if we call a chat_completion endpoint without _stream.

jonathan-buttner · 2025-12-11T22:55:57Z

...main/java/org/elasticsearch/xpack/inference/services/amazonbedrock/AmazonBedrockService.java

-            action.execute(inputs, timeout, listener);
-        } else {
+        if (SUPPORTED_INFERENCE_ACTION_TASK_TYPES.contains(model.getTaskType()) == false) {
+            listener.onFailure(createUnsupportedTaskTypeStatusException(model, SUPPORTED_INFERENCE_ACTION_TASK_TYPES));


New change is here

jonathan-buttner · 2025-12-11T22:56:15Z

...rc/main/java/org/elasticsearch/xpack/inference/services/elastic/ElasticInferenceService.java

-                responseString = responseString + " " + useChatCompletionUrlMessage(model);
-            }
-            listener.onFailure(new ElasticsearchStatusException(responseString, RestStatus.BAD_REQUEST));
+            listener.onFailure(createUnsupportedTaskTypeStatusException(model, SUPPORTED_INFERENCE_ACTION_TASK_TYPES));


Using the same helper

jonathan-buttner · 2025-12-11T22:56:33Z

...inference/src/main/java/org/elasticsearch/xpack/inference/services/openai/OpenAiService.java

-                responseString = responseString + " " + useChatCompletionUrlMessage(model);
-            }
-            listener.onFailure(new ElasticsearchStatusException(responseString, RestStatus.BAD_REQUEST));
+            listener.onFailure(createUnsupportedTaskTypeStatusException(model, SUPPORTED_INFERENCE_ACTION_TASK_TYPES));


Using the same helper and adding a return because we had a fall through bug here.

Nice catch. Is it possible to add a test that would fail without the return for this (maybe you already have)?

jonathan-buttner · 2025-12-11T22:56:43Z

.../plugin/inference/src/main/java/org/elasticsearch/xpack/inference/services/ServiceUtils.java

        );
    }

+    public static ElasticsearchStatusException createUnsupportedTaskTypeStatusException(Model model, EnumSet<TaskType> supportedTaskTypes) {


…elasticsearch into ia-bedrock-completions

…ck-completions

jonathan-buttner · 2025-12-12T19:24:38Z

.../elasticsearch/xpack/inference/services/amazonbedrock/client/AmazonBedrockExecutorTests.java

        assertThat(exceptionThrown.getCause().getMessage(), containsString("test exception"));
    }

+    public void testExecute_ChatCompletionRequest_NonStreaming_Fails() {


We're lacking coverage of around returning streamed converse responses and the stream processor logic to parse them. I'll work on those in a follow up PR because we're going to need to refactor how we mock the internal client I think. Unfortunately it's not as straightforward as when we test streaming for other services that don't use an sdk like openai.

Makes sense. I'm fine with this being in a follow-up PR.

elasticsearchmachine · 2025-12-12T19:26:53Z

Pinging @elastic/search-inference-team (Team:Search - Inference)

dan-rubinstein · 2025-12-12T20:17:44Z

.../elasticsearch/xpack/inference/services/amazonbedrock/client/AmazonBedrockExecutorTests.java

        assertThat(exceptionThrown.getCause().getMessage(), containsString("test exception"));
    }

+    public void testExecute_ChatCompletionRequest_NonStreaming_Fails() {


Makes sense. I'm fine with this being in a follow-up PR.

DonalEvans

I wasn't able to finish looking at all the test changes today, but I have a few comments/questions for the non-test code.

...earch/xpack/inference/services/amazonbedrock/client/AmazonBedrockChatCompletionExecutor.java

...ticsearch/xpack/inference/services/amazonbedrock/client/AmazonBedrockStreamingProcessor.java

DonalEvans · 2025-12-12T22:50:48Z

...arch/xpack/inference/services/amazonbedrock/completion/AmazonBedrockChatCompletionModel.java

+        var requestTaskSettings = AmazonBedrockCompletionRequestTaskSettings.fromMap(taskSettings);
+        var taskSettingsToUse = AmazonBedrockCompletionTaskSettings.of(completionModel.getTaskSettings(), requestTaskSettings);
        return new AmazonBedrockChatCompletionModel(completionModel, taskSettingsToUse);


If requestTaskSettings is equal to taskSettingsToUse, we can return the original model and avoid creating an identical object.

AmazonBedrockCompletionRequestTaskSettings and AmazonBedrockCompletionTaskSettings are separate classes. I'll check that the result of AmazonBedrockCompletionTaskSettings.of is the same as the task settings from the model passed in, if so we can return the same completionModel.

...arch/xpack/inference/services/amazonbedrock/completion/AmazonBedrockChatCompletionModel.java

...h/xpack/inference/services/amazonbedrock/completion/AmazonBedrockCompletionTaskSettings.java

...ch/xpack/inference/services/amazonbedrock/request/completion/AmazonBedrockConverseUtils.java

DonalEvans · 2025-12-13T00:50:00Z

...inference/src/main/java/org/elasticsearch/xpack/inference/services/openai/OpenAiService.java

-                responseString = responseString + " " + useChatCompletionUrlMessage(model);
-            }
-            listener.onFailure(new ElasticsearchStatusException(responseString, RestStatus.BAD_REQUEST));
+            listener.onFailure(createUnsupportedTaskTypeStatusException(model, SUPPORTED_INFERENCE_ACTION_TASK_TYPES));


Nice catch. Is it possible to add a test that would fail without the return for this (maybe you already have)?

...java/org/elasticsearch/xpack/inference/services/amazonbedrock/AmazonBedrockServiceTests.java

…ck-completions

…elasticsearch into ia-bedrock-completions

…ck-completions

Evgenii-Kazannik and others added 30 commits October 17, 2025 13:40

Add Amazon Bedrock Unified Chat Completions support

515f23d

[CI] Auto commit changes from spotless

b89c8a8

Add Amazon Bedrock Unified Chat Completions support

0430315

[CI] Auto commit changes from spotless

ac49494

Add Amazon Bedrock Unified Chat Completions support

9cc8bf9

[CI] Auto commit changes from spotless

d37c898

Add Amazon Bedrock Unified Chat Completions support

d641f19

Add Amazon Bedrock Unified Chat Completions support

cdcc815

[CI] Auto commit changes from spotless

de92840

Add Amazon Bedrock Unified Chat Completions support

731cb7c

[CI] Auto commit changes from spotless

4c3d8db

Revert commit

fcfb587

Add Amazon Bedrock Unified Chat Completions support

db1457d

Comment tool calling

63a424a

[CI] Auto commit changes from spotless

48e63cc

Use a proper type

eaa59bc

Resolve response stuck issue

12843cb

Resolve response stuck issue

01a8f43

Resolve response stuck issue

36cb3e8

Resolve response stuck issue and update ToolAwareUnifiedPublisher

0299ef6

Merge branch 'main' into Add-Amazon-Bedrock-Unified-Chat-Completions-…

aad2ca0

…support

[CI] Auto commit changes from spotless

50c77b9

fix throttling exception

c7ef4b1

[CI] Auto commit changes from spotless

385a9cb

fix validation exception

70e7e7e

Merge remote-tracking branch 'origin/Add-Amazon-Bedrock-Unified-Chat-…

d68a28f

…Completions-support' into Add-Amazon-Bedrock-Unified-Chat-Completions-support

Add Amazon Bedrock Unified Chat Completions support

eafd5c6

Merge branch 'main' into Add-Amazon-Bedrock-Unified-Chat-Completions-…

5bb01b9

…support

apply spotless

293f1b7

Merge branch 'main' of github.com:elastic/elasticsearch into ia-bedro…

bbfc154

…ck-completions

jonathan-buttner added >enhancement :SearchOrg/Inference Label for the Search Inference team Team:Search - Inference v9.3.0 labels Dec 11, 2025

Update docs/changelog/139411.yaml

d99fe83

jonathan-buttner commented Dec 11, 2025

View reviewed changes

jonathan-buttner added 3 commits December 12, 2025 14:19

Adding translation tests

72c3f9e

Merge branch 'ia-bedrock-completions' of github.com:jonathan-buttner/…

dd5f21a

…elasticsearch into ia-bedrock-completions

Merge branch 'main' of github.com:elastic/elasticsearch into ia-bedro…

f575962

…ck-completions

jonathan-buttner commented Dec 12, 2025

View reviewed changes

Fixing the change log

8f8e4f7

jonathan-buttner marked this pull request as ready for review December 12, 2025 19:26

jonathan-buttner requested review from DonalEvans and dan-rubinstein December 12, 2025 19:30

dan-rubinstein approved these changes Dec 12, 2025

View reviewed changes

jonathan-buttner and others added 2 commits December 12, 2025 17:31

Merge branch 'main' into ia-bedrock-completions

f6fff93

Beginning translation to openai for the response format

08c7931

DonalEvans reviewed Dec 13, 2025

View reviewed changes

jonathan-buttner added 2 commits December 15, 2025 16:40

Working usage and translation back to chat completion

84dd641

Merge branch 'main' of github.com:elastic/elasticsearch into ia-bedro…

8dc85c6

…ck-completions

jonathan-buttner mentioned this pull request Dec 16, 2025

Add Amazon Bedrock Unified Chat Completions support #133697

Closed

Merge branch 'ia-bedrock-completions' of github.com:jonathan-buttner/…

07d8050

…elasticsearch into ia-bedrock-completions

jonathan-buttner added the Feature:GenAI Features around GenAI label Dec 16, 2025

elasticsearchmachine added v9.4.0 and removed v9.3.0 labels Dec 17, 2025

jonathan-buttner added 2 commits December 22, 2025 09:38

Merge branch 'main' of github.com:elastic/elasticsearch into ia-bedro…

91763be

…ck-completions

Addressing feedback

5953e9b


		package org.elasticsearch.xpack.inference.services.amazonbedrock.request.completion;

		import software.amazon.awssdk.core.document.Document;

[Inference API] Add Chat Completion to Amazon Bedrock for the Inference API #139411

Are you sure you want to change the base?

[Inference API] Add Chat Completion to Amazon Bedrock for the Inference API #139411

Uh oh!

Conversation

jonathan-buttner commented Dec 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Testing

Uh oh!

elasticsearchmachine commented Dec 11, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jonathan-buttner Dec 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

elasticsearchmachine commented Dec 12, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

DonalEvans left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

jonathan-buttner commented Dec 11, 2025 •

edited

Loading

jonathan-buttner Dec 11, 2025 •

edited

Loading