Run AI Models
NuxtHub AI lets you integrate machine learning models into your Nuxt application. Built on top of Workers AI, it provides a simple and intuitive API that supports models for text generation, image generation, embeddings, and more.
const response = await hubAI().run('@cf/meta/llama-3.1-8b-instruct', {
prompt: 'Who is the author of Nuxt?'
})
Getting Started
Enable AI in your NuxtHub project by adding the ai
property to the hub
object in your nuxt.config.ts
file.
export default defineNuxtConfig({
hub: {
ai: true
},
})
Local Development
During development, hubAI()
will call the Cloudflare API. Make sure to run npx nuxthub link
to create/link a NuxtHub project (even if the project is empty). This project is where your AI models will run.
NuxtHub AI will always run AI models on your Cloudflare account, including during local development. See pricing and included free quotas on Cloudflare's documentation.
Models
Workers AI comes with a curated set of popular open-source models that enable you to do tasks such as image classification, text generation, object detection, and more.
See all Workers AI modelshubAI()
hubAI()
is a server composable that returns a Workers AI client.
const ai = hubAI()
run()
Runs a model. Takes a model as the first parameter, and an object as the second parameter.
export default defineEventHandler(async () => {
const ai = hubAI() // access AI bindings
return await ai.run('@cf/meta/llama-3.1-8b-instruct', {
prompt: 'Who is the author of Nuxt?'
})
})
Options
The model to run
The model options.
Options for configuring AI Gateway
- id
, skipCache
, and cacheTtl
.
Tools
Tools are actions that your LLM can execute to run functions or interact with external APIs. The result of these tools will be used by the LLM to generate additional responses.
This can help you supply the LLM with real-time information, save data to a KV store, or provide it with external data from your database.
With Workers AI, tools have 4 properties:
name
: The name of the tooldescription
: A description of the tool that will be used by the LLM to understand what the tool does. This allows it to determine when to use the toolparameters
: The parameters that the tool accepts.function
: The function that will be executed when the tool is called.
const tools = [
{
name: 'get-weather',
description: 'Gets the weather for a given city',
parameters: {
type: 'object',
properties: {
city: {
type: 'number',
description: 'The city to retrieve weather information for'
},
},
required: ['city'],
},
function: ({ city }) => {
// use an API to get the weather information
return '72'
}),
}
]
Tool Fields
The name of the tool
A description of the tool that will be used by the LLM to understand what the tool does. This allows it to determine when to use the tool
The parameters and options for parameters that the model will use to run the tool.
The function that the LLM can execute.
runWithTools()
The @cloudflare/ai-utils
package provides a runWithTools
function that will handle the recursive calls to the LLM with the result of the tools.
npx nypm i @cloudflare/ai-utils
runWithTools
works with multi-tool calls, handles errors, and has the same return type as hubAI().run()
so any code relying on the response from a model can remain the same.
import { runWithTools } from '@cloudflare/ai-utils'
export default defineEventHandler(async (event) => {
return await runWithTools(hubAI(), '@cf/meta/llama-3.1-8b-instruct',
{
messages: [
{ role: 'user', content: 'What is the weather in New York?' },
],
tools: [
{
name: 'get-weather',
description: 'Gets the weather for a given city',
parameters: {
type: 'object',
properties: {
city: {
type: 'number',
description: 'The city to retrieve weather information for'
},
},
required: ['city'],
},
function: ({ city }) => {
// use an API to get the weather information
return '72'
},
},
]
},
{
// options
streamFinalResponse: true,
maxRecursiveToolRuns: 1,
}
)
})
Params
Your AI Binding (hubAI()
)
The model to run
The messages and tools to use for the model
An array of optional properties that can be passed to the model.
runWithTools()
documentation.AI Gateway
Workers AI is compatible with AI Gateway, which enables caching responses, analytics, real-time logging, ratelimiting, and fallback providers. Learn more about AI Gateway.
Options
Configure options for AI Gateway by passing an additional object to hubAI().run()
, learn more on Cloudflare's docs.
export default defineEventHandler(async () => {
const ai = hubAI()
return await ai.run('@cf/meta/llama-3-8b-instruct',
{
prompt: 'Who is the creator of Nuxt?'
},
{
gateway: {
id: '{gateway_slug}',
skipCache: false,
cacheTtl: 3360
}
})
})
Name of your existing AI Gateway. Must be in the same Cloudflare account as your NuxtHub application.
Controls whether the request should skip the cache.
Controls the Cache TTL, the duration (in seconds) that a cached request will be valid for. The minimum TTL is 60 seconds and maximum is one month.
Streaming
The recommended method to handle text generation responses is streaming.
LLMs work internally by generating responses sequentially using a process of repeated inference — the full output of a LLM model is essentially a sequence of hundreds or thousands of individual prediction tasks. For this reason, while it only takes a few milliseconds to generate a single token, generating the full response takes longer.
If your UI waits for the entire response to be generated, a user may see a loading spinner for several seconds before the response is displayed.
Streaming lets you start displaying the response as soon as the first tokens are generated, and append each additional token until the response is complete. This yields a much better experience for the end user. Displaying text incrementally as it’s generated not only provides instant responsiveness, but also gives the end-user time to read and interpret the text.
To enable, set the stream
parameter to true
.
You can check if the model you're using supports streaming on Cloudflare's models documentation.
export default defineEventHandler(async (event) => {
const messages = [
{ role: 'system', content: 'You are a friendly assistant' },
{ role: 'user', content: 'What is the origin of the phrase Hello, World?' }
]
const ai = hubAI()
const stream = await ai.run('@cf/meta/llama-3.1-8b-instruct', {
stream: true,
messages
})
return stream
})
Handling Streaming Responses
To manually handle streaming responses, you can use ReadableStream
and Nuxt's $fetch
function to create a new ReadableStream
from the response.
Creating a reader allows you to process the stream in chunks as it's received.
const response = await $fetch<ReadableStream>('/api/chats/ask-ai', {
method: 'POST',
body: {
query: "Hello AI, how are you?",
},
responseType: 'stream',
})
// Create a new ReadableStream from the response with TextDecoderStream to get the data as text
const reader = response.pipeThrough(new TextDecoderStream()).getReader()
// Read the chunks of data as they're received
while (true) {
const { value, done } = await reader.read()
if (done)
break
console.log('Received:', value)
}
Vercel AI SDK
Another way to handle streaming responses is to use Vercel's AI SDK with hubAI()
.
This uses the Workers AI Provider, which supports a subset of Vercel AI features.
tools
and streamObject
are currently not supported.To get started, install the Vercel AI SDK and the Cloudflare AI Provider in your project.
npx nypm i ai @ai-sdk/vue workers-ai-provider
nypm
will detect your package manager and install the dependencies with it.useChat()
useChat()
is a Vue composable provided by the Vercel AI SDK that handles streaming responses, API calls, state for your chat.
It requires a POST /api/chat
endpoint that uses the hubAI()
server composable and returns a compatible stream for the Vercel AI SDK.
import { streamText } from 'ai'
import { createWorkersAI } from 'workers-ai-provider'
export default defineEventHandler(async (event) => {
const { messages } = await readBody(event)
const workersAI = createWorkersAI({ binding: hubAI() })
return streamText({
model: workersAI('@cf/meta/llama-3.1-8b-instruct'),
messages
}).toDataStreamResponse()
})
Then, we can create a chat component that uses the useChat()
composable.
<script setup lang="ts">
import { useChat } from '@ai-sdk/vue'
const { messages, input, handleSubmit, isLoading, stop, error, reload } = useChat()
</script>
<template>
<div v-for="m in messages" :key="m.id">
{{ m.role }}: {{ m.content }}
</div>
<div v-if="error">
<div>{{ error.message || 'An error occurred' }}</div>
<button @click="reload">retry</button>
</div>
<form @submit="handleSubmit">
<input v-model="input" placeholder="Type here..." />
<button v-if="isLoading" @click="stop">stop</button>
<button v-else type="submit">send</button>
</form>
</template>
Learn more about the useChat()
composable.
pages/ai.vue
full example with Nuxt UI & Nuxt MDC.Templates
Explore open source templates made by the community:
Pricing
Free | Workers Paid ($5/month) | |
---|---|---|
Text Generation | 10,000 tokens / day | 10,000 tokens / day + start at $0.10 / million tokens |
Embeddings | 10,000 tokens / day | 10,000 tokens / day + start at $0.10 / million tokens |
Images | 250 steps (up to 1024x1024) / day | 250 steps / day + start at $0.00125 per 25 steps |
Speech to Text | 10 minutes of audio / day | 10 minutes of audio / day + $0.0039 per minute of audio input |