Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 

Sequential RAG: Naive

Ask questions about documents with LLMs via Ollama and PGVector.

Ollama

The application consumes models from an Ollama inference server. You can either run Ollama locally on your laptop, or rely on the Testcontainers support in Spring Boot to spin up an Ollama service automatically. If you choose the first option, make sure you have Ollama installed and running on your laptop. Either way, Spring AI will take care of pulling the needed Ollama models when the application starts, if they are not available yet on your machine.

Running the application

The application relies on the native Testcontainers support in Spring Boot to spin up a PostgreSQL database with the pgvector extension for embeddings.

If you're using the native Ollama application, run the application as follows.

./gradlew bootTestRun

If you want to rely on the native Testcontainers support in Spring Boot to spin up an Ollama service at startup time, run the application as follows.

./gradlew bootTestRun -Dspring.profiles.active=ollama-image

Calling the application

Note

These examples use the httpie CLI to send HTTP requests.

Call the application that will use a chat model to answer your questions.

http --raw "What is Iorek's biggest dream?" :8080/rag/basic -b --pretty none
http --raw "Who is Lucio?" :8080/rag/basic -b --pretty none

By default, if you ask questions not related to the documents, the model will say it doesn't know the answer.

http --raw "What is the capital of Denmark?" :8080/rag/basic -b --pretty none

You can allow the model to answer questions not related to the documents, that is when no document is retrieved from the vector store.

http --raw "What is the capital of Denmark?" :8080/rag/empty-context -b --pretty none