-
Notifications
You must be signed in to change notification settings - Fork 584
Description
Did you check docs and existing issues?
- I have read all the NeMo-Guardrails docs
- I have updated the package to the latest version before submitting this issue
- (optional) I have used the develop branch
- I have searched the existing issues of NeMo-Guardrails
Python version (python --version)
Python 3.12.0
Operating system/version
Linux
NeMo-Guardrails version (if you must use a specific version and not the latest
0.9.1.1
Describe the bug
We’ve observed that the response time of our AI Health Chatbot currently takes around 15-16 seconds per response, which affects user experience and engagement. To improve efficiency and deliver a faster, more responsive interaction, we propose implementing NeMo Guardrails to optimize the flow and enhance the chatbot's performance.
Problem Statement:
Current Response Time: 15-16 seconds per interaction, which is significantly impacting the user experience.
Objective: Reduce response time while maintaining the accuracy and quality of the bot’s responses.
Impact: Slow response times may lead to user frustration, drop-offs, and lower engagement.


Code: how I implement the Nemo guardrails
llm = ChatBedrock(model_id="anthropic.claude-3-haiku-20240307-v1:0",
streaming=False,
region_name="us-east-1",
model_kwargs={"max_tokens": 500,
"temperature": 0.2,
"top_k": 250,
"top_p": 0.5,
"stop_sequences": ["\n\nHuman"]},)
nest_asyncio.apply()
config = RailsConfig.from_path("./config")
guardrails = RunnableRails(config=config, llm=llm, input_key="input", output_key="answer")
answer_prompt = ChatPromptTemplate.from_messages([
("system", changed_prompt),
self.few_shot_prompt,
MessagesPlaceholder(variable_name="chat_history"),
("user", "{input}"),
])`
Chat_history = self.LLMConnection.redis_api.get_from_redis(ref_id)
Chat_history = [serialize_message(msg) for msg in Chat_history]
document_chain = create_stuff_documents_chain(llm_connection.llm, answer_prompt)
conversational_retrieval_chain = create_retrieval_chain(history_retriever_chain, document_chain)
rag_chain_with_guardrails = guardrails | conversational_retrieval_chain
response = rag_chain_with_guardrails.invoke({"chat_history": Chat_history, "input": input_txt})
Steps To Reproduce
Issue caused in above code.
Expected Behavior
Need to reduce rails execution time.
Actual Behavior
Rails execution time was too high