bug: Rails taking More time to excuate

### Did you check docs and existing issues?

- [x] I have read all the NeMo-Guardrails docs
- [x] I have updated the package to the latest version before submitting this issue
- [ ] (optional) I have used the develop branch
- [x] I have searched the existing issues of NeMo-Guardrails

### Python version (python --version)

Python 3.12.0

### Operating system/version

Linux 

### NeMo-Guardrails version (if you must use a specific version and not the latest

0.9.1.1

### Describe the bug

We&rsquo;ve observed that the response time of our AI Health Chatbot currently takes around 15-16 seconds per response, which affects user experience and engagement. To improve efficiency and deliver a faster, more responsive interaction, we propose implementing NeMo Guardrails to optimize the flow and enhance the chatbot's performance.
Problem Statement:
Current Response Time: 15-16 seconds per interaction, which is significantly impacting the user experience.
Objective: Reduce response time while maintaining the accuracy and quality of the bot&rsquo;s responses.
Impact: Slow response times may lead to user frustration, drop-offs, and lower engagement.
 ![Image](https://github.com/user-attachments/assets/90a3c540-9b9d-46d2-a6e4-dbe08f1b2bc1)
![Image](https://github.com/user-attachments/assets/6e05152c-9d22-4ba7-b7e9-cf88946add77)


Code: how I implement the Nemo guardrails
```
llm = ChatBedrock(model_id="anthropic.claude-3-haiku-20240307-v1:0",
                  streaming=False,
                  region_name="us-east-1",
                  model_kwargs={"max_tokens": 500,
                                "temperature": 0.2,
                                "top_k": 250,
                                "top_p": 0.5,
                                "stop_sequences": ["\n\nHuman"]},)


nest_asyncio.apply()
config = RailsConfig.from_path("./config")
guardrails = RunnableRails(config=config, llm=llm, input_key="input", output_key="answer")


 answer_prompt = ChatPromptTemplate.from_messages([
            ("system", changed_prompt),
            self.few_shot_prompt,
            MessagesPlaceholder(variable_name="chat_history"),
            ("user", "{input}"),
        ])`          
        
        Chat_history = self.LLMConnection.redis_api.get_from_redis(ref_id)
        Chat_history = [serialize_message(msg) for msg in Chat_history]
  
        document_chain = create_stuff_documents_chain(llm_connection.llm, answer_prompt)
        conversational_retrieval_chain = create_retrieval_chain(history_retriever_chain, document_chain)
        rag_chain_with_guardrails = guardrails | conversational_retrieval_chain
        
        response = rag_chain_with_guardrails.invoke({"chat_history": Chat_history, "input": input_txt})
```

### Steps To Reproduce

   Issue caused in above code.

### Expected Behavior

Need to reduce rails execution time. 

### Actual Behavior

Rails execution time was too high

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

bug: Rails taking More time to excuate #831

Did you check docs and existing issues?

Python version (python --version)

Operating system/version

NeMo-Guardrails version (if you must use a specific version and not the latest

Describe the bug

Steps To Reproduce

Expected Behavior

Actual Behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

bug: Rails taking More time to excuate #831

Description

Did you check docs and existing issues?

Python version (python --version)

Operating system/version

NeMo-Guardrails version (if you must use a specific version and not the latest

Describe the bug

Steps To Reproduce

Expected Behavior

Actual Behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions