Skip to content

The LLM FineTuning and Evaluation project 🚀 enhances FLAN-T5 models for tasks like summarizing Spanish news articles 🇪🇸📰. It features detailed notebooks 📚 on fine-tuning and evaluating models to optimize performance for specific applications. 🔍✨

Notifications You must be signed in to change notification settings

sergio11/llm_finetuning_and_evaluation

Repository files navigation

LLM Fine-Tuning and Evaluation 🚀🔍

Welcome to the LLM Fine-Tuning and Evaluation repository! 🎉 Here, we delve deep into the exciting world of large language model (LLM) fine-tuning and evaluation, focusing on cutting-edge techniques to adapt models like FLAN-T5, TinyLLAMA, and Aguila7B for diverse natural language processing (NLP) tasks. 🧠💬

As LLMs become an integral part of modern AI applications, the ability to fine-tune and evaluate these models effectively has never been more crucial. This repository is designed to help you navigate the complexities of model customization, offering insights and practical tools to enhance the performance, accuracy, and ethical responsibility of your models.

Whether you're working on:

  • Summarization of lengthy texts in multiple languages 📝🌍,
  • Legal document drafting with precision and efficiency ⚖️📄,
  • Question answering systems for customer service 🤖💬,
  • Or fine-tuning models to avoid generating harmful or offensive content 🔒🛡️,

this repository provides the resources to elevate your projects to the next level.

🙏 I would like to extend my heartfelt gratitude to Santiago Hernández, an expert in Cybersecurity and Artificial Intelligence. His incredible course on Deep Learning and AI generative, available at Udemy, was instrumental in shaping the development of this project.

What's Inside? 📚✨

  • Notebooks for Fine-Tuning: Explore detailed notebooks on how to fine-tune models such as FLAN-T5 and TinyLLAMA to perform specific tasks, like summarizing Spanish newspaper articles or avoiding harmful language generation. 📝🇪🇸
  • Evaluation Techniques: Learn about different methods to evaluate the performance of your fine-tuned models, including metrics and best practices. 📊🔍
  • Instruction Fine-Tuning: Discover how to leverage instruction-based training to improve model capabilities for targeted applications. 🎯🛠️
  • Parameter Efficient Fine-Tuning with QLoRA: Understand how to apply QLoRA for efficient fine-tuning of language models, particularly in legal assistance contexts. ⚖️📝

Getting Started 🚀

To get started, check out the notebooks for step-by-step guides on model fine-tuning and evaluation:

  • Instruction_Fine_Tuning_LLM_T5.ipynb: Detailed instructions on fine-tuning FLAN-T5 for Spanish summarization. 📖
  • Evaluation_and_Analysis_T5_Familiy_LLMs.ipynb: Insights into evaluating and analyzing various T5 models. 📈
  • Parameter_Efficient_Fine_Tuning_QLoRA_Legal_Assistance.ipynb: Learn about fine-tuning with QLoRA for specialized tasks such as drafting legal documents. ⚖️
  • TinyLLAMA_PPO_RLHF_Avoiding_Offensive_Language.ipynb: Explore the fine-tuning process of TinyLLAMA using PPO and RLHF to avoid harmful or offensive language. 🔄🛡️
  • Evaluation_of_Fine_Tuned_Large_Language_Models_for_ILENIA.ipynb: Evaluation of fine-tuned models in the ILENIA framework, including Aguila7B and Latxa projects. 🌐

Fine-Tuning TinyLLAMA with PPO for RLHF: Avoidance of Harmful or Offensive Language 🛡️🔄

In this new study, the primary goal was to fine-tune TinyLLAMA using Proximal Policy Optimization (PPO) techniques combined with Reinforcement Learning from Human Feedback (RLHF). The objective is to refine the model's ability to avoid generating harmful, offensive, or toxic language while preserving meaningful content generation.

Highlights of the Study:

  • Fine-Tuning Objective: Implement RLHF to make TinyLLAMA more sensitive to language safety, focusing on removing toxic and offensive elements from generated outputs.
  • Model Configuration: Leveraging PPO, a reinforcement learning technique, to optimize TinyLLAMA’s reward function toward producing non-offensive language.
  • Evaluation: Summarization tasks were used to test the model's behavior in dialogue scenarios, and the fine-tuned TinyLLAMA demonstrated a significant reduction in offensive language while retaining coherence in its outputs.

For a comprehensive understanding of the methodology and results, refer to the notebook: TinyLLAMA_PPO_RLHF_Avoiding_Offensive_Language.ipynb.

Evaluation of Fine-Tuned Large Language Models for ILENIA 🌐📊

The ILENIA project is part of Spain's Strategic Project for Economic Recovery and Transformation (PERTE), which focuses on developing multilingual resources for the New Language Economy (NEL). This initiative supports the use of Spanish and other official languages to drive economic growth and international competitiveness in areas like AI, translation, and education.

As part of this effort, we evaluate LLMs from the Aguila7B and Latxa projects, which are designed for text and speech processing tasks. These evaluations focus on the models' performance, ensuring they align with societal and technological needs, particularly in multilingual and cross-lingual contexts.

Key Aspects:

  • Multilingualism and Cross-lingual Transfer: The models are evaluated for their ability to handle text and speech data across multiple languages.
  • Project Timeline: ILENIA spans 36 months and is led by the Barcelona Supercomputing Center (BSC-CNS).
  • Resource Optimization: The fine-tuning of LLMs ensures efficient model generation, optimizing costs and computational resources.

For an in-depth analysis, refer to the notebook: Evaluation_of_Fine_Tuned_Large_Language_Models_for_ILENIA.ipynb.

📊 Evaluation and Analysis of Pre-Trained T5 Family LLMs: Leveraging Prompt Engineering and Few-Shot Examples for Fine-Tuning

In the fast-evolving world of natural language processing (NLP), leveraging pre-trained language models has become crucial for improving performance across various tasks. 🌟 Among these, the T5 family of models stands out for its versatility and effectiveness in handling a range of language tasks. This study delves into the evaluation and analysis of pre-trained T5 models, focusing on how prompt engineering and few-shot examples can be used to fine-tune these models. 🔍

The T5 family, including models like T5-Base, T5-Large, and FLAN-T5, has shown impressive capabilities in text generation, question answering, and translation. Yet, there is always room for optimization. Fine-tuning these models using prompt engineering—designing and structuring input prompts—along with few-shot learning, offers a powerful method to enhance their performance without extensive retraining. ⚙️

In this work, we thoroughly evaluate different T5 models, exploring how various prompt engineering techniques and few-shot learning setups affect their performance. Our goal is to uncover best practices for fine-tuning pre-trained models to excel in real-world applications. By analyzing the strengths and limitations of each model under different prompt conditions, this study aims to provide valuable insights into optimizing T5-based LLMs for diverse NLP tasks. 📈

For a detailed walkthrough of the evaluation process and findings, please refer to the notebook: Evaluation_and_Analysis_T5_Family_LLMs.ipynb. 📝

Instruction Fine-Tuning for Spanish Newspaper Article Summarization Using FLAN-T5-Small 📚📝

Welcome to this project on enhancing the FLAN-T5-Small language model for summarizing Spanish newspaper articles! 🌟 In this guide, we focus on instruction fine-tuning the FLAN-T5-Small model to improve its ability to generate concise and accurate summaries of news content in Spanish.

The notebook Instruction_Fine_Tuning_LLM_T5.ipynb provides a detailed walkthrough of the entire process. 📖✨ It covers:

  • Data Preparation: How to curate and prepare a dataset of Spanish newspaper articles and their summaries.
  • Model Configuration: Setting up the FLAN-T5-Small model for instruction-based fine-tuning.
  • Fine-Tuning Process: Steps to fine-tune the model specifically for summarization tasks.
  • Evaluation: Assessing the performance of the fine-tuned model on summarization.

By following the instructions in the notebook, you’ll learn how to adapt this powerful pre-trained model to effectively handle Spanish text summarization, enabling it to deliver clear and coherent summaries of news articles. 🚀🗞️

For a comprehensive guide, refer to the notebook Instruction_Fine_Tuning_LLM_T5.ipynb. Enjoy exploring and fine-tuning! 🌟

Parameter Efficient Fine-Tuning with QLoRA for Legal Assistance ⚖️📝

This section introduces the concept of Parameter Efficient Fine-Tuning (PEFT) using QLoRA for enhancing language models in legal contexts. QLoRA (Quantized Low-Rank Adaptation) is designed to efficiently fine-tune large language models with fewer parameters, reducing both computational and memory requirements.

The notebook Parameter_Efficient_Fine_Tuning_QLoRA_Legal_Assistance.ipynb details the following:

  • LoRA Configuration: Understand how to set up Low-Rank Adaptation (LoRA) to adapt large models for specific legal tasks.
  • Fine-Tuning Process: Steps to apply QLoRA to fine-tune models such as Llama-2 for drafting legal documents and other legal assistance applications.
  • Evaluation: Techniques to evaluate the performance of the fine-tuned model in generating legal content.

This approach allows for efficient adaptation of language models to specialized tasks like legal document drafting, ensuring high performance while managing resource usage effectively.

For a comprehensive guide on QLoRA fine-tuning, refer to the notebook Parameter_Efficient_Fine_Tuning_QLoRA_Legal_Assistance.ipynb. Explore the potential of efficient fine-tuning techniques for legal applications! 🌟⚖️

Feel free to explore, experiment, and contribute to the exciting field of LLMs. Your feedback and contributions are always welcome! 🌟🤝

Happy fine-tuning and evaluating! 🚀✨

Acknowledgments:

I would like to extend my heartfelt gratitude to Santiago Hernández, an expert in Cybersecurity and Artificial Intelligence. His incredible course on Deep Learning and Generative AI, available at Udemy, was instrumental in shaping the development of this project.

Contribution

Contributions to this project are highly encouraged! If you're interested in adding new features, resolving bugs, or enhancing the project's functionality, please feel free to submit pull requests.

Get in Touch 📬

this project is developed and maintained by Sergio Sánchez Sánchez (Dream Software). Special thanks to the open-source community and the contributors who have made this project possible. If you have any questions, feedback, or suggestions, feel free to reach out at [email protected].

Visitors Count

Please Share & Star the repository to keep me motivated.

License ⚖️

This project is licensed under the MIT License, an open-source software license that allows developers to freely use, copy, modify, and distribute the software. 🛠️ This includes use in both personal and commercial projects, with the only requirement being that the original copyright notice is retained. 📄

Please note the following limitations:

  • The software is provided "as is", without any warranties, express or implied. 🚫🛡️
  • If you distribute the software, whether in original or modified form, you must include the original copyright notice and license. 📑
  • The license allows for commercial use, but you cannot claim ownership over the software itself. 🏷️

The goal of this license is to maximize freedom for developers while maintaining recognition for the original creators.

MIT License

Copyright (c) 2024 Dream software - Sergio Sánchez 

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

About

The LLM FineTuning and Evaluation project 🚀 enhances FLAN-T5 models for tasks like summarizing Spanish news articles 🇪🇸📰. It features detailed notebooks 📚 on fine-tuning and evaluating models to optimize performance for specific applications. 🔍✨

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published