FineTuneForge is a tool designed specifically for generating JSON Lines (JSONL) to facilitate the fine-tuning of AI language models like Google's PaLM 2 and OpenAI's GPT-3.5. It enables developers to easily transform text data into a JSONL format that machines can read.
To get started with FineTuneForge, follow these steps:
git clone https://github.com/ryhkml/fine-tune-forge.git
cd fine-tune-forge
chmod +x ./install.sh
./install.sh
Run the JSONL generator with the following command:
npm run build
Serve server
npm run serve
FineTuneForge is organized into several directories, each serving a specific purpose in the workflow of the JSONL generator. Below is an overview of these directories and their intended use:
DATADOC_OCR
: This directory acts as a temporary storage for OCR (Optical Character Recognition) imagesDATASET
: TheDATASET
directory is the designated location for storing the completed dataset files. Once the JSONL files have been generated and are ready for use in fine-tuning the language models, they are placed in this directoryDATATMP
: This directory for temporary storage of instruction contenttls
: This directory is reserved for storing SSL/TLS certificates
To enable HTTPS in the application, you need to configure SSL/TLS certificates correctly.
Before you start, ensure you have the following files placed in the tls
directory:
fullchain.pem
: This is your certificate file that contains the full chain of trust, including any intermediate certificates along with your owncert-key.pem
: This file contains your private key and must be kept secure. It is used to establish the encrypted connectionca.crt
(optional): This Certificate Authority (CA) file is used if you need to specify an external CA
If you use docker, uncomment the environment variable PROTOCOL_SERVER
in docker-compose.yaml
This project is licensed under the MIT License - see the LICENSE file for details.