This application combines comprehensive identification features of a multimodal input app with the quick and broad description handling capabilities of a Retrieval-Augmented Generation (RAG) system. The goal is to create a versatile and efficient Pokémon Identification App that can handle text, image, and combined queries, providing detailed and accurate information about Pokémon.
- Image Dataset: Contains images and names of Pokémon.
- General Info Dataset: Contains names, types, HP levels, attack levels, defense levels, and detailed descriptions.
- CLIP Model: For embedding text and image inputs.
- BERT/GPT Model: For embedding and handling broad text descriptions.
- Retrieval-Augmented Generation (RAG): To enhance broad description handling and provide relevant information.
- Relational Database: Stores Pokémon metadata, image URLs, and detailed descriptions.
- Vector Database: Stores embeddings for efficient similarity searches.
- Data Ingestion: Processes and stores data and embeddings.
- Search API: Handles text, image, and combined queries to return matching Pokémon data.
- RAG Integration: Enhances search capabilities by handling broad descriptions.
- Query Input: Allows users to input text descriptions or upload images.
- Results Display: Shows the Pokémon image and general information.
- raw/
- processed/
- combined/
- app/
- init.py
- main.py
- api/
- init.py
- endpoints.py
- models/
- init.py
- clip_model.py
- bert_model.py
- database/
- init.py
- setup.py
- queries.py
- requirements.txt
- README.md
- public/
- src/
- components/
- SearchBar.js
- ResultsDisplay.js
- App.js
- index.js
- components/
- package.json
- README.md
- download_images.py
- preprocess_images.py
- preprocess_csv.py
- combine_datasets.py