中文| English
🔍 Search for your local images with natural language, running completely offline. For example, "a laptop on the desk", "sunset by the sea", "kitty in the grass", and so on.
Search images by picking a photo from your gallery
- Totally free, NO in-app purchases
- Support both English and Chinese
- Indexing and searching of images works completely offline without worrying about privacy
- Show results in less than 1 second when searching for 8,000+ photos
- Wait for indexing on the first time you launch, and search immediately afterward
- New: "Load More Results" button to fetch additional results beyond the initial set
- New: "Surprise Me" 🎲 roulette button to discover random indexed photos
- Google Play - Search for “PicQuery”
- Download APK from Release
- If you have trouble accessing the above resources, please see here
🍎 For iOS users, please refer to Queryable (Code), the inspiration behind this application, developed by @mazzzystar.
Thanks to @mazzzystar and @Young-Flash for their assistance during the development. The discussion can be viewed here.
PicQuery is powered by OpenAI's CLIP model. and Apple's mobile clip
First, the images to be searched are encoded into vectors using an image encoder and stored in a database. The text provided by the user during the search is also encoded into a vector. The encoded text vector is then compared with the indexed image vectors to calculate the similarity. The top K images with the highest similarity scores are selected as the query results.
When images are indexed, only vector embeddings are stored — no image content, category text, or OCR results are saved alongside the vector. The index contains:
| Field | Type | Description |
|---|---|---|
photoId |
Long |
MediaStore ID of the photo |
albumId |
Long |
Album/bucket ID the photo belongs to |
data |
FloatArray |
512-dim (CLIP) or 256-dim (MobileCLIP) HNSW vector |
There is no per-image text metadata (tags, captions, categories) stored in the index. All semantic understanding comes entirely from the vector embedding at search time. This means:
- Extending the index to store additional metadata (tags, faces, OCR text) would require schema changes and re-indexing.
- Filtering by content category is not natively supported — it would require running a classifier at query time or enriching the index with category labels.
| Requirement | Version |
|---|---|
| Android Studio | Hedgehog 2023.1+ (or newer) |
| JDK | 17 |
| Min Android SDK | 29 (Android 10) |
| Target Android SDK | 35 (Android 15) |
| Kotlin | 2.3.0 |
| Gradle | 8.9.3 |
The ML model files are not committed to the repository due to their size. You must obtain them separately before building.
Model files required in app/src/main/assets/:
clip-image-int8.ortclip-text-int8.ort
Obtain via Colab notebook (recommended): Open Jupyter Notebook
Or download pre-built from Google Drive: CLIP models
Model files required in app/src/main/assets/:
vision_model.orttext_model.ort
Download from Google Drive: MobileCLIP models
Then switch the DI module in app/src/main/java/me/grey/picquery/common/AppModules.kt:
// Change this line:
val AppModules = listOf(..., modulesCLIP, ...)
// To:
val AppModules = listOf(..., modulesMobileCLIP, ...) // or modulesMobileCLIPv2# Clone and open in Android Studio, then:
./gradlew assembleDebug # Build debug APK
./gradlew assembleRelease # Build release APK (requires signing config)
./gradlew installDebug # Build and install on connected device./gradlew ktlintFormat # Auto-format Kotlin code
./gradlew ktlintCheck # Check Kotlin style (CI)
./gradlew detekt # Static analysis./gradlew test # Unit tests (no device needed)
./gradlew connectedAndroidTest # Instrumented tests (device/emulator needed)- Text search: natural language queries are encoded to vectors and matched against the index
- Image search: pick a photo from the gallery to find visually similar ones
- Load More: tap "Load More Results" at the bottom of search results to retrieve additional matches (increases the search K incrementally, up to 300 results max)
- Tap the Surprise button on the home screen to discover 20 random indexed photos
- Results are shown in the same grid view as search results
- Requires at least one album to be indexed first
- Groups near-identical photos (duplicates, burst shots) using Union-Find clustering
- Configurable similarity threshold (default: 0.96)
This section is intended for developers (including AI coding agents) extending this project.
UI Layer (Compose/MVVM)
└── SearchViewModel / HomeViewModel / DisplayViewModel
Domain Layer
└── ImageSearcher → SearchOrchestrator → EmbeddingService
Data Layer
└── ObjectBoxEmbeddingRepository (vector store, HNSW index)
└── PhotoRepository (MediaStore access)
ML Layer
└── CLIP / MobileCLIP (ONNX Runtime)
| Task | Where to Start |
|---|---|
| Add new ML model | Add new module in feature/, implement ImageEncoder/TextEncoder, create a new Koin module |
| Add metadata to index | Extend ObjectBoxEmbedding with new fields, update EmbeddingService.encodePhotoList |
| New search screen/feature | Add a new ViewModel, Screen composable, and Routes entry |
| Increase load-more cap | Change MAX_EXTENDED_TOP_K in SearchViewModel |
| Change roulette count | Change ROULETTE_COUNT in HomeViewModel |
| Add category filtering | Add a category: String? field to ObjectBoxEmbedding, populate during indexing via a classifier |
- ML model files are required — the app will crash at launch with
FileNotFoundExceptionif not present. Download them before building (see Required Assets). - Android 10+ required —
minSdk = 29. Older devices are not supported. - MediaStore permissions — the app requires
READ_MEDIA_IMAGES(Android 13+) orREAD_EXTERNAL_STORAGE(older). Grant these when prompted. - First-time indexing is slow — indexing all photos takes time (~35ms/image). Budget ~6 minutes for 10,000 photos on a mid-range device.
- ObjectBox version lock — ObjectBox 4.x with
@HnswIndexis required. Downgrading will break vector search. - Koin DI wiring — when adding new features, always update
AppModules.ktto register new classes.
To build this project, you need to obtain a quantized CLIP model.
Run the scripts in this jupyter notebook step by step. When you run into the "You are done" section, you should get the following model files in ./result directory:
clip-image-int8.ortclip-text-int8.ort
If you don't want to run the scripts, you may directly download them from Google Drive.
To build this project, you need to obtain a quantized CLIP model.
vision_model.orttext_model.ort
download them from Google Drive.
Put them into app\src\main\assets and you're ready to go.
val AppModules = listOf(viewModelModules, dataModules, modulesCLIP, domainModules) pick the module you want,Clip pair to modulesCLIP module, mobile-clip pair to modulesMobileCLIP module
java.lang.RuntimeException: java.lang.reflect.InvocationTarget Exception
Don't forget to add model files to
app\src\main\assetsdirectory
java.io.FileNotFoundException: clip-image-int8.ort
Make sure the model files are in the correct directory, if you are using mobile-clip, make sure you are using the correct model files, and change the module to modulesMobileCLIP
This project is open-source under an MIT license. All rights reserved.

