[Feat]: Support on-premise reranker API

### Is your feature request related to a problem?

Currently, MemMachine supports local rerankers but does not support API-based on-premise rerankers.

When you specify `ce_ranker_id` in `configurations.yml`, it internally calls the local reranker using Hugging Face's `sentence-transformers` library.

Similar to on-premise LLMs that use OpenAI-compatible APIs, it would be great if rerankers could also support on-premise APIs.

vLLM provides rerank APIs compatible with Jina and Cohere APIs. We expect other inference engines (sglang, etc.) to support similar features.

> https://docs.vllm.ai/en/latest/serving/openai_compatible_server/
> - [Re-rank API](https://docs.vllm.ai/en/latest/serving/openai_compatible_server/#re-rank-api) (/rerank, /v1/rerank, /v2/rerank)
>   - Implements [Jina AI's v1 re-rank API](https://jina.ai/reranker/)
>   - Also compatible with [Cohere's v1 & v2 re-rank APIs](https://docs.cohere.com/v2/reference/rerank)
>   - Jina and Cohere's APIs are very similar; Jina's includes extra information in the rerank endpoint's response.
>   - Only applicable to [cross-encoder models](https://docs.vllm.ai/en/latest/models/pooling_models/).

Regarding terminology, strictly speaking, this is not an "OpenAI" Compatible API. OpenAI does not have a reranker service.

This would be called a "Jina" Compatible Reranker API or "Cohere" Compatible Reranker API. However, vLLM doesn't seem to use these expressions either. The term "OpenAI Compatible" also seems acceptable.

### Describe the solution you'd like

For embeddings, both the `openai_embedder` and `openai_compatible_embedder` options internally use the `openai.AsyncOpenAI` SDK. The only difference is that `openai_compatible_embedder` changes the `base_url`.

For rerankers, it would be good to follow the same approach as embeddings.
Since `cohere_reranker_id` is currently provided, adding a `cohere_compatible_reranker_id` option seems intuitive.
The current `cohere_reranker_id` uses the `cohere.ClientV2` SDK, which also appears to provide a `base_url` option.

### Describe alternatives you've considered

_No response_

### Additional context

We&rsquo;ve already begun work on supporting this feature and will upload a draft PR shortly.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feat]: Support on-premise reranker API #1212

Is your feature request related to a problem?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feat]: Support on-premise reranker API #1212

Description

Is your feature request related to a problem?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions