Panic on nearest neighbour search with high limit

## Current Behavior

In Qdrant v1.12.4, nearest neighbour searches/queries with a high limit (e.g., `u64::MAX`) cause Qdrant to panic. It seems Qdrant now tries to allocate O(limit) memory for nearest neighbour searches.

This is a regression in v1.12.4. This bug is not present in v1.12.3.

## Steps to Reproduce


The following test fails in v1.12.4:

```rust
#[tokio::test]
async fn nearest_neighbour_limit() {
    let collection_name = "test_collection";
    let query_embedding = vec![1.0; 128];

    let qdrant = Qdrant::new(QdrantConfig::from_url(qdrant_url().as_str())).unwrap();

    qdrant
        .create_collection(
            CreateCollectionBuilder::new(collection_name)
                .vectors_config(VectorParamsBuilder::new(128, Distance::Cosine)),
        )
        .await
        .unwrap();

    qdrant
        .query(
            QueryPointsBuilder::new(collection_name)
                .query(Query::new_nearest(query_embedding))
                .limit(u64::MAX)
                .build(),
        )
        .await
        .unwrap();
}
```

Running this test causes the following Qdrant panic:

```
2024-11-19T15:46:34.643154Z  INFO storage::content_manager::toc::collection_meta_ops: Creating collection test_collection
2024-11-19T15:46:34.700389Z ERROR qdrant::startup: Panic backtrace:
   0: std::backtrace::Backtrace::create
   1: qdrant::startup::setup_panic_hook::{{closure}}
   2: std::panicking::rust_panic_with_hook
   3: std::panicking::begin_panic_handler::{{closure}}
   4: std::sys::backtrace::__rust_end_short_backtrace
   5: rust_begin_unwind
   6: core::panicking::panic_fmt
   7: hashbrown::raw::Fallibility::capacity_overflow
   8: hashbrown::raw::RawTable<T,A>::with_capacity_in
   9: collection::shards::local_shard::search::<impl collection::shards::local_shard::LocalShard>::do_search::{{closure}}
  10: collection::shards::local_shard::shard_ops::<impl collection::shards::shard_trait::ShardOperation for collection::shards::local_shard::LocalShard>::query_batch::{{closure}}
  11: collection::shards::replica_set::read_ops::<impl collection::shards::replica_set::ShardReplicaSet>::query_batch::{{closure}}::{{closure}}::{{closure}}
  12: collection::shards::replica_set::execute_read_operation::<impl collection::shards::replica_set::ShardReplicaSet>::execute_and_resolve_read_operation::{{closure}}
  13: <futures_util::future::try_future::into_future::IntoFuture<Fut> as core::future::future::Future>::poll
  14: collection::collection::query::<impl collection::collection::Collection>::batch_query_shards_concurrently::{{closure}}
  15: collection::collection::query::<impl collection::collection::Collection>::do_query_batch::{{closure}}
  16: collection::collection::query::<impl collection::collection::Collection>::query_batch::{{closure}}
  17: storage::content_manager::toc::point_ops::<impl storage::content_manager::toc::TableOfContent>::query_batch::{{closure}}
  18: <qdrant::tonic::api::points_api::PointsService as api::grpc::qdrant::points_server::Points>::query::{{closure}}
  19: <<api::grpc::qdrant::points_server::PointsServer<T> as tower_service::Service<http::request::Request<B>>>::call::QuerySvc<T> as tonic::server::service::UnaryService<api::grpc::qdrant::QueryPoints>>::call::{{closure}}
  20: <api::grpc::qdrant::points_server::PointsServer<T> as tower_service::Service<http::request::Request<B>>>::call::{{closure}}
  21: <tower::util::map_response::MapResponseFuture<F,N> as core::future::future::Future>::poll
  22: <tonic::transport::service::router::RoutesFuture as core::future::future::Future>::poll
  23: <qdrant::tonic::tonic_telemetry::TonicTelemetryService<S> as tower_service::Service<http::request::Request<hyper::body::body::Body>>>::call::{{closure}}
  24: <qdrant::tonic::logging::LoggingMiddleware<S> as tower_service::Service<http::request::Request<hyper::body::body::Body>>>::call::{{closure}}
  25: <tonic::transport::server::SvcFuture<F> as core::future::future::Future>::poll
  26: <hyper::proto::h2::server::H2Stream<F,B> as core::future::future::Future>::poll
  27: tokio::runtime::task::raw::poll
  28: tokio::runtime::scheduler::multi_thread::worker::Context::run_task
  29: tokio::runtime::task::raw::poll
  30: std::sys::backtrace::__rust_begin_short_backtrace
  31: core::ops::function::FnOnce::call_once{{vtable.shim}}
  32: std::sys::pal::unix::thread::Thread::new::thread_start
  33: <unknown>
  34: <unknown>
```




## Expected Behavior

I would expect a `u64::MAX` limit to return all points and not panic Qdrant.

## Context (Environment)





Why this came up: an older part of our code (that was written before Qdrant had the `query` API) was using the `search` API to fetch all points whose payload matched a given filter. This created a search request with a `u64::MAX` limit and a filter. When upgrading to Qdrant v1.12.4, this caused the above panic. We fixed this by switching to the `query` API with a limit of `u64::MAX` (but no actual query). This does not panic Qdrant.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Panic on nearest neighbour search with high limit #5483

Current Behavior

Steps to Reproduce

Expected Behavior

Context (Environment)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Panic on nearest neighbour search with high limit #5483

Description

Current Behavior

Steps to Reproduce

Expected Behavior

Context (Environment)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions