Description
Current Behavior
In Qdrant v1.12.4, nearest neighbour searches/queries with a high limit (e.g., u64::MAX
) cause Qdrant to panic. It seems Qdrant now tries to allocate O(limit) memory for nearest neighbour searches.
This is a regression in v1.12.4. This bug is not present in v1.12.3.
Steps to Reproduce
The following test fails in v1.12.4:
#[tokio::test]
async fn nearest_neighbour_limit() {
let collection_name = "test_collection";
let query_embedding = vec![1.0; 128];
let qdrant = Qdrant::new(QdrantConfig::from_url(qdrant_url().as_str())).unwrap();
qdrant
.create_collection(
CreateCollectionBuilder::new(collection_name)
.vectors_config(VectorParamsBuilder::new(128, Distance::Cosine)),
)
.await
.unwrap();
qdrant
.query(
QueryPointsBuilder::new(collection_name)
.query(Query::new_nearest(query_embedding))
.limit(u64::MAX)
.build(),
)
.await
.unwrap();
}
Running this test causes the following Qdrant panic:
2024-11-19T15:46:34.643154Z INFO storage::content_manager::toc::collection_meta_ops: Creating collection test_collection
2024-11-19T15:46:34.700389Z ERROR qdrant::startup: Panic backtrace:
0: std::backtrace::Backtrace::create
1: qdrant::startup::setup_panic_hook::{{closure}}
2: std::panicking::rust_panic_with_hook
3: std::panicking::begin_panic_handler::{{closure}}
4: std::sys::backtrace::__rust_end_short_backtrace
5: rust_begin_unwind
6: core::panicking::panic_fmt
7: hashbrown::raw::Fallibility::capacity_overflow
8: hashbrown::raw::RawTable<T,A>::with_capacity_in
9: collection::shards::local_shard::search::<impl collection::shards::local_shard::LocalShard>::do_search::{{closure}}
10: collection::shards::local_shard::shard_ops::<impl collection::shards::shard_trait::ShardOperation for collection::shards::local_shard::LocalShard>::query_batch::{{closure}}
11: collection::shards::replica_set::read_ops::<impl collection::shards::replica_set::ShardReplicaSet>::query_batch::{{closure}}::{{closure}}::{{closure}}
12: collection::shards::replica_set::execute_read_operation::<impl collection::shards::replica_set::ShardReplicaSet>::execute_and_resolve_read_operation::{{closure}}
13: <futures_util::future::try_future::into_future::IntoFuture<Fut> as core::future::future::Future>::poll
14: collection::collection::query::<impl collection::collection::Collection>::batch_query_shards_concurrently::{{closure}}
15: collection::collection::query::<impl collection::collection::Collection>::do_query_batch::{{closure}}
16: collection::collection::query::<impl collection::collection::Collection>::query_batch::{{closure}}
17: storage::content_manager::toc::point_ops::<impl storage::content_manager::toc::TableOfContent>::query_batch::{{closure}}
18: <qdrant::tonic::api::points_api::PointsService as api::grpc::qdrant::points_server::Points>::query::{{closure}}
19: <<api::grpc::qdrant::points_server::PointsServer<T> as tower_service::Service<http::request::Request<B>>>::call::QuerySvc<T> as tonic::server::service::UnaryService<api::grpc::qdrant::QueryPoints>>::call::{{closure}}
20: <api::grpc::qdrant::points_server::PointsServer<T> as tower_service::Service<http::request::Request<B>>>::call::{{closure}}
21: <tower::util::map_response::MapResponseFuture<F,N> as core::future::future::Future>::poll
22: <tonic::transport::service::router::RoutesFuture as core::future::future::Future>::poll
23: <qdrant::tonic::tonic_telemetry::TonicTelemetryService<S> as tower_service::Service<http::request::Request<hyper::body::body::Body>>>::call::{{closure}}
24: <qdrant::tonic::logging::LoggingMiddleware<S> as tower_service::Service<http::request::Request<hyper::body::body::Body>>>::call::{{closure}}
25: <tonic::transport::server::SvcFuture<F> as core::future::future::Future>::poll
26: <hyper::proto::h2::server::H2Stream<F,B> as core::future::future::Future>::poll
27: tokio::runtime::task::raw::poll
28: tokio::runtime::scheduler::multi_thread::worker::Context::run_task
29: tokio::runtime::task::raw::poll
30: std::sys::backtrace::__rust_begin_short_backtrace
31: core::ops::function::FnOnce::call_once{{vtable.shim}}
32: std::sys::pal::unix::thread::Thread::new::thread_start
33: <unknown>
34: <unknown>
Expected Behavior
I would expect a u64::MAX
limit to return all points and not panic Qdrant.
Context (Environment)
Why this came up: an older part of our code (that was written before Qdrant had the query
API) was using the search
API to fetch all points whose payload matched a given filter. This created a search request with a u64::MAX
limit and a filter. When upgrading to Qdrant v1.12.4, this caused the above panic. We fixed this by switching to the query
API with a limit of u64::MAX
(but no actual query). This does not panic Qdrant.