KAFKA-19967: Reduce GC pressure in tiered storage read path using direct memory buffers #21102

nandini12396 · 2025-12-07T20:43:48Z

This change addresses high GC pressure by allocating tiered storage fetch buffers in direct (off-heap) memory instead of the JVM heap. When direct memory is exhausted, the system gracefully falls back to heap allocation with a warning.

Problem:
During tiered storage reads, heap-allocated buffers bypass young generation and go directly to old generation (humongous allocations). Under high read load, these accumulate rapidly and trigger frequent, expensive G1 Old Generation collections causing significant GC pause times.

Solution:

Introduced DirectBufferPool that pools direct buffers using WeakReferences, allowing GC to reclaim buffers under memory pressure
Modified RemoteLogInputStream to use pooled direct buffers instead of per-request heap allocation
Graceful fallback to heap allocation when direct memory is exhausted

apoorvmittal10 · 2025-12-08T14:52:32Z

Thanks for the PR, I can see new metrics have been introduced in the PR which falls under Monitoring and should require a KIP (https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals).

showuon

@nandini12396 , thanks for the PR! Some high-level comments:

It looks to me that introducing a thread pool with indirect buffer can fix the GC pressure issue, right? Any other reason we want to use direct buffer?
New configs/metrics needs to go through KIP.

nandini12396 · 2025-12-11T11:02:47Z

Hi @showuon thanks for your review!
Pooling heap buffers would reduce allocation frequency but doesn't eliminate GC pressure. In G1GC, objects >32MB (half region size) are "humongous" and skip young generation entirely—they go straight to old gen. Even with pooling, these buffers:

Get scanned during every GC cycle (even if reused)
Contribute to heap occupancy that triggers GC
Can only be collected in expensive mixed/full GCs.

In tiered storage, maxBytes can reach 55MB+ based on replica.fetch.max.bytes and replica.fetch.response.max.bytes. With a 4GB heap at IHOP=35%, just ~25 concurrent fetches (25 × 55MB = 1.375GB) trigger old GC.

Direct buffers move the data off-heap entirely into native memory where GC doesn't see it. We also get zero-copy I/O since the data is already in native memory for socket writes.

To share some test results
Heap buffers: Direct buffers:

GC every ~100ms - GC every 30-40s
1.1-1.3GB after GC - 325MB after GC
546-689 humongous - ~270 humongous (50% reduction)
regions regions

I will remove the metrics changes from this PR so we can focus on the buffer pool implementation.

nandini12396 · 2025-12-15T13:21:21Z

I've updated the PR without the metrics changes to focus on fixing the issue. Please could you take a look and review. Thank you!

kamalcph

Default consumer configs are max.partition.fetch.bytes = 1 MB and fetch.max.bytes = 50 MB. And, if the size of the messages in the topic are ~1 MB, then the heap buffer size of 1 MB might be fine for those cases.

We may have to provide another config to use either direct / heap buffer when enabling the buffer pool. Since using direct buffer might create page faults, and potentially impact the produce latencies.

Thanks for the patch!

kamalcph · 2025-12-16T13:40:57Z

storage/src/main/java/org/apache/kafka/server/log/remote/storage/RemoteLogManagerConfig.java

                        MEDIUM,
-                        REMOTE_LIST_OFFSETS_REQUEST_TIMEOUT_MS_DOC);
+                        REMOTE_LIST_OFFSETS_REQUEST_TIMEOUT_MS_DOC)
+                .define(REMOTE_LOG_DIRECT_BUFFER_POOL_ENABLED_PROP,


could you mark the config as internal until the KIP gets approved?

.define -> .defineInternal

clients/src/main/java/org/apache/kafka/common/record/RemoteLogInputStream.java

…ect memory buffers This change addresses high GC pressure by allocating tiered storage fetch buffers in direct (off-heap) memory instead of the JVM heap. When direct memory is exhausted, the system gracefully falls back to heap allocation with a warning. Problem: During tiered storage reads, heap-allocated buffers bypass young generation and go directly to old generation (humongous allocations). Under high read load, these accumulate rapidly and trigger frequent, expensive G1 Old Generation collections causing significant GC pause times. Solution: - Introduced DirectBufferPool that pools direct buffers using WeakReferences, allowing GC to reclaim buffers under memory pressure - Modified RemoteLogInputStream to use pooled direct buffers instead of per-request heap allocation - Graceful fallback to heap allocation when direct memory is exhausted

github-actions bot added triage PRs from the community storage Pull requests that target the storage module tiered-storage Related to the Tiered Storage feature clients labels Dec 7, 2025

showuon reviewed Dec 10, 2025

View reviewed changes

github-actions bot removed the triage PRs from the community label Dec 11, 2025

nandini12396 force-pushed the KAFKA-19967-reduce-gc-tiered-storage-direct-memory branch from 9a49d38 to d2592f2 Compare December 15, 2025 13:20

kamalcph reviewed Dec 16, 2025

View reviewed changes

nandini12396 force-pushed the KAFKA-19967-reduce-gc-tiered-storage-direct-memory branch from d2592f2 to 6da88aa Compare December 16, 2025 22:06

nandini12396 force-pushed the KAFKA-19967-reduce-gc-tiered-storage-direct-memory branch from 6da88aa to 343a450 Compare December 16, 2025 22:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

KAFKA-19967: Reduce GC pressure in tiered storage read path using direct memory buffers #21102

KAFKA-19967: Reduce GC pressure in tiered storage read path using direct memory buffers #21102

Uh oh!

nandini12396 commented Dec 7, 2025

Uh oh!

apoorvmittal10 commented Dec 8, 2025

Uh oh!

showuon left a comment

Uh oh!

nandini12396 commented Dec 11, 2025 •

edited

Loading

Uh oh!

nandini12396 commented Dec 15, 2025

Uh oh!

kamalcph left a comment

Uh oh!

kamalcph Dec 16, 2025

Uh oh!

nandini12396 Dec 16, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

KAFKA-19967: Reduce GC pressure in tiered storage read path using direct memory buffers #21102

Are you sure you want to change the base?

KAFKA-19967: Reduce GC pressure in tiered storage read path using direct memory buffers #21102

Uh oh!

Conversation

nandini12396 commented Dec 7, 2025

Uh oh!

apoorvmittal10 commented Dec 8, 2025

Uh oh!

showuon left a comment

Choose a reason for hiding this comment

Uh oh!

nandini12396 commented Dec 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nandini12396 commented Dec 15, 2025

Uh oh!

kamalcph left a comment

Choose a reason for hiding this comment

Uh oh!

kamalcph Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

nandini12396 Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

nandini12396 commented Dec 11, 2025 •

edited

Loading