Skip to content

GH-704: Fix initialization of offset buffer when exporting VarChar vectors through C Data Interface#705

Merged
lidavidm merged 2 commits intoapache:mainfrom
Kontinuation:fix-export-empty-varlen-array
Apr 8, 2025
Merged

GH-704: Fix initialization of offset buffer when exporting VarChar vectors through C Data Interface#705
lidavidm merged 2 commits intoapache:mainfrom
Kontinuation:fix-export-empty-varlen-array

Conversation

@Kontinuation
Copy link
Copy Markdown
Member

@Kontinuation Kontinuation commented Apr 7, 2025

What's Changed

This patch fixes the initialization of offset buffers when exporting variable width arrays through Arrow C Data Interface. The original code incorrectly mess up with the member this.offsetBuffer while we should actually initialize the newly allocated offsetBuffer. I think the diff itself will be quite self-explanatory.

Closes #704 and probably #88 .

@Kontinuation Kontinuation changed the title GH-704, GH-88: Fix initializing offset buffer when exporting StringArrays through C Data Interface GH-704: Fix initializing offset buffer when exporting StringArrays through C Data Interface Apr 7, 2025
@github-actions

This comment has been minimized.

import org.apache.arrow.vector.ValueVector;
import org.apache.arrow.vector.VarBinaryVector;
import org.apache.arrow.vector.VarCharVector;
import org.apache.arrow.vector.VariableWidthVector;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unintentional import? This is failing the build

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Other test failures seems to be unrelated to this PR.

@Kontinuation Kontinuation changed the title GH-704: Fix initializing offset buffer when exporting StringArrays through C Data Interface GH-704: Fix initialization of offset buffer when exporting Arrow vectors through C Data Interface Apr 7, 2025
@Kontinuation Kontinuation changed the title GH-704: Fix initialization of offset buffer when exporting Arrow vectors through C Data Interface GH-704: Fix initialization of offset buffer when exporting VarChar vectors through C Data Interface Apr 7, 2025
@Kontinuation Kontinuation requested a review from lidavidm April 8, 2025 04:15
@lidavidm lidavidm merged commit f92585c into apache:main Apr 8, 2025
20 of 28 checks passed
dongjoon-hyun pushed a commit to apache/spark that referenced this pull request May 15, 2025
### What changes were proposed in this pull request?
This pr aims to upgrade `arrow-java` from 18.2.0 to 18.3.0.

### Why are the changes needed?
The new version bring some bug fixes, like:

- apache/arrow-java#627
- apache/arrow-java#654
- apache/arrow-java#656
- apache/arrow-java#693
- apache/arrow-java#705
- apache/arrow-java#707
- apache/arrow-java#722

In addition, the new version introduces a cascading upgrade for flatbuffers-java([ from 24.3.25 to 25.1.24 ](apache/arrow-java#600))

the full release note as follows:
- https://github.com/apache/arrow-java/releases/tag/v18.3.0

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
- Pass GitHub Acitons

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #50892 from LuciferYang/arrow-java-18.3.0.

Authored-by: yangjie01 <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
yhuang-db pushed a commit to yhuang-db/spark that referenced this pull request Jun 9, 2025
### What changes were proposed in this pull request?
This pr aims to upgrade `arrow-java` from 18.2.0 to 18.3.0.

### Why are the changes needed?
The new version bring some bug fixes, like:

- apache/arrow-java#627
- apache/arrow-java#654
- apache/arrow-java#656
- apache/arrow-java#693
- apache/arrow-java#705
- apache/arrow-java#707
- apache/arrow-java#722

In addition, the new version introduces a cascading upgrade for flatbuffers-java([ from 24.3.25 to 25.1.24 ](apache/arrow-java#600))

the full release note as follows:
- https://github.com/apache/arrow-java/releases/tag/v18.3.0

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
- Pass GitHub Acitons

### Was this patch authored or co-authored using generative AI tooling?
No

Closes apache#50892 from LuciferYang/arrow-java-18.3.0.

Authored-by: yangjie01 <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
timhurskidremio pushed a commit to timhurskidremio/dremio-arrow-java that referenced this pull request Dec 5, 2025
…har vectors through C Data Interface (apache#705)

## What's Changed

This patch fixes the initialization of offset buffers when exporting
variable width arrays through Arrow C Data Interface. The original code
incorrectly mess up with the member `this.offsetBuffer` while we should
actually initialize the newly allocated offsetBuffer. I think the diff
itself will be quite self-explanatory.

Closes apache#704 and probably apache#88 .
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug-fix PRs that fix a big.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Offset buffer contains garbled data when exporting empty string arrays from Java

2 participants