Skip to content

Improve performance of buffered string write#5985

Closed
bbowyersmyth wants to merge 2 commits intonpgsql:mainfrom
bbowyersmyth:users/bruceb/writePerf
Closed

Improve performance of buffered string write#5985
bbowyersmyth wants to merge 2 commits intonpgsql:mainfrom
bbowyersmyth:users/bruceb/writePerf

Conversation

@bbowyersmyth
Copy link
Contributor

When the string to be written can potentially be fully buffered there is a second call to GetByteCount (first one is GetSize). If the converter is not nested inside another converter, the size has already been calculated.
Basing this off BitArrayBitStringConverter that its safe to do.

PgReader.cs change is just to remove the DisplayClass struct creation.

Text.cs test string of 10_000 doesn't fit in the test buffer.

Before

Method Value Mean Error StdDev Op/s Gen0 Allocated
Read x 25.41 ns 0.203 ns 0.190 ns 39,359,074.1 0.0023 24 B
Write x NA NA NA NA NA NA
Read xxxxxxxxxx 27.39 ns 0.370 ns 0.346 ns 36,512,150.3 0.0046 48 B
Write xxxxxxxxxx 20.93 ns 0.230 ns 0.204 ns 47,775,325.5 0.0000 1 B
Read xxxx(...)xxxx [100] 30.99 ns 0.196 ns 0.184 ns 32,267,999.8 0.0214 224 B
Write xxxx(...)xxxx [100] 24.14 ns 0.064 ns 0.057 ns 41,425,834.0 0.0007 8 B
Read xxxx(...)xxxx [1000] 107.35 ns 0.816 ns 0.637 ns 9,315,283.3 0.1935 2024 B
Write xxxx(...)xxxx [1000] 74.81 ns 0.312 ns 0.261 ns 13,367,517.5 0.0074 78 B

After

Method Value Mean Error StdDev Op/s Gen0 Allocated
Read x 23.42 ns 0.095 ns 0.089 ns 42,704,740.3 0.0023 24 B
Write x NA NA NA NA NA NA
Read xxxxxxxxxx 25.19 ns 0.075 ns 0.070 ns 39,701,387.8 0.0046 48 B
Write xxxxxxxxxx 15.57 ns 0.039 ns 0.037 ns 64,237,944.4 0.0000 -
Read xxxx(...)xxxx [100] 31.36 ns 0.174 ns 0.145 ns 31,890,144.5 0.0214 224 B
Write xxxx(...)xxxx [100] 20.31 ns 0.058 ns 0.052 ns 49,229,262.7 0.0005 5 B
Read xxxx(...)xxxx [1000] 107.48 ns 0.698 ns 0.653 ns 9,304,087.4 0.1935 2024 B
Write xxxx(...)xxxx [1000] 57.45 ns 0.100 ns 0.094 ns 17,405,246.9 0.0052 54 B

@roji
Copy link
Member

roji commented Jan 2, 2025

/cc @NinoFloris

Copy link
Member

@NinoFloris NinoFloris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, it's a good optimization but it needs some tweaking before we can merge it.

// Update PgSerializerOptions.IsWellKnownTextType(Type) after any changes to this list.
mappings.AddType<string>(DataTypeNames.Text,
static (options, mapping, _) => mapping.CreateInfo(options, new StringTextConverter(options.TextEncoding), preferredFormat: DataFormat.Text), isDefault: true);
static (options, mapping, _) => mapping.CreateInfo(options, new StringTextConverter(options.TextEncoding, nested: false), preferredFormat: DataFormat.Text), isDefault: true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nestedness is not a mapping invariant. For instance this converter will be composed over by the array converter (via AddArrayType) and it would do the wrong thing. You'd have to look at the writer (we could maybe have some state there) to know whether the converter is called in a top-level context.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I'll take this PR back to draft for a while. I'll have to get a better understanding of the mapping to know what the next steps are.

}

public void WriteChars(ReadOnlySpan<char> data, Encoding encoding)
public void WriteChars(ReadOnlySpan<char> data, Encoding encoding, int? encodedByteLength = null)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should make these error-prone methods public. Instead we can apply the optimization to the internal converters. I'm fine having these signatures accessible internally.

@bbowyersmyth bbowyersmyth marked this pull request as draft February 8, 2025 22:13
@NinoFloris
Copy link
Member

I'm going to close this draft. The idea is sound, but the approach has some problems. If we want to avoid the extra work we should instead return a writeState from GetSize that contains the encoded length, this would work in all cases. The downside being that it adds an alloc per string write.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants