[Tests] Fully rework type mapping assertion helpers#6314
[Tests] Fully rework type mapping assertion helpers#6314NinoFloris wants to merge 24 commits intonpgsql:mainfrom
Conversation
27da7fe to
ec83a23
Compare
| NpgsqlDbType.Char => DbType.String, | ||
| NpgsqlDbType.Name => DbType.String, | ||
| NpgsqlDbType.Citext => DbType.String, | ||
| NpgsqlDbType.Refcursor => DbType.Object, |
There was a problem hiding this comment.
All of these are redundant, and we already don't explicitly map all the other cases of NpgsqlDbType either.
There was a problem hiding this comment.
I think the intent here was mostly just to be explicit in the source code. No issue with removing.
There was a problem hiding this comment.
No exactly, I'm fine either way too but then we should just fill it out.
ec83a23 to
9d974de
Compare
9d974de to
88c8c2c
Compare
There was a problem hiding this comment.
Pull request overview
This PR performs a comprehensive refactoring of the type mapping assertion helpers used throughout Npgsql's test suite. The main goal is to remove ambiguous "default" concepts and replace them with more explicit, precise testing parameters that better capture the nuances of type inference behavior.
Key Changes:
- Replaced
isDefault,isDefaultForReading, andisDefaultForWritingparameters with a clearerdataTypeInferenceparameter that explicitly states whether types are inferred exactly, through well-known mappings, or not at all - Introduced
ExpectedDbTypestruct to handle cases where different DbTypes are expected based on how the parameter is set (via DataTypeName vs value inference) - Renamed
pgTypeNametodataTypeNamethroughout for consistency - Added more comprehensive negative matching for write-side checks
- Removed several unused
usingdirectives forNpgsqlTypes
Reviewed changes
Copilot reviewed 35 out of 35 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| test/Npgsql.Tests/Support/TestBase.cs | Core refactoring of assertion helpers with new DataTypeInference and ExpectedDbType patterns |
| test/Npgsql.Tests/Types/*.cs | Updated all type test files to use new assertion helper signatures |
| test/Npgsql.PluginTests/*.cs | Updated plugin test files to use new assertion helper signatures |
| src/Npgsql/NpgsqlTypes/NpgsqlDbType.cs | Simplified ToDbType method by removing redundant Object mappings |
| test/Npgsql.Tests/CommandTests.cs | Minor update using UTF-8 string literals |
| test/Npgsql.Tests/LargeObjectTests.cs | Minor update using UTF-8 string literals |
Comments suppressed due to low confidence (1)
test/Npgsql.Tests/Types/FullTextSearchTests.cs:93
- Call to obsolete method Parse.
await AssertType(NpgsqlTsVector.Parse("'1'"), "'1'", "tsvector");
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 35 out of 35 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
NinoFloris
left a comment
There was a problem hiding this comment.
Highlighted some examples to explain the new DbType assertions.
| isDefault: false); | ||
| => AssertType(new TimeSpan(0, 10, 45, 34, 500), "10:45:34.5", | ||
| "time without time zone", dataTypeInference: DataTypeInferenceKind.WellKnown, | ||
| dbType: new(DbType.Time, DbType.Object), isValueTypeDefaultFieldType: false); |
There was a problem hiding this comment.
For example this comes back as DbType.Object for value based inference because Interval is the first mapping for TimeSpan, which has no DbType mapping.
| => AssertType(time, sqlLiteral, "time with time zone", isDefault: false); | ||
| => AssertType(time, sqlLiteral, | ||
| "time with time zone", dataTypeInference: DataTypeInferenceKind.WellKnown, | ||
| dbType: new(DbType.Object, DbType.DateTime)); |
There was a problem hiding this comment.
This on the other hand returns DbType.Object when we do know the data type, as time with time zone has no sensible DbType case.
| isDataTypeInferredFromValue: false); | ||
| => AssertType(instant, sqlLiteral, | ||
| "timestamp with time zone", dataTypeInference: false, | ||
| dbType: new(DbType.DateTime, DbType.Object)); |
There was a problem hiding this comment.
All of these nodatime mappings specify DbType.Object for the value based inference, as these nodatime types are unknown to npgsql before parameter binding.
test/Npgsql.Tests/Types/TextTests.cs
Outdated
| public Task Text_as_array_of_bytes() | ||
| => AssertType(Encoding.UTF8.GetBytes("foo"), "foo", "text", DbType.String, isDefault: false); | ||
| => AssertType("foo"u8.ToArray(), "foo", "text", dataTypeInference: DataTypeInferenceKind.WellKnown, | ||
| new(DbType.String, DbType.Binary), isValueTypeDefaultFieldType: false); |
There was a problem hiding this comment.
These byte[], ROM and stream tests all default to DbType.Binary through value inference due to them being bytea defaults, which maps to DbType.Binary.
| await AssertType(8D, "8", "numeric", DbType.Decimal, isDefault: false); | ||
| await AssertType(8M, "8", "numeric", DbType.Decimal, isDefault: false); | ||
| await AssertType(5.5m, "5.5", "numeric", dbType: DbType.Decimal); | ||
| await AssertTypeWrite(5.5m, "5.5", "numeric", dbType: new(DbType.VarNumeric, DbType.Decimal, DbType.Decimal)); |
There was a problem hiding this comment.
This is one of the few places we use the three parameter constructor. (around datetime there are another few cases)
We use it when we want to check whether the alias cases write correctly (e.g. varnumeric is an alias of decimal for us, but we want to know whether that mapping works).
In this case the first argument is the db type to set on the parameter, the second and third arguments are the same as the two value constructor (first comes the datatype based dbtype, then the value inferred dbtype).
In this assert we expect it to return DbType.Decimal for both. DataTypeName "numeric" being mapped to DbType.Decimal and CLR Type decimal being mapped to DbType.Decimal as well.
roji
left a comment
There was a problem hiding this comment.
Quick general question before I do a more complete review.
| NpgsqlDbType.Char => DbType.String, | ||
| NpgsqlDbType.Name => DbType.String, | ||
| NpgsqlDbType.Citext => DbType.String, | ||
| NpgsqlDbType.Refcursor => DbType.Object, |
There was a problem hiding this comment.
I think the intent here was mostly just to be explicit in the source code. No issue with removing.
| dataTypeName, | ||
| isDefault: false, | ||
| isDataTypeInferredFromValue: false); | ||
| dataTypeName, dataTypeInference: false, |
There was a problem hiding this comment.
Tiny nit: when we split arguments to different lines as in this case, I generally prefer to do all of them on different lines - I personally find it easier to read. Obviously no big deal.
There was a problem hiding this comment.
Yeah in most of these cases that I changed I have paired them with their related arguments. That's mostly why they come like that. I personally find it scans a bit more easily if I can scan a concept per line. Completely fine to change this.
| "1998-04-12 13:26:38.789", | ||
| "timestamp without time zone", | ||
| DbType.DateTime, | ||
| isDefaultForReading: false, |
There was a problem hiding this comment.
I have to say that I find the current form more immediately understandable and useful as a "spec/documentation" of Npgsql's behavior - this is clearly telling me that PG timestamp does not get read as LocalDateTime by default (i.e. one must do GetFieldValue<LocalDateTime>() to explicitly request it), whereas LocalDateTime does get written by default as PG timestamp (i.e. one doesn't need to set DataTypeName/NpgsqlDbType).
On the new side of the diff I now see dataTypeInference I'm not longer seeing that information. It could be just my way of thinking (I wrote the current system after all) but it seems like some information is getting lost here no?
Note that I definitely am for tightening assertions and checks, renaming things and other stuff you're doing in this PR - just not sure if this specific change to the actual AssertType API is necessary for that etc.
There was a problem hiding this comment.
I see, yes I think indeed that's a small difference in understanding. To me GetValue() is not necessarily a default (it also has to be called) or for that matter any less explicit than a generic method. That also fits with the reasoning around GetFieldValue<object> needing to do the same thing as GetValue(). But I agree that if you want something other than a boxed object instance of *some* type, you do have to know about that type and pass it. For those cases there is indeed more information requested of the caller, if that's what you mean.
whereas LocalDateTime does get written by default as PG timestamp (i.e. one doesn't need to set DataTypeName/NpgsqlDbType).
This part is mostly you inferring this from your existing experience, as we're not really passing the write side default argument at all. Though I agree that currently there is a nice (looking) 'default' parameter for writing and reading.
The most significant point of contention I have with “default” is on the write side though. There it really doesn't work well conceptually due to the multiple ways we try to attempt to find a mapping if we have no data type to go on (where we check whether we have an actual default mapping, but also attempt to fallback to the first clr type match if we don't have a default mapping). Which also means that if you just go by seeing whether a mapping does not have isDefault:true (in a type info resolver) that it does not mean it won't be seen as a 'default' for writing in the current tests. Unless you know exactly what all this entails you will just try some flags until the test passes, maybe having disabled half the checks (which is why I'm adding more negative matching too).
Regarding the isDefaultForReading name change (or just isDefault for AssertTypeRead) to isValueTypeDefaultFieldType. Given that we *always* align the type coming from GetValue() and FieldType (this was also true in the handler system) I elaborated the name to add the mention of its input valueType (typeof(T) basically), and the output it checks FieldType.
Now in the PR I mentioned that I wanted to get rid of the notion of default in the names of things, and on (re)reviewing this myself I can't say I have succeeded 😅, so the new name leaves a lot to be desired. I have pushed a change to rename this to valueTypeEqualsFieldType instead (also note that it's not used a lot either way, about 100 hits where actual tests pass this in).
I hope that valueTypeEqualsFieldType is something you (and @vonzshik) can get used to, but if you feel strongly about the name or the approach we should discuss it a bit.
The trend for this PR should be to have assertion argument names better describe which inputs are being checked against which outputs, and to drive down the number of (argument level) knobs. The fewer we can get away with the more we can confidently say we have a predictable and consistent mapping system, which also helps future code align better to those standards.
There was a problem hiding this comment.
So I think we're we might have a gap is around looking at the type mapping system from the inside or from the outside... I'm generally looking at these tests 100% from a user perspective - I generally like that tests encode the user contract as much as possible (where that makes sense). And if you look at our type mapping docs, you'll see the same "default" concept expressed in the two tables:
- On the read side (1st table), each PG type has a single .NET read type, e.g. PG
textgets read as .NETstringby the untyped APIs (e.g.GetValue()), but can also be read aschar[]when the typed API is used (GetFieldValue<T>()). - In my head (and currently in the docs - 2nd table) the same kinda logic works on the write side: each .NET type has a single PG type, which is what Npgsql writes unless you tell it to write something else via NpgsqlDbType/DbType/DataTypeName.
To me both of the above map pretty nicely to the concept of "default" - from a very user-oriented perspective - are you saying you think it's problematic? If so, would you explain it in another way to users in the docs, which would simplify things somehow? Or can you give a problematic scenario or two where describing things to users as "default" (either on the read or write side) does a disservice/is incorrect? Because if thinking about this as "default" works for explaining our behavior to users, it feels like it should work in the tests too (though I get that this might not always be 100% true).
Regarding the isDefaultForReading name change (or just isDefault for AssertTypeRead) to isValueTypeDefaultFieldType. Given that we always align the type coming from GetValue() and FieldType (this was also true in the handler system) I elaborated the name to add the mention of its input valueType (typeof(T) basically), and the output it checks FieldType.
Sure, I understand - in my head the alignment of GetValue() and FieldType just naturally flows from, well, having a default .NET type that's read for the given PG type (i.e. since PG text is by default read as string, GetValue() and FieldType always return string). So your proposed naming just seems to go more low-level, encompass two things and therefore be a bit less immediately understandable, where there's (for me) a simple high-level concept that already expressed this. But I can't say it bothers me that much, if it's important to you.
BTW I'm posting a long/serious answer here because it's an interesting discussion, but it's not like I have super-strong feelings about it either :) This is just test factoring after all.
| bool isDefault = true, | ||
| Func<T, T, bool>? comparer = null, | ||
| Type? fieldType = null) | ||
| public enum DataTypeInferenceKind |
There was a problem hiding this comment.
Maybe document these? I honestly am not sure what exactly they mean in the context of these tests and need to read the code to figure it out.
There was a problem hiding this comment.
Yes, will do. I'll also add a bit of rationale here about the what and why of dataTypeInference as the assertion argument.
There was a problem hiding this comment.
Took a shot at letting opus generate a rationale.
Please explain what dataTypeInference in AssertType does
Opus:
DataTypeInference controls how the test validates Npgsql's ability to infer the PostgreSQL data type from a CLR value.
What it tests:
-
Exact(ortrue): The CLR type uniquely maps to the expected PostgreSQL type. The test verifies that setting only theValueproperty (withoutNpgsqlDbTypeorDataTypeName) correctly infers the exact type. Both generic and non-generic parameters are tested. -
WellKnown: The CLR type maps to some PostgreSQL type (notUnknown), but not necessarily the expected one. Used when a type is recognized but ambiguous—the test accepts whatever type Npgsql infers rather than requiring an exact match. -
Unknown(orfalse): The CLR type cannot be used to infer any PostgreSQL type. The test expectsDataTypeNameto benullandNpgsqlDbTypeto beUnknown. Value-only parameters are excluded from the roundtrip test since they'd fail.
In practice:
The test always validates explicit type specification (via NpgsqlDbType, DataTypeName, DbType). The DataTypeInference parameter determines whether value-only scenarios are tested and what inference behavior is expected.
Could you explain why WellKnown exists over having just Unknown without negative matching?
Opus:
The key difference is what gets tested vs skipped:
With WellKnown (current approach):
- Value-only parameters are still included in the roundtrip test
- The test writes the value and verifies it round-trips correctly
- It just doesn't assert the specific inferred type (accepts whatever Npgsql infers)
With only Unknown (no negative matching):
- Value-only parameters would be skipped entirely from the test
- You lose coverage that the value can be written at all via inference
Example: Testing string → json
| Approach | What happens |
|---|---|
WellKnown |
Test writes string with value-only, verifies round-trip works (even though Npgsql infers text, not json) |
Unknown |
Test skips value-only entirely—no verification that string values work |
The WellKnown approach catches regressions where a CLR type that used to infer successfully suddenly fails to infer anything. With Unknown, you'd never notice because that scenario isn't tested.
There was a problem hiding this comment.
I'll add that this parameter just replaces isDataTypeInferredFromValue (it's a rename + the addition of the WellKnown to enable negative matching). However, as value only inference is really the only observable property for what we used to consider 'defaultness' it also meant isDefaultForWriting could be removed. Similarly to how valueTypeEqualsFieldType is all that is relevant for defaultness on the read side.
f8e0989 to
b172faf
Compare
b172faf to
2b49d7a
Compare
2b49d7a to
5d8d1e0
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 36 out of 36 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 35 out of 35 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 35 out of 35 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
463faa8 to
2416f41
Compare
Co-authored-by: Copilot <[email protected]>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 35 out of 35 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
a15c0eb to
86d6b42
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 35 out of 35 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
86d6b42 to
372b458
Compare
372b458 to
edcf6e1
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 35 out of 35 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
edcf6e1 to
6d766cc
Compare
6d766cc to
31ef76a
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 35 out of 35 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This PR contains a few things:
A notable example of this is an edge-case test for DbType.VarNumeric, which maps to NpgsqlDbType.Numeric, while DbType.Decimal is the canonical mapping to Numeric. In this test we want to be able to set VarNumeric on DbType and read this value back out of the property (currently CheckInference contains a roundtripping quirk for this, but that will go after Add IDbTypeResolver to allow plugins to control how DbTypes are mapped #6267 is merged, when DbType gets its own storage). The other inference checks however still want to observe the canonical DbType.Decimal when inferring (e.g. from a decimal value, or when the dataTypeName is "decimal" or "numeric").
A more common example is where dataTypeName maps to DbType.Object while the value inferred DbType would be something more strongly typed, like DbType.String or Binary (bitstring or datetime tests are a good source for these).
Note, some of the tests have additional asserts included. In those cases I usually ran into them while trying to lock down the right assertion changes, and took the liberty to leave the asserts in as long as they provided value.