Skip to content

Allow mapping types with search_path#4869

Closed
vonzshik wants to merge 1 commit intomainfrom
4557-map-type-with-searchpath
Closed

Allow mapping types with search_path#4869
vonzshik wants to merge 1 commit intomainfrom
4557-map-type-with-searchpath

Conversation

@vonzshik
Copy link
Contributor

Closes #4557

@vonzshik vonzshik requested a review from roji as a code owner January 12, 2023 16:06
// If the name was found but the value is null, that means that there are
// two db types with the same name (different schemas).
// Try to fall back to search_path + pgName.
if (searchPath?.Split(new[] { ',' }, StringSplitOptions.RemoveEmptyEntries) is { Length: > 0 } searchPaths)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not entirely correct as PG always searches first in pg_catalog unless it's specified, but it's probably fine?

@NicolasDorier
Copy link

I would really love this one! :)

@vonzshik vonzshik force-pushed the 4557-map-type-with-searchpath branch from ba85e67 to 1f5a60f Compare February 27, 2023 13:27
DatabaseInfo = databaseInfo;
connector.DatabaseInfo = databaseInfo;
typeMapper.Initialize(databaseInfo, _resolverFactories, _userTypeMappings);
typeMapper.Initialize(databaseInfo, _resolverFactories, _userTypeMappings, Settings.SearchPath);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This takes the search path from the connection string, but it may be configured in postgresql.conf... If we want to match the PG behavior, I think we'd need to query search_path when first connecting to a database, and use that as the default. However, even that wouldn't be enough to be 100% compatible since the user can still change search_path via a regular command, and PG doesn't send ParameterStatus for that particular setting (so we can't be aware of changes).

This kind of complication is why I originally hesitated going into all of this; it may be safer to have users provide their fully-qualified type names when more than one exists - because otherwise we can only provide an approximate (and complex) logic... What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This takes the search path from the connection string, but it may be configured in postgresql.conf... If we want to match the PG behavior, I think we'd need to query search_path when first connecting to a database, and use that as the default. However, even that wouldn't be enough to be 100% compatible since the user can still change search_path via a regular command, and PG doesn't send ParameterStatus for that particular setting (so we can't be aware of changes).

Do we really want to match the PG behavior? While on paper it does sound nice, I think that if you have multiple schemas (otherwise it doesn't really make sense) that means that there's probably a different application per schema. And if so, I doubt that they're going to use the exact same search_path.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And the original issue was about making it so we map on connection string's SearchPath. If someone actually requests automatic mapping based on pg's search_path (though that's going to be quite a breaking change), we can think about that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really want to match the PG behavior?

If we start respecting search_path in some sense, it seems problematic to not really respect it in the same way PG does, but only partially... I think it's fine not to respect it at all, but doing partial behavior feels even worse...

While on paper it does sound nice, I think that if you have multiple schemas (otherwise it doesn't really make sense) that means that there's probably a different application per schema.

I agree that's likely, but I'm not a fan of making assumptions on how exactly users use their database...

And the original issue was about making it so we map on connection string's SearchPath.

Sure, but we have to think about the larger picture etc. I do think being in a partially consistent behavior is worse than not respecting search_path at all and forcing users to be explicit with the type's schema when there are multiple ones...

@NinoFloris @Brar any opinions on the above?

Copy link
Member

@NinoFloris NinoFloris Aug 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Things change a bit with the handler rework. We don't resolve a pg type on registration anymore but on resolving. Effectively the error will move from registration to parameter processing.

That's still an improvement as we'll be able to recommend another workaround: specify the fully qualified datatypename on the parameter.

If we'd still want to do search_path resolution of unqualified names as well then I do agree with @roji it should be based on the connection info; not a static config option.

The two relevant cases are users passing an unqualified name to the parameter and users passing nothing (where we'd rely on just clr type inference).

If we just want to keep one data source default we can bake the search_path into the type catalog and let it influence the unqualified data type lookups accordingly.

That would work for both unqualified datatype names set by users on parameters and for any resolver registration that has to expand its unqualified name if it matched solely based on a CLR type.

However if we want search_path changes per connection to take effect things get very painful, especially around multiplexing.

  • If we have an unqualified name on the parameter we can't resolve it as we don't actually have a connection.
  • If we don't have a name and purely match based on CLR type we'd have the same problem, just inside the resolvers (which currently don't get any per connection state passed in either).

We'd have to delay processing and validating parameters, then as we have to always re-resolve we can't cache resolved info in parameters across executions anymore. Finally we can't even cache our resolver results as it stores which fully qualified pg type is default for which clr type (via the TypeInfo), which could now change per connection.

Needless to say, honoring per connection search paths would be a disaster :D

@NinoFloris
Copy link
Member

NinoFloris commented Jun 27, 2024

@vonzshik I think we should close this and favor something like #5535 instead. What do you think?

The situation today is already improved. As previously described:

Things change a bit with the handler rework. We don't resolve a pg type on registration anymore but on resolving. Effectively the error will move from registration to parameter processing.

That's still an improvement as we'll be able to recommend another workaround: specify the fully qualified datatypename on the parameter.

If we end up merging #5535, users that have duplicate unqualified type names can work around having to specify the fully qualified name on the parameter by making sure they filter the type loading set by specifying a sufficiently narrow search_path.

Once there are no duplicates both type only resolution and paired resolution (type, {unqualified}datatypename) will work without needing fully qualified names.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

More than one PostgreSQL type was found with the name dummy, please specify a full name including schema.

4 participants