Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unmarshal null wrapper types to nil instead of empty values #394

Merged
merged 4 commits into from
Jul 26, 2017

Conversation

cspahr-tulipretail
Copy link
Contributor

Unmarshaling now follows proto3 spec, turning JSON null into Go nil for wrapper types. Without this fix, (Type)Value wrappers are only unmarshaled as nil when the relevant key/value pairs are left out of the JSON entirely. Fields with an explicit null are unmarshaled into a struct with an empty value for the struct's Value field.

@googlebot
Copy link
Collaborator

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed, please reply here (e.g. I signed it!) and we'll verify. Thanks.


  • If you've already signed a CLA, it's possible we don't have your GitHub username or you're using a different email address. Check your existing CLA data and verify that your email is set on your git commits.
  • If your company signed a CLA, they designated a Point of Contact who decides which employees are authorized to participate. You may need to contact the Point of Contact for your company and ask to be added to the group of authorized contributors. If you don't know who your Point of Contact is, direct the project maintainer to go/cla#troubleshoot.
  • In order to pass this check, please resolve this problem and have the pull request author add another comment and the bot will run again.

@cspahr-tulipretail
Copy link
Contributor Author

I signed it!

@cybrcodr
Copy link
Contributor

This is a bit tricky. Sadly, the spec for JSON mapping is only documented for proto3 at https://developers.google.com/protocol-buffers/docs/proto3#json, which states ...

null is accepted and treated as the default value of the corresponding field type.

And "default value" here is ambiguous. I can understand your ask for having it nil for proto2 optional fields. Somehow the current implementation is consistent though for all WKTs by having a zero value instance. I'm clarifying this with the Protobuf team and hope to get an answer back on this.

Regarding your PR though, it only changes some WKTs but not other ones like Struct, Timestamp, Duration, etc. If the decision is to have nil value for these optional WKTs, then it may need to be consistent across all.

Although the following case can be tricky ...

message Message {
  optional google.protobuf.Value val = 1;
}

Should the value of val be NullValue or nil?

Anyways, I'll let you know once I hear back from the Protobuf team.

@cspahr-tulipretail
Copy link
Contributor Author

Ah, that's great, thanks! To clarify, this is only intended to cover the null wrapper types. I think my writing "(Type)Value" in my first comment was a poor abbreviation for "FloatValue, Int64Value, UInt64Value, Int32Value, UInt32Value, BoolValue, StringValue, and BytesValue", i.e., the wrapper types, but not NullValue or any of the others.

About wrapper types, the proto3 spec states:

Wrappers use the same representation in JSON as the wrapped primitive type, except that null is allowed and preserved during data conversion and transfer.

Obviously I take this to mean that they default (in Go) to nils, so that they have one-to-one marshalling/unmarshalling (they are currently marshalled as null when they're nil in the Go proto message types and EmitDefaults = true). The general consensus of the internet seems to be that this is the intended use of wrapper types in proto3, but it's true the spec doesn't seem to say anything about it :)

Thanks again!

@cybrcodr
Copy link
Contributor

Ok, here's what I got from the Protobuf team ... for deserializing JSON null to WKT value, it should be "treated as if the field is not set" with the exception for google.protobuf.Value type. By WKT, it includes Any, Timestamp, Duration, ListValue, Struct and all Wrapper types.

A JSON null value for google.protobuf.Value field should be set to NullValue, this is already done by current implementation.

So, based on above assumptions, a fix for this should pass the following conformance tests that are currently whitelisted ...
Required.JsonInput.AllFieldAcceptNull.JsonOutput
Required.JsonInput.AllFieldAcceptNull.ProtobufOutput
Required.JsonInput.WrapperTypesWithNullValue.JsonOutput
Required.JsonInput.WrapperTypesWithNullValue.ProtobufOutput

The fix should not be just for the Wrapper types, but also includes the rest. I have such fix already, but if you'd like to continue on this PR, let me know.

Thanks.

@cspahr-tulipretail
Copy link
Contributor Author

Hmm, I'll admit I hadn't looked super closely at the Value type up until now, but now as I understand it, they're used for representing arbitrary types bundled within a Struct, one of the possible values for which is null (i.e., NullValue). However, I don't see how this can fit in with the wrapper types (both their .proto definitions and, for this package, the Go code generated from them):

The wrapper messages in wrappers.proto consist of a message with a field that has a single built-in type (string, bool, etc.), so if a null StringValue field (Go type *wrappers.StringValue) can only end up as nil or *wrappers.StringValue{Value: ""}; there's no other way to represent a null vs. empty dichotomy here.

The Value type in struct.proto, on the other hand, uses oneof, so it cannot represent null vs. a single type, only null vs. any of several other types. Furthermore, for built-in types, it uses those types directly, so it's any of protobuf string, double, bool, or NullValue (plus structs and lists). The wrapper types StringValue etc. are not used here at all.

In other words, nullability for wrappers and nullability for structs using NullValue seem like separate issues, since they're incompatible as it stands.

Without defining new wrapper messages such as NullableString with oneof NullValue/string, and so on for each of the built-in types, the only obvious way I see to get "a particular type or null" semantics is StringValue vs. nil. Since wrappers.proto and struct.proto are shared across protobuf as a whole, not just the Go implementation, creating new oneof wrappers doesn't seem like the right approach in order to support nullability as part of the standard protobuf library.

Am I mistaken here? My apologies if I'm still missing something!

@googlebot
Copy link
Collaborator

CLAs look good, thanks!

@cybrcodr
Copy link
Contributor

The Value type is different from the wrapper types. Current implementation is correct with regards to deserializing "null" to the value of "NullValue" for Value type.

My point was that your fix only fixes wrapper types, but misses out on Duration and Timestamp. One can structure the code to ensure that the fix is applied correctly to all these WKTs. As mentioned, I do have corresponding fix that does that. So, let me know if you prefer to pursue this still and I'll comment on your changes.

@cspahr-tulipretail
Copy link
Contributor Author

Oh, I see---sorry, I misunderstood and thought you were saying they are supposed to be the same. Yes please, comments very welcome, and I'd still like to pursue it!

@cybrcodr
Copy link
Contributor

Here's the diff to my fix (line numbers may not match up) ...


@@ -639,6 +639,11 @@

        // Allocate memory for pointer fields.
        if targetType.Kind() == reflect.Ptr {
+               // If input JSON value is "null" and target is a pointer type, it should be treated as not
+               // set EXCEPT for pointer to structpb.Value.
+               if string(inputValue) == "null" && targetType != reflect.TypeOf(&stpb.Value{}) {
+                       return nil
+               }
                target.Set(reflect.New(targetType.Elem()))
                return u.unmarshalValue(target.Elem(), inputValue, prop)
        }
@@ -652,10 +657,6 @@
                switch w.XXX_WellKnownType() {
                case "DoubleValue", "FloatValue", "Int64Value", "UInt64Value",
                        "Int32Value", "UInt32Value", "BoolValue", "StringValue", "BytesValue":
-                       // "Wrappers use the same representation in JSON
-                       //  as the wrapped primitive type, except that null is allowed."
-                       // encoding/json will turn JSON `null` into Go `nil`,
-                       // so we don't have to do any extra work.
                        return u.unmarshalValue(target.Field(0), inputValue, prop)
                case "Any":
                        // Use json.RawMessage pointer type instead of value to support pre-1.8 version.
@@ -716,14 +717,7 @@

                        return nil
                case "Duration":
-                       ivStr := string(inputValue)
-                       if ivStr == "null" {
-                               target.Field(0).SetInt(0)
-                               target.Field(1).SetInt(0)
-                               return nil
-                       }
-
-                       unq, err := strconv.Unquote(ivStr)
+                       unq, err := strconv.Unquote(string(inputValue))
                        if err != nil {
                                return err
                        }
@@ -738,14 +732,7 @@
                        target.Field(1).SetInt(ns)
                        return nil
                case "Timestamp":
-                       ivStr := string(inputValue)
-                       if ivStr == "null" {
-                               target.Field(0).SetInt(0)
-                               target.Field(1).SetInt(0)
-                               return nil
-                       }
-
-                       unq, err := strconv.Unquote(ivStr)
+                       unq, err := strconv.Unquote(string(inputValue))
                        if err != nil {
                                return err
                        }
@@ -757,10 +744,6 @@
                        target.Field(1).SetInt(int64(t.Nanosecond()))
                        return nil
                case "Struct":
-                       if string(inputValue) == "null" {
-                               // Interpret a null struct as empty.
-                               return nil
-                       }
                        var m map[string]json.RawMessage
                        if err := json.Unmarshal(inputValue, &m); err != nil {
                                return fmt.Errorf("bad StructValue: %v", err)
@@ -775,10 +758,6 @@
                        }
                        return nil
                case "ListValue":
-                       if string(inputValue) == "null" {
-                               // Interpret a null ListValue as empty.
-                               return nil
-                       }
                        var s []json.RawMessage
                        if err := json.Unmarshal(inputValue, &s); err != nil {

Feel free to improve on it as I may have missed out certain cases. You'll need to update the tests to match the consequences of this but it will make sense based on the criteria I mentioned.

I've filed issue #397 to improve running conformance tests but we can treat that as a separate fix some other time.

I won't be able to respond immediately today for this but will try to get back on this PR late tonight or sometime tomorrow.

jsonpb/jsonpb.go Outdated
err := u.unmarshalValue(target.Elem(), inputValue, prop)

// Set to a zero (nil pointer) if wrapper type is nulled
if err == nullWrapperError {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of using an error for undoing the construction and setting to nil, it would be better to do it ahead. See the diff I've attached.

@cspahr-tulipretail
Copy link
Contributor Author

Revised based on your approach, which I agree is much cleaner.

Copy link
Contributor

@cybrcodr cybrcodr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Just a few minor feedback.

jsonpb/jsonpb.go Outdated
@@ -640,22 +640,23 @@ func (u *Unmarshaler) unmarshalValue(target reflect.Value, inputValue json.RawMe
// Allocate memory for pointer fields.
if targetType.Kind() == reflect.Ptr {
target.Set(reflect.New(targetType.Elem()))
if string(inputValue) == "null" && targetType != reflect.TypeOf(&stpb.Value{}) {
target.Set(reflect.Zero(targetType))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the value is going to be nil, then there should be no point in constructing the value and then setting the pointer to nil.

This if block should be before target.Set(reflect.new(targetType.Elem())) and hence no need to set value to nil as that should be the default.

Also, please do add comment for this if block. While it's easy to know why a "null" should be nil for pointer to messages, the if condition excludes Value. So, I'd suggest adding comment ...

// If input JSON value is "null" and target is a pointer type, it should be treated as the field is NOT
// set EXCEPT for a pointer to structpb.Value, which should be set to NullValue.

jsonpb/jsonpb.go Outdated
return u.unmarshalValue(target.Elem(), inputValue, prop)
}

if jsu, ok := target.Addr().Interface().(JSONPBUnmarshaler); ok {
return jsu.UnmarshalJSONPB(u, []byte(inputValue))
}

// Handle well-known types.
// Handle well-known types that are not null => Go nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/not null => Go nil/not pointers./

jsonpb/jsonpb.go Outdated
target.Field(1).SetInt(0)
return nil
}

unq, err := strconv.Unquote(ivStr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor. May as well inline string(inputValue) here.

jsonpb/jsonpb.go Outdated
target.Field(1).SetInt(0)
return nil
}

unq, err := strconv.Unquote(ivStr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor. May as well inline string(inputValue) here.

@@ -553,12 +553,12 @@ var unmarshalingTests = []struct {
{"camelName input", Unmarshaler{}, `{"oBool":true}`, &pb.Simple{OBool: proto.Bool(true)}},

{"Duration", Unmarshaler{}, `{"dur":"3.000s"}`, &pb.KnownTypes{Dur: &durpb.Duration{Seconds: 3}}},
{"null Duration", Unmarshaler{}, `{"dur":null}`, &pb.KnownTypes{Dur: &durpb.Duration{Seconds: 0}}},
{"null Duration", Unmarshaler{}, `{"dur":null}`, &pb.KnownTypes{Dur: nil}},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

&pb.KnownTypes{} should be good enough. Same for ones below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless you have particular strong feelings about it, I prefer to leave it with the explicit field name/nil, since it makes the intent of the tests more readable. What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I think the intent is clear, i.e. the field is not being set. Also, I fear that some linter tool may flag/change this to simplify the notation. I've seen some PRs to simplify code.

Anyways, I don't have a strong opinion on this and am ok if you think it should be there. Something that I wish to do if I get a chance is to clean up the structure of these tests, i.e. it'll be good to specify field names for all these test cases and hence also have more line breaks to make it more readable, e.g.
{
desc: "null Duration",
unmarshaler: ...,
...
},

// `null` is also a permissible value. Let's just test one.
{"null DoubleValue", Unmarshaler{}, `{"dbl":null}`, &pb.KnownTypes{Dbl: &wpb.DoubleValue{}}},

// ensure that `null` as a value ends up with a nil pointer instead of a [type]Value struct
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor. Use full statement for comments, i.e. capitalized first word if possible and end with period .

@cspahr-tulipretail
Copy link
Contributor Author

I've made all suggested changes except for the one to the test cases---please see response on review comment.

Copy link
Contributor

@cybrcodr cybrcodr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comment fix.

jsonpb/jsonpb.go Outdated
@@ -639,23 +639,25 @@ func (u *Unmarshaler) unmarshalValue(target reflect.Value, inputValue json.RawMe

// Allocate memory for pointer fields.
if targetType.Kind() == reflect.Ptr {
// If input value is "null" and target is a pointer type, then the field should be treated as not set
// UNLESS it the target is structpb.Value, in which case it should be set to structpb.NullValue.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/UNLESS it/UNLESS if/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, embarrassing! Actually I think it's better with just "UNLESS the".

@@ -553,12 +553,12 @@ var unmarshalingTests = []struct {
{"camelName input", Unmarshaler{}, `{"oBool":true}`, &pb.Simple{OBool: proto.Bool(true)}},

{"Duration", Unmarshaler{}, `{"dur":"3.000s"}`, &pb.KnownTypes{Dur: &durpb.Duration{Seconds: 3}}},
{"null Duration", Unmarshaler{}, `{"dur":null}`, &pb.KnownTypes{Dur: &durpb.Duration{Seconds: 0}}},
{"null Duration", Unmarshaler{}, `{"dur":null}`, &pb.KnownTypes{Dur: nil}},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I think the intent is clear, i.e. the field is not being set. Also, I fear that some linter tool may flag/change this to simplify the notation. I've seen some PRs to simplify code.

Anyways, I don't have a strong opinion on this and am ok if you think it should be there. Something that I wish to do if I get a chance is to clean up the structure of these tests, i.e. it'll be good to specify field names for all these test cases and hence also have more line breaks to make it more readable, e.g.
{
desc: "null Duration",
unmarshaler: ...,
...
},

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants