[Go] Unable to JSON marshal float64 arrays which contain a NaN value #40563

xxgreg · 2024-03-14T20:08:32Z

array.Float64.MarshallJSON() returns the following error for the example code below:

json: unsupported value: NaN

package main

import (
	"fmt"
	"math"

	"github.com/apache/arrow/go/v16/arrow/array"
	"github.com/apache/arrow/go/v16/arrow/memory"
)

func main() {

	builder := array.NewFloat64Builder(memory.DefaultAllocator)
	builder.Append(math.NaN())

	array_ := builder.NewArray()

	bs, err := array_.MarshalJSON()
	if err != nil {
		fmt.Println("error:", err)
	}

	fmt.Println(string(bs))
}

Consider updating the marshalling code for Float64 and other floating point array types with something like:

func (a *array.Float64) MarshalJSON() ([]byte, error) {
	vals := make([]any, a.Len())
	for i := 0; i < a.Len(); i++ {
		if !a.IsValid(i) {
			vals[i] = nil
			continue
		}

		f := a.Value(i)
		switch {
		case math.IsNaN(f):
			vals[i] = "NaN"
		case math.IsInf(f, 1):
			vals[i] = "+Inf"
		case math.IsInf(f, -1):
			vals[i] = "-Inf"
		default:
			vals[i] = f
		}
	}

	return json.Marshal(vals)
}

Component(s)

Go

The text was updated successfully, but these errors were encountered:

zeroshade · 2024-03-19T18:23:27Z

Hmm, if we do this we'd probably need to also update the corresponding UnmarshalJSON code to properly handle NaN so that we're still symmetrical. I'll mark this as a "good first issue" for someone to pick up

verma-kartik · 2024-04-07T18:55:28Z

@zeroshade @xxgreg I would like to make the above changes. I can assign myself via take right?
Also, I tried the changes on local, and the change is working as expected. I observed that:

There are not any unit tests for MarshalJSON() and UnmarshalJSON, shall I also add them to for this issue?
/go/arrow/array has separate files for handling Float16 data type, but Float32, Float64, and others are together in numeric.gen.go. Is there an explicit reason to do so or just technical debt to break numeric.gen.go in multiple files?

zeroshade · 2024-04-07T19:57:47Z

Hey @verma-kartik, using take would indeed assign yourself to this issue. It would also assign you automatically if you file a PR containing GH-30563 in it's title.

As for your observations:

There are not any unit tests for MarshalJSON() and UnmarshalJSON, shall I also add them to for this issue?

While there aren't tests that explicitly call MarshalJSON and UnmarshalJSON they are both indirectly tested via https://github.com/apache/arrow/blob/main/go/arrow/array/util_test.go and https://github.com/apache/arrow/blob/main/go/arrow/array/json_reader_test.go which call FromJSON, use json.Marshal and use a json.Reader respectively. So you should be able to add the relevant tests for this into one of those files.

/go/arrow/array has separate files for handling Float16 data type, but Float32, Float64, and others are together in numeric.gen.go. Is there an explicit reason to do so or just technical debt to break numeric.gen.go in multiple files?

The reason why Float16 is separate from those is due to the fact that the code gen has to handle float16.Num as opposed to just using float16 like it can do with the other data types which are native. Ideally, the technical debt here would be to replace the numeric.gen.go file to utilize Go generics instead.

verma-kartik · 2024-04-08T04:09:27Z

take

verma-kartik · 2024-04-08T04:11:48Z

@zeroshade Thanks for pointing me in the right direction for tests.

verma-kartik · 2024-04-08T16:56:12Z

@zeroshade @xxgreg math.inf() is of type float64. I have two approaches on how to handle these for float32:
Either, typecast the value to float64 before checking for math.IsInf(),

...
case math.IsInf(float64(f), 1):
      vals[i] = "+Inf"
...

Or, use the math.Float32frombits() to interpret the uint32 value of IEEE-752 of +Inf as float32,

...
case math.Float32frombits(0x7F800000) == f:
      vals[i] = "+Inf"
...

Similarly for handling NaN and -Inf. Which one (or neither) is feasible?

For float16, in, there is Inf() but not NegInf(). It can be added,

...
func NegInf() Num { return Num{bits: 0xFC00} }
...

zeroshade · 2024-04-08T19:00:37Z

Personally, I would just use strconv.FormatFloat which will properly handle the +Inf, -Inf and NaN cases for 32 or 64 bit floats

…ain a NaN value

…NaN value (#41109) ### Rationale for this change From this [issue](#40563 (comment)). ### What changes are included in this PR? MarshalJSON() is changed for Float32, Float64 to handle NaN, +Inf and -Inf values. I was only able to handle NaN for Float16. UnmarshalJSON() was already handling NaN, +Inf and -Inf in the right way. ### Are these changes tested? These changes are tested. Tests are included in those files where I find it was logically appropriate to do so. ### Are there any user-facing changes? They will be able to handle corner values for Float32, Float64 now. * GitHub Issue: #40563 Lead-authored-by: verma-kartik <[email protected]> Co-authored-by: work-vermakartik <[email protected]> Co-authored-by: Matt Topol <[email protected]> Signed-off-by: Matt Topol <[email protected]>

zeroshade · 2024-04-19T15:59:13Z

Issue resolved by pull request 41109
#41109

xxgreg · 2024-04-20T01:43:22Z

🎉 Thanks team! @verma-kartik @zeroshade

…NaN value (#41109) ### Rationale for this change From this [issue](#40563 (comment)). ### What changes are included in this PR? MarshalJSON() is changed for Float32, Float64 to handle NaN, +Inf and -Inf values. I was only able to handle NaN for Float16. UnmarshalJSON() was already handling NaN, +Inf and -Inf in the right way. ### Are these changes tested? These changes are tested. Tests are included in those files where I find it was logically appropriate to do so. ### Are there any user-facing changes? They will be able to handle corner values for Float32, Float64 now. * GitHub Issue: #40563 Lead-authored-by: verma-kartik <[email protected]> Co-authored-by: work-vermakartik <[email protected]> Co-authored-by: Matt Topol <[email protected]> Signed-off-by: Matt Topol <[email protected]>

…ain a NaN value (apache#41109) ### Rationale for this change From this [issue](apache#40563 (comment)). ### What changes are included in this PR? MarshalJSON() is changed for Float32, Float64 to handle NaN, +Inf and -Inf values. I was only able to handle NaN for Float16. UnmarshalJSON() was already handling NaN, +Inf and -Inf in the right way. ### Are these changes tested? These changes are tested. Tests are included in those files where I find it was logically appropriate to do so. ### Are there any user-facing changes? They will be able to handle corner values for Float32, Float64 now. * GitHub Issue: apache#40563 Lead-authored-by: verma-kartik <[email protected]> Co-authored-by: work-vermakartik <[email protected]> Co-authored-by: Matt Topol <[email protected]> Signed-off-by: Matt Topol <[email protected]>

…NaN value (#41109) ### Rationale for this change From this [issue](apache/arrow#40563 (comment)). ### What changes are included in this PR? MarshalJSON() is changed for Float32, Float64 to handle NaN, +Inf and -Inf values. I was only able to handle NaN for Float16. UnmarshalJSON() was already handling NaN, +Inf and -Inf in the right way. ### Are these changes tested? These changes are tested. Tests are included in those files where I find it was logically appropriate to do so. ### Are there any user-facing changes? They will be able to handle corner values for Float32, Float64 now. * GitHub Issue: #40563 Lead-authored-by: verma-kartik <[email protected]> Co-authored-by: work-vermakartik <[email protected]> Co-authored-by: Matt Topol <[email protected]> Signed-off-by: Matt Topol <[email protected]>

xxgreg added the Type: bug label Mar 14, 2024

github-actions bot added the Component: Go label Mar 14, 2024

zeroshade added the good-first-issue label Mar 19, 2024

github-actions bot assigned verma-kartik Apr 8, 2024

verma-kartik pushed a commit to verma-kartik/arrow that referenced this issue Apr 9, 2024

apacheGH-40563: [Go] Unable to JSON marshal float64 arrays which cont…

8aee618

…ain a NaN value

verma-kartik mentioned this issue Apr 9, 2024

GH-40563: [Go] Unable to JSON marshal float64 arrays which contain a NaN value #41109

Merged

zeroshade added this to the 16.1.0 milestone Apr 19, 2024

zeroshade closed this as completed Apr 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Go] Unable to JSON marshal float64 arrays which contain a NaN value #40563

[Go] Unable to JSON marshal float64 arrays which contain a NaN value #40563

xxgreg commented Mar 14, 2024 •

edited

Loading

zeroshade commented Mar 19, 2024

verma-kartik commented Apr 7, 2024 •

edited

Loading

zeroshade commented Apr 7, 2024

verma-kartik commented Apr 8, 2024

verma-kartik commented Apr 8, 2024

verma-kartik commented Apr 8, 2024 •

edited

Loading

zeroshade commented Apr 8, 2024 •

edited

Loading

zeroshade commented Apr 19, 2024

xxgreg commented Apr 20, 2024

[Go] Unable to JSON marshal float64 arrays which contain a NaN value #40563

[Go] Unable to JSON marshal float64 arrays which contain a NaN value #40563

Comments

xxgreg commented Mar 14, 2024 • edited Loading

Component(s)

zeroshade commented Mar 19, 2024

verma-kartik commented Apr 7, 2024 • edited Loading

zeroshade commented Apr 7, 2024

verma-kartik commented Apr 8, 2024

verma-kartik commented Apr 8, 2024

verma-kartik commented Apr 8, 2024 • edited Loading

zeroshade commented Apr 8, 2024 • edited Loading

zeroshade commented Apr 19, 2024

xxgreg commented Apr 20, 2024

xxgreg commented Mar 14, 2024 •

edited

Loading

verma-kartik commented Apr 7, 2024 •

edited

Loading

verma-kartik commented Apr 8, 2024 •

edited

Loading

zeroshade commented Apr 8, 2024 •

edited

Loading