Skip to content

storage: timeout errors are not always wrapped when a func was retried #9720

@BrennaEpp

Description

@BrennaEpp

Errors are wrapped when calling the run function in invoke.go, when the error was retried and there was a context timeout/deadline exceeded.
For example, this is what an error looks like after the retry was exhausted with a timeout on obj.Attrs:
retry failed with context deadline exceeded; last error: googleapi: Error 400: ...

However, there are (at least) 2 cases when those errors are not wrapped:

  1. When MaxAttempts is reached; this is what an error looks like after the number of max attempts was exhausted:
    googleapi: Error 400: ...

  2. On object uploads with HTTP; This is what an error looks like after retry timeout exhausted on object write:
    context deadline exceeded

Here is repro code:

		var err error
		// This bucket name is too short, which will produce a 400 error.
		obj := client.Bucket("a").Object("content")

		// Always retry, so no calls will succeed on this handle.
		// Retries everything, including the 400 error we get because of the bucket name.
		obj = obj.Retryer(WithPolicy(RetryAlways),
			WithErrorFunc(func(err error) bool { return true }),
		)

		// Timeout on obj.Attrs - this error is wrapped correctly
		timeoutCtx, cancel := context.WithTimeout(ctx, time.Second*1)
		defer cancel()

		_, err = obj.Attrs(timeoutCtx)
		if err == nil {
			t.Fatalf("obj.Attrs should have failed but succeeded")
		}
		fmt.Printf("obj.Attrs times out: \n%v\n-\n", err)

		// Max attempts on obj.Attrs - this error is not wrapped
		objMaxAttempt := obj.Retryer(WithMaxAttempts(4))

		_, err = objMaxAttempt.Attrs(ctx)
		if err == nil {
			t.Fatalf("obj.Attrs should have failed but succeeded")
		}
		fmt.Printf("obj.Attrs max attempts run out: \n%v\n-\n", err)

		// Timeout on obj write - this error is not wrapped
		timeoutCtx, cancel = context.WithTimeout(ctx, time.Second*5)
		defer cancel()

		w := obj.NewWriter(timeoutCtx)
		_, err = io.Copy(w, strings.NewReader("It was the best of times, it was the worst of times."))
		if err != nil {
			t.Errorf("io.Copy: %v", err)
		}

		err = w.Close()
		if err == nil {
			t.Fatalf("w.Close should have failed but succeeded")
		}
		fmt.Printf("w.Close times out: \n%v\n", err)

I'm also concerned about timeouts that happen inside the call on L72 in invoke.go, in the unlucky case that the timeout is exhausted right before that call and the call checks the timeout and errors. I haven't looked at this code path to see if that is possible and if it would return an unwrapped error, but we should verify.

Metadata

Metadata

Assignees

Labels

api: storageIssues related to the Cloud Storage API.type: bugError or flaw in code with unintended results or allowing sub-optimal usage patterns.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions