[fluentd-gcp addon] Trim too long log entries due to Stackdriver limitations #52289

crassirostris · 2017-09-11T17:01:23Z

Stackdriver doesn't support log entries bigger than 100KB, so by default fluentd plugin just drops such entries. To avoid that and increase the visibility of this problem it's suggested to trim long lines instead.

/cc @igorpeshansky

[fluentd-gcp addon] Fluentd will trim lines exceeding 100KB instead of dropping them.

k8s-ci-robot · 2017-09-11T17:01:24Z

@crassirostris: GitHub didn't allow me to request PR reviews from the following users: igorpeshansky.

Note that only kubernetes members can review this PR, and authors cannot review their own PRs.

In response to this:

Stackdriver doesn't support log entries bigger than 100KB, so by default fluentd plugin just drops such entries. To avoid that and increase the visibility of this problem it's suggested to trim long lines instead.

/cc @igorpeshansky
[fluentd-gcp addon] Fluentd will trim lines exceeding 100KB instead of dropping them.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

piosz

LGTM from me.
@igorpeshansky could you please take a look?

piosz · 2017-09-11T17:49:24Z

cluster/addons/fluentd-gcp/fluentd-gcp-configmap.yaml

@@ -332,6 +332,17 @@ data:
      </metric>
    </match>

+    # Trim the entries which exceed slightly less than 100KB, to avoid


Add todo to remove this hack.

igorpeshansky · 2017-09-11T23:08:24Z

cluster/addons/fluentd-gcp/fluentd-gcp-configmap.yaml

+      @type record_transformer
+      enable_ruby true
+      <record>
+        log ${record['log'].length > 100000 ? '[The entry was trimmed] ' + record['log'][0:100000 - 25] : record['log']}


There's no need to use 0:100000 - 25 because 100000 is already smaller than the 100k limit (102400).

Will this message break JSON encoded log lines?

Ack, agree that it's better to keep it clean

Will this message break JSON encoded log lines?

Yes, it will. It is a conscious decision, because the alternative is dropping this JSON entry

igorpeshansky · 2017-09-11T23:11:45Z

test/e2e/instrumentation/logging/stackdrvier/basic.go

+					if len(log.TextPayload) == originalLength {
+						return false, fmt.Errorf("got non-trimmed entry of length %d", len(log.TextPayload))
+					}
+					return true, nil


Since you'll be documenting the trimmed entry prefix, want to also check that it's being prepended to trimmed entries (i.e., check that the entry starts with [The entry was trimmed])?

I thought about that, it created a dependency on the string constant between the code and configuration

OTOH, it's better to explicitly change/remove the test instead of debugging the problem b/c of lack of this check

crassirostris · 2017-09-12T08:34:35Z

@piosz @igorpeshansky PTAL

igorpeshansky · 2017-09-12T11:21:09Z

test/e2e/instrumentation/logging/stackdrvier/basic.go

+					if len(log.TextPayload) == originalLength {
+						return false, fmt.Errorf("got non-trimmed entry of length %d", len(log.TextPayload))
+					}
+					if !strings.HasSuffix(log.TextPayload, trimSuffix) {


Umm, prefix?

I shrank the added part and moved it to the end of the message, because then it will clearly signalize that this is not an end of the original message

igorpeshansky

Expanded on my previous comment.

igorpeshansky · 2017-09-12T11:24:26Z

cluster/addons/fluentd-gcp/fluentd-gcp-configmap.yaml

@@ -332,6 +332,18 @@ data:
      </metric>
    </match>

+    # TODO(instrumetation): Reconsider this workaround later.


Typo: instrumetation.

Ack, thanks

igorpeshansky · 2017-09-12T11:28:41Z

cluster/addons/fluentd-gcp/fluentd-gcp-configmap.yaml

+      @type record_transformer
+      enable_ruby true
+      <record>
+        log ${record['log'].length > 100000 ? "#{record['log'][0..100000]}[trimmed]" : record['log']}


A suffix will cause the whole line to be parsed before we realize it's malformed JSON and revert to text. A prefix would short-circuit that, so it's more efficient. You could add both if it makes for better human readability (e.g., "[Trimmed long entry] {"a": "very long string...")...

Ack, agree, thanks

crassirostris · 2017-09-12T11:36:12Z

@igorpeshansky Done, PTAL

igorpeshansky

LGTM

piosz · 2017-09-12T18:18:41Z

cluster/addons/fluentd-gcp/fluentd-gcp-configmap.yaml

@@ -383,7 +395,7 @@ data:
      num_threads 2
    </match>
 metadata:
-  name: fluentd-gcp-config-v1.1
+  name: fluentd-gcp-config-v1.1.1


why not 1.2? the change is pretty significant and I believe just major and minor is enough for config purposes.

Done, thanks. Kept patch version to be consistent with the semver

piosz · 2017-09-12T18:26:26Z

/lgtm
/approve no-issue

crassirostris · 2017-09-12T19:28:10Z

/retest

…tation

piosz · 2017-09-13T08:55:21Z

/lgtm
/approve no-issue

k8s-github-robot · 2017-09-13T08:55:57Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: crassirostris, piosz

Associated issue requirement bypassed by: piosz

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

~~cluster/addons/fluentd-gcp/OWNERS~~ [crassirostris,piosz]
~~test/e2e/instrumentation/logging/OWNERS~~ [crassirostris,piosz]

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

k8s-ci-robot · 2017-09-13T09:20:48Z

@crassirostris: The following test failed, say /retest to rerun them all:

Test name	Commit	Details	Rerun command
pull-kubernetes-e2e-gce-bazel	`d8525f8`	link	`/test pull-kubernetes-e2e-gce-bazel`

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

k8s-github-robot · 2017-09-13T11:04:51Z

Automatic merge from submit-queue (batch tested with PRs 52316, 52289, 52375)

ericchiang · 2017-09-13T17:18:18Z

FYI the test modifications broke the e2e test: #52433

…-#52289-upstream-release-1.7 Automated cherry pick of #52289

heshamMassoud · 2019-11-14T12:04:19Z

Does this mean the rest of the message, the extra length (exceeding the 100KB) is dropped and not logged?

heshamMassoud · 2019-11-14T12:20:41Z

It also seems that the stackdriver quota for a logentry size is 256KB not 100KB https://cloud.google.com/logging/quotas

crassirostris added area/logging kind/bug Categorizes issue or PR as related to a bug. sig/instrumentation Categorizes an issue or PR as relevant to SIG Instrumentation. labels Sep 11, 2017

crassirostris added this to the v1.8 milestone Sep 11, 2017

crassirostris assigned piosz Sep 11, 2017

crassirostris requested a review from piosz September 11, 2017 17:01

k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Sep 11, 2017

k8s-github-robot added the release-note Denotes a PR that will be considered when it comes time to generate release notes. label Sep 11, 2017

piosz approved these changes Sep 11, 2017

View reviewed changes

igorpeshansky reviewed Sep 11, 2017

View reviewed changes

crassirostris force-pushed the sd-logging-trim-long-lines branch 2 times, most recently from 3aaa124 to a5e8915 Compare September 12, 2017 08:34

crassirostris force-pushed the sd-logging-trim-long-lines branch 2 times, most recently from b04ead3 to 7a76ca8 Compare September 12, 2017 09:28

igorpeshansky suggested changes Sep 12, 2017

View reviewed changes

crassirostris force-pushed the sd-logging-trim-long-lines branch from 7a76ca8 to cf93209 Compare September 12, 2017 11:35

igorpeshansky approved these changes Sep 12, 2017

View reviewed changes

crassirostris force-pushed the sd-logging-trim-long-lines branch from cf93209 to 8cec585 Compare September 12, 2017 13:39

piosz reviewed Sep 12, 2017

View reviewed changes

k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 12, 2017

crassirostris force-pushed the sd-logging-trim-long-lines branch from 8cec585 to 4761bb2 Compare September 12, 2017 18:25

k8s-github-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 12, 2017

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 12, 2017

k8s-github-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 12, 2017

crassirostris force-pushed the sd-logging-trim-long-lines branch 2 times, most recently from c9ef369 to d2ef703 Compare September 12, 2017 20:55

k8s-github-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 12, 2017

[fluentd-gcp addon] Trim too long log entries due to Stackdriver limi…

d8525f8

…tation

crassirostris force-pushed the sd-logging-trim-long-lines branch from d2ef703 to d8525f8 Compare September 13, 2017 08:27

crassirostris mentioned this pull request Sep 13, 2017

Automated cherry pick of #52289 #52404

Merged

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 13, 2017

wojtek-t modified the milestones: v1.7, v1.8 Sep 13, 2017

k8s-github-robot merged commit c9759ae into kubernetes:master Sep 13, 2017

ericchiang mentioned this pull request Sep 13, 2017

[e2e failure] Cluster level logging implemented by Stackdriver should ingest logs #52433

Closed

wojtek-t added a commit that referenced this pull request Sep 13, 2017

Merge pull request #52404 from crassirostris/automated-cherry-pick-of…

1d025de

…-#52289-upstream-release-1.7 Automated cherry pick of #52289

crassirostris mentioned this pull request Sep 18, 2017

Trim long lines in fluentd #40212

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[fluentd-gcp addon] Trim too long log entries due to Stackdriver limitations #52289

[fluentd-gcp addon] Trim too long log entries due to Stackdriver limitations #52289

crassirostris commented Sep 11, 2017

k8s-ci-robot commented Sep 11, 2017

piosz left a comment

piosz Sep 11, 2017

crassirostris Sep 12, 2017

igorpeshansky Sep 11, 2017

dominikschulz Sep 12, 2017

crassirostris Sep 12, 2017

crassirostris Sep 12, 2017

igorpeshansky Sep 11, 2017

crassirostris Sep 12, 2017

crassirostris commented Sep 12, 2017

igorpeshansky Sep 12, 2017

crassirostris Sep 12, 2017

igorpeshansky left a comment

igorpeshansky Sep 12, 2017

crassirostris Sep 12, 2017

igorpeshansky Sep 12, 2017

crassirostris Sep 12, 2017

crassirostris commented Sep 12, 2017

igorpeshansky left a comment

piosz Sep 12, 2017

crassirostris Sep 12, 2017

piosz commented Sep 12, 2017

crassirostris commented Sep 12, 2017

piosz commented Sep 13, 2017

k8s-github-robot commented Sep 13, 2017

k8s-ci-robot commented Sep 13, 2017

k8s-github-robot commented Sep 13, 2017

ericchiang commented Sep 13, 2017

heshamMassoud commented Nov 14, 2019 •

edited

Loading

heshamMassoud commented Nov 14, 2019

[fluentd-gcp addon] Trim too long log entries due to Stackdriver limitations #52289

[fluentd-gcp addon] Trim too long log entries due to Stackdriver limitations #52289

Conversation

crassirostris commented Sep 11, 2017

k8s-ci-robot commented Sep 11, 2017

piosz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

crassirostris commented Sep 12, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

igorpeshansky left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

crassirostris commented Sep 12, 2017

igorpeshansky left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

piosz commented Sep 12, 2017

crassirostris commented Sep 12, 2017

piosz commented Sep 13, 2017

k8s-github-robot commented Sep 13, 2017

k8s-ci-robot commented Sep 13, 2017

k8s-github-robot commented Sep 13, 2017

ericchiang commented Sep 13, 2017

heshamMassoud commented Nov 14, 2019 • edited Loading

heshamMassoud commented Nov 14, 2019

heshamMassoud commented Nov 14, 2019 •

edited

Loading