Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow node label to be a flexible expression #75

Merged
merged 1 commit into from
Sep 11, 2020

Conversation

cpoole
Copy link
Contributor

@cpoole cpoole commented Apr 16, 2020

This PR addresses a new feature:

I do not see a way to provide a list of labels that can be OR'd together to select a node. Right now if you provide multiple labels it must match ALL of them.

This PR makes node labels into a user specifiable expression:

(node_labels['region'] == 'us-west-2' && node_labels['app'] == 'nginx') || (node_labels['region'] == 'us-west-2' && node_labels['foo'] == 'bar') || (node_labels['type'] == 'toolbox')

This is achieved through the use of https://github.com/antonmedv/expr

This PR adds the ability to flip draino into OR mode.

While doing this I realized when multiple labels are supplied that have the same key but different values only the last provided is used in the app: eg. --node-label=foo=bar --node-label=foo=elephant results in the *nodeLabels map only including {foo: elephant}

In order to fix this I parse them as an array of independent map[string]string and then let the Filter functions determine how to handle the deduping (required in the AND case to maintain backwards compatible behavior)

@codecov
Copy link

codecov bot commented Apr 16, 2020

Codecov Report

Merging #75 into master will increase coverage by 0.25%.
The diff coverage is 80.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #75      +/-   ##
==========================================
+ Coverage   73.92%   74.18%   +0.25%     
==========================================
  Files           7        7              
  Lines         464      488      +24     
==========================================
+ Hits          343      362      +19     
- Misses        113      116       +3     
- Partials        8       10       +2     
Impacted Files Coverage Δ
internal/kubernetes/drainSchedule.go 68.88% <0.00%> (-0.78%) ⬇️
internal/kubernetes/nodefilters.go 93.93% <85.71%> (-6.07%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cbce4e3...689c180. Read the comment docs.

@dbenque
Copy link
Contributor

dbenque commented Apr 16, 2020

I have the same problem but your solution would block me in fact because I need a mix of AND and OR.
I was about to start something where the flag can be repeated (OR) and in each repetition we can have a set of KEY=PAIR comma separated (AND).
This is possible because label value can't have a comma. The validation regexp for value is (([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])?

So for example:
--node-label=foo=bar,app=nginx --node-label=foo=bar,app=mysql --node-label=type=toolbox
would mean:
( foo=bar AND app=nginx) OR (foo=bar,app=mysql) OR (type=toolbox)

@cpoole
Copy link
Contributor Author

cpoole commented Apr 16, 2020

hah, I literally started with that!

But then i realized it would be a breaking change. People that are currently specifying node-label more than once are expecting the AND behavior.

I'm fine to tweak my PR to work the way you're proposing but maybe we should do a new flag? --node-label-v2

@jacobstr
Copy link
Contributor

jacobstr commented Apr 17, 2020

The --node-labels map roughly to the matchLabels behavior. Started going down the path of thinking about e.g. --node-match-expressions:

https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/#resources-that-support-set-based-requirements

--node-match-expressions="{key: node_type, operator: In, values: [ingress,control,worker]}"

Thats a rough angle I thought to offer up. I think, realistically, that the yaml encoded string encompassed in the command-line arg is kind of ugly.

I think in general, if folks are starting to specify more complex selectors a configuration file may start to be appropriate. The structure for these selectors would then mirror the kubernetes resources: leveraging data structures users are already familiar with as well as library functions in client-go.

Thus far, the answer has been to just use multiple draino instances for things like this, but that's not very user-centric admittedly.

On the other hand, my personal feeling is that the command line arguments are best suited for some happy path scenario.


I appreciate the effort for backwards compatibility here as well. I think I like the comma separated syntax more, but don't like the break in compatibility.

Why not keep the AND'ing behavior between --node-labels and use commas to indicate the OR'd alternatives?

@jacobstr
Copy link
Contributor

jacobstr commented Apr 17, 2020

Sounds like I'm repeating myself with the configuration file suggestion: #62 (comment)

@dbenque
Copy link
Contributor

dbenque commented Apr 17, 2020

Having the config file is clearly the way to more complex scenarii. I will probably work on it at some point.

Waiting for that a simple alternative can be to have 2 flags that are exclusive:
--node-label <-- current flag, with current behavior
--node-labels <-- new flag with comma separated option

repeating --node-label would be an AND
repeating --node-labels would be an OR between list of labels
any tentative to use both syntax at a time would be an error.

WDYT?

@jacobstr
Copy link
Contributor

jacobstr commented Apr 17, 2020

Wanted to draw attention to this. Why not just allow:

--node-label worker_type=ingress,worker --node-label region=us-west1

Which would evaluate as:

The node must be (worker_type=ingress OR worker_type=worker) AND (region=us-west1)

The comma separation in the --node-label is in a sense † net-new and therefore non breaking.

@dbenque your suggestion was to use commas for AND, I'm suggest using commas to indicate OR. I don't think one is more intuitive than the other. As long as the behavior is:

  • Clearly documented.
  • Backward compatible.

We should be g2g.

† Note: Possibly someone in the wild is using a comma literally, but I think that's highly unlikely and might be disallowed anyway by the valid chars in labels values.

  • Check that indeed one cannot created labels with commas in the value.
Invalid value: "thing,woot": a valid label must be an empty string or consist of alphanumeric characters, '-', '_' or '.', and must start and end with an alphanumeric character (e.g. 'MyValue',  or 'my_value',  or '12345', regex used for validation is '(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])?')

@cpoole
Copy link
Contributor Author

cpoole commented Apr 20, 2020

@jacobstr I agree a configfile is a better end state here. But I dont have the time for a refactor of that magnitude.

What you're proposing would be fine with me if the syntax was --node-label key=value,key=value. I think many people will need the OR condition to work for different label keys.

Who here can actually approve a PR/design? I'm not trying to put more work into this if a design cannot be solidified. I've currently forked and am running this branch in production.

@negz you have the collaborator flair, but also @jacobstr you seem like you have merge access?

@jacobstr
Copy link
Contributor

@cpoole I work at Planet and hold the keys these days. I think we're iterating on a good design here through the discussion so this debate isn't without merit.

Agreed re: keyFoo=value,keyBar=value for broadening the utility.

@cpoole
Copy link
Contributor Author

cpoole commented Apr 21, 2020

@jacobstr ok sounds good to me

So I'll go ahead and implement it using the following:

  • each independent --node-labels collection functions as an AND
  • within a --node-labels arg the multiple labels are OR'd together.

logic example:
--node-labels foo=bar,animal=dog --node-labels region=us-west-2

the following nodes would match:
Node1:
foo=bar
region=us-west-2

Node2:
animal=dog
region=us-west-2

the following nodes would not match:
Node1:
foo=no
region=us-west-2

Node2:
foo=bar
region=us-west-1

@dbenque
Copy link
Contributor

dbenque commented Apr 21, 2020

@cpoole @jacobstr , maybe a stupid question but with that semantic how would I do:

(region=us-west-1 AND app=nginx) OR (region=us-west-2 AND foo=bar) OR (type=toolbox)

@jacobstr
Copy link
Contributor

I'm rusty on my boolean algebra buuuut... I the gist of what you'd be trying to do is converting something that it's in disjunctive normal form (or of ands) to conjunctive normal form (and of ors).

I believe this is generally possible, though a bit messy and perhaps re-expressing things this way will obfuscate the original meaning of the expression.

Some comments here: https://math.stackexchange.com/questions/1710210/converting-formula-from-cnf-to-dnf

@dbenque
Copy link
Contributor

dbenque commented May 4, 2020

Such transformation (dnf/cnf) would make it completely obscure while with a simple OR there is a natural way of expressing group of labels that define the group of nodes.
I am afraid the syntax proposed here #75 (comment) would make it impossible to use for me and my users.

@cpoole
Copy link
Contributor Author

cpoole commented Aug 20, 2020

ok @dbenque and @jacobstr

I'm finally circling back to this. I thought about what what @dbenque was saying and I agree... the original proposal in this PR was too restrictive. I've redone the PR using https://github.com/antonmedv/expr as an expression compiler.

This allows for incredibly flexible use cases and I think will satisfy 99% of users.

However, This is now a fully breaking change... What would y'all like to do?

@cpoole cpoole changed the title Allow multiple labels to be AND or OR together Allow node label to be a flexible expression Aug 20, 2020
@jacobstr
Copy link
Contributor

I hadn't looked at the push script before and was hoping that we didn't publish the latest tag - might have made me bolder here. I suppose the land shifting and breaking beneath your feet is to be expected for folks relying on latest. Unfortunately, I don't have a mailing list of draino users to even attempt to provide the courtesy of a heads up here.

I'll find some time to review things. I suspect we might have some way to put this new syntax behind a new command line flag and have the old-style flag "converted" to your expression compiler to reduce duplication. That'd let us avoid any migration mess.

Copy link
Contributor

@jacobstr jacobstr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for doing this @cpoole. I've left some feedback that I think will provide some CYA in case new requirements (plzcani filter on annotations) come up as well as a path to grandfathering this without breaking compatibility.

internal/kubernetes/drainSchedule.go Outdated Show resolved Hide resolved
run.sh Outdated Show resolved Hide resolved
README.md Show resolved Hide resolved
internal/kubernetes/nodefilters.go Outdated Show resolved Hide resolved
cmd/draino/draino.go Outdated Show resolved Hide resolved
@jacobstr
Copy link
Contributor

@cpoole I'm going to check in around 9:30-3:00 PST tomorrow to drive this to a conclusion. Happy to get this merged vs continuing to linger in the bike shed.

@cpoole
Copy link
Contributor Author

cpoole commented Aug 21, 2020

Sounds good @jacobstr I'll address your comments tonight

@cpoole
Copy link
Contributor Author

cpoole commented Aug 21, 2020

ok @jacobstr updated and tested.


expression, err := expr.Compile(*expressionStr, expr.Env(parameters))
if err != nil {
log.Fatalf("Could not compile the node label expression: %v", err)
Copy link
Contributor

@jacobstr jacobstr Aug 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

log.Fatalf here is a bit unfortunate - it'll cause the program to exit but it's kind of buried in a function that shouldn't have this kind of side effect.

Can we error up front if the expression is invalid? It looks like we don't get the parameters until we've obtained a particular node but we can probably still compile the expression with an empty environment?

Curious what happens if the metadata.labels. does not exist for a specific expression as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Looks like we could compile the program with no env outside of the callback function body and fail early there.
  • Saw your test case with a field that does not exist.
  • This may be useful to ensure the Expr provided evaluates to a boolean: https://godoc.org/github.com/antonmedv/expr#AsBool

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

went ahead and compiled outside of the body... seems like I was planning to do that anyway since I changed the signature to return an err, but i must have gotten distracted 😆

@@ -21,15 +21,17 @@ import (
"testing"
"time"

"gotest.tools/assert"
core "k8s.io/api/core/v1"
meta "k8s.io/apimachinery/pkg/apis/meta/v1"
)

func TestNodeLabelFilter(t *testing.T) {
Copy link
Contributor

@jacobstr jacobstr Aug 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would prefer the original test cases intact since we still support the old syntax. Can we add the new ones as separate table-driven tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added the old ones back as a separate test

@cpoole
Copy link
Contributor Author

cpoole commented Aug 28, 2020

@jacobstr addressed your comments. The build is failing because linter is running out of memory. I dont see a way to retry a travis build. Any suggestions?

@jacobstr
Copy link
Contributor

I'll merge this in a moment assuming tests pass and a subsequent rebase on your branch should hopefully resolve it.

@cpoole
Copy link
Contributor Author

cpoole commented Aug 31, 2020

@jacobstr squashed and rebased. Lets put this saga to an end

@cpoole
Copy link
Contributor Author

cpoole commented Sep 9, 2020

@jacobstr bump

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants