Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding support for Accelerators to GCE clusters. #45130

Merged
merged 1 commit into from
May 5, 2017

Conversation

vishh
Copy link
Contributor

@vishh vishh commented Apr 28, 2017

Create clusters with GPUs in GKE by specifying "type=<gpu-type>,count=<gpu-count>" to NODE_ACCELERATORS env var.
List of available GPUs - https://cloud.google.com/compute/docs/gpus/#introduction

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Apr 28, 2017
@k8s-reviewable
Copy link

This change is Reviewable

@k8s-github-robot k8s-github-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. release-note Denotes a PR that will be considered when it comes time to generate release notes. labels Apr 28, 2017
@vishh vishh assigned dashpole and unassigned gmarek and jszczepkowski Apr 28, 2017
@vishh vishh requested a review from dchen1107 April 28, 2017 23:00
@vishh
Copy link
Contributor Author

vishh commented Apr 28, 2017

@dashpole will you be able to review this PR?

local attempt=1
while true; do
echo "Attempt ${attempt} to create ${1}" >&2
if ! ${gcloud} compute instance-templates create \
if ! ${gcloud} beta compute instance-templates create \
Copy link
Contributor Author

@vishh vishh Apr 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mikedanese @zmerlynn any objections towards switching OSS to using beta APIs?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we do similar to this PR in gke? cc @roberthbailey

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a change list out for GKE already.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mikedanese can I assume no objections towards this change then?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not from me. Thanks.

@@ -55,6 +58,11 @@ if [[ "${NODE_OS_DISTRIBUTION}" == "cos" ]]; then
NODE_OS_DISTRIBUTION="gci"
fi

# GPUs supported in GCE do not have compatible drivers in Debian 7.
if [[ "${NODE_OS_DISTRIBUTION}" == "debian" ]]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are blacklisting debian for GPU support. Should we instead whitelist gci? May be better if we want to add more OS distributions (ubuntu?) in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ubuntu should work as well. Hence the blacklist.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, in that case this is a good approach.

# VMs with Accelerators cannot be live migrated.
# More details here - https://cloud.google.com/compute/docs/gpus/add-gpus#create-new-gpu-instance
if [[ ! -z "${NODE_ACCELERATORS}" ]]; then
accelerator_args="--maintenance-policy TERMINATE --restart-on-failure --accelerator ${NODE_ACCELERATORS}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it matter that terminate is all caps? Looks like the documentation has it lowercase...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

jk, found the full example here

@dashpole
Copy link
Contributor

code changes lgtm. feel free to add the label once you figure out beta apis question.

@mikedanese
Copy link
Member

/approve

@vishh
Copy link
Contributor Author

vishh commented May 5, 2017

adding lgtm tag based on #45130 (comment)

/lgtm

@vishh vishh added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 5, 2017
@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mikedanese, vishh

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@k8s-github-robot
Copy link

Automatic merge from submit-queue (batch tested with PRs 44830, 45130)

@k8s-github-robot k8s-github-robot merged commit d4f9271 into kubernetes:master May 5, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants