Skip to content

Commit

Permalink
feat(sagemaker): add Endpoint L2 construct (#22886)
Browse files Browse the repository at this point in the history
This is the third and final PR to complete the implementation of RFC 431:
aws/aws-cdk-rfcs#431

closes #2809

----

### All Submissions:

* [x] Have you followed the guidelines in our [Contributing guide?](https://github.com/aws/aws-cdk/blob/main/CONTRIBUTING.md)

### Adding new Unconventional Dependencies:

* [ ] This PR adds new unconventional dependencies following the process described [here](https://github.com/aws/aws-cdk/blob/main/CONTRIBUTING.md/#adding-new-unconventional-dependencies)

### New Features

* [x] Have you added the new feature to an [integration test](https://github.com/aws/aws-cdk/blob/main/INTEGRATION_TESTS.md)?
	* [x] Did you use `yarn integ` to deploy the infrastructure and generate the snapshot (i.e. `yarn integ` without `--dry-run`)?

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*

----

Co-authored-by: Matt McClean <[email protected]>
Co-authored-by: Long Yao <[email protected]>
Co-authored-by: Drew Jetter <[email protected]>
Co-authored-by: Murali Ganesh <[email protected]>
Co-authored-by: Abilash Rangoju <[email protected]>
  • Loading branch information
6 people authored Nov 25, 2022
1 parent c594918 commit bf7586b
Show file tree
Hide file tree
Showing 33 changed files with 5,848 additions and 0 deletions.
67 changes: 67 additions & 0 deletions packages/@aws-cdk/aws-sagemaker/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -195,3 +195,70 @@ const endpointConfig = new sagemaker.EndpointConfig(this, 'EndpointConfig', {
]
});
```

### Endpoint

When you create an endpoint from an `EndpointConfig`, Amazon SageMaker launches the ML compute
instances and deploys the model or models as specified in the configuration. To get inferences from
the model, client applications send requests to the Amazon SageMaker Runtime HTTPS endpoint. For
more information about the API, see the
[InvokeEndpoint](https://docs.aws.amazon.com/sagemaker/latest/dg/API_runtime_InvokeEndpoint.html)
API. Defining an endpoint requires at minimum the associated endpoint configuration:

```typescript
import * as sagemaker from '@aws-cdk/aws-sagemaker';

declare const endpointConfig: sagemaker.EndpointConfig;

const endpoint = new sagemaker.Endpoint(this, 'Endpoint', { endpointConfig });
```

### AutoScaling

To enable autoscaling on the production variant, use the `autoScaleInstanceCount` method:

```typescript
import * as sagemaker from '@aws-cdk/aws-sagemaker';

declare const model: sagemaker.Model;

const variantName = 'my-variant';
const endpointConfig = new sagemaker.EndpointConfig(this, 'EndpointConfig', {
instanceProductionVariants: [
{
model: model,
variantName: variantName,
},
]
});

const endpoint = new sagemaker.Endpoint(this, 'Endpoint', { endpointConfig });
const productionVariant = endpoint.findInstanceProductionVariant(variantName);
const instanceCount = productionVariant.autoScaleInstanceCount({
maxCapacity: 3
});
instanceCount.scaleOnInvocations('LimitRPS', {
maxRequestsPerSecond: 30,
});
```

For load testing guidance on determining the maximum requests per second per instance, please see
this [documentation](https://docs.aws.amazon.com/sagemaker/latest/dg/endpoint-scaling-loadtest.html).

### Metrics

To monitor CloudWatch metrics for a production variant, use one or more of the metric convenience
methods:

```typescript
import * as sagemaker from '@aws-cdk/aws-sagemaker';

declare const endpointConfig: sagemaker.EndpointConfig;

const endpoint = new sagemaker.Endpoint(this, 'Endpoint', { endpointConfig });
const productionVariant = endpoint.findInstanceProductionVariant('my-variant');
productionVariant.metricModelLatency().createAlarm(this, 'ModelLatencyAlarm', {
threshold: 100000,
evaluationPeriods: 3,
});
```
Loading

0 comments on commit bf7586b

Please sign in to comment.