Selfie segmentation with ML Kit on iOS

ML Kit provides an optimized SDK for selfie segmentation. The Selfie Segmenter assets are statically linked to your app at build time. This will increase your app size by up to 24MB and the API latency can vary from ~7ms to ~12ms depending on the input image size, as measured on iPhone X.

Try it out

Play around with the sample app to see an example usage of this API.

Before you begin

Include the following ML Kit libraries in your Podfile:
```
pod 'GoogleMLKit/SegmentationSelfie', '7.0.0'
```
After you install or update your project’s Pods, open your Xcode project using its .xcworkspace. ML Kit is supported in Xcode version 13.2.1 or higher.

1. Create an instance of Segmenter

To perform segmentation on a selfie image, first create an instance of Segmenter with SelfieSegmenterOptions and optionally specify the segmentation settings.

Segmenter options

Segmenter Mode

The Segmenter operates in two modes. Be sure you choose the one that matches your use case.

STREAM_MODE (default)

This mode is designed for streaming frames from video or camera. In this mode, the segmenter will leverage results from previous frames to return smoother segmentation results.

SINGLE_IMAGE_MODE (default)

This mode is designed for single images that are not related. In this mode, the segmenter will process each image independently, with no smoothing over frames.

Enable raw size mask

Asks the segmenter to return the raw size mask which matches the model output size.

The raw mask size (e.g. 256x256) is usually smaller than the input image size.

Without specifying this option, the segmenter will rescale the raw mask to match the input image size. Consider using this option if you want to apply customized rescaling logic or rescaling is not needed for your use case.

Specify the segmenter options:

Swift

let options = SelfieSegmenterOptions()
options.segmenterMode = .singleImage
options.shouldEnableRawSizeMask = true

Objective-C

MLKSelfieSegmenterOptions *options = [[MLKSelfieSegmenterOptions alloc] init];
options.segmenterMode = MLKSegmenterModeSingleImage;
options.shouldEnableRawSizeMask = YES;

Finally, get an instance of Segmenter. Pass the options you specified:

Swift

let segmenter = Segmenter.segmenter(options: options)

Objective-C

MLKSegmenter *segmenter = [MLKSegmenter segmenterWithOptions:options];

2. Prepare the input image

To segment selfies, do the following for each image or frame of video. If you enabled stream mode, you must create VisionImage objects from CMSampleBuffers.

Create a VisionImage object using a UIImage or a CMSampleBuffer.

If you use a UIImage, follow these steps:

Create a VisionImage object with the UIImage. Make sure to specify the correct .orientation.
Swift
let image = VisionImage(image: UIImage) visionImage.orientation = image.imageOrientation
Objective-C
MLKVisionImage *visionImage = [[MLKVisionImage alloc] initWithImage:image]; visionImage.orientation = image.imageOrientation;
If you use a CMSampleBuffer, follow these steps:
- Specify the orientation of the image data contained in the CMSampleBuffer.
  
  To get the image orientation:
  Swift
  
  func imageOrientation( deviceOrientation: UIDeviceOrientation, cameraPosition: AVCaptureDevice.Position ) -> UIImage.Orientation { switch deviceOrientation { case .portrait: return cameraPosition == .front ? .leftMirrored : .right case .landscapeLeft: return cameraPosition == .front ? .downMirrored : .up case .portraitUpsideDown: return cameraPosition == .front ? .rightMirrored : .left case .landscapeRight: return cameraPosition == .front ? .upMirrored : .down case .faceDown, .faceUp, .unknown: return .up } }
  
  Objective-C
  
  - (UIImageOrientation) imageOrientationFromDeviceOrientation:(UIDeviceOrientation)deviceOrientation cameraPosition:(AVCaptureDevicePosition)cameraPosition { switch (deviceOrientation) { case UIDeviceOrientationPortrait: return cameraPosition == AVCaptureDevicePositionFront ? UIImageOrientationLeftMirrored : UIImageOrientationRight; case UIDeviceOrientationLandscapeLeft: return cameraPosition == AVCaptureDevicePositionFront ? UIImageOrientationDownMirrored : UIImageOrientationUp; case UIDeviceOrientationPortraitUpsideDown: return cameraPosition == AVCaptureDevicePositionFront ? UIImageOrientationRightMirrored : UIImageOrientationLeft; case UIDeviceOrientationLandscapeRight: return cameraPosition == AVCaptureDevicePositionFront ? UIImageOrientationUpMirrored : UIImageOrientationDown; case UIDeviceOrientationUnknown: case UIDeviceOrientationFaceUp: case UIDeviceOrientationFaceDown: return UIImageOrientationUp; } }
- Create a VisionImage object using the CMSampleBuffer object and orientation:
  Swift
  
  let image = VisionImage(buffer: sampleBuffer) image.orientation = imageOrientation( deviceOrientation: UIDevice.current.orientation, cameraPosition: cameraPosition)
  
  Objective-C
  
  MLKVisionImage *image = [[MLKVisionImage alloc] initWithBuffer:sampleBuffer]; image.orientation = [self imageOrientationFromDeviceOrientation:UIDevice.currentDevice.orientation cameraPosition:cameraPosition];
  3. Process the image
  
  Pass the VisionImage object to one of the Segmenter's image processing methods. You can either use the asynchronous process(image:) method or the synchronous results(in:) method.
  
  To perform segmentation on a selfie image synchronously:
  Swift
  
  var mask: [SegmentationMask] do { mask = try segmenter.results(in: image) } catch let error { print("Failed to perform segmentation with error: \(error.localizedDescription).") return } // Success. Get a segmentation mask here.
  
  Objective-C
  
  NSError *error; MLKSegmentationMask *mask = [segmenter resultsInImage:image error:&error]; if (error != nil) { // Error. return; } // Success. Get a segmentation mask here.
  To perform segmentation on a selfie image asynchronously:
  Swift
  
  segmenter.process(image) { mask, error in guard error == nil else { // Error. return } // Success. Get a segmentation mask here.
  
  Objective-C
  
  [segmenter processImage:image completion:^(MLKSegmentationMask * _Nullable mask, NSError * _Nullable error) { if (error != nil) { // Error. return; } // Success. Get a segmentation mask here. }];
  4. Get the segmentation mask
  
  You can get the segmentation result as follows:
  Swift
  
  let maskWidth = CVPixelBufferGetWidth(mask.buffer) let maskHeight = CVPixelBufferGetHeight(mask.buffer) CVPixelBufferLockBaseAddress(mask.buffer, CVPixelBufferLockFlags.readOnly) let maskBytesPerRow = CVPixelBufferGetBytesPerRow(mask.buffer) var maskAddress = CVPixelBufferGetBaseAddress(mask.buffer)!.bindMemory( to: Float32.self, capacity: maskBytesPerRow * maskHeight) for _ in 0...(maskHeight - 1) { for col in 0...(maskWidth - 1) { // Gets the confidence of the pixel in the mask being in the foreground. let foregroundConfidence: Float32 = maskAddress[col] } maskAddress += maskBytesPerRow / MemoryLayout<Float32>.size }
  
  Objective-C
  
  size_t width = CVPixelBufferGetWidth(mask.buffer); size_t height = CVPixelBufferGetHeight(mask.buffer); CVPixelBufferLockBaseAddress(mask.buffer, kCVPixelBufferLock_ReadOnly); size_t maskBytesPerRow = CVPixelBufferGetBytesPerRow(mask.buffer); float *maskAddress = (float *)CVPixelBufferGetBaseAddress(mask.buffer); for (int row = 0; row < height; ++row) { for (int col = 0; col < width; ++col) { // Gets the confidence of the pixel in the mask being in the foreground. float foregroundConfidence = maskAddress[col]; } maskAddress += maskBytesPerRow / sizeof(float); }
  For a full example of how to use the segmentation results, please see the ML Kit quickstart sample.
  
  Tips to improve performance
  
  The quality of your results depends on the quality of the input image:
  - For ML Kit to get an accurate segmentation result, the image should be at least 256x256 pixels.
  - If you perform selfie segmentation in a real-time application, you might also want to consider the overall dimensions of the input images. Smaller images can be processed faster, so to reduce latency, capture images at lower resolutions, but keep in mind the above resolution requirements and ensure that the subject occupies as much of the image as possible.
  - Poor image focus can also impact accuracy. If you don't get acceptable results, ask the user to recapture the image.
  If you want to use segmentation in a real-time application, follow these guidelines to achieve the best frame rates:
  - Use the stream segmenter mode.
  - Consider capturing images at a lower resolution. However, also keep in mind this API's image dimension requirements.
  - For processing video frames, use the results(in:) synchronous API of the segmenter. Call this method from the AVCaptureVideoDataOutputSampleBufferDelegate's captureOutput(_, didOutput:from:) function to synchronously get results from the given video frame. Keep AVCaptureVideoDataOutput's alwaysDiscardsLateVideoFrames as true to throttle calls to the segmenter. If a new video frame becomes available while the segmenter is running, it will be dropped.
  - If you use the output of the segmenter to overlay graphics on the input image, first get the result from ML Kit, then render the image and overlay in a single step. By doing so, you render to the display surface only once for each processed input frame. See the previewOverlayView and CameraViewController classes in the ML Kit quickstart sample for an example.
  Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
  
  Last updated 2024-11-21 UTC.
  Connect
  
  Blog
  
  Instagram
  
  LinkedIn
  
  X (Twitter)
  
  YouTube
  
  Programs
  
  Women Techmakers
  
  Google Developer Groups
  
  Google Developer Experts
  
  Accelerators
  
  Google Developer Student Clubs
  
  Developer consoles
  
  Google API Console
  
  Google Cloud Platform Console
  
  Google Play Console
  
  Firebase Console
  
  Actions on Google Console
  
  Cast SDK Developer Console
  
  Chrome Web Store Dashboard
  
  Google Home Developer Console
  Android
  
  Chrome
  
  Firebase
  
  Google Cloud Platform
  
  Google AI
  
  All products
  
  Terms
  
  Privacy
  
  Sign up for the Google for Developers newsletter Subscribe
  
  English
  
  Deutsch
  
  Español
  
  Español – América Latina
  
  Français
  
  Indonesia
  
  Italiano
  
  Polski
  
  Português – Brasil
  
  Tiếng Việt
  
  Türkçe
  
  Русский
  
  עברית
  
  العربيّة
  
  فارسی
  
  हिंदी
  
  বাংলা
  
  ภาษาไทย
  
  中文 – 简体
  
  中文 – 繁體
  
  日本語
  
  한국어

Selfie segmentation with ML Kit on iOS

Try it out

Before you begin

1. Create an instance of Segmenter

Segmenter options

Segmenter Mode

Enable raw size mask

Swift

Objective-C

Swift

Objective-C

2. Prepare the input image

Swift

Objective-C

Swift

Objective-C

Swift

Objective-C

3. Process the image

Swift

Objective-C

Swift

Objective-C

4. Get the segmentation mask

Swift

Objective-C

Tips to improve performance