Skip to content

Adds model weights pruning#16256

Draft
Burhan-Q wants to merge 46 commits intomainfrom
prune
Draft

Adds model weights pruning#16256
Burhan-Q wants to merge 46 commits intomainfrom
prune

Conversation

@Burhan-Q
Copy link
Contributor

@Burhan-Q Burhan-Q commented Sep 12, 2024

Summary

This was something I experimented with a while ago but had poor results due to global pruning methods. Thanks to @lordofkillz for sharing his experiments, pruning now has only a minor accuracy drop with a minor gain in inference speed.

Note

Saved model weights appear to have a slightly larger file size than original weights. Unclear why this occurs.

Example

from ultralytics import YOLO
from ultralytics.utils.torch_utils import prune_model

# Load trained model
model = YOLO("yolov8m.pt")

prune_model(model, 0.3)  # model pruned in-place

>>> Model sparsity achieved 29.96% from 0.00%

model.save("yolov8m-sparse-30.pt", False)

Performance

Testing on COCO128 using YOLOv8m shows a dip in performance when pruning at a target of $0.3$ from the original weights, only showing summary (all) and first five class results for brevity.

Normal model validation

yolo val model="yolov8m.pt" data="coco128.yaml" batch=1
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95):
                   all       5000      36335      0.716       0.61      0.667      0.501
                person       2693      10777      0.821      0.745      0.829      0.616
               bicycle        149        314      0.742      0.525      0.626      0.402
                   car        535       1918      0.765      0.637      0.713      0.498
            motorcycle        159        367      0.811      0.678      0.793      0.547
              airplane         97        143       0.84      0.884      0.925      0.776

Speed: 0.3ms preprocess, 8.7ms inference, 0.0ms loss, 1.1ms postprocess per image
yolo val model="yolov8m-sparse-30.pt" data="coco128.yaml" batch=1

Pruned model validation

                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95):
                   all       5000      36335      0.706      0.595      0.653      0.489
                person       2693      10777       0.89      0.659      0.819      0.609
               bicycle        149        314      0.692      0.544      0.613      0.386
                   car        535       1918       0.79      0.595      0.704      0.487
            motorcycle        159        367      0.796      0.689      0.781       0.54
              airplane         97        143      0.821      0.902      0.916      0.767

Speed: 0.3ms preprocess, 8.7ms inference, 0.0ms loss, 1.1ms postprocess per image

Repeated inference evaluation results

Test inference on directory of images repeatedly and average overall inference runtime.

model type AVG inference time (ms)
Normal 17.570
Pruned @30% 17.492
Repeated evaluation code

import timeit

from ultralytics import YOLO
from ultralytics.utils.torch_utils import prune_model

N = 17
R = 5
im = ASSETS / 'bus.jpg'
img = cv.imread(im)
img_dir = Path(r"Q:\datasets\coco128\images\valid")
# Load all images
images = [cv.imread(f) for f in img_dir.iterdir() if f.is_file()]

normal_model = YOLO("yolov8m.pt")
sparse_model = YOLO("yolov8m-sparse-30.pt")

def infer_sparse_model():
    sparse_model.predict(images, verbose=False)

def infer_normal_model():
    normal_model.predict(images, verbose=False)

if __name__ == '__main__':

    _ = normal_model.predict([img] * 15)  # warmup
    print(
        "Normal model averaged inference time:",
        round(sum(timeit.repeat(infer_normal_model, number=N, repeat=R)) / R, 3)
    )
    
    _ = sparse_model.predict([img] * 15)  # warmup
    print(
        "Sparse model averaged inference time:",
        round(sum(timeit.repeat(infer_sparse_model, number=N, repeat=R)) / R, 3)
    )

Environment info

Ultralytics YOLOv8.2.92 🚀 Python-3.10.9 torch-2.3.1+cu121 CUDA:0 (NVIDIA GeForce RTX 3080, 12288MiB)
Setup complete ✅ (12 CPUs, 31.9 GB RAM, 778.9/1863.0 GB disk)

OS                  Windows-10-10.0.19045-SP0
Environment         Windows
Python              3.10.9
Install             git
RAM                 31.86 GB
CPU                 Intel Core(TM) i5-10600K 4.10GHz
CUDA                12.1

matplotlib          ✅ 3.8.1>=3.3.0
opencv-python       ✅ 4.8.1.78>=4.6.0
pillow              ✅ 9.3.0>=7.1.2
pyyaml              ✅ 6.0.1>=5.3.1
requests            ✅ 2.31.0>=2.23.0
scipy               ✅ 1.11.3>=1.4.1
torch               ✅ 2.3.1+cu121>=1.8.0
torchvision         ✅ 0.18.1+cu121>=0.9.0
tqdm                ✅ 4.66.1>=4.64.0
psutil              ✅ 5.9.6
py-cpuinfo          ✅ 9.0.0
thop                ✅ 0.1.1-2209072238>=0.1.1
pandas              ✅ 2.1.3>=1.1.4
seaborn             ✅ 0.13.0>=0.11.0

🛠️ PR Summary

Made with ❤️ by Ultralytics Actions

🌟 Summary

New functions for model pruning and zero-count testing were added to improve efficiency and analysis.

📊 Key Changes

  • Added zero_count function to count zero-valued parameters in PyTorch models.
  • Introduced prune_model function for L1 unstructured pruning of convolutional layers.
  • Implemented substr_in_set function to check substring presence in a set.
  • Added tests for these new functions to ensure proper functionality.

🎯 Purpose & Impact

  • Efficiency: Pruning reduces model size, potentially improving performance and reducing resource usage. 🏋️‍♀️
  • Analysis: zero_count helps in assessing model sparsity, which is useful for optimization. 📊
  • Flexibility: The ability to exclude certain layers from pruning allows for more controlled and informed pruning strategies. 🎯

@Burhan-Q Burhan-Q added the enhancement New feature or request label Sep 12, 2024
@Burhan-Q Burhan-Q self-assigned this Sep 12, 2024
@UltralyticsAssistant UltralyticsAssistant added the detect Object Detection issues, PR's label Sep 12, 2024
@UltralyticsAssistant
Copy link
Member

👋 Hello @Burhan-Q, thank you for submitting a PR to the ultralytics/ultralytics repository! 🚀 This is an automated response. An Ultralytics engineer will review your PR shortly.

To ensure a smooth integration of your work, please review the following checklist:

  • Define a Purpose: Clearly explain the purpose of your changes. It seems you have already done a great job describing your pruning improvements and performance results. Make sure your commit messages adhere to project conventions.
  • Synchronize with Source: Ensure your PR is up-to-date with the main branch. If not, rebase or merge the latest changes.
  • Verify CI Checks: Confirm all Continuous Integration (CI) checks are passing. Fix any issues if they arise.
  • Update Documentation: Modify relevant documentation for any new or updated features you introduced.
  • Include Tests: If applicable, update or provide new tests that cover your changes, ensuring all tests pass successfully.
  • Sign the CLA: If this is your first contribution, please sign the Contributor License Agreement by commenting, "I have read the CLA Document and I sign the CLA" below.

For more details, please refer to our Contributing Guide. Feel free to comment if you have any questions. Thank you for enhancing Ultralytics with your contributions! 🎉

@codecov
Copy link

codecov bot commented Sep 12, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 69.87%. Comparing base (0ae9ee8) to head (2d68019).
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #16256      +/-   ##
==========================================
+ Coverage   69.84%   69.87%   +0.02%     
==========================================
  Files         129      129              
  Lines       17095    17112      +17     
==========================================
+ Hits        11940    11957      +17     
  Misses       5155     5155              
Flag Coverage Δ
Benchmarks 34.45% <30.00%> (-0.02%) ⬇️
GPU 36.10% <30.00%> (-0.02%) ⬇️
Tests 66.32% <100.00%> (+0.03%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Burhan-Q Burhan-Q requested a review from Laughing-q September 13, 2024 12:28
@Laughing-q Laughing-q self-assigned this Oct 9, 2024
@github-actions
Copy link

👋 Hello there! We wanted to let you know that we've decided to close this pull request due to inactivity. We appreciate the effort you put into contributing to our project, but unfortunately, not all contributions are suitable or aligned with our product roadmap.

We hope you understand our decision, and please don't let it discourage you from contributing to open source projects in the future. We value all of our community members and their contributions, and we encourage you to keep exploring new projects and ways to get involved.

For additional resources and information, please see the links below:

Thank you for your contributions to YOLO 🚀 and Vision AI ⭐

@github-actions github-actions bot added the Stale Stale and schedule for closing soon label Mar 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

detect Object Detection issues, PR's enhancement New feature or request Stale Stale and schedule for closing soon

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants