Skip to content

Weights are being normalized using number of samples as opposed to sum in GaussianMixture #24085

@kshitijgoel007

Description

@kshitijgoel007

Describe the bug

Weights are being normalized at Line https://github.com/scikit-learn/scikit-learn/blob/main/sklearn/mixture/_gaussian_mixture.py#L718 using n_samples. It should be done using weights.sum() as
done in _m_step() here: https://github.com/scikit-learn/scikit-learn/blob/main/sklearn/mixture/_gaussian_mixture.py#L756.

Steps/Code to Reproduce

Weights are being normalized at Line https://github.com/scikit-learn/scikit-learn/blob/main/sklearn/mixture/_gaussian_mixture.py#L718 using n_samples. It should be done using weights.sum() as
done in _m_step() here: https://github.com/scikit-learn/scikit-learn/blob/main/sklearn/mixture/_gaussian_mixture.py#L756.

Expected Results

Correct weights.

Actual Results

Incorrect weights.

Versions

System:
    python: 3.9.13 (main, May 24 2022, 21:28:31)  [Clang 13.1.6 (clang-1316.0.21.2)]
executable: /Users/kshitijgoel/Documents/main/code.nosync/self_organizing_gmm/.venv/bin/python
   machine: macOS-12.4-x86_64-i386-64bit

Python dependencies:
      sklearn: 1.1.1
          pip: 22.2.1
   setuptools: 62.3.2
        numpy: 1.23.1
        scipy: 1.8.1
       Cython: 0.29.30
       pandas: 1.4.3
   matplotlib: 3.5.2
       joblib: 1.1.0
threadpoolctl: 3.1.0

Built with OpenMP: False

threadpoolctl info:
       user_api: blas
   internal_api: openblas
         prefix: libopenblas
       filepath: /Users/kshitijgoel/Documents/main/code.nosync/self_organizing_gmm/.venv/lib/python3.9/site-packages/numpy/.dylibs/libopenblas64_.0.dylib
        version: 0.3.20
threading_layer: pthreads
   architecture: Haswell
    num_threads: 8

       user_api: blas
   internal_api: openblas
         prefix: libopenblas
       filepath: /Users/kshitijgoel/Documents/main/code.nosync/self_organizing_gmm/.venv/lib/python3.9/site-packages/scipy/.dylibs/libopenblas.0.dylib
        version: 0.3.17
threading_layer: pthreads
   architecture: Haswell
    num_threads: 8

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugModerateAnything that requires some knowledge of conventions and best practicesmodule:mixture

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions