Skip to content

[BUG]: compare_models in parallel mode fails when include parameter is set to empty list  #4028

Open
@kondziolka9ld

Description

@kondziolka9ld

pycaret version checks

Issue Description

Hi,
I came across an issue when I set include parameter (in compare_models function) to [] (however, I think it will fail when not model is trained as well). Then, compare_models fails with:

/venv/lib/python3.10/site-packages/pycaret/utils/generic.py:964: in wrapper
    return func(*args, **kwargs)
/venv/lib/python3.10/site-packages/pycaret/classification/functional.py:814: in compare_models
    return _CURRENT_EXPERIMENT.compare_models(
/venv/lib/python3.10/site-packages/pycaret/classification/oop.py:1180: in compare_models
    return_values = super().compare_models(
/venv/lib/python3.10/site-packages/pycaret/internal/pycaret_experiment/supervised_experiment.py:535: in compare_models
    return self._parallel_compare_models(parallel, caller_params, turbo=turbo)
/venv/lib/python3.10/site-packages/pycaret/internal/pycaret_experiment/supervised_experiment.py:292: in _parallel_compare_models
    return parallel.compare_models(self, params)
/venv/lib/python3.10/site-packages/pycaret/parallel/fugue_backend.py:166: in compare_models
    res = pd.concat(cloudpickle.loads(x[0]) for x in outputs)
/venv/lib/python3.10/site-packages/pandas/core/reshape/concat.py:382: in concat
    op = _Concatenator(
/venv/lib/python3.10/site-packages/pandas/core/reshape/concat.py:445: in __init__
    objs, keys = self._clean_keys_and_objs(objs, keys)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <pandas.core.reshape.concat._Concatenator object at 0x7551f8bedcc0>, objs = <generator object FugueBackend.compare_models.<locals>.<genexpr> at 0x7551f7f2e570>, keys = None

    def _clean_keys_and_objs(
        self,
        objs: Iterable[Series | DataFrame] | Mapping[HashableT, Series | DataFrame],
        keys,
    ) -> tuple[list[Series | DataFrame], Index | None]:
        if isinstance(objs, abc.Mapping):
            if keys is None:
                keys = list(objs.keys())
            objs_list = [objs[k] for k in keys]
        else:
            objs_list = list(objs)
    
        if len(objs_list) == 0:
>           raise ValueError("No objects to concatenate")
E           ValueError: No objects to concatenate

I found a comment: #3690 (comment)

@Haminh630 compare_models return empty list when model training fails. If you set errors = 'raise' in compare_models you will be able to see what is causing the failure.

Indeed, it returns empty list of models in case of non-parallel mode. So I expect that in parallel mode it should return empty list as well.

Reproducible Example

from pycaret.parallel import FugueBackend
import pycaret.classification as pc
from pycaret.datasets import get_data

pc.setup(
  data_func=lambda: get_data("juice", verbose=False),
  target="Purchase",
  session_id=0,
  n_jobs=1,
  verbose=False,
  html=False
)

fconf = {
  "fugue.rpc.server": "fugue.rpc.flask.FlaskRPCServer",
  "fugue.rpc.flask_server.host": "localhost",
  "fugue.rpc.flask_server.port": "3333",
  "fugue.rpc.flask_server.timeout": "2 sec",
}

pc.compare_models(include=[], parallel=FugueBackend("dask", fconf, display_remote=True, batch_size=3, top_only=False))

Expected Behavior

When no model is trained then empty list of models is returned.

Actual Results

: No objects to concatenate

Installed Versions

3.2.1

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions