Skip to content

#2415 Fix precision issue in tree split descriptions#2721

Open
Shashank1202 wants to merge 1 commit intocatboost:masterfrom
Shashank1202:fix/tree-split-precision
Open

#2415 Fix precision issue in tree split descriptions#2721
Shashank1202 wants to merge 1 commit intocatboost:masterfrom
Shashank1202:fix/tree-split-precision

Conversation

@Shashank1202
Copy link

  • Created a new method get_full_precision_tree_splits in core.py to retrieve tree split descriptions with 8 significant digits.

This change addresses issue #2415, improving the precision of split descriptions in tree visualizations.

@Shashank1202
Copy link
Author

Hey can anyone guide me where to add test cases, I went through the document , but couldn't make it.

Your guidance would be highly appreciated!!!

Copy link

@RahulVadisetty91 RahulVadisetty91 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

The create_dir_if_not_exist function could be improved to handle exceptions more efficiently, such as when a directory already exists to avoid a race condition in multi-threaded environments. You could use os.makedirs() with exist_ok=True (Python 3.2+) to avoid manual checks.

def create_dir_if_not_exist(path):
os.makedirs(path, exist_ok=True)

return True

### Function to add full Precision split
def get_full_precision_splits(self, tree_idx, pool= None):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new function get_full_precision_splits is defined as a standalone function in core.py, but it tries to access self._object. Since it's not a method of a class, self will be undefined, causing a NameError when the function is called. Functions operating on the model object should typically be methods within the CatBoost class (or its base class) to have access to self.

Possibly just an indentation issue

@a-holm
Copy link

a-holm commented Apr 4, 2025

Just a note. The PR description mentions creating a method named get_full_precision_tree_splits, but the implemented function is named get_full_precision_splits.

Copy link
Member

@andrey-khropov andrey-khropov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It won't work like this because _get_tree_splits returns already prepared strings.

The function that generates these descriptions is implemented in C++ and called here.

I suggest adding an additional parameter to it (add up in the call stack in python package functions) that specifies the format for printing floating point values. I suggest to use a well-known printf notation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants