-
-
Notifications
You must be signed in to change notification settings - Fork 26.5k
Description
The following code segment should be adjusted:
self.capacity = node_ndarray.shape[0]
if self._resize_c(self.capacity) != 0:It should instead be just if self._resize_c(node_ndarray.shape[0]) != 0:. Allow me to explain why:
In the resize_c function, the initial check is if capacity == self.capacity and self.nodes != NULL: . If we are editing an existing tree (which is probably bad practice, but it has several use cases so it might as well be supported for people that know what they're doing right?), the nodes are not NULL. This means that, because we pass self.capacity as an argument, it will evaluate to true and thus return 0 without resizing.
If the replacement I suggest is implemented, the resize will be executed and - as an effect of the resize function - the capacity will still be updated, but at the right moment.
Currently trying to load the state of one (large) tree into the state of another (small) tree will result in a segfault on certain operations.
scikit-learn/sklearn/tree/_tree.pyx
Line 699 in 8ea2997
| self.capacity = node_ndarray.shape[0] |