You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add functionality for early stopping rounds. (#193)
* add functionality for early stopping
* remove version word
* evaluation msg into a parsing function and add back evaluation to updateone
* Updated the call to updateone! to pass in the watchlist so it can be used by early stopping round logic.
* Added comments, additional examples, fixed issues with watchlist ordering as a Dict.
* Added functionality to extract the best iteration round with examples. Included additional test case coverage.
* Cleaned up some lingering test cases.
* Updated doc to include early stopping example.
* Added additional info on data types for watchlist
* Annotated OrderedDict to be more obvious.
* Included using statement for OrderedCollection
* Moved log message parsing to update! instead of updateone
* Updated documentation and tests.
* Altered the XGBoost method definition to reflect exception states for early stopping rounds and watchlist.
* Created exception if extract_metric_value could not find a match when parsing XGBoost logs.
---------
Co-authored-by: Wilan Wong <[email protected]>
Co-authored-by: wilan-wong-1 <[email protected]>
Copy file name to clipboardExpand all lines: docs/src/index.md
+41Lines changed: 41 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -127,6 +127,7 @@ Unlike feature data, label data can be extracted after construction of the `DMat
127
127
[`XGBoost.getlabel`](@ref).
128
128
129
129
130
+
130
131
## Booster
131
132
The [`Booster`](@ref) object holds model data. They are created with training data. Internally
132
133
this is always a `DMatrix` but arguments will be automatically converted.
@@ -182,3 +183,43 @@ is equivalent to
182
183
bst =xgboost((X, y), num_round=10)
183
184
update!(bst, (X, y), num_round=10)
184
185
```
186
+
187
+
### Early Stopping
188
+
To help prevent overfitting to the training set, it is helpful to use a validation set to evaluate against to ensure that the XGBoost iterations continue to generalise outside training loss reduction. Early stopping provides a convenient way to automatically stop the
189
+
boosting process if it's observed that the generalisation capability of the model does not improve for `k` rounds.
190
+
191
+
If there is more than one element in watchlist, by default the last element will be used. In this case, you must use an ordered data structure (`OrderedDict`) compared to a standard unordered dictionary otherwise an exception will be generated. There will be
192
+
a warning if you want to execute early stopping mechanism (`early_stopping_rounds > 0`) but have provided a watchlist with type `Dict` with
193
+
more than 1 element.
194
+
195
+
Similarly, if there is more than one element in eval_metric, by default the last element will be used.
0 commit comments