Open
Description
I propose we deprecate the use of text file input and remove the text parsers in XGBoost, including the libsvm
parser and csv
parser from dmlc core. Nowadays, there's a wealth of third-party libraries focus on feature engineering that can handle these formats with high efficiency. Loading the data inside XGBoost does not provide much value as users are likely need to perform tasks like cross-validation and hyper-parameter optimization.
At the moment, there are three use cases for the text input:
- CLI. I propose the removal of the CLI in Deprecate the command line interface. #9471 .
- External memory: We have largely replaced the external memory with a custom data iterator. Even with text input, the underlying implementation uses a data iterator.
- Federated learning: I believe we will move toward memory input as we progress for better integration with frameworks like nvflare.
Activity