Skip to content

Deprecate text file input. #9472

Open
Open
@trivialfis

Description

I propose we deprecate the use of text file input and remove the text parsers in XGBoost, including the libsvm parser and csv parser from dmlc core. Nowadays, there's a wealth of third-party libraries focus on feature engineering that can handle these formats with high efficiency. Loading the data inside XGBoost does not provide much value as users are likely need to perform tasks like cross-validation and hyper-parameter optimization.

At the moment, there are three use cases for the text input:

  • CLI. I propose the removal of the CLI in Deprecate the command line interface. #9471 .
  • External memory: We have largely replaced the external memory with a custom data iterator. Even with text input, the underlying implementation uses a data iterator.
  • Federated learning: I believe we will move toward memory input as we progress for better integration with frameworks like nvflare.

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions