+4 votes
in Programming Languages by (17.0k points)
I want to store my training and test data into a Dmatrix object so that I can use them in the XGBoost model.

How can I create a DMatrix object for my train and test data?

1 Answer

0 votes
by (28.9k points)

The xgboost module has a function DMatrix(). You can use it to create DMatrix object. Here is an example:

import xgboost as xgb
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.metrics import roc_auc_score

# load breast cancer data
data = load_breast_cancer()
X = data.data
y = data.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=1)

# create DMatrix object
dtrain = xgb.DMatrix(X_train, y_train)
dtest = xgb.DMatrix(X_test)

# xgboost params
params = {
    'max_depth': 6,
    'subsample': 0.80,
    'silent': 1

# train the model
model = xgb.train(params, dtrain)
pred = model.predict(dtest)
print(roc_auc_score(y_test, pred))