+2 votes
in Machine Learning by (73.8k points)
recategorized by

By default, sklearn's GridSearchCV() writes its outputs to the console. I want to save the output of each iteration in an output file. How can I put the following outputs to a file?

Fitting 5 folds for each of 1 candidates, totalling 5 fits
[CV] colsample_bytree=0.5, eta=0.05, gamma=0.01, learning_rate=0.1, max_depth=7, min_child_weight=0, subsample=0.8
[CV]  colsample_bytree=0.5, eta=0.05, gamma=0.01, learning_rate=0.1, max_depth=7, min_child_weight=0, subsample=0.8, score=0.961, total=   4.4s
[CV] colsample_bytree=0.5, eta=0.05, gamma=0.01, learning_rate=0.1, max_depth=7, min_child_weight=0, subsample=0.8
[CV]  colsample_bytree=0.5, eta=0.05, gamma=0.01, learning_rate=0.1, max_depth=7, min_child_weight=0, subsample=0.8, score=1.000, total=   2.9s
[CV] colsample_bytree=0.5, eta=0.05, gamma=0.01, learning_rate=0.1, max_depth=7, min_child_weight=0, subsample=0.8
[CV]  colsample_bytree=0.5, eta=0.05, gamma=0.01, learning_rate=0.1, max_depth=7, min_child_weight=0, subsample=0.8, score=0.994, total=   2.9s
[CV] colsample_bytree=0.5, eta=0.05, gamma=0.01, learning_rate=0.1, max_depth=7, min_child_weight=0, subsample=0.8
[CV]  colsample_bytree=0.5, eta=0.05, gamma=0.01, learning_rate=0.1, max_depth=7, min_child_weight=0, subsample=0.8, score=0.950, total=   2.8s
[CV] colsample_bytree=0.5, eta=0.05, gamma=0.01, learning_rate=0.1, max_depth=7, min_child_weight=0, subsample=0.8
[CV]  colsample_bytree=0.5, eta=0.05, gamma=0.01, learning_rate=0.1, max_depth=7, min_child_weight=0, subsample=0.8, score=0.984, total=   2.9s
 

1 Answer

+1 vote
by (349k points)
selected by
 
Best answer

You need to use sys.stdout as the file handler to write outputs of GridSearchCV() to a file. In the following code, I have used XGBclassifer() for the GridSearch(). The outputs will be saved in 'tune.tsv' instead of displaying them on the console.

import sys
clf = xgb.XGBClassifier(scale_pos_weight=testData.posWeight)
random_search = GridSearchCV(clf,
    param_grid=paramGrid,
    scoring='roc_auc', cv=5,
    verbose=3)

#save the outputs of gridsearch
sys.stdout = open('tune.tsv', 'w')
random_search.fit(trainData.features, trainData.labels)

sys.stdout.close()


...