Hyperparameter Tuning: Finding the Sweet Spot
In ML, there is a difference between Parameters (the internal numbers the model learns on its own, like the slope in linear regression) and Hyperparameters (the settings you choose before training, like K in KNN, or max_depth in a Decision Tree).
How do you find the best hyperparameters? You could guess and check, but a better way is Grid Search. You give the algorithm a list of possible values for each hyperparameter, and it tries every single combination, using Cross-Validation to see which combo works best!
Use GridSearchCV to find the best settings for a Random Forest.
from sklearn.model_selection import GridSearchCV
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_iris
X, y = load_iris(return_X_y=True)
# The grid of hyperparameters to try
param_grid = {
'n_estimators': [10, 50, 100],
'max_depth': [None, 2, 5]
}
# TODO: Initialize GridSearchCV with a RandomForestClassifier, the param_grid, and cv=3
# grid_search = ???
# TODO: Fit the grid search to the data
# grid_search.???
# print(f"Best parameters found: {grid_search.best_params_}")
# print(f"Best cross-validation score: {grid_search.best_score_:.2f}")Remember the "No Free Lunch" theorem: No single model or set of hyperparameters is guaranteed to work best for every problem. You always have to experiment with your specific dataset!