Lesson 16: Classification — Predicting Categories
Distinguish classification from regression; build a KNN classifier.
Classification: Predicting Categories
While regression predicts numbers, Classification predicts categories. Is this email spam or not? Is this image a cat or a dog?
K-Nearest Neighbors (KNN)
One of the most intuitive classification algorithms is K-Nearest Neighbors. It works on a simple principle: "You are the average of your closest friends."
If we plot our data points, KNN looks at the "K" closest data points to our new, unknown example. It takes a majority vote among those neighbors to decide the category. If K=5, and 3 neighbors are cats and 2 are dogs, the model predicts "Cat"!
Python Challenge: Find Your Neighbors!
Train a KNN classifier to predict whether a point is red (0) or blue (1).
from sklearn.neighbors import KNeighborsClassifier
# Features: [X-coordinate, Y-coordinate]
X = [[1, 2], [1.5, 1.8], [5, 8], [8, 8], [1, 0.6], [9, 11]]
# Labels: 0 for Red, 1 for Blue
y = [0, 0, 1, 1, 0, 1]
# TODO: Initialize KNeighborsClassifier with k=3
# knn = ???
# TODO: Fit the model and predict for a new point [2, 2]
# knn.???
# prediction = knn.predict([[2, 2]])
# print(f"Predicted class: {prediction[0]}")By changing the value of K, you can change how smooth or jagged the decision boundary between the classes becomes.