Supervised machine learning problems where the goal is to predict a numeric value are called regression. Problems where the goal is to predict a category / label are called classification.

This document explains core differences, formulas, symbols, real-world examples, diagrams (SVG + JS), and practical tips.

Key differences — quick table

Aspect	Classification	Regression
Target type	Discrete labels (e.g., {spam, not-spam})	Continuous numerical values (e.g., price)
Loss examples	Cross-entropy, Hinge loss	Mean Squared Error (MSE), MAE
Evaluation	Accuracy, Precision, Recall, F1, ROC-AUC	RMSE, MAE, R² (coefficient of determination)
Output activation	Softmax / Sigmoid	Linear
Typical models	Logistic regression, SVM, Decision Trees, Random Forest, Neural Nets (with softmax)	Linear regression, Ridge/Lasso, Decision Trees, Random Forest, Neural Nets (linear output)

Mathematical definitions & formulas

Below we show simple core formulas with clear explanation of every symbol.

Regression — Simple linear regression model

y = \beta_0 + \beta_1 x

y = predicted numeric value (target).
x = input feature (one-dimensional example).
\beta_0 = intercept (bias), \beta_1 = slope (weight for x).

Loss — Mean Squared Error (MSE)

MSE = \frac{1}{n} \sum_{i=1}^n (y_i - \hat{y}_i)^2

Symbols: n = number of samples, y_i = true value, \hat{y}_i = model prediction.

Classification — Logistic regression (binary)

P(y=1|x) = \sigma(z) = \frac{1}{1 + e^{-z}} , \quad z = w^T x + b

P(y=1|x) = probability the label is 1 given x.
\sigma(\cdot) = sigmoid activation.
z = linear score, with w (weights vector), b (bias).

Loss — Binary cross-entropy (log loss)

L = -\frac{1}{n} \sum_{i=1}^n \left[ y_i \log(\hat{p}_i) + (1-y_i) \log(1-\hat{p}_i) \right]

Symbols: \hat{p}_i = predicted probability P(y=1|x_i).

Real-world examples

Classification: Email spam detection (spam vs not-spam), Medical diagnosis (disease present / absent), Image recognition (cat/dog).
Regression: House price prediction (USD), Predicting temperature tomorrow (°C), Estimating age from continuous features.

Example: For hospital triage — classification to decide 'urgent' vs 'non-urgent' and regression to estimate 'expected wait time in minutes'.

Visual demos (interactive)

Below are two SVG diagrams. Use the toggle to see classification vs regression.

Evaluation metrics — formulas and meanings

Regression

RMSE = \sqrt{\frac{1}{n}\sum_{i=1}^n (y_i - \hat{y}_i)^2}

Root Mean Squared Error — lower is better; same units as y.

Classification (binary) — Confusion matrix terms

Accuracy = \frac{TP + TN}{TP + TN + FP + FN}

TP = true positives, TN = true negatives, FP = false positives, FN = false negatives.

Precision / Recall / F1

Precision = \frac{TP}{TP + FP} \quad Recall = \frac{TP}{TP + FN} \quad F1 = 2 \cdot \frac{Precision \cdot Recall}{Precision + Recall}

Use precision/recall when class imbalance matters (e.g., disease detection where positive class is rare).

Practical tips for choosing models

If target is numeric & continuous → regression family. Start with linear regression, then try tree-based or neural nets if non-linear.
If target is categorical → classification family. For many categories, use softmax output (multiclass). For imbalance, consider resampling or class weights.
Feature scaling: For models like SVM or logistic regression, standardize features. For tree-based models, scaling is less critical.
Regularization: Ridge/Lasso for regression, L2/L1 for classification to prevent overfitting.

Practical code examples (Python / scikit-learn)

These snippets show how the task setup differs in code.

Regression example

from sklearn.linear_model import LinearRegression
model = LinearRegression()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

Classification example (binary)

from sklearn.linear_model import LogisticRegression
clf = LogisticRegression()
clf.fit(X_train, y_train)
probs = clf.predict_proba(X_test)[:,1]
y_pred = clf.predict(X_test)

Note: predict_proba returns probabilities; choose threshold (default 0.5) or tune via ROC.

Symbols & glossary

x — input feature vector (single feature x or vector \(\mathbf{x}\)).
y — true target; numeric for regression, discrete label for classification.
\hat{y} — model prediction (numeric or label).
\hat{p} — predicted probability for a positive class.
\beta, w — model parameters (weights).
b, \beta_0 — bias / intercept.
n — sample size.

Muhtasari (Kiswahili)

Masuala ya supervised machine learning ambapo tunataka kutabiri thamani ya namba (kama gharama au joto) ni regression. Pale tunapotaka kutabiri kitengo/label (kama spam/ham au aina ya ugonjwa) ni classification.

Hati hii inaeleza tofauti, fomula, mfano wa maisha halisi, michoro kwa SVG + JS, na vidokezo vya vitendo.

Tofauti kuu

Sehemu	Classification	Regression
Aina ya lengo	Label za diskreti (mfano: {spam, not-spam})	Thamani za namba zinazoendelea (mfano: bei)
Hasara (loss)	Cross-entropy, Hinge	MSE, MAE
Vigezo vya tathmini	Accuracy, Precision, Recall, F1, ROC-AUC	RMSE, MAE, R²
Activation ya output	Softmax / Sigmoid	Linear
Mifano ya kawaida	Logistic regression, SVM, Decision Tree, Random Forest	Linear regression, Ridge/Lasso, Random Forest

Fomula za msingi

Regression (mistari rahisi)

y = \beta_0 + \beta_1 x

Maana: y ni thamani inayotabiriwa, x ni sifa (feature), \beta ni uzito.

MSE

MSE = \frac{1}{n} \sum_{i=1}^n (y_i - \hat{y}_i)^2

Classification (logistic binary)

P(y=1|x) = \sigma(z) = \frac{1}{1 + e^{-z}} , \quad z = w^T x + b

Mifano ya maisha

Classification: Kugundua barua pepe ya spam, uchunguzi wa ugonjwa (poa/si poa), kutambua picha (paka/mbwa).
Regression: Kutabiri bei ya nyumba, joto kesho, umri unatabiriwa kutoka kwa sifa (features).

Vipimo vya tathmini

Regression: RMSE, MAE — thamani ndogo ni nzuri.
Classification: Accuracy, Precision, Recall, F1 — tumia precision/recall wakati daraja mojawapo ni nadra.

Vidokezo vya vitendo

Kama lengo ni namba → regression. Anza na linear regression.
Kama lengo ni category → classification. Kwa daraja nyingi tumia softmax.
Scaling: kwa SVM/logistic, fanya standardization; kwa tree-hatarishi sio muhimu.
Regularization: tumia Ridge/Lasso au L2/L1 ili kuepuka overfitting.

Orodha ya ishara (symbols)

x — sifa (feature) au vector ya sifa.
y — lengo (target).
\hat{y} — utabiri wa model.
w, \beta — uzito wa model.

Created: Interactive single-page notes with diagrams, formulas and code snippets.

Machine Learning — Classification vs Regression (EN / SW)

Machine Learning — Classification vs Regression

Overview SUMMARY (English)

Supervised machine learning predicts either numeric values (regression) or categories/labels (classification).

Key differences — quick table

Aspect	Classification	Regression
Target type	Discrete labels (e.g., {spam, not-spam})	Continuous numeric values (e.g., price)
Loss examples	Cross-entropy, Hinge loss	Mean Squared Error (MSE), MAE
Evaluation	Accuracy, Precision, Recall, F1, ROC-AUC	RMSE, MAE, R²
Output activation	Softmax / Sigmoid	Linear
Typical models	Logistic regression, SVM, Decision Trees, Random Forest	Linear regression, Ridge/Lasso, Decision Trees, Random Forest

Mathematical definitions & formulas

Regression — Simple linear regression model

y = β₀ + β₁ x

y = predicted numeric value
x = input feature
β₀ = intercept (bias), β₁ = slope (weight)

Loss — Mean Squared Error (MSE)

MSE = (1/n) ∑_i=1ⁿ (y_i - ŷ_i)²

n = number of samples, y_i = true value, ŷ_i = predicted value

Classification — Logistic regression (binary)

P(y=1|x) = σ(z) = 1 / (1 + e^-z) , z = w^Tx + b

P(y=1|x) = probability label is 1, σ = sigmoid, z = linear score (weights w, bias b)

Loss — Binary cross-entropy

L = -(1/n) ∑_i=1ⁿ [y_i log(p̂_i) + (1 - y_i) log(1 - p̂_i)]

Real-world examples

Classification: Email spam detection, Medical diagnosis (disease present/absent), Image recognition (cat/dog)
Regression: House price prediction, Temperature prediction, Age estimation

Reference Book: N/A

Author name: SIR H.A.Mwala Work email: biasharaboraofficials@gmail.com
#MWALA_LEARN Powered by MwalaJS #https://mwalajs.biasharabora.com
#https://educenter.biasharabora.com

:: 1::

⬅ ➡

📰 Latest News & Learning resources

FREE AND REWARDED INTER-SCHOOL EXAMS COMPETITIONS PRO CHALLENGE LEAGUE 🏆 WILL BE STARTED AT THE END OF NOVEMBER AND WILL BE DONE IN MWALA-LEARN

11/12/2025

MAJINA WALIOITWA KWENYE USAILI AJIRA ZA MUDA - USIMAMIZI WA UCHAGUZI 2025*

10/5/2025

TAARIFA KWA WAMILIKI WA MADUKA NA BIASHARA

9/24/2025

TANGAZO LA NAFASI ZA KAZI HALMASHAURI YA WILAYA YA MAGU 12-09-2025

9/12/2025

NEWS FROM Higher Education Students' Loans Board

9/2/2025

MWALA_LEARN LIBRARY

MWALA_LEARN_PRE MOCK, MOCK & PRE NECTA WITH SOLUTION 2024.pdf

Machine Learning — Classification vs Regression

Machine Learning — Classification vs Regression

Overview (English)

Key differences — quick table

Mathematical definitions & formulas

Regression — Simple linear regression model

Loss — Mean Squared Error (MSE)

Classification — Logistic regression (binary)

Loss — Binary cross-entropy (log loss)

Real-world examples

Visual demos (interactive)

Evaluation metrics — formulas and meanings

Regression

Classification (binary) — Confusion matrix terms

Precision / Recall / F1

Practical tips for choosing models

Practical code examples (Python / scikit-learn)

Regression example

Classification example (binary)

Symbols & glossary

Muhtasari (Kiswahili)

Tofauti kuu

Fomula za msingi

Regression (mistari rahisi)

MSE

Classification (logistic binary)

Mifano ya maisha

Vipimo vya tathmini

Vidokezo vya vitendo

Orodha ya ishara (symbols)

Machine Learning — Classification vs Regression

Overview SUMMARY (English)

Key differences — quick table

Mathematical definitions & formulas

Regression — Simple linear regression model

Loss — Mean Squared Error (MSE)

Classification — Logistic regression (binary)

Loss — Binary cross-entropy

Real-world examples

Muhtasari (Kiswahili)

📰 Latest News & Learning resources

MWALA_LEARN LIBRARY

SCHOOL@ALL_LEVELS MATERIALS LIBRARY

Top-level Folder ID: 14bE03l23tkIhtIExStmWDrOUm9UpEsMn