Skip to content

Cheatsheet: What machine learning actually is

Traditional programmingMachine learning
You provideThe rules (logic)Examples (inputs plus answers)
Machine providesExecutionThe learned rules
Intelligence lives inYour codeThe data
Best whenRule is simple and knownRule is too many, fuzzy, or unknown, and you have data
FamilyHas labels?GoalSub-types
SupervisedYesPredict the answer for new inputsRegression (number), Classification (category)
UnsupervisedNoFind structure in the dataClustering, Dimensionality reduction
Reinforcement (out of scope)No (uses reward)Learn by trial and errorn/a here
RegressionClassification
Answer typeA numberA category
ExamplesHouse price, temperature, units soldSpam / not-spam, which digit, fraud / legit
StepAskIf…
1Do I have labeled answers?No -> unsupervised, or not machine learning
2Is the answer a number or a category?Number -> regression; Category -> classification
SituationWhy a rule wins
A simple known rule worksExact, fast, fully explainable (e.g. sales tax)
No dataNothing to learn from
Unexplainable mistake unacceptableA statistical black box cannot be trusted
ClaimVerdict
”Perfect on the training data”Proves nothing
”Holds up on new, unseen data”The only test that counts
Model memorizes training dataLearned the noise, not the pattern
ProblemAnswer
Predict tomorrow’s temperatureSupervised, regression
Is this transaction fraud?Supervised, classification
Group shoppers, no categories givenUnsupervised, clustering
Compute sales taxNeither (write the rule)
Compress 200 survey questions to a few themesUnsupervised, dimensionality reduction