Decision Trees

Type
Classification, Regression

Decision Trees are non-parametric supervised learning methods used for classification and regression. They use a tree-like graph of decisions to predict the value of a target variable.

Decision Tree Algorithms

CART
C4.5
C5.0
Key Components
Nodes: Represent a test on an attribute (feature).
Branches: Represent the outcome of the test.
Leaves: Represent the final class label or value.
Building the Tree (Recursive Partitioning)

The goal is to partition data into subsets that are as “pure” as possible.
Information Gain: Measures the reduction in Entropy (disorder) after a split. Higher gain is better.
Gini Impurity: A measure of how often a randomly chosen element would be incorrectly labeled. Used frequently in the CART algorithm.
Splitting Criteria: At each node, the algorithm selects the feature and split point that maximizes purity in the resulting child nodes.
Overfitting and Pruning

Decision trees can easily overfit by becoming too deep and capturing noise.
Pre-pruning: Stop growing the tree early (e.g., limit depth or minimum samples per leaf).
Post-pruning: Grow the full tree and then remove nodes that provide little predictive power.

Feature	Decision Trees	Boosting
Model Type	Single base learner	Ensemble of learners
Training	Fast, recursive	Sequential, slower
Complexity	Easy to interpret	High predictive power, “black box”
Variance	High (prone to overfitting)	Reduced via iterative refinement

Decision Trees

Key Components

Building the Tree (Recursive Partitioning)

Overfitting and Pruning