Partition the feature space into rectangles and fit simple model (constant) in each one.
Advantage: • Interpretability, tree defines a set of rules which are easy to follow. • Handles missing data. • Can take both continuous and categorical variables as input.
Drawbacks: • Deep trees have high variance and low bias. • Small trees have low variance and high bias. • New data might completely change the shape of the tree.
Draw B bootstrap samples of data and build a model on each bootstrap sample. • Idea: Empirical distribution mimics the distribution X is drawn from. • Average predictions or take majority vote. Advantage: • Handles missing data. • Can take both continuous and categorical variables as input. • Averages out noise (lowers variance). Drawbacks: • Does nothing to the bias. • Variance limited by the correlation between trees


