Day 49 of 100 Days of AI

This morning I made a start on learning about “regularization” in machine learning.

When we train sophisticated machine models, sometimes they “overfit” the training data. In the example below from Andrew NG, you can see the third chart on the far right has a blue line (the model) that fits the training data perfectly. However, it has “overfit” the data—which is to say, it fits perfectly to the training data but at the expense of being able to make predictions about new data points that may fall outside of the prediction area.

Meanwhile, the first chart on the far left has a model that underfits our training data. It might not be so good at making predictions about new data points, too.

Regularization is a collection of techniques that can help us get a model that’s “just right”, like in the example above with the middle case. It simplifies models that are overly complex and helps us deal with the overfitting problem. More on this in the next few days.