Day 2 of 100 Days of AI

I spent bits of today watching Youtube videos about linear regression. One of the best ones I found was this StatQuest video on the topic. I enjoyed his style so much, I ordered his introductory book on Machine Learning. I’m planning a digital fast over a long weekend in the second quarter of the year, and I’ll take the book with me to continue learning offline.

Today I learnt about a new concept: Overfitting.

Key takeaways today:

  • Overfitting: This is when a machine learning model fits too strictly to the training data that it was trained on. So much so, that the model doesn’t generalize well to data outside of the training set (i.e. new data.)
  • Why does overfitting happen? A few drivers:
    • Small training data set — limited training examples might throw up patterns that don’t account for the true distribution of the data.
    • High model complexity — having too many parameters might throw up noise or patterns that are not useful;
    • Biased data — this could throw up patterns that are not generalizable across wider a population of data.