Day 80 of 100 Days of AI

I did two things today.

First, I got a Youtube summarizer to work. I followed a simple tutorial here. I will create an agent tool out of this, and also try and build a RAG process around it.

Second, I watched this lecture on “The Future of AI from the History of Transformer.” It’s by Hyung Won Chung, a research scientist at OpenAI who previously worked at Google Brain.

The key points of the talk stem from this chart in the presentation.

The dominant force driving progress in AI today is cheap computer power. The cost of computing is falling exponentially!

This force is so powerful that it reduces the need for overly complex AI algorithms. You can scale up models with cheaper compute and more data, producing excellent results even with simpler modelling methods that don’t rely on complex assumptions or inductive reasoning.

The practical implication is that since a ton of cheap compute enables simpler AI architectures that outperform their more complex counterparts, AI researchers should take advantage of this trend rather than try to be too clever.

For example, decoder-only models like GPT-3 outperform Google’s T5 encoder-decoder model. This isn’t to say that advanced algorithms that make lots of inductive assumptions should be discarded. Pruning them for simplicity and, in turn, more generalisability can be a powerful technique if you have the compute to train models on lots more data.

So, as compute gets cheaper and more abundant, focusing on scalable models with fewer built-in biases becomes increasingly important. This approach not only takes advantage of the current trajectory of how cheap computation is getting, but it also prepares our latest models for even greater compute efficiencies in the future!