Tech

Day 78 of 100 Days of AI

Michael Tefula

11 Jun 2024 — 1 min read

Apple will be introducing large language models to their devices later this year. They have a high-level write-up here on the technical bits of how they got to where they are.

The post includes detail on how they conducted pre-training, post-training, optimization, and dynamic model adaptation. Some key bits I took away from reading the post are:

Size — The models are small and will fit into powerful smartphones. For example one of Apple’s on-device models is a ~ 3 billion parameter language model. For comparison, Meta’s latest flagship model has 70 billion parameters.
Fine-tuned – The on-device models are fine-tuned and specialised for a set of common use-cases (e.g. text summarisation, image generation, and in-app actions). This means you don’t really need supersized models.
Smart optimization – Apple has done a lot of smart work to make on-device models exceptionally efficient. On an iPhone 15 Pro, they were able to get time-to-first token down to just 0.6 milliseconds per prompt token (for compraison, GPT 4 achieves 0.64 and GPT 3.5 Turbo achieves 0.27)
Server-based models – For more difficult tasks, the phone can rely on server-based models that run on “Private Cloud Compute.”

Here is a sample of some benchmarks that Apple shared. It’s impressive that the on-device performance beats other larger models. But of course, these are Apple’s own benchmarks and it’s possible (though not necessarily true) there might have been some cherry picking to get the best numbers.

Overall, Apple has achieved promising results. We can expect even better performance of their on-device models in the years ahead.

3 Learnings from Building a Weekend App with AI

This past long weekend (thanks, Easter break) I dug into an idea I had knocking around. I read Hacker News religiously and the lengthy comments sections can be as insightful as the articles they reference. However, the most interesting stories on Hacker News often have too many comments to digest.

Learn Slow So You Can Move Fast

I learned to code the old-school way: I bought a Python textbook and went through examples and exercises, page by page, writing all the code from scratch. Today, we have AI agents writing code for us. I often use Cursor and LLMs to rapidly generate snippets or whole sections of

The Gen AI Frenzy: What’s Hype, What’s Real, and Where’s the Productivity?

Today, I read two contrasting articles. One posited that we are near the peak of investor hype in Gen AI. It argued that productivity gains from this new technology will be incremental rather than transformative. Another article suggested the opposite. It made the distinction between good bubbles and bad bubbles,

Prompt Engineering: A Surprising Switching Cost of Large Language Models

I've been working on some exceptionally long LLM prompts for a couple of projects at work. I've noticed a fascinating phenomenon: A prompt that works well with one model can diverge in performance when applied to another. This presents switching costs for developers and businesses. You

Read more

3 Learnings from Building a Weekend App with AI

Learn Slow So You Can Move Fast

The Gen AI Frenzy: What’s Hype, What’s Real, and Where’s the Productivity?

Prompt Engineering: A Surprising Switching Cost of Large Language Models