Day 26 of 100 Days of AI

I was finally able to train a classification model using Google’s Vertex AI. When I do this locally on my machine with the same training data it trains in a few seconds. On Vertex AI it took 1 hour 46 minutes!

You can see the training budget I set below in node hours and the training performance.

I don’t fully understand why the training took so long in the cloud but I think that since I chose AutoML — a configuration for non-machine learning experts — Google’s platform executed a bunch of automated machine learning methods which take up a big chunk of processing time. These include feature engineering, testing various model architectures, and hyperparameter tuning.

It’s possible to train a model on Vertex AI without using AutoML but that needs significantly more configuration. For small data sets I’m likely better off just writing and running the training code locally. Still, what Vertex AI offers is massively helpful for people building software for public or private business consumption.

Here is one of the charts Google automatically creates from the classification model. I used heart attack data from Kaggle but since Vertex AI requires at least 1,000 data points, I created fictional data by duplicating several rows of data. This means that my model isn’t of much use, but it was good learning exercise. (Note: in this fictional dataset, caa — number of major vessels that are narrow or blocked on a fluoroscopy — is the most important feature in predicting a heart attack.)

Once the model was trained, I deployed it to an endpoint (accessible via an API) and below I tested it with a few inputs for a prediction.

Key Takeaways:

  • Building and deploying models in the cloud is now easier than ever, thanks to tools like Vertex AI. However, for small datasets and experiments, staying local on a machine is probably best.