Up until now Iāve been manually adjusting hyperparameters on my ML models until Iām happy with the result. But now that Iāve discovered how to automate it, Iām never going back š
In this post Iāll show you how to run automated hyperparameter tuning on AI Platform via XGBoost model code packaged in a custom container. If you havenāt hit buzzword capacity yet, read on.
You might be wondering: what is a hyperparameter? And why do you need to tune it? Good question! Before I explain how to automate the tuning, let me quickly explain what hyperparameters are.
Hyperparameter tuning: the basics
When youāre building machine learning models, there are all sorts of configuration options you need to set before you start training. Things like the number of data points your model will process at once, the size of each layer, how much your model should adjust itās learning rate, and the number of times it should iterate over all of your data. There are some rough guidelines on how to set these numbers (start with a few, small layers and build from there) but sometimes the process of setting them can seem arbitrary. These values are called hyperparameters.
Confession time: my typical workflow thus far has involved choosing hyperparameters that worked for someone elseās model on a similar task and then haphazardly adjusting those values up and down depending on the results until I was happy with the accuracy (and of course, the results of my interpretability analysis). This approach works, but itās fairly tedious and there may be a combination of hyperparameters I never discover.
The tools
š« Automated hyperparameter tuning to the rescue. Here are the tools Iāll be using to show you how this works:
- Dataset: Iāll train a model using a subset of the NOAA weather data in BigQuery public datasets.
- ML framework: To keep the model code short and sweet Iāll build it with XGBoost
- Containers: Iāll be containerizing the model code with Docker and hosting it on Google Container Registry
- Training & tuning: Finally, for training and hyperparam tuning Iām using Cloud AI Platform, specifically making use of custom container training and atuomated hyperparameter tuning
Will it rain in the UK today?
Iāve been visiting the UK a lot recently (itās one of my favorite places š) and I thought it would be fun see if I could predict whether it would rain on a given day based on historical weather data. Thereās an awesome public weather dataset available in BigQuery with weather data from all over the world going back to 1932.
To grab all weather data from the UK (filtered by lat/lng) I ran the following SQL query:
I did some additional filtering to remove null values and the result is 3,881 rows of data. Hereās a preview:
With that weāre ready to build the model. And because this post doesnāt have any pictures, hereās one of me in the UK on a day that couldnāt decide if it was sunny or rainy:
Building the XGBoost model
Before we containerize things, letās setup the data and write the XGBoost model code. First weāll read the data in as a Pandas DataFrame and split it into train and test sets:
I used os.getcwd()
to get the filepath so that our script will be able to access it from within the container (which weāll create soon). Our model will be a binary classifier, predicting 0 for no rain and 1 for rain.
The code to create our XGBClassifier
and train it is simple. There are a lot of optional parameters we could pass in, but for now weāll use the defaults (weāll use hyperparameter tuning magic later to find the best values):
Then we can print the modelās accuracy as a percentage, along with the error value for the last training epoch:
On the first attempt at training this I got an accuracy of 86% (your exact % may vary) - not bad. The last thing weāll want to do is save our model to Google Cloud Storage. First, use argparse
to pass our Cloud Storage bucket as a command line argument:
And then invoke subprocess
to save the model file via gsutil
:
Using Cloud hyperparameter tuning
Time to add the hyperparameter tuning code! For that weāll be using the hypertune package made available by Google Cloud. To use it there are a few steps:
- Choose the hyperparameters you want to tune
- Create a
config.yaml
file with the guidelines for tuning those parameters - Add some
hypertune
code to your model script to use those hyperparameters in your training - Submit the training job to Cloud AI Platform
Which parameters will we tune?
You can see the full list of parameters for XGBClassifier
in the docs. Weāll experiment with two:
max_depth
: the maximum tree depth for our model, default is 3 (remember XGB is a boosted tree framework, more on that here)learning_rate
: how much our modelās weights will change every time theyāre updated, default is 0.1
Now that weāve chosen the hyperparameters we want Cloud to find the optimal values for, we can create a config file with some guidelines for tuning. Remember above when we calculated the error
of our trained XGB model? Thatās a number we want to minimize. Weāll pass this to the hypertune service, so it has a metric for judging the success of each training run (called a trial).
Hereās what our config file looks like:
Itās pretty easy to understand, but hereās a bit more detail on the maxParallelTrials
config: hypertune
can run multiple trials in parallel. The benefit of this is that your job will finish faster but the service can only optimize values based on data from completed trials, so this is a tradeoff youāll want to consider.
Adding hypertune code
To add our hyperparameters to our model code, weāll parse them as command line arguments by adding them to the get_args()
function defined above:
Note that the names of the parameters here correspond with what weāve defined in our config file. Weāll call this function at the beginning of our main method:
Next we want to send the hypertune metrics to Cloud. Weāll add some code to do that after training and calculating evaluation metrics:
Since you donāt need to set the number of epochs for XGBoost Iāve set the global_step
to 1. The hyperparameter_metric_tag
correpsonds to our config file.
Finally, add the args created above to your model training code:
When we kick off our hyperparameter training job, itāll change max_depth
and learning_rate
for each training trial.
Time to containerize š¦
I decided to containerize the model code to try out the custom container training feature on Cloud AI Platform (also because Iāve never played with Docker and this seemed like a good excuse). Learning all these new tools was surprisingly frictionless.
You donāt have to use a custom container to train an XGBoost model on AI Platform, but Iāve used it here to show you how it works. And the cool thing about custom containers is that you can write an ML pipeline using whatever framework youād like and itāll train on AI Platform. The possibilities are endless.
So far weāve created a single Python file with our model code and config.yaml
with our hypertune configs. Weāll now package up the code:
- Dockerfile (weāll write this next)
- config.yaml
- trainer/
- model.py
- rain_uk.csv (our data file)
Writing the Dockerfile
In our Dockerfile weāll include all the commands needed to run our image. Itāll install all the libraries weāre using and set up the entry point for our training code.
You can find the Dockerfile I used for this here.
Pushing to Google Container Registry
To use our container we need to first build it. If you havenāt already, youāll want to run gloud auth configure-docker
to connect gcloud with your Docker credentials.
Then youāll build your container passing it the url of your image in Google Container Registry (details on the format below):
You can choose whatever youād like for the image repo name and tag. Next, test the container by running it locally:
And push it to Container Registry:
With that weāve set up everything we need to start training (and automated tuning, obvs š§).
Let the automated hyperparameter party begin
You can kick off the training job with a single gcloud command. Update this with the name of your training job (can be anything you want), the region youād like to run it in, your image URI, and the storage bucket to write model artifacts to:
Next head over to the AI Platform > Jobs section of your console and select the job you just kicked off. In this job details view, you should see a table where your hypertune summary will be logged as trials complete:
You can refresh this page as training continues to see the results of the latest trial. In the screenshot above Iām sorting by lowest error (the thing Iām trying to minimize). You can also click View logs to monitor your job as it trains.
Looks like my best trial was #4, using a max_depth
of 10
and a learning rate
of 0.362
. I updated my model to use those parameters, then re-ran training and accuracy increased to 90% - quite an improvement! There are obviously many more hyperparameters I could tune, which might improve my accuracy even more.
It would have taken ages for me to find this combination of hyperparams on my own, which is the best part of automated hyperparameter tuning. Thereās also a lot of magic happening under the hood in how hypertune updates your hyperparameters after each trial. It does this using Bayesian optimization, which I wonāt pretend to be an expert in so instead Iāll just link you to this blog post.
Learn more
If youāve made it this far in the post, tweet me āIām hyped about hypertune!ā In all seriousness, here are some useful resources if you want to start using these tools on your own models:
- Guide to custom containers on AI Platform
- Guide to hyperparameter tuning on AI Platform
- Weather dataset in BigQuery
Iām always open to feedback and suggestions for future posts (and cool datasets, because there are never enough). Let me know what you think on Twitter at @SRobTweets.