MLWhiz | AI Unwrapped

MLWhiz | AI Unwrapped

Share this post

MLWhiz | AI Unwrapped
MLWhiz | AI Unwrapped
100x faster Hyperparameter Search Framework with Pyspark
Copy link
Facebook
Email
Notes
More

100x faster Hyperparameter Search Framework with Pyspark

Rahul Agarwal's avatar
Rahul Agarwal
Feb 22, 2020
∙ Paid

Share this post

MLWhiz | AI Unwrapped
MLWhiz | AI Unwrapped
100x faster Hyperparameter Search Framework with Pyspark
Copy link
Facebook
Email
Notes
More
Share
100x faster Hyperparameter Search Framework with Pyspark

Recently I was working on tuning hyperparameters for a huge Machine Learning model.

Manual tuning was not an option since I had to tweak a lot of parameters. Hyperopt was also not an option as it works serially i.e. at a time, only a single model is being built. So it was taking up a lot of time to train each model and I was pretty short on time.

I had to come up with a better more efficient approach if I were to meet the deadline. So I thought of the one thing that helps us data scientists in many such scenarios — Parallelization.

Can I parallelize my model hyperparameter search process?

As you would have guessed, the answer is Yes.

This post is about setting up a hyperparameter tuning framework for Data Science using scikit-learn/xgboost/lightgbm and pySpark.

Keep reading with a 7-day free trial

Subscribe to MLWhiz | AI Unwrapped to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2025 Rahul Agarwal
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share

Copy link
Facebook
Email
Notes
More