AutoML with AutoGluon#
Purpose#
Machine learning has become a powerful tool for solving real-world problems. However, building effective ML models traditionally requires deep expertise in data preprocessing, feature engineering, model selection, and hyperparameter tuning. This complexity can be a barrier for many developers, data scientists, and domain experts. Automated Machine Learning (AutoML) aims to simplify and accelerate the ML workflow by automating the most time-consuming and technically demanding tasks. With AutoML, you can:
Reduce the need for manual trial-and-error in model selection and tuning.
Achieve competitive performance with minimal effort.
Focus more on solving business problems rather than technical implementation.
Make ML accessible to non-experts.
AutoGluon is an open-source AutoML toolkit developed by Amazon Web Services (AWS). It automates many steps in the machine learning pipeline, including preprocessing, model selection, hyperparameter tuning, and ensembling. It’s designed to be easy to use and powerful, even with minimal code. For example, if you’re working with tabular data, you can train a model and make predictions on new data with just three lines of code.
from autogluon.tabular import TabularPredictor
predictor = TabularPredictor(label='target_column').fit(train_data)
predictions = predictor.predict(test_data)
Whether you’re a beginner looking to get started with ML or an experienced practitioner seeking to streamline your workflow, AutoGluon provides a robust and user-friendly solution.
To demonstrate how AutoML can be applied to semiconductor testing, this tutorial uses the prediction of Minimum Operating Voltage (vMin) as an example. The term vMin refers to the lowest voltage at which an integrated circuit operates reliably. A lower vMin reduces power consumption and extends device lifespan. Traditionally, vMin search begins from an initial voltage and gradually decreases until the minimum stable point is found, making the process time-consuming. Using AutoML to predict vMin reduces test time and ATE usage, enabling faster and more efficient semiconductor testing.
About this tutorial#
In this tutorial, you will learn how to do AutoML with AutoGluon in three steps:
Installation
Model training, prediction and evaluation
Using model in ACS RTDI
Compatibility#
Ubuntu 20.4 / SmarTest 8 / RHEL79 / Nexus 3.1.0 / Edge 3.4.0-prod / Unified Server 2.3.0
Procedure#
Installation#
Ensure you have the following prepared:
An ACS RTDI virtual environment containing an Ubuntu Server VM.
Then follow the commands to install environment required.
Click to expand!
# 1.Download artifacts
cd ~
rm -rf jupyter_lab_autogluon
curl http://10.44.5.139/jupyter_lab_autogluon_1.0.0.tar.gz -O
mkdir -p ~/jupyter_lab_autogluon && tar -zxf ./jupyter_lab_autogluon_1.0.0.tar.gz -C ~/jupyter_lab_autogluon
mv -f jupyter_lab_autogluon_1.0.0.tar.gz jupyter_lab_autogluon
# 2.Install required system packages
sudo apt update -y
sudo apt install -y make build-essential libssl-dev zlib1g-dev \
libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm \
libncurses5-dev libncursesw5-dev xz-utils tk-dev libffi-dev \
liblzma-dev python3-openssl git
# 3.Install Python 3.10.12 and create virtual environment
PYTHON_VERSION="3.10.12"
PYTHON_DIR="$HOME/.local/python310"
VENV_DIR="$HOME/jupyter310_env"
if [ ! -d "$PYTHON_DIR" ]; then
wget "https://www.python.org/ftp/python/$PYTHON_VERSION/Python-$PYTHON_VERSION.tgz"
tar -xzf "Python-$PYTHON_VERSION.tgz"
cd "Python-$PYTHON_VERSION" || exit
./configure --prefix="$PYTHON_DIR" --enable-optimizations
make -j $(nproc)
make install
cd .. || exit
rm -rf "Python-$PYTHON_VERSION" "Python-$PYTHON_VERSION.tgz"
else
echo "Python 3.10.12 already installed, skipping..."
fi
if [ ! -d "$VENV_DIR" ]; then
"$PYTHON_DIR/bin/python3.10" -m venv "$VENV_DIR"
else
echo "Virtual environment already exists, skipping..."
fi
# 4.Install JupyterLab
source "$VENV_DIR/bin/activate"
pip install --upgrade pip -q
pip install jupyterlab
jupyter lab --generate-config -y
echo "c.ServerApp.port = 8890" >> "$HOME/.jupyter/jupyter_lab_config.py"
deactivate
Install AutoGluon and its dependencies:
!python -m pip install --upgrade pip
!python -m pip install autogluon
Model training, prediction and evaluation#
Start JupyterLab:
source "$HOME/jupyter310_env/bin/activate"
jupyter lab --ip=0.0.0.0 --port=8890 ~/jupyter_lab_autogluon/jupyter_lab_autogluon_1.0.0/autogluon-example.ipynb
Loading Data#
Load the dataset required for model training and prediction. The target variable for prediction is the “vMin” column.
import pandas as pd
# 1. Obtain the dataset
train_data = pd.read_csv('data/sample.train.csv', keep_default_na=True, na_values=["", " ", "NaN"])
test_data = pd.read_csv('data/sample.test.csv', keep_default_na=True, na_values=["", " ", "NaN"])
# 2. The target for prediction is "vMin" column
label = 'vMin'
Training#
AutoGluon generates high-performance predictive models by automatically preprocessing data (including handling missing values, encoding category features, etc.), selecting appropriate base models (such as LightGBM, XGBoost, etc.), optimizing hyperparameters and integrating multiple models.
# Train the model
predictor = TabularPredictor(label=label).fit(train_data)
Prediction#
The predictor obtained through training can make predictions on the target column. The column “vMin” has raw values and the column “predicted” has predicted values.
y_pred = predictor.predict(test_data.drop(columns=[label]))
# Combine true and predicted values into a DataFrame
comparison_df = test_data[[label]].copy()
comparison_df['predicted'] = y_pred
# Display the first few rows
comparison_df.head(20)
vMin predicted
0 0.487 0.491009
1 0.450 0.463575
2 0.918 0.900433
3 0.640 0.622023
4 0.638 0.612958
Evaluation#
Evaluate the results of the trained model on the test set.
predictor.evaluate(test_data, silent=True)
{'root_mean_squared_error': np.float64(-0.04254107047903863),
'mean_squared_error': -0.001809742677502532,
'mean_absolute_error': -0.02221871429233551,
'r2': 0.9006343785237739,
'pearsonr': 0.9500226237145906,
'median_absolute_error': -0.009687968969345068}
AutoGluon standardizes all evaluation metrics to a “higher-is-better” format. For error-based metrics such as Root Mean Squared Error (RMSE), Mean Squared Error (MSE), Mean Absolute Error (MAE), and Median Absolute Error (MedAE), this means the reported values are negative, since lower error indicates better performance. In contrast, metrics like R2 (coefficient of determination) measure the goodness of fit, while Pearson Correlation assesses how well the predicted trend aligns with the actual trend.
To evaluate the importance of the features trained by AutoGluon, call the feature_importance() function on your test data.
predictor.feature_importance(test_data)
| importance | stddev | p_value | n | p99_high | p99_low | |
|---|---|---|---|---|---|---|
| test_suite_name | 0.121988 | 0.001625 | 3.782142e-09 | 5 | 0.125335 | 0.118641 |
| x_spec | 0.004538 | 0.000932 | 2.019610e-04 | 5 | 0.006457 | 0.002619 |
| ecid_parametric_2 | 0.001077 | 0.000262 | 3.912536e-04 | 5 | 0.001618 | 0.000537 |
| VDD0 | 0.001019 | 0.000183 | 1.187764e-04 | 5 | 0.001395 | 0.000643 |
| ecid_end | 0.000952 | 0.000236 | 4.179740e-04 | 5 | 0.001438 | 0.000466 |
To evaluate the accuracy of individual models trained by AutoGluon, call the leaderboard() function on your test data.
predictor.leaderboard(test_data)
| model | score_test | score_val | eval_metric | pred_time_test | pred_time_val | fit_time | pred_time_test_marginal | pred_time_val_marginal | fit_time_marginal | stack_level | can_infer | fit_order | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | WeightedEnsemble_L2 | -0.042541 | -0.024852 | root_mean_squared_error | 0.55979 | 0.342045 | 32.233006 | 0.004487 | 0.000755 | 0.030322 | 2 | True | 10 |
| 1 | NeuralNetFastAI | -0.043398 | -0.027227 | root_mean_squared_error | 0.088596 | 0.045584 | 14.866164 | 0.088596 | 0.045584 | 14.866164 | 1 | True | 6 |
| 2 | RandomForestMSE | -0.047547 | -0.026954 | root_mean_squared_error | 0.204762 | 0.133644 | 11.562767 | 0.204762 | 0.133644 | 11.562767 | 1 | True | 3 |
| 3 | LightGBMLarge | -0.048979 | -0.041049 | root_mean_squared_error | 0.100231 | 0.036399 | 3.642617 | 0.100231 | 0.036399 | 3.642617 | 1 | True | 9 |
| 4 | LightGBM | -0.049166 | -0.037701 | root_mean_squared_error | 0.055474 | 0.028727 | 1.3965 | 0.055474 | 0.028727 | 1.3965 | 1 | True | 2 |
Using model in ACS RTDI#
Download the example Test Program and application on the Host Controller:
cd ~/apps/
curl http://10.44.5.139/apps/application-ag-v3.1.0-RHEL79.tar.gz -O
tar -zxf application-ag-v3.1.0-RHEL79.tar.gz
Make a container application from the developed model#
After training, the model is located in the AutogluonModels directory. You need to copy the model files to the container application’s directory: ~/apps/application-ag-v3.1.0/rd-autogluon-app/AutogluonModels.
Then, we need to implement model loading, receiving request messages from the Test Program, performing prediction, and returning the results to the Test Program:
# Load the model
g_ag_model = "AutogluonModels/ag-20250915_051243" # The trained AutoGluon model
self.predictor = TabularPredictor.load(g_ag_model)
......
......
# receiving request messages
def _handle_request(self, request, logger):
try:
# Deep copy and parse the JSON request
request_cp = copy.deepcopy(request)
request_cp = json.loads(request_cp)
# Extract row data and convert to DataFrame
row_data = request_cp.get("data")
print(f"---row data:{row_data}")
df = pd.read_csv(StringIO(row_data))
# Convert all values to numeric, coercing errors to NaN
df = df.apply(pd.to_numeric, errors='coerce')
# Make prediction
predicted_value = self.predictor.predict(df).iloc[0]
Request a prediction to a running application#
We want to predict vMin (column W) with feature data from column A to column V:
| A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| acs_lot | acs_wafer | acs_ecid_x | acs_ecid_y | ecid_start | ecid_end | ecid_time | bin | final_bin_txt | ecid_lot_id | acs_tester_names | acs_last_test_suite | ecid_parametric_1 | ecid_parametric_2 | test_suite_name | x_spec | FBVDDQ | VDD0 | VDD1 | VDD2 | VDD3 | VDDMS | vMin |
| acs_lot_004 | 1 | 1 | 1 | 12/17/2024 22:24 | 12/17/2024 22:33 | 555577 | 1 | final_bin_txt_001 | ecid_lot_id_002 | acs_tester_names_001 | acs_last_test_suite_002 | 1585.520839 | 2122.811813 | test_suite_name_200 | x_spec_005 | 1.35 | 0.8 | 0.8 | 0.8 | 0.8 | 0.8 | 0.595 |
The following is the feature data used in the prediction request:
{ "request": "Predict Target", "data": "acs_lot,acs_wafer,acs_ecid_x,acs_ecid_y,ecid_start,ecid_end,ecid_time,bin,final_bin_txt,ecid_lot_id,acs_tester_names,acs_last_test_suite,ecid_parametric_1,ecid_parametric_2,test_suite_name,x_spec,FBVDDQ,VDD0,VDD1,VDD2,VDD3,VDDMS\nacs_lot_001,16,3,4,_RARE_,10/26/2024 21:15,66506.7,982,final_bin_txt_002,ecid_lot_id_001,acs_tester_names_001,acs_last_test_suite_018,1888.513983,2143.343698,test_suite_name_089,x_spec_002,1.35,0.8,0.8,0.8,0.8,0.8" }
// Prepare feature data for prediction request
String[] values = line.split(",");
String data = headerLine + "\n" + line;
StringBuilder csvData = new StringBuilder();
csvData.append(headerLine); // header
csvData.append("\\n"); // escaped newline
csvData.append(line); // data row
String jsonPayload = "{ \"request\": \"Predict Target\", \"data\": \"" + csvData.toString() + "\" }";
...
Send feature data to the container application for prediction via Nexus TPI:
// Connect to the container application
NexusTPI.target("ag-app").timeout(20);
...
// Use Nexus TPI to send a request to the container application
int res = NexusTPI.request(jsonPayload);
// Receive the response
String response = NexusTPI.getResponse();
System.out.println("Response: " + response);
The returned response is in the following format, and the predicted value can be extracted from it:
Response: Predicted:0.5410701036453247
You can follow the steps below to run this example:
In this example, we use Unified Server as the image repository
Please refer to “Use Unified Server as a Container Registry” and configure Unified Server as a Docker image registry, you need to follow these steps from the documentation:
Configure the hosts for the Host Controller
Get the docker certificates from the Unified Server
Configure the Project and Account in Harbor
Configure Edge Server
Build the Docker image:
cd ~
curl http://10.44.5.139/docker/ubuntu_20.04.tar -O
sudo docker load -i ubuntu_20.04.tar
cd ~/apps/application-ag-v3.1.0/rd-autogluon-app/
sudo docker build ./ --tag=unifiedserver.local/example-project/example-repo:ag
Push the Docker image to the Unified Server:
sudo docker push unifiedserver.local/example-project/example-repo:ag
Configure the Nexus for container application deployment to the Edge Server:
gedit /opt/acs/nexus/conf/images.json
{
"selector": {
"device_name": "demo RTDI"
},
"edge": {
"address": "<Edge IP>",
"registry": {
"address": "unifiedserver.local",
"user": "robot$example_account",
"password": "<Password>"
},
"containers": [
{
"name": "ag-app",
"image": "example-project/example-repo:ag",
"environment" : {
"ONEAPI_DEBUG": "3",
"ONEAPI_CONTROL_ZMQ_IP": "<Host Controller IP>"
}
}
]
}
}
gedit /opt/acs/nexus/conf/acs_nexus.ini
[Auto_Deploy]
Enabled=false
...
[GUI]
Auto_Popup=true
To apply the modified configuration, restart Nexus:
sudo systemctl restart acs_nexus
Run the Test Program
cd ~/apps/application-ag-v3.1.0/
sh start_smt8.sh
You can view the inference results in the console log
{ "request": "Predict Target", "data": "acs_lot,acs_wafer,acs_ecid_x,acs_ecid_y,ecid_start,ecid_end,ecid_time,bin,final_bin_txt,ecid_lot_id,acs_tester_names,acs_last_test_suite,ecid_parametric_1,ecid_parametric_2,test_suite_name,x_spec,FBVDDQ,VDD0,VDD1,VDD2,VDD3,VDDMS\nacs_lot_001,16,3,4,_RARE_,10/26/2024 21:15,66506.7,982,final_bin_txt_002,ecid_lot_id_001,acs_tester_names_001,acs_last_test_suite_018,1888.513983,2143.343698,test_suite_name_089,x_spec_002,1.35,0.8,0.8,0.8,0.8,0.8" }
TPI res: 0
Response: Predicted:0.5410701036453247
{ "request": "Predict Target", "data": "acs_lot,acs_wafer,acs_ecid_x,acs_ecid_y,ecid_start,ecid_end,ecid_time,bin,final_bin_txt,ecid_lot_id,acs_tester_names,acs_last_test_suite,ecid_parametric_1,ecid_parametric_2,test_suite_name,x_spec,FBVDDQ,VDD0,VDD1,VDD2,VDD3,VDDMS\nacs_lot_004,1,1,1,12/17/2024 22:24,12/17/2024 22:33,555577,1,final_bin_txt_001,ecid_lot_id_002,acs_tester_names_001,acs_last_test_suite_002,1585.520839,2122.811813,test_suite_name_200,x_spec_005,1.35,0.8,0.8,0.8,0.8,0.8" }
TPI res: 0
Response: Predicted:0.5783701539039612
{ "request": "Predict Target", "data": "acs_lot,acs_wafer,acs_ecid_x,acs_ecid_y,ecid_start,ecid_end,ecid_time,bin,final_bin_txt,ecid_lot_id,acs_tester_names,acs_last_test_suite,ecid_parametric_1,ecid_parametric_2,test_suite_name,x_spec,FBVDDQ,VDD0,VDD1,VDD2,VDD3,VDDMS\nacs_lot_004,1,2,6,10/21/2024 12:25,10/21/2024 12:33,514916.2,177,final_bin_txt_005,ecid_lot_id_002,acs_tester_names_003,acs_last_test_suite_008,1960.714288,2155.847071,test_suite_name_042,x_spec_008,1.35,0.8,0.8,0.8,0.8,0.8" }
TPI res: 0
Response: Predicted:0.5643042922019958
For instructions on how to run the Test Program, please refer to “DPAT demo application on RTDI with SmarTest 8” -> “Run the SmarTest test program”.