MLA-C01 Online Practice Questions

Home / Amazon / MLA-C01

Latest MLA-C01 Exam Practice Questions

The practice questions for MLA-C01 exam was last updated on 2026-02-24 .

Viewing page 1 out of 4 pages.

Viewing questions 1 out of 21 questions.

Question#1

A company has historical data that shows whether customers needed long-term support from company staff. The company needs to develop an ML model to predict whether new customers will require long-term support.
Which modeling approach should the company use to meet this requirement?

A. Anomaly detection
B. Linear regression
C. Logistic regression
D. Semantic segmentation

Explanation:
Logistic regression is a suitable modeling approach for this requirement because it is designed for binary classification problems, such as predicting whether a customer will require long-term support ("yes" or "no"). It calculates the probability of a particular class and is widely used for tasks like this where the outcome is categorical.

Question#2

A healthcare analytics company wants to segment patients into groups that have similar risk factors to develop personalized treatment plans. The company has a dataset that includes patient health records, medication history, and lifestyle changes. The company must identify the appropriate algorithm to determine the number of groups by using hyperparameters.
Which solution will meet these requirements?

A. Use the Amazon SageMaker AI XGBoost algorithm. Set max_depth to control tree complexity for risk groups.
B. Use the Amazon SageMaker k-means clustering algorithm. Set k to specify the number of clusters.
C. Use the Amazon SageMaker AI DeepAR algorithm. Set epochs to determine the number of training iterations for risk groups.
D. Use the Amazon SageMaker AI Random Cut Forest (RCF) algorithm. Set a contamination hyperparameter for risk anomaly detection.

Explanation:
The problem described is a patient segmentation use case, which is a classic example of unsupervised learning. The objective is to group patients with similar characteristics without predefined labels. AWS documentation clearly states that Amazon SageMaker k-means is designed specifically for clustering and segmentation tasks.
The SageMaker k-means algorithm groups data points into clusters based on feature similarity and requires the user to define the number of clusters using the k hyperparameter. This directly satisfies the requirement to “determine the number of groups by using hyperparameters.” AWS recommends k-means for applications such as customer segmentation, risk grouping, and pattern discovery in healthcare data.
Option A (XGBoost) is a supervised learning algorithm used for classification and regression. The max_depth hyperparameter controls tree complexity, not the number of groups, making it unsuitable for this task.
Option C (DeepAR) is a time-series forecasting algorithm optimized for predicting future values, not clustering patients.
Option D (Random Cut Forest) is an anomaly detection algorithm. While useful for identifying outliers or unusual patient behavior, it does not perform clustering or group segmentation.
AWS SageMaker documentation explicitly identifies k-means as the correct choice when the goal is to partition data into a predefined number of clusters using a tunable hyperparameter.
Therefore, Option B is the correct and AWS-verified answer.

Question#3

A company has deployed an XGBoost prediction model in production to predict if a customer is likely to cancel a subscription. The company uses Amazon SageMaker Model Monitor to detect deviations in the F1 score.
During a baseline analysis of model quality, the company recorded a threshold for the F1 score. After several months of no change, the model's F1 score decreases significantly.
What could be the reason for the reduced F1 score?

A. Concept drift occurred in the underlying customer data that was used for predictions.
B. The model was not sufficiently complex to capture all the patterns in the original baseline data.
C. The original baseline data had a data quality issue of missing values.
D. Incorrect ground truth labels were provided to Model Monitor during the calculation of the baseline.

Explanation:
Problem Description:
The F1 score, which is a balance of precision and recall, has decreased significantly. This indicates the model's predictions are no longer aligned with the real-world data distribution.
Why Concept Drift?
Concept drift occurs when the statistical properties of the target variable or features change over time.
For example, customer behaviors or subscription cancellation patterns may have shifted, leading to reduced model accuracy.
Signs of Concept Drift:
Deviation in performance metrics (e.g., F1 score) over time. Declining prediction accuracy for certain groups or scenarios. Solution:
Monitor for drift using tools like SageMaker Model Monitor. Regularly retrain the model with updated data to account for the drift.
Why Not Other Options?
B: Model complexity is unrelated if the model initially performed well.
C: Data quality issues would have been detected during baseline analysis.
D: Incorrect ground truth labels would have resulted in a consistently poor baseline.
Conclusion: The decrease in F1 score is most likely due to concept drift in the customer data, requiring retraining of the model with new data.

Question#4

A company has a custom extract, transform, and load (ETL) process that runs on premises. The ETL process is written in the R language and runs for an average of 6 hours. The company wants to migrate the process to run on AWS.
Which solution will meet these requirements?

A. Use an AWS Lambda function created from a container image to run the ETL jobs.
B. Use Amazon SageMaker AI processing jobs with a custom Docker image stored in Amazon Elastic Container Registry (Amazon ECR).
C. Use Amazon SageMaker AI script mode to build a Docker image. Run the ETL jobs by using SageMaker Notebook Jobs.
D. Use AWS Glue to prepare and run the ETL jobs.

Explanation:
The ETL process has two critical characteristics: it is long-running (6 hours) and written in R. AWS Lambda is unsuitable because it has a maximum execution time of 15 minutes. AWS Glue primarily supports Spark-based ETL and does not natively support custom R-based workloads.
AWS documentation recommends using Amazon SageMaker Processing Jobs for long-running, custom data processing workloads. Processing jobs allow users to run arbitrary code in custom Docker containers, making them ideal for migrating on-premises ETL jobs written in R.
By building a custom Docker image that includes the R runtime and required libraries and storing it in Amazon ECR, the company can run the ETL job at scale on managed infrastructure without rewriting the code.
SageMaker script mode is intended for training, not ETL.
Therefore, SageMaker processing jobs with a custom container are the correct solution.

Question#5

A company wants to predict the success of advertising campaigns by considering the color scheme of each advertisement. An ML engineer is preparing data for a neural network model. The dataset includes color information as categorical data.
Which technique for feature engineering should the ML engineer use for the model?

A. Apply label encoding to the color categories. Automatically assign each color a unique integer.
B. Implement padding to ensure that all color feature vectors have the same length.
C. Perform dimensionality reduction on the color categories.
D. One-hot encode the color categories to transform the color scheme feature into a binary matrix.

Explanation:
One-hot encoding is the appropriate technique for transforming categorical data, such as color information, into a format suitable for input to a neural network. This technique creates a binary vector representation where each unique category (color) is represented as a separate binary column, ensuring that the model does not infer ordinal relationships between categories. This approach preserves the categorical nature of the data and avoids introducing unintended biases.

Disclaimer

This page is for educational and exam preparation reference only. It is not affiliated with Amazon, AWS Certified Associate, or the official exam provider. Candidates should refer to official documentation and training for authoritative information.

Exam Code: MLA-C01Q & A: 207 Q&AsUpdated:  2026-02-24

  Get All MLA-C01 Q&As