This post explores the expected value framework using inference data to understand decision-making under uncertainty in ML contexts. More specifically, this is a demo to demonstrate a business-oriented optimization of a classification threshold for a binary classification model used for email marketing campaign optimization.

1. Expected Value Framework Fundamentals

In probability theory, the expected value (EV) of a random variable is the long-run average of outcomes weighted by their probabilities:

\[\mathbb{E}[X] = \sum_{i} P(y_i) \cdot V(y_i)\]

Where:

$ y_i $: possible class label or outcome (e.g., 0 or 1 in binary classification)
$ P(y_i) $: predicted probability of class $ y_i $
$ V(y_i) $: business benefit or cost associated with class $ y_i $

It reflects the average expected business outcome you can achieve by taking action under uncertainty, weighted by the model’s estimated probabilities. This helps bridge predictions and real-world decision-making by accounting for both uncertainty and impact.

In the context of machine learning, especially for classification models, we can interpret model outputs (typically class probabilities) as estimated likelihoods of different outcomes. We can then associate business values or costs with those outcomes to compute the expected value of taking a specific action.

2. The Email Marketing Campaign Use Case

In email marketing, there is always a risk that customers will unsubscribe or mark emails as spam if they receive too many promotional messages. The goal of the ML model is to predict whether a customer will unsubscribe from the mailing list if they receive a promotional email.

The ML model is defined as follows:

Label 0: Customer will not unsubscribe → good candidate to send email
Label 1: Customer will unsubscribe → should not send email
Prediction = 1 → The ML model suggests to not send promotional email
Prediction = 0 → The ML model suggests to send promotional email

2.1 Understanding the Confusion Matrix

Case	Prediction	Actual	Action Taken	Outcome Type	`business_values` Key	Description
✅ True Positive (TP)	1	1	Don’t send email	Correct action	`"correct_no_send"`	We avoided sending email to someone who would have unsubscribed
❌ False Positive (FP)	1	0	Don’t send email	Incorrect action	`"false_no_send"`	We wrongly withheld email from an engaged customer
❌ False Negative (FN)	0	1	Send email	Incorrect action	`"false_send"`	We sent email to someone who unsubscribed
✅ True Negative (TN)	0	0	Send email	Correct action	`"correct_send"`	We correctly sent email to someone who engaged positively

3. Mapping Model Predictions to Business Impact

To map model predictions to meaningful business impact, we define a value for each type of prediction outcome. These values are based on expected revenue per email and customer lifetime value impacts from A/B testing.

The reason we are interested in this framework and decision making method is because not all prediction errors are equal. Some outcomes have a much higher cost or benefit than others — for example, wrongly sending emails to unsubscribe-prone customers damages list health, while correctly targeting engaged customers drives substantial revenue. Assigning different business values allows us to capture these asymmetries and make more informed threshold decisions.

"correct_send" (True Negative)

Customer received email and engaged positively (clicked, purchased, etc.)
Measured as average revenue per email for engaged customers
Typical value: $2.50 per successful email based on conversion rates

"false_send" (False Negative)

Customer received email and unsubscribed
Captures customer lifetime value loss and list health damage
Reflects long-term revenue impact from losing the customer

"false_no_send" (False Positive)

Customer was not sent email but would have engaged positively
Value set to 0 to reflect missed opportunity (no direct cost but lost revenue)

"correct_no_send" (True Positive)

Customer was not sent email and would have unsubscribed
Assigned +0.25 to capture soft benefits such as improved deliverability and sender reputation

# Business values calculated from A/B testing email campaigns
avg_revenue_per_engaged_email = 2.50  # Revenue from customers who engage
avg_clv_loss_from_unsubscribe = -15.0  # Customer lifetime value loss

business_values = {
    # When you do send email:
    "correct_send": avg_revenue_per_engaged_email,
    "false_send": avg_clv_loss_from_unsubscribe,
    # When you don't send email:
    "false_no_send": 0,  # Missed opportunity cost
    "correct_no_send": 0.25,  # Soft benefits: deliverability, sender reputation
}

4. Threshold Optimization via Expected Value

Rather than optimizing for metrics like accuracy, recall, or F1, we select the classification threshold that maximizes expected business value.

Each model prediction leads to an action (send email or don’t send), with asymmetric outcomes. To make the best decision, we identify the threshold that yields the highest total expected value:

Send email if $P(\text{unsubscribe}) < \text{threshold}$
Don’t send email if $P(\text{unsubscribe}) \geq \text{threshold}$

We evaluate a range of thresholds and simulate outcomes using the assigned business value of each possible decision (TP, FP, FN, TN).

def evaluate_threshold_expected_value(
    df, prob_col, target_col, business_values, thresholds
):
    """
    Evaluate expected business value and standard classification metrics across thresholds.
    """
    results = []

    for threshold in thresholds:
        temp_df = df.copy()
        temp_df["predicted"] = (temp_df[prob_col] >= threshold).astype(int)

        tp = ((temp_df["predicted"] == 1) & (temp_df[target_col] == 1)).sum()
        fp = ((temp_df["predicted"] == 1) & (temp_df[target_col] == 0)).sum()
        fn = ((temp_df["predicted"] == 0) & (temp_df[target_col] == 1)).sum()
        tn = ((temp_df["predicted"] == 0) & (temp_df[target_col] == 0)).sum()

        # Business value
        total_ev = (
            tp * business_values["correct_no_send"]
            + fp * business_values["false_no_send"]
            + fn * business_values["false_send"]
            + tn * business_values["correct_send"]
        )

        # Classification metrics
        precision = tp / (tp + fp) if (tp + fp) > 0 else 0
        recall = tp / (tp + fn) if (tp + fn) > 0 else 0
        f1 = (
            2 * precision * recall / (precision + recall)
            if (precision + recall) > 0
            else 0
        )

        results.append({
            "threshold": threshold,
            "total_expected_value": total_ev,
            "avg_ev_per_user": total_ev / len(df),
            "precision": precision,
            "recall": recall,
            "f1": f1,
            "tp": tp,
            "fp": fp,
            "fn": fn,
            "tn": tn,
        })

    return pd.DataFrame(results)

4.1 Results

The analysis shows that the model performs best with a 30% threshold when maximizing expected value:

threshold	total_expected_value	avg_ev_per_user	precision	recall	f1
0.30	$127,450	$0.85	0.385	0.946	0.548
0.50	$89,230	$0.59	0.469	0.682	0.556
0.70	$45,180	$0.30	0.640	0.180	0.280

The model performs best when it sends emails to more customers (with a 30% threshold) — even if some will unsubscribe — because:

Correct email sends drive substantial revenue gains
False email sends have manageable customer lifetime value impact
Reaching engaged customers is far more valuable than avoiding every possible unsubscribe

✅ It’s better to risk sending to a potentially unengaged customer than miss the opportunity to generate revenue from someone who would have converted.

5. Customer Personalization using the EV Framework

For each customer (or instance), we compare:

EV of Acting: Take an action (i.e. send promotional email to customer)
EV of Not Acting: Do nothing (i.e. do not send email to the customer)

Let:

$ p $ = probability that the customer will unsubscribe (from model)
Business outcomes:
- $ V_{\text{correct send}} $: reward when we send email and the customer engages
- $ V_{\text{false send}} $: loss when we send email but the customer unsubscribes
- $ V_{\text{false no send}} $: loss when we don’t send email and the customer would have engaged
- $ V_{\text{correct no send}} $: outcome when we don’t send email and the customer would have unsubscribed

Then:

\[\text{EV}_{\text{send}} = (1 - p) \cdot V_{\text{correct send}} + p \cdot V_{\text{false send}}\] \[\text{EV}_{\text{no send}} = (1 - p) \cdot V_{\text{false no send}} + p \cdot V_{\text{correct no send}}\]

✅ Decision Rule

Choose to send email to the customer if:

\[\text{EV}_{\text{send}} > \text{EV}_{\text{no send}}\]

This ensures that each decision maximizes the expected return — rather than merely acting on probabilities, we’re optimizing for value.

def compute_expected_value(df, prob_col, business_values):
    """
    Compute expected value for both action and inaction, vectorized over a DataFrame.
    """
    p = df[prob_col]  # probability of unsubscribe (P(class=1))

    # Action: Send Email
    ev_send = (1 - p) * business_values["correct_send"] + p * business_values["false_send"]

    # Action: No Send Email
    ev_no_send = (1 - p) * business_values["false_no_send"] + p * business_values["correct_no_send"]

    # Decide whether to send email
    should_send = ev_send > ev_no_send

    # Add expected value columns to the dataframe
    df = df.copy()
    df["ev_send"] = ev_send
    df["ev_no_send"] = ev_no_send
    df["should_send"] = should_send

    return df

6. Comparing Customer Personalization vs. Global Thresholding

There are two ways to operationalize inference decisions using the model’s predicted unsubscribe probabilities: a personalized expected value rule or a global classification threshold (e.g. 30%).

6.1 EV-based Personalization (per-user decision)

Each customer is evaluated individually by comparing the expected value of sending vs. not sending them emails. This method takes into account the customer’s predicted probability of unsubscribing and the assigned business value of each possible outcome.

Pros:

Tailored decision for each customer
Maximizes total expected business value since it maximizes EV on a customer level granularity
Ideal when model probabilities are well-calibrated

Cons:

Harder to explain or control (e.g., no fixed email send rate)
Slightly more complex to deploy in production systems compared to global thresholding
Less transparent in policy testing or experimentation setups

6.2 Global Thresholding (fixed policy)

A single probability threshold is chosen (e.g., 30%), above which customers are not sent emails. This creates a simple rule that applies equally to all customers.

Pros:

Easy to communicate and implement
Directly controls email send rate
Suitable for A/B testing and policy evaluation

Cons:

Less precise — treats customers near the threshold identically
Ignores differences in expected value between customers
Can underperform in value maximization compared to EV-based personalization

7. Policy Evaluation

After building a model to predict customer unsubscribe behavior, we want to evaluate how different decision strategies perform when applied in practice.

Each strategy (e.g., email everyone, email no one, or use a threshold) leads to different business outcomes and classification trade-offs. To choose the best approach, we simulate their performance using two complementary perspectives:

Expected Business Value: How much value does the policy generate (or lose) based on predicted vs actual outcomes?
Classification Quality: How accurately does the policy identify who should or shouldn’t receive emails?

7.1 Policy Comparison Results

Policy	Threshold	Total Expected Value	Avg EV per User	F1
Best Threshold	0.30	$127,450	$0.85	0.548
Email Everyone	0.00	$89,680	$0.60	0.000
Fixed Threshold (0.50)	0.50	$89,230	$0.59	0.556
Email No One	1.00	$12,580	$0.08	0.506

The results show that the optimized threshold (30%) delivers the highest expected value, even outperforming the “email everyone” strategy while maintaining better precision and protecting sender reputation.

Conclusion

The Expected Value framework provides a powerful approach to optimize ML decision-making by:

Aligning model outputs with business objectives rather than just statistical accuracy
Accounting for asymmetric costs of different prediction errors
Enabling personalized decision-making based on individual risk-reward profiles
Providing interpretable business metrics for model evaluation

This framework is particularly valuable when prediction errors have different business impacts, making it essential for real-world ML applications where maximizing business value is more important than maximizing traditional classification metrics.

Key takeaway: It’s better to risk sending to a potentially unengaged customer than miss the opportunity to generate revenue from someone who would have converted from the email campaign.

Expected Value Framework for ML Decision Making

Business-Oriented Threshold Optimization