Model Card for Customer Churn Prediction Pipeline

This model is a trained Scikit-learn pipeline designed to predict whether a telecom customer is likely to churn based on account, service, and billing attributes.

Model Details

Model Description

This model acts as a churn-risk scoring engine for retention workflows. It combines preprocessing (imputation, scaling, one-hot encoding) and classification in a single serialized pipeline artifact for consistent training and inference behavior.

  • Developed by: Aashir Hameed
  • Model type: Scikit-learn Tabular Classification Pipeline
  • Language(s): English (en) for feature labels/documentation
  • License: Apache 2.0
  • Trained from: Telco customer churn tabular dataset

Model Sources

Uses

Direct Use

This model is intended for churn risk scoring in:

  • CRM prioritization and retention campaigns
  • Proactive outreach workflows for high-risk customers
  • Batch scoring of customer cohorts

Binary output mapping:

  • 0: No Churn
  • 1: Churn

Out-of-Scope Use

This model is not intended for:

  • Causal inference on churn drivers
  • Fairness-critical automated decisions without human review
  • Data distributions that significantly differ from the Telco training data

Bias, Risks, and Limitations

Like all supervised models, this pipeline may reflect historical biases and collection artifacts present in source data. Prediction confidence can degrade under distribution shift (for example new plans, pricing structures, or service bundles not represented in training data). The model should be monitored for drift and recalibrated/retrained on a schedule.

How to Get Started with the Model

Use the code below for inference with joblib:

from pathlib import Path
import joblib
import pandas as pd

model = joblib.load(Path("churn_model_v1.pkl"))

sample = pd.DataFrame(
    [
        {
            "gender": "Female",
            "SeniorCitizen": "0",
            "Partner": "Yes",
            "Dependents": "No",
            "tenure": 12,
            "PhoneService": "Yes",
            "MultipleLines": "No",
            "InternetService": "Fiber optic",
            "OnlineSecurity": "No",
            "OnlineBackup": "Yes",
            "DeviceProtection": "No",
            "TechSupport": "No",
            "StreamingTV": "Yes",
            "StreamingMovies": "Yes",
            "Contract": "Month-to-month",
            "PaperlessBilling": "Yes",
            "PaymentMethod": "Electronic check",
            "MonthlyCharges": 89.1,
            "TotalCharges": 1069.2,
        }
    ]
)

prediction = model.predict(sample)[0]
probability = model.predict_proba(sample)[0][1]
print(prediction, probability)

Training Details

Training Data

The model was trained on WA_Fn-UseC_-Telco-Customer-Churn.csv with the standard churn target column (Churn).

Training Procedure

Preprocessing

  • Dropped non-predictive customerID
  • Coerced TotalCharges to numeric and removed rows with invalid target/critical numeric values
  • Numeric preprocessing: median imputation + standard scaling
  • Categorical preprocessing: most-frequent imputation + one-hot encoding (handle_unknown='ignore')

Training Hyperparameters

  • Validation: Stratified K-Fold cross-validation (n_splits=5)
  • Model search: GridSearchCV with scoring = f1
  • Candidates: Logistic Regression and Random Forest
  • Winning model: Random Forest
  • Best params (winner):
    • class_weight=balanced
    • max_depth=8
    • min_samples_leaf=4
    • min_samples_split=2
    • n_estimators=200

Evaluation

Testing Data, Factors & Metrics

Testing Data

Held-out split from the Telco dataset with stratified train/test partitioning.

Metrics

  • Accuracy
  • F1-score

Results

  • Final Test Accuracy: 75.05%
  • Final Test F1-Score: 62.38%
  • Best CV F1-score: 63.96%

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator.

  • Hardware Type: Standard local CPU training environment
  • Training profile: Classical ML grid-search over two model families

Author & Contact

Aashir Hameed

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support