Model Deployment¶

Learn how to save and deploy your trained MLFCrafter models using the DeployCrafter.

DeployCrafter Overview¶

The DeployCrafter handles model deployment by saving trained models and associated artifacts to disk. It supports multiple serialization formats.

Basic Deployment¶

Save Model with Default Settings¶

from mlfcrafter import MLFChain, DataIngestCrafter, CleanerCrafter, ModelCrafter, DeployCrafter

# Train and save model
pipeline = MLFChain(
    DataIngestCrafter("data.csv"),
    CleanerCrafter(strategy="auto"),
    ModelCrafter(),
    DeployCrafter()  # Auto-generates filename
)
result = pipeline.run()

print(f"Model saved to: {result['deployment_path']}")
print(f"Artifacts saved: {result['artifacts_saved']}")

Deployment Configuration¶

Custom Model Path¶

# Save to specific location
deployer = DeployCrafter(
    model_path="models/customer_model.joblib",
    save_format="joblib"
)

Serialization Formats¶

# Joblib format (recommended for sklearn models)
deployer = DeployCrafter(
    model_path="model.joblib",
    save_format="joblib"
)

# Pickle format (for compatibility)
deployer = DeployCrafter(
    model_path="model.pkl", 
    save_format="pickle"
)

Artifact Control¶

# Include all artifacts (default)
deployer = DeployCrafter(
    model_path="full_model.joblib",
    include_scaler=True,      # Include fitted scaler
    include_metadata=True     # Include training metadata
)

# Model only
deployer = DeployCrafter(
    model_path="model_only.joblib",
    include_scaler=False,
    include_metadata=False
)

Complete Deployment Pipeline¶

from mlfcrafter import MLFChain, DataIngestCrafter, CleanerCrafter, ScalerCrafter, ModelCrafter, ScorerCrafter, DeployCrafter

# Full pipeline with deployment
pipeline = MLFChain(
    # Data processing
    DataIngestCrafter("sales_data.csv"),
    CleanerCrafter(
        strategy="auto",
        str_fill="Unknown"
    ),
    ScalerCrafter(scaler_type="standard"),

    # Model training
    ModelCrafter(
        model_name="xgboost",
        model_params={
            "n_estimators": 200,
            "learning_rate": 0.05
        },
        test_size=0.2
    ),

    # Model evaluation
    ScorerCrafter(),

    # Model deployment
    DeployCrafter(
        model_path="production/sales_model_v1.joblib",
        save_format="joblib",
        include_scaler=True,     # Essential for production
        include_metadata=True    # Track model details
    )
)

result = pipeline.run()

# Deployment results
if result['deployment_successful']:
    print("✅ Model deployed successfully!")
    print(f"📍 Location: {result['deployment_path']}")
    print(f"📊 Test F1: {result['scores']['f1']:.4f}")
    print(f"🔧 Artifacts: {', '.join(result['artifacts_saved'])}")
else:
    print("❌ Deployment failed")

Loading Saved Models¶

Load Model for Inference¶

# Load saved model and artifacts
artifacts = DeployCrafter.load_model("sales_model_v1.joblib")

# Access components
model = artifacts["model"]
scaler = artifacts.get("scaler")      # May be None if not saved
metadata = artifacts.get("metadata")  # May be None if not saved

print(f"Model type: {type(model).__name__}")
if scaler:
    print(f"Scaler type: {type(scaler).__name__}")
if metadata:
    print(f"Training date: {metadata['timestamp']}")
    print(f"Features: {metadata['features']}")

Make Predictions¶

import pandas as pd

# Load model artifacts
artifacts = DeployCrafter.load_model("customer_model.joblib")
model = artifacts["model"]
scaler = artifacts.get("scaler")

# Load new data
new_data = pd.read_csv("new_customers.csv")

# Apply same preprocessing if scaler was saved
if scaler:
    # Get feature columns from metadata
    feature_cols = artifacts["metadata"]["features"]
    new_data_scaled = scaler.transform(new_data[feature_cols])
    predictions = model.predict(new_data_scaled)
else:
    predictions = model.predict(new_data)

print(f"Predictions: {predictions}")

Deployment Strategies¶

Versioned Models¶

from datetime import datetime

# Create versioned model name
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
model_path = f"models/customer_churn_v{timestamp}.joblib"

deployer = DeployCrafter(
    model_path=model_path,
    include_metadata=True  # Track version info
)

Environment-Specific Deployment¶

import os

# Development environment
if os.getenv('ENV') == 'dev':
    deployer = DeployCrafter(
        model_path="dev/models/test_model.joblib",
        include_metadata=True
    )
# Production environment
else:
    deployer = DeployCrafter(
        model_path="prod/models/production_model.joblib",
        save_format="joblib",
        include_scaler=True,
        include_metadata=True
    )

Model Artifacts¶

What Gets Saved¶

When include_metadata=True, the following information is saved:

Model algorithm name and parameters
Feature names used for training
Target column name
Training and test scores
Dataset shape information
Scaler type (if scaler included)
Training timestamp

Accessing Metadata¶

artifacts = DeployCrafter.load_model("model.joblib")
metadata = artifacts["metadata"]

print(f"Model trained on: {metadata['timestamp']}")
print(f"Algorithm: {metadata['model_name']}")
print(f"Test accuracy: {metadata['test_score']:.4f}")
print(f"Features used: {metadata['features']}")

Best Practices¶

Always Include Scaler¶

Set include_scaler=True for production models to ensure consistent preprocessing.

Use Metadata for Tracking¶

Enable include_metadata=True to track model versions and performance.

Check Deployment Success¶

Always verify the deployment_successful flag:

result = pipeline.run()
if not result['deployment_successful']:
    print(f"Deployment failed: {result.get('deployment_error', 'Unknown error')}")

Choose Appropriate Format¶

Use "joblib" for scikit-learn models (faster, optimized)
Use "pickle" for custom objects or compatibility needs

Example: Production Deployment¶

# Complete production deployment example
pipeline = MLFChain(
    # Data pipeline
    DataIngestCrafter("production_data.csv"),
    CleanerCrafter(strategy="auto"),
    ScalerCrafter(scaler_type="robust"),

    # Model training
    ModelCrafter(
        model_name="random_forest",
        model_params={"n_estimators": 200},
        test_size=0.2,
        stratify=True
    ),

    # Evaluation
    ScorerCrafter(),

    # Production deployment
    DeployCrafter(
        model_path="production/fraud_detection_v2.1.joblib",
        save_format="joblib",
        include_scaler=True,
        include_metadata=True
    )
)

result = pipeline.run()

# Validate deployment
if result['deployment_successful']:
    print("🚀 Production model deployed!")

    # Test loading
    artifacts = DeployCrafter.load_model(result['deployment_path'])
    print(f"✅ Model loading test successful")
    print(f"📊 F1 Score: {result['scores']['f1']:.4f}")
else:
    print("❌ Deployment failed - check logs")