Pipeline Basics¶

Learn the fundamental concepts of MLFCrafter pipelines and how to build effective ML workflows.

What is a MLFChain?¶

A MLFChain is the core orchestrator in MLFCrafter that manages the execution of multiple processing steps (crafters) in sequence. It provides:

Sequential Processing: Steps execute in order
Data Flow Management: Automatic data passing between steps
Error Handling: Graceful failure handling and rollback
State Management: Track pipeline state and intermediate results
Logging: Comprehensive logging of each step

Basic Pipeline Structure¶

from mlfcrafter import MLFChain

# Create a new pipeline
pipeline = MLFChain()

# Add processing steps
pipeline.add_crafter(CrafterA())
pipeline.add_crafter(CrafterB())
pipeline.add_crafter(CrafterC())

# Run the pipeline
result = pipeline.run()

Crafter Lifecycle¶

Each crafter goes through several phases during execution:

Initialization: Crafter receives data from previous step
Validation: Input data is validated
Processing: Main logic execution
Output: Results are prepared for next step
Cleanup: Resources are released

Data Flow Between Crafters¶

Data flows automatically between crafters:

pipeline = MLFChain()
pipeline.add_crafter(DataIngestCrafter(data))      # Outputs: processed DataFrame
pipeline.add_crafter(CleanerCrafter())             # Inputs: DataFrame, Outputs: clean DataFrame  
pipeline.add_crafter(ScalerCrafter())              # Inputs: DataFrame, Outputs: scaled DataFrame
pipeline.add_crafter(ModelCrafter())               # Inputs: DataFrame, Outputs: trained model

Conditional Execution¶

Add conditional logic to pipelines:

pipeline = MLFChain()
pipeline.add_crafter(DataIngestCrafter(data))

# Conditional cleaning based on data quality
if data.isnull().sum().sum() > 0:
    pipeline.add_crafter(CleanerCrafter(strategy='impute'))

pipeline.add_crafter(ModelCrafter())

Next Steps¶

Explore Model Training options
Read about Deployment strategies