ModelCrafter¶
The ModelCrafter handles model selection, training, and evaluation. It supports multiple algorithms and provides train/test splitting.
Overview¶
from mlfcrafter import ModelCrafter
crafter = ModelCrafter(
model_name="random_forest",
model_params={"n_estimators": 100},
test_size=0.2,
random_state=61,
stratify=True
)
Parameters¶
model_name
(str)¶
Default: "random_forest"
Available Models:
- "random_forest"
: RandomForestClassifier
- "logistic_regression"
: LogisticRegression
- "xgboost"
: XGBClassifier
model_params
(Optional[Dict])¶
Default: {}
(use sklearn defaults)
Hyperparameters for the selected model:
# Random Forest examples
{"n_estimators": 100, "max_depth": 10}
# XGBoost examples
{"learning_rate": 0.1, "max_depth": 6}
# Logistic Regression examples
{"C": 1.0, "max_iter": 1000}
test_size
(float)¶
Default: 0.2
Proportion of data for testing (0.0 to 1.0).
random_state
(int)¶
Default: 61
Seed for reproducible results.
stratify
(bool)¶
Default: True
Whether to maintain class proportions in train/test split.
Context Input¶
data
: Dataset (required)target_column
: Name of target variable column (required)
Context Output¶
model
: Trained model objectX_train
,X_test
: Feature splitsy_train
,y_test
: Target splitsy_pred
: Predictions on test settrain_score
: Training accuracytest_score
: Test accuracymodel_name
: Algorithm name usedfeatures
: List of feature column names
Example Usage¶
from mlfcrafter import MLFChain
from mlfcrafter.crafters import *
# Basic Random Forest
pipeline = MLFChain()
pipeline.add_crafter(DataIngestCrafter("data.csv"))
pipeline.add_crafter(ModelCrafter())
result = pipeline.run()
# Tuned XGBoost
pipeline = MLFChain()
pipeline.add_crafter(DataIngestCrafter("data.csv"))
pipeline.add_crafter(ModelCrafter(
model_name="xgboost",
model_params={
"n_estimators": 200,
"learning_rate": 0.1,
"max_depth": 6
},
test_size=0.25
))
result = pipeline.run()
# Logistic Regression
pipeline = MLFChain()
pipeline.add_crafter(DataIngestCrafter("data.csv"))
pipeline.add_crafter(ModelCrafter(
model_name="logistic_regression",
model_params={"C": 0.5, "max_iter": 2000},
stratify=False
))
result = pipeline.run()