Comparing Python ML Libraries: Scikit-learn vs TensorFlow vs PyTorch

A comprehensive comparison of the three most popular machine learning libraries with practical examples, benchmarks, and guidance on choosing the right tool for your projects.

D
Dery Febriantara Developer
Comparing Python ML Libraries: Scikit-learn vs TensorFlow vs PyTorch

Choosing the right machine learning library can significantly impact your productivity, model performance, and deployment options. In this comprehensive guide, we’ll dive deep into the three most popular Python ML libraries: Scikit-learn, TensorFlow, and PyTorch. By the end, you’ll know exactly which tool to reach for in any situation.

Overview: The Big Three

Before we dive into details, let’s understand what each library is designed for:

LibraryPrimary UseCreated ByFirst Release
Scikit-learnClassical MLINRIA2007
TensorFlowDeep LearningGoogle2015
PyTorchDeep LearningFacebook/Meta2016

When to Use Each

  • Scikit-learn: Tabular data, quick prototyping, classical algorithms
  • TensorFlow: Production deployment, mobile/edge devices, large-scale systems
  • PyTorch: Research, experimentation, custom architectures, rapid iteration

Scikit-learn: The Swiss Army Knife

Scikit-learn is the go-to library for traditional machine learning. It provides a consistent, well-documented API that makes it easy to experiment with different algorithms.

Philosophy and Design

Scikit-learn follows a simple design philosophy:

  1. Consistent API: All models use .fit(), .predict(), .transform()
  2. Composability: Build pipelines with preprocessing and models
  3. Sensible defaults: Works out of the box with good hyperparameters
  4. Extensive documentation: Every function is thoroughly documented

Strengths

  • Simple, consistent API: Learn once, apply everywhere
  • Excellent documentation: Tutorials, examples, user guide
  • Great for classical ML: SVM, Random Forest, Gradient Boosting, etc.
  • Built-in preprocessing: Scaling, encoding, feature selection
  • Model selection tools: Cross-validation, grid search, metrics
  • Integration: Works seamlessly with NumPy, Pandas, and visualization libraries

Complete Example: Classification Pipeline

import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split, cross_val_score, GridSearchCV
from sklearn.preprocessing import StandardScaler, LabelEncoder
from sklearn.pipeline import Pipeline
from sklearn.compose import ColumnTransformer
from sklearn.impute import SimpleImputer
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC
from sklearn.metrics import classification_report, confusion_matrix, roc_auc_score
import matplotlib.pyplot as plt

# Load and prepare data
# Using a sample dataset structure
np.random.seed(42)
n_samples = 1000

data = pd.DataFrame({
    'age': np.random.randint(18, 70, n_samples),
    'income': np.random.normal(50000, 20000, n_samples),
    'education': np.random.choice(['high_school', 'bachelor', 'master', 'phd'], n_samples),
    'credit_score': np.random.randint(300, 850, n_samples),
    'years_employed': np.random.randint(0, 40, n_samples),
    'approved': np.random.randint(0, 2, n_samples)
})

# Add some missing values
data.loc[np.random.choice(data.index, 50), 'income'] = np.nan
data.loc[np.random.choice(data.index, 30), 'credit_score'] = np.nan

# Separate features and target
X = data.drop('approved', axis=1)
y = data['approved']

# Identify column types
numeric_features = ['age', 'income', 'credit_score', 'years_employed']
categorical_features = ['education']

# Create preprocessing pipelines
numeric_transformer = Pipeline(steps=[
    ('imputer', SimpleImputer(strategy='median')),
    ('scaler', StandardScaler())
])

categorical_transformer = Pipeline(steps=[
    ('imputer', SimpleImputer(strategy='constant', fill_value='missing')),
    ('encoder', LabelEncoder())
])

# Note: For OneHotEncoder use this instead
from sklearn.preprocessing import OneHotEncoder
categorical_transformer_onehot = Pipeline(steps=[
    ('imputer', SimpleImputer(strategy='constant', fill_value='missing')),
    ('encoder', OneHotEncoder(handle_unknown='ignore'))
])

# Combine preprocessors
preprocessor = ColumnTransformer(
    transformers=[
        ('num', numeric_transformer, numeric_features),
        ('cat', categorical_transformer_onehot, categorical_features)
    ])

# Create full pipeline with classifier
pipeline = Pipeline(steps=[
    ('preprocessor', preprocessor),
    ('classifier', RandomForestClassifier(random_state=42))
])

# Split data
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)

# Train and evaluate
pipeline.fit(X_train, y_train)
y_pred = pipeline.predict(X_test)
y_proba = pipeline.predict_proba(X_test)[:, 1]

print("Classification Report:")
print(classification_report(y_test, y_pred))
print(f"ROC-AUC Score: {roc_auc_score(y_test, y_proba):.4f}")

# Cross-validation
cv_scores = cross_val_score(pipeline, X, y, cv=5, scoring='roc_auc')
print(f"\nCross-validation ROC-AUC: {cv_scores.mean():.4f} (+/- {cv_scores.std()*2:.4f})")

Hyperparameter Tuning

# Grid search with cross-validation
param_grid = {
    'classifier__n_estimators': [50, 100, 200],
    'classifier__max_depth': [5, 10, 20, None],
    'classifier__min_samples_split': [2, 5, 10],
    'classifier__min_samples_leaf': [1, 2, 4]
}

grid_search = GridSearchCV(
    pipeline,
    param_grid,
    cv=5,
    scoring='roc_auc',
    n_jobs=-1,
    verbose=1
)

grid_search.fit(X_train, y_train)

print(f"Best parameters: {grid_search.best_params_}")
print(f"Best CV score: {grid_search.best_score_:.4f}")

# Evaluate best model
best_model = grid_search.best_estimator_
y_pred_best = best_model.predict(X_test)
y_proba_best = best_model.predict_proba(X_test)[:, 1]
print(f"Test ROC-AUC: {roc_auc_score(y_test, y_proba_best):.4f}")

Comparing Multiple Algorithms

from sklearn.model_selection import cross_validate

# Define models to compare
models = {
    'Logistic Regression': LogisticRegression(max_iter=1000),
    'Random Forest': RandomForestClassifier(n_estimators=100, random_state=42),
    'Gradient Boosting': GradientBoostingClassifier(n_estimators=100, random_state=42),
    'SVM': SVC(probability=True, random_state=42)
}

# Compare models
results = {}
for name, model in models.items():
    # Create pipeline with each model
    clf_pipeline = Pipeline(steps=[
        ('preprocessor', preprocessor),
        ('classifier', model)
    ])

    # Cross-validate
    cv_results = cross_validate(
        clf_pipeline, X, y,
        cv=5,
        scoring=['accuracy', 'roc_auc', 'f1'],
        return_train_score=True
    )

    results[name] = {
        'accuracy': cv_results['test_accuracy'].mean(),
        'roc_auc': cv_results['test_roc_auc'].mean(),
        'f1': cv_results['test_f1'].mean()
    }

# Display results
results_df = pd.DataFrame(results).T
print(results_df.round(4))

When to Use Scikit-learn

Perfect for:

  • Tabular/structured data (CSV, databases)
  • Quick prototyping and experimentation
  • Classical ML algorithms (SVM, trees, linear models)
  • Feature engineering and preprocessing
  • Model selection and hyperparameter tuning
  • Small to medium datasets

Not ideal for:

  • Deep learning and neural networks
  • Image, video, or audio processing
  • Large-scale distributed training
  • GPU acceleration
  • State-of-the-art NLP models

TensorFlow: Production-Ready Deep Learning

TensorFlow is Google’s deep learning framework, designed for production deployment at scale.

Philosophy and Design

TensorFlow prioritizes:

  1. Production readiness: Easy deployment to servers, mobile, browsers
  2. Ecosystem: TensorBoard, TensorFlow Lite, TensorFlow.js, TFX
  3. Scalability: Distributed training across multiple GPUs/TPUs
  4. Keras integration: High-level API for rapid development

Strengths

  • Production deployment: TensorFlow Serving, TF Lite, TF.js
  • Comprehensive ecosystem: Tools for every part of the ML lifecycle
  • TensorBoard: Excellent visualization and debugging
  • Keras API: User-friendly high-level interface
  • TPU support: Native support for Google’s TPUs
  • Mobile and edge: Deploy to phones, IoT devices, browsers

Complete Example: Image Classification

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import numpy as np

# Check GPU availability
print(f"TensorFlow version: {tf.__version__}")
print(f"GPU available: {tf.config.list_physical_devices('GPU')}")

# Load CIFAR-10 dataset
(x_train, y_train), (x_test, y_test) = keras.datasets.cifar10.load_data()

# Preprocessing
x_train = x_train.astype('float32') / 255.0
x_test = x_test.astype('float32') / 255.0

# Convert labels to categorical
num_classes = 10
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

print(f"Training data shape: {x_train.shape}")
print(f"Test data shape: {x_test.shape}")

# Data augmentation
datagen = ImageDataGenerator(
    rotation_range=15,
    width_shift_range=0.1,
    height_shift_range=0.1,
    horizontal_flip=True,
    zoom_range=0.1
)
datagen.fit(x_train)

# Build CNN model
def create_cnn_model(input_shape, num_classes):
    model = keras.Sequential([
        # First convolutional block
        layers.Conv2D(32, (3, 3), padding='same', input_shape=input_shape),
        layers.BatchNormalization(),
        layers.Activation('relu'),
        layers.Conv2D(32, (3, 3), padding='same'),
        layers.BatchNormalization(),
        layers.Activation('relu'),
        layers.MaxPooling2D((2, 2)),
        layers.Dropout(0.25),

        # Second convolutional block
        layers.Conv2D(64, (3, 3), padding='same'),
        layers.BatchNormalization(),
        layers.Activation('relu'),
        layers.Conv2D(64, (3, 3), padding='same'),
        layers.BatchNormalization(),
        layers.Activation('relu'),
        layers.MaxPooling2D((2, 2)),
        layers.Dropout(0.25),

        # Third convolutional block
        layers.Conv2D(128, (3, 3), padding='same'),
        layers.BatchNormalization(),
        layers.Activation('relu'),
        layers.Conv2D(128, (3, 3), padding='same'),
        layers.BatchNormalization(),
        layers.Activation('relu'),
        layers.MaxPooling2D((2, 2)),
        layers.Dropout(0.25),

        # Dense layers
        layers.Flatten(),
        layers.Dense(512),
        layers.BatchNormalization(),
        layers.Activation('relu'),
        layers.Dropout(0.5),
        layers.Dense(num_classes, activation='softmax')
    ])

    return model

model = create_cnn_model((32, 32, 3), num_classes)

# Compile model
model.compile(
    optimizer=keras.optimizers.Adam(learning_rate=0.001),
    loss='categorical_crossentropy',
    metrics=['accuracy']
)

model.summary()

# Callbacks
callbacks = [
    keras.callbacks.EarlyStopping(
        monitor='val_loss',
        patience=10,
        restore_best_weights=True
    ),
    keras.callbacks.ReduceLROnPlateau(
        monitor='val_loss',
        factor=0.5,
        patience=5,
        min_lr=1e-6
    ),
    keras.callbacks.TensorBoard(
        log_dir='./logs',
        histogram_freq=1
    ),
    keras.callbacks.ModelCheckpoint(
        'best_model.keras',
        save_best_only=True,
        monitor='val_accuracy'
    )
]

# Train model
history = model.fit(
    datagen.flow(x_train, y_train, batch_size=64),
    epochs=100,
    validation_data=(x_test, y_test),
    callbacks=callbacks,
    verbose=1
)

# Evaluate
test_loss, test_acc = model.evaluate(x_test, y_test, verbose=0)
print(f"\nTest accuracy: {test_acc:.4f}")

Transfer Learning with TensorFlow

from tensorflow.keras.applications import ResNet50, VGG16, MobileNetV2
from tensorflow.keras.layers import GlobalAveragePooling2D, Dense, Dropout

def create_transfer_learning_model(base_model_name='resnet50', num_classes=10):
    # Choose base model
    base_models = {
        'resnet50': ResNet50,
        'vgg16': VGG16,
        'mobilenet': MobileNetV2
    }

    # Load pretrained model without top layers
    base_model = base_models[base_model_name](
        weights='imagenet',
        include_top=False,
        input_shape=(224, 224, 3)
    )

    # Freeze base model layers
    base_model.trainable = False

    # Build model
    inputs = keras.Input(shape=(224, 224, 3))
    x = base_model(inputs, training=False)
    x = GlobalAveragePooling2D()(x)
    x = Dense(256, activation='relu')(x)
    x = Dropout(0.5)(x)
    outputs = Dense(num_classes, activation='softmax')(x)

    model = keras.Model(inputs, outputs)

    return model, base_model

# Create and compile
model, base_model = create_transfer_learning_model('mobilenet', num_classes=10)
model.compile(
    optimizer=keras.optimizers.Adam(learning_rate=0.001),
    loss='categorical_crossentropy',
    metrics=['accuracy']
)

# After initial training, fine-tune the base model
def fine_tune_model(model, base_model, fine_tune_at=100):
    # Unfreeze top layers of base model
    base_model.trainable = True

    # Freeze layers before fine_tune_at
    for layer in base_model.layers[:fine_tune_at]:
        layer.trainable = False

    # Recompile with lower learning rate
    model.compile(
        optimizer=keras.optimizers.Adam(learning_rate=1e-5),
        loss='categorical_crossentropy',
        metrics=['accuracy']
    )

    return model

Custom Training Loop in TensorFlow

# For more control, use custom training loops
@tf.function
def train_step(model, optimizer, loss_fn, x_batch, y_batch):
    with tf.GradientTape() as tape:
        predictions = model(x_batch, training=True)
        loss = loss_fn(y_batch, predictions)

    gradients = tape.gradient(loss, model.trainable_variables)
    optimizer.apply_gradients(zip(gradients, model.trainable_variables))

    return loss

@tf.function
def test_step(model, loss_fn, x_batch, y_batch):
    predictions = model(x_batch, training=False)
    loss = loss_fn(y_batch, predictions)
    accuracy = tf.reduce_mean(
        tf.cast(tf.argmax(predictions, axis=1) == tf.argmax(y_batch, axis=1), tf.float32)
    )
    return loss, accuracy

# Training loop
def custom_training(model, train_dataset, test_dataset, epochs=10):
    optimizer = keras.optimizers.Adam(learning_rate=0.001)
    loss_fn = keras.losses.CategoricalCrossentropy()

    for epoch in range(epochs):
        # Training
        train_losses = []
        for x_batch, y_batch in train_dataset:
            loss = train_step(model, optimizer, loss_fn, x_batch, y_batch)
            train_losses.append(loss.numpy())

        # Validation
        test_losses = []
        test_accuracies = []
        for x_batch, y_batch in test_dataset:
            loss, acc = test_step(model, loss_fn, x_batch, y_batch)
            test_losses.append(loss.numpy())
            test_accuracies.append(acc.numpy())

        print(f"Epoch {epoch+1}: "
              f"Train Loss = {np.mean(train_losses):.4f}, "
              f"Test Loss = {np.mean(test_losses):.4f}, "
              f"Test Acc = {np.mean(test_accuracies):.4f}")

Saving and Deploying TensorFlow Models

# Save full model
model.save('my_model.keras')

# Save in SavedModel format (recommended for serving)
model.save('saved_model/my_model')

# Convert to TensorFlow Lite for mobile
converter = tf.lite.TFLiteConverter.from_saved_model('saved_model/my_model')
tflite_model = converter.convert()

with open('model.tflite', 'wb') as f:
    f.write(tflite_model)

# Quantize for smaller model size
converter.optimizations = [tf.lite.Optimize.DEFAULT]
quantized_model = converter.convert()

# Export to TensorFlow.js for browser
# Run in terminal: tensorflowjs_converter --input_format=keras my_model.keras tfjs_model/

When to Use TensorFlow

Perfect for:

  • Production deployment at scale
  • Mobile and edge deployment (TF Lite)
  • Browser-based ML (TensorFlow.js)
  • Large-scale distributed training
  • TPU training on Google Cloud
  • Enterprise environments

Not ideal for:

  • Quick research prototyping (use PyTorch)
  • When you need maximum flexibility
  • Small projects where Keras overhead isn’t worth it

PyTorch: Research-First Deep Learning

PyTorch is Facebook/Meta’s deep learning framework, beloved by researchers for its flexibility and Pythonic design.

Philosophy and Design

PyTorch prioritizes:

  1. Pythonic: Feels like writing regular Python
  2. Dynamic graphs: Define-by-run for flexibility
  3. Debugging: Standard Python debugging tools work
  4. Research-friendly: Easy to experiment with new ideas

Strengths

  • Dynamic computation graphs: Flexibility for complex architectures
  • Intuitive: Feels like native Python
  • Excellent debugging: Use pdb, print statements, etc.
  • Strong community: Most research papers use PyTorch
  • Hugging Face integration: State-of-the-art NLP models
  • TorchScript: Production deployment option

Complete Example: Image Classification

import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, Dataset
import torchvision
import torchvision.transforms as transforms
from tqdm import tqdm
import numpy as np

# Check GPU availability
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"Using device: {device}")

# Data transformations
train_transform = transforms.Compose([
    transforms.RandomHorizontalFlip(),
    transforms.RandomCrop(32, padding=4),
    transforms.ColorJitter(brightness=0.2, contrast=0.2),
    transforms.ToTensor(),
    transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010))
])

test_transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010))
])

# Load CIFAR-10
train_dataset = torchvision.datasets.CIFAR10(
    root='./data', train=True, download=True, transform=train_transform
)
test_dataset = torchvision.datasets.CIFAR10(
    root='./data', train=False, download=True, transform=test_transform
)

train_loader = DataLoader(train_dataset, batch_size=128, shuffle=True, num_workers=4)
test_loader = DataLoader(test_dataset, batch_size=256, shuffle=False, num_workers=4)

# Define CNN architecture
class CNN(nn.Module):
    def __init__(self, num_classes=10):
        super(CNN, self).__init__()

        # Feature extractor
        self.features = nn.Sequential(
            # Block 1
            nn.Conv2d(3, 64, kernel_size=3, padding=1),
            nn.BatchNorm2d(64),
            nn.ReLU(inplace=True),
            nn.Conv2d(64, 64, kernel_size=3, padding=1),
            nn.BatchNorm2d(64),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(2, 2),
            nn.Dropout(0.25),

            # Block 2
            nn.Conv2d(64, 128, kernel_size=3, padding=1),
            nn.BatchNorm2d(128),
            nn.ReLU(inplace=True),
            nn.Conv2d(128, 128, kernel_size=3, padding=1),
            nn.BatchNorm2d(128),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(2, 2),
            nn.Dropout(0.25),

            # Block 3
            nn.Conv2d(128, 256, kernel_size=3, padding=1),
            nn.BatchNorm2d(256),
            nn.ReLU(inplace=True),
            nn.Conv2d(256, 256, kernel_size=3, padding=1),
            nn.BatchNorm2d(256),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(2, 2),
            nn.Dropout(0.25),
        )

        # Classifier
        self.classifier = nn.Sequential(
            nn.Flatten(),
            nn.Linear(256 * 4 * 4, 512),
            nn.BatchNorm1d(512),
            nn.ReLU(inplace=True),
            nn.Dropout(0.5),
            nn.Linear(512, num_classes)
        )

    def forward(self, x):
        x = self.features(x)
        x = self.classifier(x)
        return x

# Initialize model
model = CNN(num_classes=10).to(device)

# Count parameters
total_params = sum(p.numel() for p in model.parameters())
trainable_params = sum(p.numel() for p in model.parameters() if p.requires_grad)
print(f"Total parameters: {total_params:,}")
print(f"Trainable parameters: {trainable_params:,}")

# Loss and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.AdamW(model.parameters(), lr=0.001, weight_decay=0.01)

# Learning rate scheduler
scheduler = optim.lr_scheduler.OneCycleLR(
    optimizer,
    max_lr=0.01,
    epochs=50,
    steps_per_epoch=len(train_loader)
)

# Training function
def train_epoch(model, loader, criterion, optimizer, scheduler, device):
    model.train()
    running_loss = 0.0
    correct = 0
    total = 0

    pbar = tqdm(loader, desc='Training')
    for inputs, targets in pbar:
        inputs, targets = inputs.to(device), targets.to(device)

        # Forward pass
        outputs = model(inputs)
        loss = criterion(outputs, targets)

        # Backward pass
        optimizer.zero_grad()
        loss.backward()

        # Gradient clipping
        torch.nn.utils.clip_grad_norm_(model.parameters(), max_norm=1.0)

        optimizer.step()
        scheduler.step()

        # Statistics
        running_loss += loss.item()
        _, predicted = outputs.max(1)
        total += targets.size(0)
        correct += predicted.eq(targets).sum().item()

        pbar.set_postfix({
            'loss': running_loss / (pbar.n + 1),
            'acc': 100. * correct / total
        })

    return running_loss / len(loader), correct / total

# Evaluation function
@torch.no_grad()
def evaluate(model, loader, criterion, device):
    model.eval()
    running_loss = 0.0
    correct = 0
    total = 0

    for inputs, targets in loader:
        inputs, targets = inputs.to(device), targets.to(device)

        outputs = model(inputs)
        loss = criterion(outputs, targets)

        running_loss += loss.item()
        _, predicted = outputs.max(1)
        total += targets.size(0)
        correct += predicted.eq(targets).sum().item()

    return running_loss / len(loader), correct / total

# Training loop
best_acc = 0
epochs = 50

for epoch in range(epochs):
    print(f"\nEpoch {epoch+1}/{epochs}")

    train_loss, train_acc = train_epoch(
        model, train_loader, criterion, optimizer, scheduler, device
    )

    test_loss, test_acc = evaluate(model, test_loader, criterion, device)

    print(f"Train Loss: {train_loss:.4f}, Train Acc: {train_acc:.2%}")
    print(f"Test Loss: {test_loss:.4f}, Test Acc: {test_acc:.2%}")

    # Save best model
    if test_acc > best_acc:
        best_acc = test_acc
        torch.save({
            'epoch': epoch,
            'model_state_dict': model.state_dict(),
            'optimizer_state_dict': optimizer.state_dict(),
            'best_acc': best_acc,
        }, 'best_model.pth')
        print(f"New best model saved with accuracy: {best_acc:.2%}")

print(f"\nBest Test Accuracy: {best_acc:.2%}")

Transfer Learning with PyTorch

import torchvision.models as models

def create_transfer_model(num_classes, model_name='resnet18', pretrained=True):
    # Load pretrained model
    if model_name == 'resnet18':
        model = models.resnet18(weights='IMAGENET1K_V1' if pretrained else None)
        num_features = model.fc.in_features
        model.fc = nn.Linear(num_features, num_classes)

    elif model_name == 'resnet50':
        model = models.resnet50(weights='IMAGENET1K_V1' if pretrained else None)
        num_features = model.fc.in_features
        model.fc = nn.Linear(num_features, num_classes)

    elif model_name == 'efficientnet':
        model = models.efficientnet_b0(weights='IMAGENET1K_V1' if pretrained else None)
        num_features = model.classifier[1].in_features
        model.classifier = nn.Sequential(
            nn.Dropout(0.2),
            nn.Linear(num_features, num_classes)
        )

    return model

# Freeze and unfreeze layers for fine-tuning
def freeze_layers(model, freeze_until='layer3'):
    for name, param in model.named_parameters():
        if freeze_until in name:
            break
        param.requires_grad = False

def unfreeze_all(model):
    for param in model.parameters():
        param.requires_grad = True

# Usage
model = create_transfer_model(num_classes=10, model_name='resnet18')
freeze_layers(model)  # First train only the classifier

# After a few epochs, unfreeze for fine-tuning
unfreeze_all(model)

Custom Loss Functions in PyTorch

class FocalLoss(nn.Module):
    """Focal Loss for addressing class imbalance."""

    def __init__(self, alpha=1, gamma=2, reduction='mean'):
        super(FocalLoss, self).__init__()
        self.alpha = alpha
        self.gamma = gamma
        self.reduction = reduction

    def forward(self, inputs, targets):
        ce_loss = nn.functional.cross_entropy(inputs, targets, reduction='none')
        pt = torch.exp(-ce_loss)
        focal_loss = self.alpha * (1 - pt) ** self.gamma * ce_loss

        if self.reduction == 'mean':
            return focal_loss.mean()
        elif self.reduction == 'sum':
            return focal_loss.sum()
        return focal_loss

class LabelSmoothingLoss(nn.Module):
    """Label smoothing for better generalization."""

    def __init__(self, num_classes, smoothing=0.1):
        super(LabelSmoothingLoss, self).__init__()
        self.confidence = 1.0 - smoothing
        self.smoothing = smoothing
        self.num_classes = num_classes

    def forward(self, pred, target):
        pred = pred.log_softmax(dim=-1)
        with torch.no_grad():
            true_dist = torch.zeros_like(pred)
            true_dist.fill_(self.smoothing / (self.num_classes - 1))
            true_dist.scatter_(1, target.unsqueeze(1), self.confidence)
        return torch.mean(torch.sum(-true_dist * pred, dim=-1))

Mixed Precision Training

from torch.cuda.amp import autocast, GradScaler

# Initialize gradient scaler for mixed precision
scaler = GradScaler()

def train_epoch_mixed_precision(model, loader, criterion, optimizer, scaler, device):
    model.train()
    running_loss = 0.0

    for inputs, targets in loader:
        inputs, targets = inputs.to(device), targets.to(device)

        optimizer.zero_grad()

        # Mixed precision forward pass
        with autocast():
            outputs = model(inputs)
            loss = criterion(outputs, targets)

        # Scaled backward pass
        scaler.scale(loss).backward()
        scaler.step(optimizer)
        scaler.update()

        running_loss += loss.item()

    return running_loss / len(loader)

When to Use PyTorch

Perfect for:

  • Research and experimentation
  • Custom neural network architectures
  • When debugging and flexibility are important
  • Working with Hugging Face transformers
  • Academic projects and publications
  • Quick prototyping of new ideas

Not ideal for:

  • Production deployment without additional tools
  • Mobile deployment (though improving)
  • When TensorFlow ecosystem is required

Head-to-Head Comparison

Syntax Comparison

Creating a simple neural network:

# Scikit-learn (using MLPClassifier)
from sklearn.neural_network import MLPClassifier
model = MLPClassifier(hidden_layer_sizes=(128, 64), max_iter=1000)
model.fit(X_train, y_train)
predictions = model.predict(X_test)

# TensorFlow/Keras
model = keras.Sequential([
    keras.layers.Dense(128, activation='relu'),
    keras.layers.Dense(64, activation='relu'),
    keras.layers.Dense(10, activation='softmax')
])
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy')
model.fit(X_train, y_train, epochs=10)
predictions = model.predict(X_test)

# PyTorch
class Net(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(input_size, 128)
        self.fc2 = nn.Linear(128, 64)
        self.fc3 = nn.Linear(64, 10)

    def forward(self, x):
        x = torch.relu(self.fc1(x))
        x = torch.relu(self.fc2(x))
        return self.fc3(x)

model = Net()
optimizer = optim.Adam(model.parameters())
# Training loop required

Performance Comparison

AspectScikit-learnTensorFlowPyTorch
Training Speed (CPU)FastMediumMedium
Training Speed (GPU)N/AFastFast
Inference SpeedFastFastFast
Memory EfficiencyGoodGoodGood
Startup TimeFastSlowMedium
Model SizeSmallMediumMedium

Ecosystem Comparison

FeatureScikit-learnTensorFlowPyTorch
VisualizationMatplotlibTensorBoardTensorBoard/Weights&Biases
ServingLimitedTF ServingTorchServe
MobileNoTF LitePyTorch Mobile
BrowserNoTensorFlow.jsONNX.js
NLPLimitedTF HubHugging Face
CVLimitedTF Hubtorchvision

Decision Framework

Choose Scikit-learn if:

  1. Working with tabular/structured data
  2. Using classical ML algorithms (not deep learning)
  3. Need quick prototyping and experimentation
  4. Dataset fits in memory
  5. Interpretability is important
  6. Team is new to ML

Choose TensorFlow if:

  1. Deploying to production at scale
  2. Need mobile/edge deployment
  3. Using Google Cloud/TPUs
  4. Building end-to-end ML pipelines
  5. Need comprehensive ecosystem
  6. Enterprise environment

Choose PyTorch if:

  1. Doing research or experimentation
  2. Building custom architectures
  3. Working with NLP (Hugging Face)
  4. Need maximum flexibility
  5. Debugging is important
  6. Academic environment

Using Multiple Libraries Together

In practice, you’ll often use multiple libraries:

# Preprocessing with Scikit-learn
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# Model with PyTorch
import torch
X_train_tensor = torch.FloatTensor(X_train_scaled)
y_train_tensor = torch.LongTensor(y_train)

# Evaluation with Scikit-learn metrics
from sklearn.metrics import classification_report
predictions = model(X_test_tensor).argmax(dim=1).numpy()
print(classification_report(y_test, predictions))

# Visualization with TensorBoard
from torch.utils.tensorboard import SummaryWriter
writer = SummaryWriter()

Conclusion

Each library has its strengths:

  • Scikit-learn: Best for classical ML, tabular data, and quick experimentation
  • TensorFlow: Best for production deployment and comprehensive ecosystem
  • PyTorch: Best for research, flexibility, and modern NLP

The “best” library depends on your specific needs. Many practitioners use all three, choosing the right tool for each task. Start with one, become proficient, then expand your toolkit as needed.

  1. Start with Scikit-learn: Learn ML fundamentals
  2. Add PyTorch or TensorFlow: Choose based on your focus (research vs. production)
  3. Master one deep learning framework: Go deep before going wide
  4. Learn the ecosystem: TensorBoard, Weights & Biases, MLflow
  5. Production skills: Docker, APIs, model serving

Further Resources

  • Scikit-learn: Official docs and user guide
  • TensorFlow: TensorFlow tutorials and Keras documentation
  • PyTorch: PyTorch tutorials and documentation
  • Fast.ai: Practical deep learning course (uses PyTorch)
  • Coursera/Deeplearning.ai: Comprehensive ML courses