← Back to Projects Deployed System

HR Attrition Prediction System

A supervised machine learning system that predicts employee attrition and provides actionable insights through both predictive modeling and an interactive analytics dashboard.

Python Scikit-learn Machine Learning Streamlit Power BI

Problem And Solution

Problem

Organizations struggle to identify employees likely to leave, leading to unexpected attrition and increased hiring and training costs.

Solution

Built a machine learning model that predicts attrition risk and complements it with visual dashboards for easier decision-making.

Data And Analysis

Dataset

Kaggle HR dataset with ~4,190 records and 41 features including performance, satisfaction, and job-related attributes.

Key Insights

  • Sales & technical roles show higher attrition
  • Managers show higher retention
  • Low job satisfaction strongly increases attrition risk

Model Development

Pipeline

Built a full preprocessing pipeline using ColumnTransformer for encoding, scaling, and feature transformation.

Models Compared

  • Logistic Regression (Best)
  • Support Vector Machine
  • Decision Tree
  • Random Forest

Evaluation Strategy

Optimized for Recall to minimize missed attrition cases and ensure better business impact.

Results

Achieved ~79–80% recall using Logistic Regression with class imbalance handling.

System Features

Prediction System

  • Employee likely to leave
  • Employee likely to stay

Dashboard

  • Attrition by department
  • Travel frequency impact
  • Workforce distribution insights

Deployment

Model deployed using Streamlit with a user interface for real-time predictions. Model saved using Pickle/Joblib for integration.

Future Improvements

  • SHAP for model explainability
  • Improved feature engineering
  • Better handling of class imbalance