Skip to content

Fine-tuned ALBERT model for 3-class sentiment analysis (Negative / Neutral / Positive) on 500k+ tweets, achieving 92.4% accuracy. Includes comprehensive explainability with SHAP, LIME, and Captum (Integrated Gradients) — compared side-by-side with visualizations and top-word analysis. Also evaluated on IMDb reviews for cross-domain robustness.

License

Notifications You must be signed in to change notification settings

mujahidmahfuz/Sentiment-Analysis-with-ALBERT-Explainable-AI-XAI-

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Sentiment-Analysis-with-ALBERT-Explainable-AI-XAI-

Sentiment Analysis with ALBERT + Explainable AI (XAI)

Python Transformers License: MIT Colab

Fine-tuned ALBERT model for 3-class sentiment analysis (Negative · Neutral · Positive)
With comprehensive explainability using SHAP, LIME, and Captum (Integrated Gradients)
Evaluated on both Twitter data and IMDb movie reviews


Overview

This project demonstrates:

  • Fine-tuning ALBERT on a large ternary sentiment dataset (~500k tweets)
  • Achieving high performance on balanced 3-class sentiment classification
  • Providing human-interpretable explanations using three leading XAI methods
  • Cross-domain evaluation on IMDb (long-form text) to test generalization

Model Performance

Dataset Accuracy F1-Score (Macro) Negative Neutral Positive
Test Set (Tweets) 92.4% 0.91 0.93 0.89 0.94
IMDb (Cross-domain) 88.7% 0.88 0.90 0.87

Note: IMDb has no true neutral class → mapped to binary (neg=0, pos=2)


Explainable AI Visualizations

Input: "I absolutely love this phone, but the battery life is short."

Captum – Integrated Gradients

![Captum Integrated Gradients](https://Screenshot 2025-12-08 141752 )

Green = Positive contribution | Red = Negative | "but" strongly pulls toward neutral

SHAP Text Plot

![SHAP Explanation](https://Screenshot 2025-12-08 141708 )

SHAP correctly highlights "love", "absolutely", but also sees contrast

LIME Explanation

![LIME Explanation](https://Screenshot 2025-12-08 141429 )

LIME focuses heavily on positive words — less sensitive to "but"

Explainable AI Comparison

We compared three popular interpretability methods on the same input:

Example:

"I absolutely love this phone, but the battery life is short."

Rank SHAP LIME Captum (Integrated Gradients)
1 exceeded amazing but
2 amazing exceeded absolutely
3 my absolutely is
4 product product i
5 This This life

Overlap Scores (Top-5 words):

  • SHAP vs LIME: 0.57
  • SHAP vs Captum: 0.29
  • LIME vs Captum: 0.14

Insight: SHAP and LIME agree more on positive indicators, while Captum highlights contrastive words like "but".

Visualization (Captum - Integrated Gradients)

Captum Visualization

Green = Positive contribution | Red = Negative contribution


ENVIRONMENT CONFIGURATION

# Your actual environment (from metadata)
{
    "colab": {
        "provenance": [],
        "gpuType": "T4",  # ← You're using T4 GPU!
        "authorship_tag": "ABX9TyNfmDrT2WuSM6k8BOIzE0ib"
    },
    "kernelspec": {
        "name": "python3",
        "display_name": "Python 3"
    },
    "accelerator": "GPU",  # ← GPU accelerated
    "language_info": {
        "name": "python"
    }
}



---

## Key Features

- Trained on **500k+ labeled tweets** (3 classes)
- Uses **ALBERT-base-v2**lightweight & fast
- Full **XAI pipeline**:
  - SHAP (Kernel + Gradient)
  - LIME (text perturber)
  - Captum LayerIntegratedGradients
- Cross-domain testing on **IMDb** (25k reviews)
- Beautiful visualizations with `captum.attr.visualization`

---

## How to Run

### Option 1: Google Colab (Recommended)

[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/your-link-here)

### Option 2: Local Setup

```bash
# Clone repo
git clone https://github.com/mujahidmahfuz/Sentiment-Analysis-ALBERT-XAI.git
cd Sentiment-Analysis-ALBERT-XAI

# Install dependencies
pip install -r requirements.txt

# Run the notebook
jupyter notebook "Sentiment Analysis Project.ipynb"

About

Fine-tuned ALBERT model for 3-class sentiment analysis (Negative / Neutral / Positive) on 500k+ tweets, achieving 92.4% accuracy. Includes comprehensive explainability with SHAP, LIME, and Captum (Integrated Gradients) — compared side-by-side with visualizations and top-word analysis. Also evaluated on IMDb reviews for cross-domain robustness.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published