Data-Sci-Portfolios

Amazon “Stars vs Sentiment” 📦⭐️📝

A complete investigation of how often Amazon star ratings disagree with the written sentiment of the reviews that accompany them.
Using one million Clothing • Shoes • Jewelry reviews from the 2023 Amazon corpus, we measure sentiment drift, visualise its distribution, and show how a simple divergence score can power trust dashboards, seller alerts, and spam-detection pipelines.

Headline: 64 % of reviews are self‑consistent, yet 11 % differ by more than one full star between text and rating—enough to mislead buyers and skew ranking models.


🚦 Project Status

Component State
White paper (PDF) ✅ uploaded Stars_vs_Sentiment_Final_WhitePaper.pdf
10‑slide video (7 min) Stars_vs_Sentiment_Presentation.mp4
Code & notebooks ✅ pushed to main
MiniLM star‑prediction fine‑tune 🔄 optional – planned Q4 2025

🔍 Research Questions

  1. Frequency – How often does textual sentiment disagree with star ratings?
  2. Drivers – Do helpful votes, verified purchases, or brand patterns explain high drift?
  3. Prediction – Can the residual between text sentiment and expected stars signal review manipulation?

📊 Key Findings

Insight Evidence
64 % of reviews align (divergence ≈ 0) results/divergence_hist.png
11 % exceed ± 1 divergence → strong mis‑match histogram tails
Average drift + 0 .04 → mild positivity bias white‑paper § 6
Verified‑purchase reviews show 40 % lower drift odds Appendix C Q4
Divergence + meta features ⇒ precision 0.55 for spam Appendix C Q7

Divergence = sentiment score − normalised star, where stars 1‑5 map to −1…+1.


🗂️ Repository Layout

.
├── data/                         # parquet slices & model artefacts
├── notebooks/
│   └── 01_EDA.ipynb             # sentiment, divergence, figures
├── scripts/
│   └── get_data.py              # streaming download + sample
├── results/
│   ├── star_counts.png
│   ├── divergence_hist.png
│   ├── polarity_vs_rating.png
│   └── helpful_vs_divergence.png
├── Stars_vs_Sentiment_Final_WhitePaper.pdf
├── Stars_vs_Sentiment_Presentation.mp4
├── environment.yml
└── README.md                    # this file

⚙️ Quick Start

git clone https://github.com/<your‑org>/stars-vs-sentiment.git
cd stars-vs-sentiment

# 1.  Create environment
conda env create -f environment.yml
conda activate stars-sentiment

# 2.  End‑to‑end run  (≈ 15 min CPU • 4 min single A10 GPU)
make all

make all streams a one‑million‑row sample, scores sentiment with DistilBERT, computes divergence, and regenerates every figure under results/.


🛠️ Methodology at a Glance

Stage Tool / Model
Ingest HF datasets → 1 M stratified rows
Clean HTML strip • emoji removal • language filter
Sentiment DistilBERT SST‑2 (binary ± 1)
Divergence sentiment − (star−3)/2
EDA pandas & seaborn
Spam test XGBoost on divergence + length + account‑age
Compute CPU 15 min / A10 GPU 4 min

Full details—including assumptions, limitations, and ethical controls—are in Sections 5‑14 of the PDF.


📈 Reproduce Every Figure

jupyter nbconvert --execute --to notebook   --inplace notebooks/01_EDA.ipynb

All PNGs refresh under results/, and the notebook records runtime logs.


🛡️ Ethical & Fairness Guard‑Rails


🤝 Contributing

Bug reports and pull requests welcome—see CONTRIBUTING.md.


📜 License


📚 References

Hou Y. et al. (2024) Bridging Language and Items for Retrieval and Recommendation. arXiv 2403.03952
Mukherjee A. et al. (2013) What Yelp Fake Review Filter Might Be Doing? WWW Companion
Sanh V. et al. (2019) DistilBERT, a Distilled Version of BERT. arXiv 1910.01108