Final Results
114K+
Tracks Analyzed
77.5%
XGBoost Accuracy
0.859
Peak ROC-AUC
84%
Popular Song Recall
✨ Technical Highlights
🔧 Engineered Features
Implemented Vocal Intensity and Energy Balance to capture the complex synergy between audio signals.
🤖 Advanced Ensemble
Moved beyond linear baselines to XGBoost, capturing non-linear relationships with high precision.
🔍 Explainable AI
Used SHAP Analysis to demystify the "black box," visualizing how each feature pulls the prediction toward popularity.
🎯 Precision Tuning
Shifted decision thresholds to 0.4436 to optimize the F1-Score and maximize hit discovery.
🏆 Model Leaderboard
| Model | Accuracy | ROC-AUC | F1-Score | Stability |
|---|---|---|---|---|
| XGBoost (Champion) | 0.775 | 0.859 | 0.780* | +/- 0.006 |
| Gradient Boosting | 0.766 | 0.850 | 0.760 | +/- 0.007 |
| Random Forest | 0.761 | 0.839 | 0.771 | +/- 0.010 |
| Logistic Regression | 0.577 | 0.621 | 0.590 | Baseline |
* Scores validated via 5-Fold Stratified Cross-Validation for robustness.
🔑 The "Hit" Ingredients
What Drives Popularity?
- Genre Encodings: The strongest predictor of initial popularity potential.
- Emotional Valence: Listeners prefer positive, upbeat tones over somber ones.
- Danceability: Rhythmic consistency remains a staple of chart-topping tracks.
- Energy Balance: Success favors a balanced mix of organic and electronic production.
- Strategic Length: Modern streaming favors brevity; 2.5–3.5 mins is the current sweet spot.