Lack of description on individual features.
Difficult to determine the correlations because of large number of features.
The dataset contains Null, NaN and Infinity values.
Unsupervised model: Isolation Forest for unsupervised anomaly detection
Supervised models: Random Forest, Decision Tree, Logistic Regression, AdaBoost and
Naive Bayes.
See the results for each model:
Model | Accuracy | F1-Score |
---|---|---|
Random Forest | 95.04% | 0.9507 |
Decision Tree | 91.95% | 0.9202 |
Logistic Regression | 81.66% | 0.8151 |
ADA Boost | 68.90% | 0.6917 |
Naive Bayes | 61.87% | 0.6104 |