Orange 3 Workflow - click any model to inspect scores
Click a model node to activate
1. Source
๐
File (XLSX)
white-wine-quality
โ
2. Prepare
๐
Data Table
Inspect 4,898 rows
โ๏ธ
Edit Domain
quality โ Categorical
โ
3. Models - tap to inspect
โ
4. Evaluate
๐งช
Test & Score
5-fold stratified CV
๐
ROC Analysis
AUC curves
๐ข
Confusion Matrix
Per-class accuracy
Model Comparison - 5-fold stratified cross-validation
Model
AUC
Accuracy
F1
Precision
Recall
MCC
Feature Importance (Decision Tree, depth 5)
Quality Score Distribution
Classes 5 & 6 = 74% of all 4,898 instances
Python Analysis - Jupyter Notebook Deep Dive
๐
What is this section?
The Orange 3 analysis above gives us a visual workflow. Below, the same dataset was analyzed using Python (pandas, scikit-learn, matplotlib) in a Jupyter Notebook - replicating every result and going deeper. Each chart below includes an explanation so you can follow along even without a data science background.
The bar chart (left) shows how many wines exist at each quality score from 3 to 9. The pie chart (right) groups them into Low (3โ4), Medium (5โ6), and High (7โ9) buckets.
Why does this matter?
Quality scores 5 and 6 make up 74% of all 4,898 wines. This is called class imbalance - the model sees far more medium wines than low or high ones. As a result, accuracy alone is misleading: a model that just guesses '6' for everything would still be ~45% accurate. This is why we also look at AUC and F1.
Key Finding
Only 20 wines scored 3 and only 5 scored 9 - these rare classes are nearly impossible to classify correctly.
Python
vc = df['quality'].value_counts().sort_index()
Step 2 - Feature Correlation Heatmap
What is this chart?
A heatmap shows the correlation between every pair of features. Red means positively correlated (when one goes up, so does the other). Blue means negatively correlated. White/neutral means no strong relationship.
Why does this matter?
Naive Bayes assumes all features are independent of each other - but this heatmap shows that's not true here. Density and residual sugar are strongly correlated (+0.84), and density and alcohol are strongly negatively correlated (-0.78). This explains why Naive Bayes performs poorly on this dataset.
Key Finding
Alcohol has the strongest positive correlation with quality (r = 0.44), confirming it as the top predictor across all models.
This is the actual decision tree trained on the wine data, limited to 3 levels deep so it's readable. Each node shows: the feature being split on, the threshold value, and how many samples land in each quality class.
Why does this matter?
Decision trees are powerful because they're interpretable - you can follow the branches and understand exactly why a prediction was made. This is what the Orange Tree Viewer shows visually. The root split (the very first question the tree asks) is on alcohol content.
Key Finding
The first question is: 'Is alcohol โค 10.85?' Wines with lower alcohol tend to score lower. This single feature alone creates strong separation between classes.
A horizontal bar chart comparing all 6 models by two metrics: Accuracy (how often it's exactly right) and F1 score (a balance of precision and recall, more useful when classes are imbalanced).
Why does this matter?
Looking at both metrics together gives a fairer picture than accuracy alone. A model might have high accuracy by always predicting the most common class, but a low F1 score would expose that weakness.
Key Finding
Tree has the highest accuracy (60.4%) in Python, consistent with the Orange results. Neural Network and LDA (CN2 proxy) are close behind. SVM has the widest gap between accuracy and F1, suggesting it struggles with rare classes.
Python
cross_val_score(model, X, y, cv=5, scoring='accuracy')
cross_val_score(model, X, y, cv=5, scoring='f1_weighted')
Step 5 - Feature Importance
What is this chart?
A horizontal bar chart showing how much each feature contributes to the Decision Tree's decisions. The longer the bar, the more that feature is used to split the data at decision nodes.
Why does this matter?
Feature importance helps us understand what actually drives wine quality predictions. It can guide winemakers on which chemical properties to focus on - and tells us which features we could potentially remove without losing much accuracy (e.g. citric acid and total SOโ contribute very little).
Key Finding
Alcohol (48.8%) and volatile acidity (20.4%) together explain nearly 70% of the tree's decisions. Density, pH, and chlorides contribute a smaller but meaningful portion.
Python
dt.feature_importances_
# Top feature: alcohol at 48.8%
Step 6 - Confusion Matrix
What is this chart?
A grid where each row is the actual quality score and each column is what the model predicted. The diagonal (top-left to bottom-right) shows correct predictions highlighted in red. Off-diagonal cells are mistakes.
Why does this matter?
Accuracy gives one number, but the confusion matrix shows where exactly the model fails. You can see which classes get confused with each other - for example, quality 5 and 6 are frequently swapped because they're so similar and so common in training.
Key Finding
Class 3 (only 20 samples) and Class 9 (only 5 samples) are never predicted correctly - the model simply hasn't seen enough examples to learn them. Classes 5 and 6 dominate predictions.
Python
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y, y_pred)
Step 7 - Does Normalization Help? (Bonus)
What is this chart?
A bar chart comparing SVM and kNN performance with raw features versus normalized features (using StandardScaler, which rescales every feature to have mean=0 and std=1).
Why does this matter?
SVM and kNN are distance-based algorithms - they measure how 'close' data points are to each other. If one feature ranges from 0.1 to 1.0 and another ranges from 1 to 300, the larger-scale feature dominates the distance calculation unfairly. Normalizing puts all features on equal footing.
Key Finding
kNN improved from 47.5% to 55.7% accuracy just by normalizing - a +8.2% boost with zero other changes. SVM improved from 44.9% to 56.7%. This validates the suggestion in the report.