Computational Statistics

Chapter 5 - Visualization of Multivariate Data

Dr. Mehdi Maadooliat

Marquette University
MATH 4750 - Spring 2025

Scatterplot Matrices

  • Introduction to scatterplot matrices for multivariate data
  • Example using the iris dataset
data(iris)

# Scatterplot matrix for virginica species
pairs(iris[101:150, 1:4], main="Scatterplot Matrix for Virginica", col="blue")

Customizing Scatterplot Matrices

  • Customizing scatterplot matrices with density plots
# Adding density plots to scatterplot matrix
panel.d <- function(x, ...) {
    usr <- par("usr")
    on.exit(par(usr))
    par(usr = c(usr[1:2], 0, .5))
    lines(density(x), col="red")
}
pairs(iris[101:150, 1:4], panel = panel.d, main="Customized Scatterplot Matrix")

Correlation Plots

  • Visualizing correlations between variables
  • Example: Correlation plot with corrplot package
# Correlation plot for iris dataset
library(corrplot)
cor_iris <- cor(iris[, 1:4])
corrplot(cor_iris, method="circle", main="Correlation Plot for Iris Dataset")

3D Scatter Plots

  • Visualizing 3D relationships between variables
  • Example: 3D scatter plot using the scatterplot3d package
# 3D scatter plot for iris dataset
library(scatterplot3d)
scatterplot3d(iris$Sepal.Length, iris$Petal.Length, iris$Sepal.Width, color="darkgreen", pch=16, main="3D Scatter Plot of Iris Data")

Principal Component Analysis (PCA)

  • Introduction to PCA for dimensionality reduction
  • Example: PCA visualization using ggplot2
# PCA for iris dataset
library(ggplot2)
pca <- prcomp(iris[, 1:4], scale=TRUE)
pca_df <- data.frame(pca$x, Species=iris$Species)

# Visualizing PCA results
ggplot(pca_df, aes(PC1, PC2, color=Species)) +
  geom_point(size=2) +
  labs(title="PCA of Iris Dataset", x="Principal Component 1", y="Principal Component 2") +
  theme_minimal()

Conclusion

  • Recap of visualization techniques for multivariate data
  • Practice: Generate visualizations for other datasets and explore 3D plots