Deep Learning in R
Marquette University
SCoRT - Summer 2025
Why Deep Learning? complex function approximation, feature engineering
Keras in R: high-level API for TensorFlow backend
Installation:
Backend Setup: Automatically installs TensorFlow backend
Benefits:
tensorflow::tf_config().Artificial Intelligence
Machine learning
Deep learning
Artificial Intelligence
Machine learning
Deep learning

The input data and corresponding targets 
Layers, which are combined into a network (or model) 
The loss function, which provides feedback for learning 
The optimizer, which determines how learning proceeds 


The first successful practical application of neural nets came in 1989 from Bell Labs, when Yann LeCun combined the earlier ideas of convolutional neural networks and backpropagation, and applied them to the problem of classifying handwritten digits.
The resulting network, dubbed LeNet, was used by the USPS in the 1990s to automate the reading of ZIP codes on mail envelopes.
LeNet-5 was applied by several banks to recognize hand-written numbers on checks digitized in 32x32 pixel images.

In 2011, Dan Ciresan from IDSIA (Switzerland) began to win academic image-classification competitions with GPU-trained deep neural networks
in 2012, a team led by Alex Krizhevsky and advised by Geoffrey Hinton was able to achieve a top-five accuracy of 83.6%–a significant breakthrough (in 2011 it was only 74.3%). 
Three forces are driving advances in ML:

VGG16 is a convolutional neural network model proposed by K. Simonyan and A. Zisserman from the University of Oxford.
The model achieves 92.7% top-5 test accuracy in ImageNet. It was one of the famous model submitted to ILSVRC-2014.
It makes the improvement over AlexNet by replacing large kernel-sized filters (11 and 5 in the first and second convolutional layer, respectively) with multiple 3×3 kernel-sized filters one after another.
VGG16 was trained for weeks using NVIDIA Titan Black GPU’s. 
Perceptron: basic neuron → activation(input * weight + bias)
Layers: Dense layers, activation functions (relu, sigmoid, softmax)
Forward & backward pass: feedforward, gradient descent
plot().
A Neural Network



Activation Function





Hidden Layers Disappears







are needed: https://tensorflow.rstudio.com/tools/local_gpu.html
library(keras3)
# Load MNIST (Modified National Institute of Standards and Technology) images datasets
c(c(x_train, y_train), c(x_test, y_test)) %<-% dataset_mnist()
# Flatten images and transform RGB values into [0,1] range
x_train <- array_reshape(x_train, c(nrow(x_train), 784))
x_test <- array_reshape(x_test, c(nrow(x_test), 784))
x_train <- x_train / 255
x_test <- x_test / 255
# Convert class vectors to binary class matrices
y_train <- to_categorical(y_train, 10)
y_test <- to_categorical(y_test, 10)
model <- keras_model_sequential(input_shape = c(784))
model %>%
layer_dense(units = 256, activation = 'relu') %>%
layer_dropout(rate = 0.4) %>%
layer_dense(units = 128, activation = 'relu') %>%
layer_dropout(rate = 0.3) %>%
layer_dense(units = 10, activation = 'softmax')
summary(model)
># Model: "sequential"
># │ Layer (type) │ Output Shape │ Param # │
># ├-----------------------┼------------------┼---------┤
># │ dense_11 (Dense) │ (None, 256) │ 200,960 │
># │ dropout_3 (Dropout) │ (None, 256) │ 0 │
># │ dense_10 (Dense) │ (None, 128) │ 32,896 │
># │ dropout_2 (Dropout) │ (None, 128) │ 0 │
># │ dense_9 (Dense) │ (None, 10) │ 1,290 │
># └-----------------------┴------------------┴---------┘
># Total params: 235,146 (918.54 KB)
># Trainable params: 235,146 (918.54 KB)
># Non-trainable params: 0 (0.00 B)
















fit() to train the model for 10 epochs using batches of 128 images:
Epoch 1/10
375/375 ━━━━━ 3s 5ms/step - accuracy: 0.7831 - loss: 0.6970 - val_accuracy: 0.9513 - val_loss: 0.1640
Epoch 2/10
375/375 ━━━━━ 1s 3ms/step - accuracy: 0.9371 - loss: 0.2123 - val_accuracy: 0.9628 - val_loss: 0.1249
Epoch 3/10
375/375 ━━━━━ 1s 3ms/step - accuracy: 0.9539 - loss: 0.1540 - val_accuracy: 0.9666 - val_loss: 0.1098
Epoch 4/10
375/375 ━━━━━ 1s 3ms/step - accuracy: 0.9612 - loss: 0.1301 - val_accuracy: 0.9743 - val_loss: 0.0865
Epoch 5/10
375/375 ━━━━━ 1s 3ms/step - accuracy: 0.9663 - loss: 0.1145 - val_accuracy: 0.9730 - val_loss: 0.0921
Epoch 6/10
375/375 ━━━━━ 1s 3ms/step - accuracy: 0.9688 - loss: 0.1020 - val_accuracy: 0.9736 - val_loss: 0.0923
Epoch 7/10
375/375 ━━━━━ 1s 3ms/step - accuracy: 0.9726 - loss: 0.0940 - val_accuracy: 0.9770 - val_loss: 0.0822
Epoch 8/10
375/375 ━━━━━ 1s 3ms/step - accuracy: 0.9742 - loss: 0.0875 - val_accuracy: 0.9770 - val_loss: 0.0815
Epoch 9/10
375/375 ━━━━━ 1s 3ms/step - accuracy: 0.9750 - loss: 0.0791 - val_accuracy: 0.9785 - val_loss: 0.0810
Epoch 10/10
375/375 ━━━━━ 1s 3ms/step - accuracy: 0.9769 - loss: 0.0744 - val_accuracy: 0.9777 - val_loss: 0.0835 
plot(history)
model %>% predict(x_test[1:100,]) %>% apply(1, which.max)-1
># 4/4 ━━━━━━━━━━━━━━━━━━━━ 0s 957us/step
># [1] 7 2 1 0 4 1 4 9 6 9 0 6 9 0 1 5 9 7
># [19] 3 4 9 6 6 5 4 0 7 4 0 1 3 1 3 4 7 2
># [37] 7 1 2 1 1 7 4 2 3 5 1 2 4 4 6 3 5 5
># [55] 6 0 4 1 9 5 7 8 9 3 7 4 6 4 3 0 7 0
># [73] 2 9 1 7 3 2 9 7 7 6 2 7 8 4 7 3 6 1
># [91] 3 6 9 3 1 4 1 7 6 9
round(model %>% predict(x_test[1:9,]),5)
># 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 16ms/step
># [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
># [1,] 0.00000 0.00000 0e+00 0.000 0.00000 0.00000 0.00000 1.00000 0e+00 0.00000
># [2,] 0.00000 0.00000 1e+00 0.000 0.00000 0.00000 0.00000 0.00000 0e+00 0.00000
># [3,] 0.00000 0.99983 1e-05 0.000 0.00001 0.00000 0.00000 0.00014 1e-05 0.00000
># [4,] 0.99986 0.00000 6e-05 0.000 0.00000 0.00000 0.00007 0.00000 0e+00 0.00000
># [5,] 0.00000 0.00000 0e+00 0.000 0.99995 0.00000 0.00000 0.00000 0e+00 0.00005
># [6,] 0.00000 0.99998 0e+00 0.000 0.00000 0.00000 0.00000 0.00002 0e+00 0.00000
># [7,] 0.00000 0.00000 0e+00 0.000 0.99984 0.00000 0.00000 0.00000 3e-05 0.00013
># [8,] 0.00000 0.00001 1e-05 0.002 0.00007 0.00007 0.00000 0.00044 4e-05 0.99737
># [9,] 0.00000 0.00000 0e+00 0.000 0.00000 0.30770 0.69230 0.00000 0e+00 0.00000





cnn <- keras_model_sequential(input_shape=c(28,28,1)) %>%
layer_conv_2d(filters=32, kernel_size=c(3,3), activation='relu') %>%
layer_max_pooling_2d(pool_size=c(2,2)) %>%
layer_conv_2d(filters=64, kernel_size=c(3,3), activation='relu') %>%
layer_max_pooling_2d(pool_size=c(2,2)) %>%
layer_flatten() %>%
layer_dense(units=64, activation='relu') %>%
layer_dense(units=10, activation='softmax')


application_resnet50(), fine-tuningsave_model_hdf5(), load_model_hdf5()
Thank You