Long Short-Term Memory Models: Results and Discussion | Saylor Academy

Results and Discussion

This section compares performances of state-of-the-art (SOTA) models in terms of accuracy, time, and loss.

Time Analysis

The training time comparisons of SOTA models are indicated in Table 1. The results indicated that most DL models provide reasonable training time except the transformer-based model. Models that use LSTM, BoT, and CNN performed an epoch per minute, whereas the BoT-based model achieves 13 s per epoch in contrast to 28 min in the case of the transformer model. In the testing phase, results are aligned with the training phase. Even though only time analysis does not give a concrete interpretation of a model, we see a considerable time efficiency difference between BERT and other models.

Table 1. Training and testing time comparison of SOTA models

Epoch/Test	LSTM	BoT	CNN	Transformer
1	1 m 41 s	0 m 14 s	0 m 30 s	28 m 4 s
2	1 m 40 s	0 m 13 s	0 m 30 s	28 m 7 s
3	1 m 41 s	0 m 13 s	0 m 30 s	28 m 6 s
4	1 m 40 s	0 m 13 s	0 m 30 s	28 m 7 s
5	1 m 40 s	0 m 13 s	0 m 30 s	28 m 7 s
6	1 m 41 s	0 m 13 s	0 m 30 s	28 m 6 s
7	1 m 41 s	0 m 13 s	0 m 30 s	28 m 4 s
8	1 m 40 s	0 m 13 s	0 m 30 s	28 m 7 s
9	1 m 40 s	0 m 13 s	0 m 30 s	27 m 56 s
10	1 m 40 s	0 m 13 s	0 m 30 s	27 m 58 s
Testing	15 ms	9 ms	10 ms	35 ms

Validation and Test Losses

Validation loss is another critical metric to evaluate how a model fits new data. Validation loss is also a good indicator of overfitting. The models’ validation, training, and test losses are shown in Fig. 1 and Table 2.

Fig. 1. Validation and training losses of the models.

Fig. 1. Validation and training losses of the models.

The loss graph of the transformer-based model indicates that it could converge faster than other models with fewer training epochs. This will be a result of pre-training of the transformer model.

Table 2. Test losses of the models.

Models	LSTM	BoT	CNN	Transformer
Loss	0.323	0.391	0.344	0.209

Validation Accuracy

Validation accuracy in combination with validation loss could be used to determine the model's generalization ability. The validation and testing accuracies of the models are given in Table 3. Validation accuracy reveals that five epochs of training are enough to get good results which are also in line with the validation loss. Testing accuracy is aligned with the validation accuracy where the transformer-based model achieves the best performance.

Table 3. Validation and testing accuracies of the models.

Epoch/Test	LSTM	BoT	CNN	Transformer
1	73.28%	72.08%	77.03%	91.93%
2	82.19%	76.29%	84.56%	91.76%
3	79.64%	80.50%	86.06%	92.02%
4	87.71%	83.55%	86.78%	90.74%
5	87.81%	85.47%	86.99%	91.31%
6	89.27%	85.47%	87.23%	91.31%
7	89.65%	87.09%	87.16%	90.89%
8	91.52%	87.68%	87.30%	91.19%
9	88.06%	88.07%	86.96%	92.15%
10	89.69%	88.46%	87.40%	91.85%
Testing	86.96%	85.18%	85.04%	91.58%

Observations derived from the performance comparisons are outlined below.

Observation 1: BoT-based model is faster than other DL models.

Observation 2: Transformer-based model takes a long time to train and predict.

Observation 3: Optimum epoch number could be determined using accuracy and loss of training and validation phases. Five epochs of training provide optimum training.

Observation 4: Transformer-based model converges faster than other models.

Course Introduction

Course Syllabus

Unit 1: What Is Artificial Intelligence?

1.1: The Turing Test

The Turing Test for Intelligence

Why the Turing Test Is Important

1.2: The Four Types of AI

Is Intelligence How You Think or the Output of Thinking?

Unit 1 Assessment

Unit 1 Assessment

Unit 2: Agent-Based Approach to AI

2.1: Introduction to Agent-Based AI

Agents, Agent Types, and Their Capabilities

2.2: Analyzing Environmental Characteristics

Properties of Problem Environments and How to Analyze Them

Unit 2 Assessment

Unit 2 Assessment

Unit 3: Machine Learning and Its Importance

3.1: Learning in AI and Agents

Supervised, Unsupervised, and Reinforcement ML

3.2: Applications of ML in Neural Networks

Newer Machine Learning Models and Applications

Unit 3 Assessment

Unit 3 Assessment

Unit 4: Machine Learning Algorithms

4.1: Classification Algorithms

Classification versus Regression

Importance of Classification and Regression in Machine Learning

Classification Using K-nearest Neighbors Algorithm

4.2: Classification Algorithm Performance

False Positives / False Negatives / Confusion Matrix

Precision and Recall Calculations from Confusion Matrix

Linear Regression – How It Works

4.3: Linear Regression Algorithms

Metrics for Linear Regression Effectiveness: R-squared, MSE and RSE

Lasso and Ridge Regression

Improving Linear Regression by Reducing Residual Errors

4.4: Other Supervised ML Classification Algorithms

Classification Using Decision Trees

Classification Using Logistic Regression

Applying Bayes' Theorem in Machine Learning

4.5: Unsupervised Learning and Reinforcement Learning

Unlabelled Data and Unsupervised Machine Learning

Principles and Applications of Reinforcement Learning

4.6: ML Using Neural Networks

Introduction to Neural Networks Basics

Neural Networks: Types and Applications

Unit 4 Assessment

Unit 4 Assessment

Unit 5: Problem-Solving Methods in AI

5.1: Integrating ML Skills

Applying Classification to Determine Insurability

How Regression Is Applied in Contemporary Computing

Using Neural Networks in Cancer Detection

5.2: General AI Problem-Solver Architecture

Characteristics of General Problem-Solver

5.3: Designing a General Problem-Solving Agent

How GPS Is Used

Computational Tractability of GPS

Unit 5 Assessment

Unit 5 Assessment

Unit 6: Search Algorithms

6.1: Uninformed Search Algorithms

Uninformed or Brute Force Search

Depth First Search Algorithm

Breadth First Search Algorithm

Uniform Cost Search Algorithm

6.2: Heuristic Search Algorithms

Heuristics and Using Them to Improve Search

Overview of A* Search and Analysis of Performance

Unit 6 Assessment

Unit 6 Assessment

Unit 7: Iterative Improvement Algorithms

7.1: Using Iterative Improvement to Solve Problems

Iterative Improvement Algorithms and Hill-Climbing

Constraint Satisfaction Problems and Their Importance

7.2: Improving Algorithm Efficiency

How Simulated Annealing Improves Hill-Climbing

Improving Mediocre Solutions Using Genetic Algorithms

Unit 7 Assessment