Summary of Mitigation Strategies

This paper has reviewed numerous techniques on big data analytics and the impact of uncertainty of each technique. Table 2 summarizes these findings. First, each AI technique is categorized as either ML, NLP, or CI. The second column illustrates how uncertainty impacts each technique, both in terms of uncertainty in the data and the technique itself. Finally, the third column summarizes proposed mitigation strategies for each uncertainty challenge. For example, the first row of Table 2 illustrates one possibility for uncertainty to be introduced in ML via incomplete training data. One approach to overcome this specific form of uncertainty is to use an active learning technique that uses a subset of the data chosen to be the most significant, thereby countering the problem of limited available training data.

Table 2 Uncertainty mitigation strategies

Artificial intelligence Uncertainty Mitigation
Machine learning Incomplete training samples
Inconsistent classification
Learning from low veracity and noisy data
Active learning, Deep learning, Fuzzy sets, Feature selection
Learning from unlabeled data Active learning
Scalability Distributed learning
Deep learning
Natural language processing Keyword search Fuzzy, Bayesian
Ambiguity of words in POS ICA, LIBLINEAR and MNB algorithm
Classification (simplifying language assumption) ICA, Open issue
Computational intelligence Low veracity, complex and noisy data Fuzzy logic, EA
High volume, variety Swarm intelligence, EA, Fuzzy-logic based matching algorithm, EA

Note that we explained each big data characteristic separately. However, combining one or more big data characteristics will incur exponentially more uncertainty, thus requiring even further study.