Read this article. What are the main challenges of using big data?
With Hadoop, extremely large volumes of data with either varying structures or none at all can be processed, managed, and analyzed. However, Hadoop also has some limitations.
The Generation of Multiple Copies of Big Data. HDFS was built for efficiency; thus, data is replicated in multiples. Generally, data are generated in triplicate at minimum. However, six copies must be generated to sustain performance through data locality. As a result, the Big Data is enlarged further.
Challenging Framework. The MapReduce framework is complicated, particularly when complex transformational logic must be leveraged. Attempts have been generated by open-source modules to simplify this framework, but these modules also use registered languages.
Very Limited SQL Support. Hadoop combines open-source projects and programming frameworks across a distributed system. Consequently, offers it gains limited SQL support and lacks basic SQL functions, such as subqueries and grouping by analytics.
Lack of Essential Skills. Intriguing data mining libraries are implemented inconsistently as part of the Hadoop project. Thus, algorithm knowledge and development skill with respect to distributed MapReduce are necessary.
Inefficient Execution. HDFS does not consider query optimizers. Therefore, it cannot execute an efficient cost-based plan. Hence, the sizes of Hadoop clusters are often significantly larger than needed for a similar database.