Using Text Mining Techniques to Identify Research Trends
Site: | Saylor Academy |
Course: | BUS610: Advanced Business Intelligence and Analytics |
Book: | Using Text Mining Techniques to Identify Research Trends |
Printed by: | Guest user |
Date: | Wednesday, 2 April 2025, 11:23 PM |
Description
Combining text mining techniques and bibliometric analysis can help uncover hidden information in scientific publications and unseen patterns and trends in research fields. Text mining may help researchers gain a more comprehensive understanding of the knowledge of a certain field hidden in a large amount of scientific literature. Clustering can provide a more detailed structured/architecture overview of a certain field. Social network analysis (SNA) explores core themes and allows researchers to better understand the developmental gains of a certain field. How do you think SNA enables companies to understand your purchasing decisions? What are some text mining techniques companies might use to find connections for customer demographic characteristics? Using one of the free tools listed here, map your own interactions with friends and the mutual brands advertised to you. What similarities do you see?
Abstract
The research goal of this paper is to identify major academic branches and to detect research trends in design research using text mining techniques. In this paper, the information about scientific literature in design research isprocessed. A combination of clustering and bibliometric analysis led to shaping four academic branches and summarizing each academic branch. Then, research trends and the evolution for each academic branch are explored. We perform a two-dimensional text mining approach, including bibliometric and network analysis, in order to detect trends of major academic branches. Specifically, the bibliometric characterization aims to assess design research area outputs, while the network analysis intends to reveal research trends in each academic branch of design research and the evolution of core research themes.
Keywords: text mining; bibliometric analysis; trend analysis; design research
Source: Binling Nie and Shouqian Sun, https://www.mdpi.com/2076-3417/7/4/401/htm
This work is licensed under a Creative Commons Attribution 4.0 License.
1. Introduction
The volume and diversity of scientific literature are expanding every day. A basis for future development is recorded in detail by abundant archived scientific literature. Data mining tries to discover information hidden in scientific literature, which is not accessible by simple statistical techniques. Text mining techniques area significant subset of data mining that aims to extract knowledge from unstructured or semi-structured textual data and has widespread applications in analyzing and processing textual documents. Hence, combining textual mining techniques and bibliometric analysis can be exploited to help us discover more unseen patterns in research fields than simple bibliometric analysis.
Great design has been the crucial point to make more aesthetically pleasing and more practical products and provide more adaptable services. It was revealed by the Design Management Institute that many design-centric companies like Apple and Target outperformed the S&P 500 by 224% from 2003–2013. Therefore, building a systematic view of design research becomes increasingly essential. Until now, analyzing the design research area from the perspective of scientometrics is still a largely unexplored area. To the best of our knowledge, there has been only one paper applying a quantitative approach to investigating the evolution and future trends of design research by analyzing citations of papers in the journal Design Studies. Additionally, few existing studies presented a qualitative method to analyze the design research area. Yet, there has been no in-depth study on keeping track of the current advances in the design research area.
Consequently, we use textual mining techniques to identify major academic branches and detect research trends in design research. Our work is perhaps the first such attempt of this kind to present a detailed bibliometric analysis in research fields.
2. Materials and Methods
In this section, we demonstrate how to implement the text mining process. First, we prepare all of the needed data and preprocess them into the needed format. Then, we discuss a combination of clustering and bibliometric analysis to identify major academic branches of design research and summarize each academic branch (Section 2.2). Finally, we describe how a two-dimensional text mining approach including bibliometric and network analysis isused to detect research trends from different perspectives (Section 2.3).
2.1. Data Preparation
To give a clear distinction to the research goal of this paper, the definition of design is clarified as follows: design may be a substantive reference to the aesthetic, functional, economic and sociopolitical dimensions of both the design object and design process. The data are provided by Web of Science (WOS). Data retrieval strategy: the subject is "design research" or "design studies"; the publication period is confined from "between year 2004 and 2015" (retrieved data are during May 2016 and June 2016). Accurately retrieved reference type is "Article"; language is "English"; data category is refined as "engineering multidisciplinary or engineering manufacturing or engineering industrial". To better meet the requirements of the definition of design, we expand the number of papers in the main journals, which are discussed in Quality perceptions of design journals: The design scholars' perspective. For obviously unrelated journals, for example Design Codes and Cryptograph, we excluded all of their papers. A total number of 20,218 publications is obtained. We further divide the data into two parts: BASE data and Citation data. BASE data store items containing: title, authors, journal name, keywords, abstract, publishing date. The items in BASE data are fed into the BASE table. Citation data (i.e., references cited by the articles included in the core dataset) store the following items: the cited article's title, the cited article's author, the cited article's journal, total number of the citing article, per year's number of the citing article during 2004–2015. Similarly, the data in the citing dataset are fed into the Citation table.2.2. Identification of Major Academic Branches
- Preprocess the document set, including word segmentation and removing stop words.
- Use Latent Dirichlet Allocation(LDA) to extract features and determine the optimal number of topics (features)
- Calculate the results of the k of 2, 3, 4, 5, 6,7 by the K-means algorithm, then combine the Sum of Squared Error(SSE) and inter-cluster distance to obtain the best classification result. We use the distance between cluster centroids to measure inter-cluster distance. The calculation formula of SSE is listed as Equation (1).
(1)
where is the number of clusters,
is the cluster
represents the document
that belongs to the cluster
and
is the clustering centroid of the cluster
.
We briefly introduce LDA below. LDA is a hierarchical Bayesian model. It is used to model corpora of documents that can be represented by bags of words. The generative process of document sampling assumes a set of topics, where each document is sampled
from a mixture of topics, and each topic is a discrete probability distribution that defines how likely each word appears in a given topic. As the number of topics affects the fitting performance of the LDA model, therefore we use common criteria
evaluation of perplexity to determine the optimal number of topics. Normally, the smaller the degree of perplexity, the better the LDA model performs. The calculation formula of perplexity is as follows,
where is the number of documents,
represents the length of the document
and
is the probability that the LDA model generates the document
.
There are many methods to estimate the model parameters; we use the Gibbs sampling algorithm in this paper. The standard approach to "smooth" the multinomial parameters is assigning positive probability to all vocabulary items whether or not they are observed in the training set.
Based on classification results, we calculate traditional bibliometric indicators such as total outputs, total citation and citation-based impact assessment to provide an architecture overview of design research.
2.3. Detection of the Research Trends of Design Research
In this subsection, we describe how a two-dimensional text mining approach including bibliometric and network analysis is used to detect research trends from different perspectives. The bibliometric characterization aims to assess academic outputs trends and the development trends of the design research area. The network analysis intends to find out research trends in each academic branch of design research and the evolution of core research themes.
Bibliometric analysis involves total outputs and citation-based impact assessment. Counting the number of research publications reflects the impact and usefulness of scientific research output. Additionally, the citation count is a good measure of the influence power of a research paper. High citation indicates more effectiveness, usefulness and productiveness. A previous study computed the values of the Citation Function (CF) and the Co-Citation Function (CCF) to identify the changes of thematic clusters over time. The profile of both functions is similar, and CCF is more expressive. However, a large data collection amount and computational amount are needed for CCF. Therefore, we apply an approach that is based on CF, presented as follows,
where is the number of documents in the analyzed thematic cluster,
is the number of citations of the document
in the year
,
is total number of citations of the document
, max
is the maximum value of the citations in the analyzed documents,
is the period when the document
has been cited and max
is the maximum period where the document can be cited for the data under analysis. The value of CF stands for a research domain's development level and is related to the number of citations only. It provides easy identification of the developing research areas and offers an alternative view on the subject development.
Network analysis is carried out to further analyze the social relations among the core themes. Keywords reflect some important information about research trends. In this paper, high frequency keywords are extracted from the abstract and title of scientific literature in each academic branch. After that, networks based on the keyword co-occurrence matrix are constructed to visualize the internal dynamics of major academic branches in the design research area by using Ucinet 6.0.
3. Results
Since this is the first quantitative study that analyzes the design research area in detail, it might be useful to reveal some general findings that provide an overview of the design research area. We list the locations of the authors and find out the journal's influence among the international design research community in Appendix A. Then, we combine text mining techniques and bibliometric analysis to obtain major academic branches of design research (Section 3.1) and identify the development trends of major academic branches (Section 3.2).
3.1. Major Academic Branches of Design Research
Major academic branches of design research are identified in this subsection. Firstly, the corpus of the LDA model consists of unique words obtained from the BASE table, which consists of 4893 unique words. To improve accuracy, a special stop word list
just for academic articles in design research is created to remove the stop words. Next, we repeat the calculating of the perplexity of LDA model when the number of topic varies from 10–100, which aims to obtain an optimal topic set. Results are shown
in Figure 1. When the number of topics is 30, the performance is the best. Thus, we apply LDA to remodel the preprocessed corpus by setting the number of topics to 30. Then, we get the topic distribution as the selected real-valued features for each
article. Finally, we use K-means to perform classification with selected features and compare classification results, which is evaluated by SSE with different values. Additionally, we compute inter-cluster distance with different
values,
as listed in Table 1. In Figure 2, we can see that the value of SSE gets smaller as the value of
gradually increases from 2–8. The value of SSE achieves a relatively stable state when the value of
is greater than four. From Table 1, we
can observe that the inter-cluster distances between Cluster 1 and Cluster 4, Cluster 2 and Cluster 3, Cluster 2 and Cluster 5, Cluster 3 and Cluster 5 are obviously smaller than other inter cluster distances when
is seven. We can merge Clusters
1, 4 and Clusters 2, 3, 5 into two clusters and obtain four clusters. The same analysis logically would apply to the circumstances when
is five and six. Thus, it is reasonable to classify all of the research articles into four clusters. Then,
we calculate the four cluster centroids. According to top three topics in each cluster centroid in Table 2 and the top ten words of each topic in Table A3 of Appendix B, each cluster can be interpreted as an academic branch of design.
Figure 1. The relationship between the degree of perplexity and the number of topics.
Figure 2. The relationship between SSE and the value of in K-means.
Table 1. Inter-cluster distance with different values.
k Value | Inter-Cluster Distance |
---|---|
2 | 1–2: 0.1771 |
3 | 1–2: 0.1770; 1–3: 0.2720; 2–3: 0.1991 |
4 | 1–2: 0.1222; 1–3: 0.2166; 1–4: 0.2831; 2–3: 0.1992; 2–4: 0.1832; 3–4: 0.2780 |
5 | 1–2: 0.1246; 1–3: 0.2202; 1–4: 0.2855; 1–5: 0.1820; 2–3: 0.1509; 2–4: 0.1836; 2–5: 0.0165; 3–4: 0.2800; 3–5: 0.1582; 4–5: 0.2438 |
6 | 1–2: 0.1073; 1–3: 0.1412; 1–4: 0.0220; 1–5: 0.1823; 1–6: 0.0400; 2–3: 0.2401; 2–4: 0.1786; 2–5: 0.2838; 2–6: 0.1460; 3–4: 0.2196; 3–5: 0.3037; 3–6: 0.1906; 4–5: 0.2594; 4–6: 0.0312; 5–6: 0.2331 |
7 | 1–2: 0.2880; 1–3: 0.2234; 1–4: 0.0266; 1–5: 0.1854; 1–6: 0.2598; 1–7: 0.2591; 2–3: 0.0120; 2–4: 0.1838; 2–5: 0.0476; 2–6: 0.1395; 2–7: 0.1373; 3–4: 0.2815; 3–5: 0.0506; 3–6: 0.2318; 3–7: 0.2308; 4–5: 0.2454; 4–6: 0.3172; 4–7: 0.3176; 5–6: 0.2014; 5–7: 0.1970; 6–7: 0.2533 |
Table 2. Top three topics in cluster centroids.
Classification Number | Top Three Topics in Cluster Centroids |
---|---|
Cluster 1 | Topic 20, Topic 21, Topic 1 |
Cluster 2 | Topic 9, Topic 7, Topic 19 |
Cluster 3 | Topic 12, Topic 3, Topic 29 |
Cluster 4 | Topic 14, Topic 24, Topic 0 |
Cluster 1 includes these words, "interaction", "human-computer", "user", "interface", etc. They reveal a discipline of design that pays attention to satisfying the needs and desires of the majority of people who will use the product. Therefore, Cluster 1 is called interaction design. Interaction design is about users' behaviors. It focuses on creating engaging interfaces with understanding how users and the web communicate with each othe. User experience is vital to all kinds of products and services. On the web, user experience becomes even more important.
The most frequently-used words in Cluster 2 are "ergonomics", "musculoskeletal", "work", "physical", "back", "pain", etc. These words are involved in harmonizing the functionality of tasks with the human requirements of those performing them. As a result, Cluster 2 is named ergonomic design. It is a branch of science drawing from physiology, engineering and psychology studies. It is often the most difficult variable to factor into the early stages of the design process. With poor ergonomic design, users may even be injured. Ergonomic design is said to be human-centered design focusing on usability. It seeks to ensure that human restrictions and capabilities are met and supported by design options.
These words, "product", "design", "process", "theory", etc., are contained in Cluster 3. The content of studies includes the process of creating a new product. These words also demonstrate the methods, strategies, research, analysis and management of design. To conclude, Cluster 2 belongs to product design. It presents an in-depth study of structured design processes and methods, which has many benefits in education and industry. On the industry side, a structured design process is mandatory to effectively decide what projects to bring to market, schedule this development pipeline in a changing uncertain world and efficiently create robust delightful products. On the educational side, the benefits of using structured design methods include concrete experiences with hands-on products, applications of contemporary technologies, realistic and fruitful applications of applied mathematics and scientific principles, studies of systematic experimentation, exploration of the boundaries of design methodology and decision making for real product development.
These words, "visualization", "information", "data", "mining", etc., explain the practice of presenting information in a way to display information effectively with graphic design. Overall, Cluster 4 is impliedto be information design. It is defined as the art and science of preparing information so that it can be used by human beings with efficiency and effectiveness. It aims to develop documents that are comprehensible, rapidly and accurately retrievable and easy to translate into effective action. Another objective is to design interactions with equipment that are easy, natural, and as pleasant as possible. This involves solving many problems in the design of the human-computer interface.
In addition, the research outputs of each branch are computed to preliminarily reflect the focus of design research. The result is listed in Table 3. The publication output of production design has been at the lowest level of the four branches. This is owing to the fact that there are less doctoral programs of product design at domestic and foreign institutions. Correspondingly, there is a small number of publications referring to product design, not to mention articles which are selected by Science Citation Index (SCI). In general, students who specialize in product design prefer to be designers not researchers. In contrast, the other two branches, information design and ergonomic design, are more involved in engineering. More and more interdisciplinary researchers participate in the studies of information design and ergonomic design. Besides, it appears that the developing branch "interaction design" has attracted more research attention than the veteran branch called "product design". It owns the rapid development of networks. Towards the end of the 1980s and early 1990s, the introduction of computer networks had a large impact on interaction design. As graphics and computational capabilities increased, it was possible to interactively visualize large-scale data. With the awareness of the importance of the usability of its products in a competitive environment, interaction design has grown gradually over the years. It can be seen that technology has broadened the boundaries of design research. Additionally, "ergonomic design" has a great impact on other academic branches. "Interaction design" and "information design" affect each other. Much of the experience of "product design" is used for reference by "ergonomic design".
Table 3. Research outputs of major academic branches.
Branch | TP | TC | ACPP | CDS | ||
---|---|---|---|---|---|---|
Interaction design | 7272 | 71,482 | 9.83 | Ergonomic | Product | Information |
1091 | 1874 | 10,200 | ||||
Ergonomics design | 5293 | 46,368 | 8.76 | Interaction | Product | Information |
12,084 | 4002 | 8291 | ||||
Product design | 2617 | 22,076 | 8.44 | Interaction | Ergonomic | Information |
4911 | 14,044 | 3167 | ||||
Information design | 5112 | 44,750 | 8.12 | Interaction | Ergonomic | Product |
19,793 | 1454 | 2612 |
TP(Total Publications); TC(Total Citations); ACPP(Average Citation Per Paper); CDS(Cross-Domain Citations).
3.2. Trends of Major Academic Branches
To detect the trends of major academic branches, we perform a two-dimensional approach including bibliometric and network analysis. The bibliometric characterization aims to assess design research area outputs' trends. Additionally, the network analysis intends to find out research trends in each academic branch of design research and the evolution of core research themes.
3.2.1. Identifying Research Trends with the Bibliometric Characterization
The time-trend analysis among four branches is displayed in this subsection. Trends about publication outputs and citations in design research are listed in Figure 3 and Figure 4. Firstly, an obvious rise can also be seen in the annual publication output related to interaction design. There is no doubt that the popularity of the Internet motivates the rapid development of interaction design. The value of CF grows until 2009, revealing that interaction design has shown tremendous growth. After 2009, it keeps flat without two local minima near 2010 and 2013. This indicates continued development of interaction design. Secondly, it also appears that the annual publication output of the traditional branch, product design, has grown steadily. The value of the CF of product design grows until 2015. However, there are temporary drops of the values between 2009 and 2010 and between 2013 and 2014, understood as temporary interruptions in the development of product design. Thirdly sometimes the annual publication output of ergonomic design slightly increases, and sometimes, it slightly decreases. Overall, it is growing. According to Figure 4, the value of CF grows at a steady rate, which indicates stable development of ergonomic design. Finally, the annual publication output of information design keeps increasing year by year. Nevertheless, the value of CF firstly increases and then decreases. Information design seems to become stagnant after 2009.
Figure 3. Publication outputs of major academic branches in design research.
Figure 4. Trend of the Citation Function (CF) in design research.
Additionally, the topic evolution and citations of main topics in each branch are computed to reflect the influence of each branch. In Figure 5, Figure 6, Figure 7, Figure 8, Figure 9, Figure 10, Figure 11 and Figure 12, we can learn that the influence of each branch increases yearly. As can be observed from Figure 5 and Figure 6, Topic 9 is ascending with a rapid growth, while Topic 7 and Topic 19 keep up with and surpass Topic 9. The tendency reveals that ergonomic design is concerned with the scientific study of products and the environment, to be safe, healthy and comfortable for human beings. As shown in Figure 7 and Figure 8, three main topics alternately increase, and Topic 3 keeps expanding rapidly. It shows that product design tends to be focused on developing the understanding of the design process. By comparing Figure 9 and Figure 10, Topic 21 keeps its predominance, and Topic 1 intends to catch up. This indicates that research about the interaction of social media is a global trend in interaction design. According to Figure 11 and Figure 12, Topic 14 maintains a competitive advantage with a slight drop, and Topic 0 has grown rapidly. It elaborates that information design presents how data are processed and what data you see.
Figure 5. The main topics' evolution in ergonomic design.
Figure 6. The citations of the main topics in ergonomic design.
Figure 7. The main topics' evolution in product design.
Figure 8. The citations of the main topics in product design.
Figure 9. The main topics' evolution in interaction design.
Figure 10. The citations of the main topics in interaction design.
Figure 11. The main topics' evolution in information design.
Figure 12. The citations of the main topics in information design.
3.2.2. Identifying Research Trends with Network Analysis
We further analyze keywords to reveal the trend of core themes in each academic branch. We divide the full period into three periods. These are 2004–2007, 2008–2011 and 2012–2015. For each branch, the co-occurrence relationships among the top 50 high-frequency keywords in each period are listed in Figure 13, Figure 14, Figure 15 and Figure 16. The connection relationship between two words is represented by the lines. The thicker the line, the stronger the connection. Additionally, the size of nodes marks the degree of the core or edge. The colors represent k values listed in Figure 13, Figure 14, Figure 15 and Figure 16. The bigger the k value, the more important the node. Core themes in each of the four years are represented by red nodes.
Figure 13. Co-word network of interaction design during 2004–2015.
Figure 14. Co-word network of ergonomic design during 2004–2015, emg: Electromyography.
Figure 15. Co-word network of product design during 2004–2015.
Figure 16. Co-word network of information design during 2004–2015.
As can be observed from Figure 13, "social media" is the biggest node during 2004–2015, except for "human-computer interaction". It explains that social network sites are paying more attention to exploring the interaction between the user and the user interface, with the popularization of computer networks. The color of nodes, such as "personality", "gender", "motivation", "emotion", "technology" and "e-learning", is red, and their size remains stable in the three periods, which states that researchers continue to study the use of computers from a psychological perspective. Some researchers address human interactions with computers. Others emphasize the psychological effects of computers on phenomena such as learning, cognition, personality and social interactions. Additionally, the k values of "virtual reality" nodes in the three periods are 12, 13 and 17. The rising tendency reveals that several researchers have increased research interest in the design of 3D interaction. "Mobile phone" is an emerging theme during 2012–2015. With the increasing penetration of mobile phones, quite a few studies focus on the interactive pattern of smart phones users and examine factors that influence problematic smart phone use.
As shown in Figure 14, the biggest node is "ergonomics", which is the core theme during 2004–2015. Most of the research in ergonomic design applies ergonomics/human factors in the design, planning and management of technical and social systems at work or leisure. The second frequently-used word is "biomechanics". It states that some studies in ergonomic design assess quantitatively the impact of the biomechanical factors on the risk of injury to control occupational injuries. The nodes such as "musculoskeletal disorders", "postures", "fatigue", "mental workload", "low back pain", "comfort", "heart rate" and "safety" continuously appear as red nodes in co-word networks, and their size have kept stable in the past 12 years, which reveals that they are the core themes during 2004–2015. It suggests that due to the increasingly complex technology, the increment of work activities required of people is leading to a broader range of physical stress, as well as psychological stress. Most of the studies hold the view that the principle of ergonomics design is to ensure the comfort, health and safety of people at work. To yield ergonomic design principles, a host of researchers leverage fundamental knowledge of human capabilities and limitations and the basic understanding of cognitive, physical, physiological and social aspects of human behaviors. Furthermore, "situation awareness" is an emerging theme during 2012–2015. An increasing number of studies explore how situation awareness affects human interaction.
By comparing the three pictures in Figure 15 side by side, "product design", "design theory", "design process", "design methodology", "design cognition", "design activity", "design research", "design tools" and "design models" are the most frequently-used words during 2004–2015. It reveals that a multitude of researcher have been focused on the theory of design. It presents in-depth studies of the methods, strategies, research and analysis of design. Besides, it is related to the management of the design strategy, process and implementation. The focus of much publications includes basic theoretical advances, case studies, new methodologies and procedures; as well as empirical studies. The red nodes such as "design theory", "design process", "design research" and "design cognition" continuously appear in co-word networks in the past 12 years, and their size remains stable, which represents the core themes during 2004–2015. It demonstrates that the IT evolution has influenced the design professions. Many researchers invent new tools, methods and techniques to design and manufacture products. The core theme "design education" during 2008–2015 shows a rising importance. There are more and more researchers developing more innovative approaches to design education. The design research area affects the development of the economy and society. "Problem solving", "collaborative design", "design management" and "protocol analysis" are emerging themes during 2012–2015. It demonstrates that several researchers provide insight into process structure and the organization of design to enhance the quality of products. Additionally, the research method "protocol analysis" captures researchers' attention. Nevertheless, a few themes, such as "consumption" and "ethnography", have become obsolete with the passage of time.
Figure 16 shows that "information visualization", "data visualization", "visual analytics", "volume visualization", "dimensionality reduction" and "feature extraction" have been core themes during 2004–2015. It reveals that information design is concerned with visualizing information with informational graphics, such as charts and diagrams. Information design is associated with making complex data easier to understand and to use, as well. Therefore, information design can be integrated into the research process by processing data into a visual format. Beyond that, the k values of red nodes in three periods are 15, 16 and 17, revealing that the relationship between core themes is increasingly tight in information design. The core theme "visual analytics" shows increasing importance. Quite a few studies concentrate on solutions that give the user more freedom to visually represent information, rather than standard solutions, for which the structure of the visualization is fixed. Still other researchers address the challenge of visual analytics for big data with the enormous growth of data in the last few decades. "GIS" continuously appears during 2004–2015. "Flow visualization" and "user interface" are emerging themes during 2008–2012. It states that information design is widely used in different fields. A part of the researchers charts the workflow of information to increase the efficiency of data curation tasks. The others are devoted to studying the impact of information design on the user's focus of attention.
In conclusion, the size of red nodes in Figure 13, Figure 14 and Figure 16 has changed little in the three periods. We can infer that these research hotspots have a sustainable development. From Figure 15, the size of "design practice" gradually becomes the biggest one, and other nodes get much smaller, which reveals that "design practice" gradually occupied a dominant position in product design.
4. Conclusions
This study presents a bibliometric, network-theoretic and text-based analysis of the design research area during the last 12-year period. Our work is the first in-depth study on keeping track of the current advances in the design research area by using text mining techniques. Furthermore, the result shows that the developed methods are universal and could be applied to manage the knowledge of various research fields. Additionally, text mining techniques used in this study could help researchers have a comprehensive understanding about the knowledge of a certain field hidden in a large amount of scientific literature. The clustering method provides an architecture overview of a certain field in more detail. Additionally, social network analysis further explores the core themes to help researchers with a better understanding of the development gain of a certain field.
For further studies, we will apply the author-topic model, a probabilistic model to link authors to observed words in the scientific literature of the research fields (a case study of design research), which will provide a general framework for exploration, discovery and query-answering in the context of the relationships of author and topics.