Big Data is a powerful tool in any data-driven decision-making scenario. Read this article to learn how it can be utilized in new product development to unlock needs customers may or may not directly state.
Findings
In this section, we will present key insights and lessons learned from the in-depth analysis of the longitudinal case study of STE. The company is currently using big data for improved customer involvement in development of a new product. The term 'big data' in this case refers typically to the following types of data: (a) traditional enterprise data, (b) social media data, and (c) machine-generated/sensor data (e.g. weblogs, cloud files, smart meters, manufacturing sensors, equipment logs).
Stage 1: Using big data to determine customer profile
STE used to have little direct feedback from customers. Only recently did STE start to monitor customer comments on social media about its products. The Company has implemented the product innovation strategy of 'Customer Demand-Orientation' and analysed customers' behaviour by machine-style learning using both qualitative and quantitative data. STE Company gathered feedback from their customers as well as partners about their preferences. In order to identify each wearable electronic headset product and generate new ideas, the company collected different source of data such as videos, photos, number of comments and number of followers from the most popular websites (i.e. Amazon, Facebook, eBay) by using web page cleaning, web crawler and HTML parsing technologies. It is worth to mention that all these collected information has vast amounts of data where people produce and share every second and most of the information is unstructured data (i.e. photos, videos or social media) which means it cannot easily be put into tables. Moreover, take Facebook posts as an example, the data quality and accuracy are less controllable. Thus, in order to harvest great values from big data, the trustworthiness of the data is a significant issue that Company STE needs to address. The Company pointed out that data quality can be verified by complete and accurate data which includes values and variables relevant to the purpose of collecting them.
Additionally, STE has developed an application which is highly customisable. It allows its partners and customers to upload their ideas and suggestions to facilitate the company's NPD and the invention of new features. In this way, potential targets were customers with an interest in becoming involved in developing the new wearable device. Some of these customers were highly innovative and able to offer valuable new product ideas. By visualising the big customer data collected in the past 3 months, early adopters of the high-end wearable device requested functions like Dynamic Streaming Technology, telematics, multi-channel recording, and voice control (see Fig. 1).
Fig. 1 Levels of customer involvement for each new product function

By making use of social data, the managers face the challenges to extract the useful information from the terabytes of text data. Different from other data sources, data from social media is no second intent opinion, but the data density is extremely low, i.e. the useful information is buried in the unstructured massive data. In order to unravel the hidden information, data scientist need to adopt some data-mining techniques such as frequency analysis, cluster analysis, and sentiment analysis. For example, customers have a discussion of new ideas of certain functions, a multidimensional scaling diagram (MDS) can effectively illustrate the clustering results of different opinion groups in the discussion. It is a useful approach to uncover information since people will use similar wordings but sentence different structures to express their ideas. Thus, it is useful to use clustering analysis with MDS diagrams to identify major opinion group from Facebook data. Also, username is another useful 'metadata' to identify gender of the post owner, so it will be easy to investigate different gender group towards the propose NPD ideas. The STE information department had the job of processing different customers' suggested attributes of a new product in parallel. In particular, it applies different conventional data techniques to harvest useful information from big data. For example, Apache Mahout for machine learning algorithms in business, Tableau for big data visualisation, Storm for analysing real-time computation system and InfoSphere for big data mining and integration.
Stage 2: Using big data to identify information sources
STE's information department is able to analyse customer data captured from different sources. STE Company has set up three research centres and innovation laboratories for acquiring huge data sources from its operators and target customer. Particularly, social media is a very important data source. For example, the main forum of STE on its official website posts (in different formats) more than 10,000 topics/day, including new product information, announcements, feedback and discussions. In this open community, there are 10s of 1000s of posts fed back by customers every week, of which some deep reports of product using came into being. By the way of integrating and analysing information of those posts, STE Company can acquire demands information of customers' with low costs and high efficiency, providing innovative ideas for research and development of new products.
The Company emphasised the important big data source from social media that can create relations and provide a better understanding of the customers and their product usage and in that way improve the development. Particularly, 'lead users' can be differentiated from 'normal customers' by the information department through RFM analysis which is a data mining technique quantifying customer value by examining how recency, frequency and monetary a customer purchases. The information department analysed the customer information and activities around the STE brand (e.g. platform, communities, apps, and official websites). Figure 2 shows the STE network for customer connection and interaction. STE connects to its customers through a wide range of sources at low cost (e.g. official web forums, mobile apps, website communities). Using the same means, customers can also interact with the company and each other in real-time. The latest product information is updated to these STE sources on a daily basis, partly to attract more customers and partly to gain feedback for further developments. A vast number of these places were cultivated for different interested customers. In this way, the information department can collect a wide range of customer information from different channels and platforms extremely quickly. Data requirements could be different due to different organisations' needs and problems. Then, a number of data pre-processing techniques, including data cleaning, data integration, data transformation and data reduction, can be applied to remove noise and correct inconsistencies from data sets. After that, data mining techniques can be used to help managers generate lots of useful information.
Fig. 2 The STE network for customer involvement

Stage 3: Using big data to improve customer involvement in product design
To come up with a customer involvement process, their individual qualities as well as inspirations should be considered in the design. According to Füller and Matzler, the best solution to customer involvement is dependent on the specific situation: the offered incentives (e.g. monetary compensation, supply of proprietary information, excitement factor, or even just the kudos of being called a 'co-developer'); the degree of use of multi-media data (e.g. virtual product presentations, short videos, animations); the intensity of interaction (e.g. duration, frequency and number of participants); the applied tools (e.g. open discussion forums, toolkits, virtual stock markets, virtual concept testing or competitions); and the communication style (e.g. anonymity of the interacting parties, informal/formal; uni-, bi- or multi-directional). In particular, STE Company stalls feedback software and sensors into every new product in combination with the advantages of technology and hardware. According to various data transmitted from customers' smart devices, functional design to products can be made appropriately so that new product with improved features/functions in line with customers' demands can be launched. The company has grasped the core big data technology (e.g., Spark SQL, and Hadoop Cluster). Therefore, the companies were able to apply data analytics with its big data technology, and react quickly to acquire a large number of loyal customers through adopting reasonable product portfolio, accurate market orientation and perfect function design. These inputs also allow R&D teams to quickly develop a new version of a product, with improved functions and features.
The Company explained that customer involvement can be facilitated through utilising big data to provide new or more precise insights. The insights can be gained in a digital form, such as use tests, in order to understand the customers and adjust the decisions about the products accordingly. Therefore, big data allows users' behaviours to be examined and thus their demands can be fitted. Since the recruitment of customers was conducted from diverse information sources, it was imperative for the design to align with corporate identity. In addition to this, the customers were led by the design to share their ideas and expertise in a simple and enjoyable approach.
Stage 4: Using big data to enable customer access and participation
In the programming and testing of the customer involvement platform, customers were accessed (communicated with) by different means. The uses of banners, emails, pop-up windows as well as short articles were considered in communicating with the customers encountered in the Internet and notify them on their responsibility in the NPD project (Füller and Matzler 2007). For example, an inviting pop-up window invited every 50th online user to the STE official forums for 1 month. Email and app subscribers were recruited simultaneously.
Within the involvement process, the company connects with its customers through its own Talend big data platform. It understands customers' behaviours and needs better by acquiring datasets from 12 processes that run simultaneously and come from sources including third parties, social networking feeds and application programming interfaces (APIs). The customers were asked for help or raising certain questions with the need for apt processing. A wide range of customers enjoyed giving direct feedback and expressed preferences in NPD. At the meanwhile, customer inputs can be analysed to initiate some improvement and to get first results in case the anticipated quantity or quality of customer information is insufficient, the process of collecting it needs to be reconsidered. Customers engaging with the company for the first time should be assessed for background information on their preferences, their willingness to contribute again as well as their expectations regarding further NPD projects. For those customers who have already engaged with the company several times, a relationship can be seen to have been established and company managers should consider creating a specific community for such customers.
The role of big data in the NPD process is directly affected by how data-driven the organisation is overall. In particular, as seen at STE Company, their product and solution, as well as product development is highly dependent on the collection and processing of large customer data sets. In order to handle the problem of difficult data storage, STE Company has set up a data centre of cloud computing specialising in storing and handling big data. By means of reconstructing decision making and analysis system based on big data and establishing product innovation and feedback mechanism based on big data and cloud computing, STE Company improved the competitiveness of products and satisfaction of customers. Furthermore, the Company further explained the important of having data centralised in order to provide the ability to all interested actors within the company to access and process it. To organise and manage successfully big data, organisations should have established innovation ecosystem and built "data-alliances" with stakeholders including partners, suppliers and other actors with common interests.
Results
In the case of STE, overall, more than 26,000 customers had participated in NPD in some form. Over 13,700 new product ideas were recorded from different sources (e.g. images, text, video and voice mails), more than 127,400 comments were made on the specific functionalities required, and over 3200 visions of future similar devices were gathered. A total of 15,943 customers participated (or around 61.2%) would like to get involved in future NPD projects.
As suggested by Matzler et al. and Füller and Matzler, a functional/dysfunctional examination was conducted to identify whether the new functions identified were considered exciting or simply basic. Take the function of Dynamic Streaming Technology as an example, according to the customers' answers toward the question of how they would react if the dynamic streaming technology (high-quality videos in 360∘)
virtual reality) was provided to in the new device and how they would react if the function
were not provided, then function/dysfunctional ratios were calculated which evidences whether the new function
identified is an excitement or basic factor. As Fig. 3 shows, if Dynamic Streaming Technology were provided it would
have a significant impact (0.71) on overall satisfaction with the new device, more so than on dissatisfaction if the
function were not provided (−0.39). Therefore, the finding shows that the Dynamic Streaming Technology was
identified as an excitement factor, and exceeding customer expectations. If the Dynamic Streaming Technology
delivered to the new device it brings excitement, otherwise, there is a low dissatisfaction from the customers.
Fig. 3 Dissatisfaction and satisfaction level due to absence or presence of Dynamic Streaming Technology

To examine whether the customer involvement provided by STE actually
allows customers to better understand the value of the new products and to articulate their needs, several questions
were asked. In particular, a five-point Likert scale (1 =
very positive; 2 = positive; 3 = neither; 4 =
negative; 5 = very negative) was used to identify the perceptions of customers. The result shows that most of the
participated customers highly agreed with the statement that if Dynamic Streaming Technology was supported by the
device, its functions and features would meet with satisfaction by customers (mean = 1.23; SD = 0.81). The
functionality and interactivity of the information department also helped in the articulation of individual needs
and wants (mean = 1.96; SD = 0.94). Overall, customers stated that they positively and actively made contributions
to the NPD (mean = 2.08; SD = 0.78).
Engagement with customer-derived big customer data helps STE to
understand its customers as well as the market. In this situation, big data supports STE's customer involvement by
revealing the factors that might influence customer loyalty (i.e., how to keep customers coming back again and
again). By applying big data analytics, STE can identify optimal investment opportunities across different
information sources, and keep optimising its marketing strategies through analysis, measurement and testing. The
different information and communication technologies applied offer unstructured, semi-structured, and structured
input to the R&D teams. In the case study project, structured and large-scale data sets were gathered during the
earlier NPD phases, in order to attain more insight into customer contexts and needs, through dialogues,
collaboration and online surveys. For example, STE utilises customer dialogue to shape its NPD through customer data
capture, crowd-sourcing and large forums. This structured information was often based on customer stories or
dialogues, and customers were able to consciously and actively support the development of new products and
functionalities. Semi- and un-structured rich data were captured in the later phases of NPD, when a feature or
product had been launched on the market, and customers were able to use the particular product or feature. For
instance, STE applies natural language processing (NLP) to unstructured content (captured from apps and social
networks) to identify customer satisfaction and preferences. The rich data from a variety sources provided a
different type of information to the NPD process, and included real behavioural data based on the click behaviour of
customers using a system, for example. In such circumstances, the customer was not actively involved in giving
feedback, but information was automatically captured through analysing customers' online behaviour. Organisations
are paying more and more attention to gathering this type of data, to the extent that discussions are arising in
social media about ethics and customer privacy. This is an aspect that requires to be taken into consideration when
concentrating on capturing customer data for NPD. Structured, semi-structured and unstructured data are common in
customer involvement studies in all phases of product development. However, in the case study, the data in the first
two phases (generation of idea and concepts; design and engineering) are more connected to feedback, while in the
later phase of test and launch larger amounts of data are captured through actual use and customer behaviour.
Discussion
The in-depth case study indicates how big data is used by STE to improve customer involvement in NPD. In particular, customer involvement has to meet both managers' expectations and those of the customers participated. That is, it is necessary to ensure that honest and valuable feedback and input is gained from a diverse body of customers; if they are not provided with some motivation, they will stop contributing or be tempted to provide incorrect information. From the case study, it is evident that STE, through the use of big data, allows customers not only to embrace inventive products to suggest novel ideas, via trial and error. Apparently, big data enables customers to fiddle around with novel products and new features, which enables them not only to share their tacit knowledge, but also to articulate their explicit needs. Through its efforts in addressing customers' needs, STE enhances the participation of customer in future company-initiated NPD projects.
Today, customer expectations are high: they want the latest technology and cutting-edge functionality, but at an unprecedented low price, and immediate services. At the same time, they have little brand loyalty and keep comparing the product with others. The proposed customer involvement approach can provide companies with the guidance to handle data from various sources and formats, as well as to push intelligence from these data to various channels so as to support NPD. It is meaningful for development of products and services with short product life cycles, notably in the customer electronics industry and social media applications, where demand is driven mainly by lifestyle trends. STE Company indicates that big data in the customer involvement can lead to a greater understanding on how products or users behave, and enable accurate recommendations for existing or new products. Further, they explained that big data compose a key element of some of their products. In this context, the recommended customer involvement, based on big data, has to support a company's key competencies and be in line with its objectives. On the other hand, it is necessary to balance the expected benefits against potential costs of using big data to interaction with them. Furthermore, the case also highlighted that achieving customer involvement requires considerable coordination and planning through the different NPD phases. Thus, support from top management through a product champion and tight interfacing with the target market is essential components of customer involvement. In addition, the existing organisational culture (i.e. hierarchical corporate culture) can represent a barrier to firms seeking to capitalise on big data to improve customer involvement. The STE Company referred to the significance of data-driven culture and the impact upon the connected operations. Brynjolfsson and McElheran explain that in a data-driven culture, data holds a central function and promotes a fact-based decision-making process, while data-driven decisions are more informed and effective. Interviewees from STE Company argued that a data-driven culture is more important than big data technology, since this is where limitations often occur. However, small or young firms are likely to favour the proposed big data supported customer involvement, given their weak inherent 'culture'. Another issue is that not all firms (especially small ones) have adequate infrastructure or professionals to analyse the big data.