Developing Insights from Social Media

Discussion and conclusions

In this paper, we develop a generic procedure that enables researchers to discover social trends from the collective voice of target users on Twitter. Our proposed approach provides a comprehensive guidance on how to identify a target audience of users on Twitter and discover social trends represented by hashtags, which we believe are unique and hard to acquire otherwise. We choose Twitter among many other social media platforms primarily due to its open and data-friendly nature which has attracted a large number of not only people as its users but also researchers who are interested in public opinions and social trends. We first address the problem of identifying the right users that meet certain criteria from a large pool of random Twitter users, leveraging a wide range of user profiling techniques proposed to date for many different purposes. If the basic user profiling is not satisfactory, we propose to, when possible, consider customized user profiling by developing a machine learning solution to a specific user profiling task. Once the target users have been identified, we explore mining hashtags from the tweets created by the users. Our findings from the two in-depth case studies, one on women interested in fashion and the other on people who reacted to the Me Too movement, demonstrate that the findings acquired by our approach offer unique perspectives and opportunities for social trend analysis.

There is a potential limitation of this work, which we call the target user update problem. While there are user attributes that are less subject to change such as gender, race/ethnicity, and personality traits, some of the attributes are prone to change such as location and interest. Furthermore, Twitter users can update their profiles, which can lead to a case in which some users are identified as having a certain attribute value based on their bio at some point, but at a later point they are no longer identified as having the attribute value, because they have changed their bio. This could be critical to a study, considering the fact that some studies aim to track a social trend over time, and therefore those users who are inaccurately identified as target users may continue to have a negative impact on the analysis. This is a good example of the coverage error mentioned by Hsieh et al. In this case, a decision needs to be made on whether to embrace them throughout the study or update the users at every time point. When updating the users, one should be aware that it requires an update of the entire tweet data including hashtags, which can result in a new version of customized user profiling, which can also lead to different user attribute values.

It is worth mentioning that our proposed method for customized user profiling does not work for all cases. It specifically relies on the hashtags used by the users and is limited to a classification task for user profiling. Nevertheless, we believe that it is useful for many cases, considering the fact that many of the user profiling tasks deal with classification as with gender or political orientation classification, and that it can be a good complement to the available solutions that fail to fill a user attribute of all users. We also acknowledge that the current study is only a starting point that can lead to more interesting and deeper research on text analysis in a variety of disciplines.