Area 2: SocialSense: Mining the Sense of Organizations, People and Other Entities from Social Media

Social media portals like Twitter, Facebook, Google+, and various forum sites contain the everyday thoughts, opinions, and experiences of their online users. Parts of these UGCs reflect and reveal information about organizations such as the companies, banks, government organizations, and universities etc. These UGC about organizations provide important and timely indicators on the spontaneous and often genuine views of the users, fans and customers of the organizations. It is thus invaluable for organizations to keep track of such views to get live feedbacks from their users and perform live analytics on such data to discover both market insights and foresights and provide better services to their users.

Given an organization, ScoialSense discovers users' views in terms of emerging and evolving topics/events and provide them to the organizations as alerts and general online feedbacks. To attain this, we need to deal with four research challenges listed below:

  • How can we obtain a more representative distribution of relevant data about the target organizations considering that most of the social media services have unknown sampling methodology and limit the amount of data that can be crawled?
  • What are the emerging and evolving topics about the organization? In particular, how early can we predict the emerging topics (alerts) before they become viral?
  • How to identify the user community of the organization? In particular, who are the key influencers and who re those that share the same interests as the organization (interest communities)?
  • How to discriminate the user communities and topics of different organizations that share the same acronym?


To address the first challenge we elicit data from different sources of information including fixed and dynamic keywords, known users, and automatically identified key-users of the target organization. The second challenge can be addressed by learning the organization topics online through time, from which we propose a sparse coding algorithm that can quickly identify the emerging topics, keep track of the evolving topics, and purge the trivial ones as time passes. The third challenge can be addressed by identifying the active users who regularly tweets about the organization, initiates major discussions, and have many followers within the organization. Finally, the last challenge can be addressed by utilizing the context of the target organization that can be determined by the content of the already relevant data and user community of the target organization.


Figure 1: The general architecture of SocialSense


Figure 2: What is contained in a topic: relevant posts, user community and sentiments


Figure 1 depicts the general architecture of SocialSense, and Figure 2 presents some sample analytics that it can generate.

In addition to organizations, the same approach can be applied to mining People, Topic and other entities.