Empirical Evaluation of Social and Traditional Search Tools for Adhoc Information Retrieval

The nature of World Wide Web (www) has evolved over the passage of time. Easier and faster availability of Internet has given rise to huge volumes of data available online. Another cause of huge volumes of data is the emergence of online social networks (like Facebook, Twitter, etc.) which has actually changed the role of data consumers to data generators. Increasing popularity of these online social networks has also changed the way different web services used to be used. For example, Facebook messaging has some impact on usage of emails; twitter usage affects (positively or negatively) online newspaper readings. Both of these platforms are heavily used for information searching. In this paper, we evaluate the role of Facebook and Twitter for academic queries and compare the findings with Google search engines to find out if there is a chance that these online social networks will replace Google sooner. A query set selected from the standard AOL dataset is used for experimentation. Academic related queries are selected and classified by expert users. Findings of Google, Facebook and Twitter are compared against these queries using Mean Average Precision (MAP), as a metrics for evaluation. Results conclude that Google has the dominating factor with a better MAP than Facebook and Twitter. Keywords—AOL Query Log; Facebook; Twitter; Social Search


INTRODUCTION
With the passage of time, the nature of web has evolved.A major breakthrough in this regards is the emergence of online social networks.Online social networks have not only changed the role of users from content consumers to content generators but also have changed the way users used to search the web.These social media websites, showing various forms of consumer generated content (CGC) such as virtual communities, blogs, social networks, wikis, collaborative tagging and media files that are shared on sites like Flickr and YouTube have gained substantial popularity [1].Also Social network sites (SNSs) such as Facebook, MySpace, Bebo, and Cyworld have attracted millions of users, many of whom have assimilated these sites into their daily practices in real time [2].Apart of it, most sites support the maintenance of preexisting social networks, but others help strangers connect based on shared interests, activities, or political views.Number of sites gratify to diverse audiences, while others attract people based on shared racial or common language, religious, sexual, or nationality-based identities.These sites also vary in extent to which they incorporate new communication and information tools, such as blogging, mobile connectivity, and photo/video-sharing [3].This exponentially increasing interest in online social networks (see figure 1) has resulted in generation of huge amount of daily data on the web.Traditionally, it is know that search engines are used for searching relevant information from the web.However, there has been an increasing trend of searching information using online social networks.This is where the concept of social search gets emerged.

A. Social Search
The process of social search on social media points out the usage of social mechanism to seek information on web.Many search engines provide facility for social search; by providing a link of a web page (e.g., public Twitter posts), or it is simply a process of result ranking [4].Social tagging systems' output can be the base platform for online social search engines like delicious on (delicious.com).Evan et al. [6] point out the stages for search process in cases when people' need to be in contact with others.Morris et al. [7] provide a survey for Twitter and Facebook users for the cases; to have a status message question type about any social networks need.The study of Social searching behavior, on a Q&A site, is to post a question (e.g., Harper et al. [8], Liu et al. [9]) on community of large scale users (normally having no direct relation to the asker) can put answers.The systems like Aardvark which is simply a system of expertise-finding [10] or Collabio [ [11]], straightaway can be useful to help in person finding process, and is qualified for information need consideration.Reference librarians can provide assistance as professionals to numerous searchers [12].The social search asserts that (a) social network links can be leveraged to improve the quality of search results, and that (b) a growing body of Internet content cannot be retrieved by traditional web search as it is not wellconnected to the hyperlinked Web [14], [13].It is said that current web search engines are not able to find relevant information available on online social networks.Therefore, www.ijacsa.thesai.orgthere is a trend of using online social networks for information seeking.In this work, we focus to analyze this trend by looking at the relevancy of the results both kind of search tools return.We try to find out how much successful are online social networks on providing relevant results and if there is any chance of online social networks replacing traditional search engines.We select a set of academic queries for this purpose because academic queries are one of the most searched information on online social networks.The overall prime objective of this work is to compare and evaluate the effectiveness of online social networks and traditional search engines for search of academic queries.A diversity of topics is selected from a standard query log that relates to different academic information needs.The paper is categorized in different sections like: Section II, contains some related literature work while section III portrays experiments and discuss their results.At the end, we conclude our paper with conclusions drawn from our work.

II. RELATED WORK
Most of the other works typically focus on social search.For example, Dodds et al. [16] report a successful experiment on exploration of social search.Experiments performed in which more than 60,000 email users attempted to reach one of the 18 target persons in 13 countries by forwarding messages to acquaintances.It was found that targets can be grasped in a median of 5 to 7 steps.Another work which tries to improve web search using social aspects is done by Bao et al. [17].They used social annotations for this purpose.Some have also analyzed the impact of users social networks on personalization [18].There are works that have evaluated some specific online social network and evaluated them for social search.For example, Scale et al. [21] evaluate the role of Facebook platform as a social search engine.They found out that Facebook returns irrelevant results for unknown persons or groups.Another very popular work in this regard has been performed by Tancer [22].Tancer put in front a case study of a user information need, the solution in which is delivered by friends in Facebook relieving the users' use of a traditional search engine.Tancers experience concludes insight in how humans in a Social Networking Sites (SNS) environment can collaborate and participate to meet user information needs.One of the most important contributions towards social search is proposal of models for social search.Work of Evan et al. [19], [20] is considered a significant effort in this regard.According to Jaime Teevan et al. [23], roundabout 50% users are in contact via the use of Status Message Question Asking(SMQA) behavior, so that is the reason that SMQA is the hot research area and most common item in new researches.After 50% Facebook users, twitter was on second with 33% and LinkedIn, Google with 25% on third in usage of SMQA.There are some works that we find very relevant to our work in nature of the problem they worked on.The work of Morris et al. is one of the initial works [15] focusing on social search.This work is most related to our work however there are major differences between our methodology and target domains.Compared to our work where we effectively use SNs online search option, they used status messages as information seeking option.Similarly, Zheng et al. [24] tried to evaluate online social networks for travel queries.The focus and goal of their study was to examine the extent to which social media results appear in search engine results in the context of travel-related searches.Their employed research design simulated a traveler's use of a search engine; it was for travel planning by using a set of pre-defined keywords in combination with nine U.S. tourist destination names.Comparative findings of search results reveal that the role play of social media contains significant portion of the search results, as now people' rely & use social media community more than ever before.The current work is the confirmation to argue that social media provides online search progressively.Another work that we find somehow close to our work is done by Alan et al. [25].In current paper, authors examined the work potential for using online social networks to boost Internet search.They analyzed the differences between the social networking systems and Web in terms of the mechanisms they use to locate and publish useful information.They conferred the benefits of integrating the mechanisms for finding useful content in both the social networks and Web.Such initial results from a social networking experiment suggest that such integration has the potential to improve the quality of Web search experience.Our work portrays the results by evaluating the situations in which platforms are suitable for what type of categories on different platforms like Google, Facebook & Twitter.

III. EXPERIMENTS
Our experimentation significantly follows standard methods and measures.Experiments start with selection of academic queries [26].We consider a query an academic query if it seeks any information relating to academics needs (for example, from admission information search to expert searching).We use real world search engine query log for extraction of academic queries.

A. Selection of Queries
On August 4, 2006 (in first decade of century), AOL (America Online) intuitively released a huge dataset query log collection (i.e. of 500,000 people') that was the collection of real users search relation with AOL for academic (noncommercial) domain.AOL, take an action and immediately (on August 7, 2006) cleared the site with such data, but it was too late.The files were floated and shared all over the internet within this short time span.It was bulk of about 36 million Web searched queries typed by approximately 657,000 users www.ijacsa.thesai.orgfor three month time span (from March 01, 2006, to May 31, 2006).It consisted of a compressed 439 MB download with 2.12 GB in expansion.A sequential go through AOL dataset makes it possible for us to select a subset of academic queries.We identify and further categorized these queries under different information need labels to make them more understandable.Table1 gives 36 queries for 6different information needs are given in the table below.The queries given in Table 1 are used to compare search engines with online social networks.We select three different platforms for performing our experiments.We choose Google to represent Search Engines while Facebook and Twitter are chosen for representing online social networks.This selection is based on the popularity of each platform (see figure 1) that can be used for textual information search.

B. Returned Results and Evaluations
In next phase of experimentation, we use search interface of each selected platform to search with selected list of queries (see table 1).Top 20 documents for each query are downloaded for each selected platform as it is shown in figures 2, 3 and 4. To evaluate these returned results, a massive exercise of user evaluation is planned.

C. User Evaluations
We recruit five different users for unbiased evaluation of returned results foreach platform.Each user is aged between 24 to 30 years and is computer science graduates.Users are asked to thoroughly understand the queries for unbiased evaluation of returned results.They are presented with a web interfaceto mark each returned result as relevant (1) or not relevant (0).Evaluationsare performed in a sequential process i.e. first of all results of all queries areevaluated for Google and then same process is repeated for Twitter and Facebook.Fleiss kappa [27] is used to measure inter-annotator agreement for eachselected platform as shown in table 2. We can see that results are good enough to be considered as a reliable interannotator agreement.www.ijacsa.thesai.orgWe decided the usage of mean average precision (MAP) [28] as metric for performance evaluation of each platform.In a set of queries, the MAP is the mean of the average precision scores for each query.www.ijacsa.thesai.org∑ Where Q is the total number of queries and AP(q) is average precision for a given query q.
To compute average precision, it is assumed that we have total twenty relevant documents in the collection for each query.Following tables provide MAP for each selected platform computed through labeling by each user.Looking at individual results for Google (figure 2), Facebook and Twitter, it can be concluded that MAP results for Google are the highest and consistent across different categories as well as different users.MAP values for Twitter are much lower but consistent across different categories and users.However, for Facebook results we see inconsistency among users as well as among categories.Comparing MAP results for all three platforms using figure 2, we can conclude that Google has produced the best results for all academic queries while Facebook has beaten Twitter for most of the categories.Table 5 shows the variance among MAP for users and also for categories which also show that Google result show more consistent attitude for all query types.Therefore, it can be concluded from all results that Google still holds its position for academic information searching.However, there is a trend of seeking support of online social networks for search information which did not exist earlier.We also observed that Facebook proved to be more helpful when searching for academic related information than Twitter.Main reason for these results is presence of many Facebook pages and groups that share much academic related information such as admissions and scholarship opportunities.

IV. CONCLUSION
In this paper, we made an effort to compare social search with traditional search for academic queries.The main objective was to evaluate who is better after years of dominance of online social networks among web users.For this purpose, we selected Facebook and Twitter for representing online social networks while Google search engine is used for representation of traditional search.We used AOL data-set for selection of queries.The experimentation results reveal that Google maintains its dominance in academic information searching.Comparing both Facebook and Twitter, it has been found that Facebook provides much more relevant information for academic queries to its users than Twitter.

Fig. 2 .
Fig. 2. MAP Comparison of Google, Twitter and Facebook for all Query categories

TABLE .
TABLE.III.COMPARISON OF MAPS FOR ALL PLATFORMS (GOOGLE, TWITTER, FACEBOOK)

TABLE .
IV. FLEISS KAPPA FOR DIFFERENT PLATFORMS