Systematic Evaluation of Social Recommendation Systems : Challenges and Future

The issue of information overload could be effectively managed with the help of intelligent system which is capable of proactively supervising the users in accessing relevant or useful information in a tailored way, by pruning the large space of possible options. But the key challenge lies in what all information can be collected and assimilated to make effective recommendations. This paper discusses reasons for evolution of recommender systems leading to transition from traditional to social information based recommendations. Social Recommender System (SRS) exploits social contextual information in the form of social links of users, social tags, user-generated data that contain huge supplemental information about items or services that are expected to be of interest of user or about features of items. Therefore, having tremendous potential for improving recommendation quality. Systematic literature review has been done for SRS by categorizing various kinds of social-contextual information into explicit and implicit user-item information. This paper also analyses key aspects of any generic recommender system namely Domain, Personalization Levels, Privacy and Trustworthiness, Recommender algorithms to give a better understanding to researchers new in this field. Keywords—Social Recommender System; Social Tagging; Social Contextual Information


INTRODUCTION
Exponential growth and sophistication of information on the web is the result of diminishing lines of produces and consumers of data as well as latest growing trend of pervasive computing of "information anywhere, anytime" .In order to deal with information overload, progressive evolution of Recommender systems has taken place over the years.There are zillions of different items available, users cannot be expected to browse through all of them to find what they might like, and therefore, filtering has become a popular technique to connect supply and demand [1].
In mid-90 a lot of research was done to improve Collaborative Filtering (CF) [2], [3] [4], [5] , [6] one of the most popular methods of recommendation, and even now.One of the major problem with CF is Cold start problem which occurs due to initial lack of ratings to make any reliable recommendation.
To overcome this, new methods of recommendations were explored like demographic filtering, content-based filtering (CBF) [3], [6].At this point in time, Recommender systems used only the explicit ratings from the users along with demographic information (e.g., Sex, age, country) and limited content based information or item attributes (e.g.genre, album, singer etc. for music) available with recommender engine designers.In some domains (e.g., videos, photos, blogs) it is very difficult to generate reliable attributes for items.Therefore, pure Content based Filtering (CBF) implementations are rare to find [3] since they are based on content analysis of items.Also, one of the major drawback of CBF is overspecialization problem.Because of its inherent nature, it tries to recommend similar type of items to users, thereby losing on novelty factor in making recommendation.In order to overcome the short coming of the existing methods, Hybrid methods [7] came to existence which exploited the merits of each of these techniques.Constant effort of improving hybrid methods still went on.But, the data sparsity problem inherent in the traditional recommendation systems adversely affects the recommendation quality.Also, many of the traditional recommendation algorithms could not be applied on large datasets [3], [8].
Basic premise of traditional recommender systems is that, it considers users to be independent people, ignoring the social trust relationships among the users, which happens to be quite an important key aspect and distributed across identically.With the help of social contextual information (e.g., user's social trust network, tags issued by users or associated with items, etc.) more accurate suggestions could be made.This led to the second phase of the evolution in recommendation system.With the rapid expansion of Web 2.0, these systems incorporated social information [8] [9], [10], [11], [12] along with information used in traditional recommendation system, leading to the development of Social based Recommendation systems [13], [14], [15], [2] [16], [17] [18].This social information was related to the virtual social circle of the user.Simultaneously, users-generated information (e.g., comments, post, tags, photos, videos) in social network too started being used for the purpose of recommendation [11].Bobadilla et. al.(2013) in [3] have shown the evolution of recommendation system from first phase which is based on traditional Web to the present second phase based on Social Web, which has almost progressed to third phase based on Internet of things.From the evolution so far, it was evident that as we assimilate and integrate more and diverse types of information, the gradual development in these systems is bound to happen.
As seen so far, in this paper we have provided an overview of how recommender systems evolved over the years and highlights the reasons that are leading to its evolution.The rest of the paper is structured in the following manner: Section 2 focuses on analyzing recommendation systems over 8 key www.ijacsa.thesai.orgdimensions for better understanding.Section 3 focuses on Social Recommendation systems (SRS) which is based on social information (e.g., tags, post, opinions, and social links of user) going as input to recommendation engine.A systematic literature review has been done for the same.Section 4 provides an overview of the next generation recommendation systems and highlights the challenges posed by existing systems.Lastly, Section 5 concludes the paper stating the performance of social -contextual recommendation systems over traditional recommendation systems based on the review done.This paper would give researchers a deep insight into SRS and acquit them about the latest advancements and finally provide a foundation on which the future work of these systems could be based.

II. DIMENSIONS OF RECOMMENDER SYSTEM
In order to design a new recommendation system or improve an existing or simply understand it, one needs to understand the generic framework of any recommender system.The key elements in any generic recommendation systems (User, Items, Ratings, Community) are linked as depicted in Figure 1.Users make preferences for items in a system.They express their opinion in the system via ratings (e.g., on a scale of 0-5, ratings in form of stars, fun boards).The space where these key elements make sense is called community.Konstan [2] discussed 8 dimensions of analysis for Recommendation system.They are various aspects to these systems, which makes the understanding and functioning of it easier to researcher.Further these dimensions have been discussed in present scenario to explore future opportunities as the commercial recommendation systems strive in the market to offer best content and quality in recommendations as well as greatest variety of services [3].
 Domain -Recommendation systems has felt its importance in diverse areas and with the popularity of internet, the number is still growing [3].Based on the research carried out in paper [19]   Purpose -The compelling reason for implementing recommendations in E-commerce domain is that they have turned out to be serious business tools to enhance the sales by improving cross-sell by suggesting additional products and gaining customer loyalty resulting in repeat business [20].In university digital library, recommender system is proposed to disseminate information based on quality to help users access relevant research resources among the thousands of resources that are available but yet hard to find [21], [22].
 Recommendation Context -It refers to the context in which the recommendation is being made.It answers the question -What the user is doing when the recommendation is being made.Examples could include like e.g.hanging out with friends, looking for an eating joint in a user's nearby location.Recommendation systems that consider group of users as input to these system, are starting to expand and are used in different areas like tourism, music, web etc.Currently, mobile applications use GPS feature to fetch the current geographic location of user, and employ Recommender systems [23], [24] to utilize this information for generating recommendations e.g., Zomato app.Moon-Hee Park , Jin-Hyuk Hong , Sung-Bae Cho (2007) proposed to model user preference in restaurants by using context-aware information and user profile by implementing map-based Personalized Recommendations using Bayesian Network [25].USER MODEL www.ijacsa.thesai.orginformation about its registered users.How much of the user's personal information to be revealed?For the sake of privacy preservation, a certain level of ambiguity must be introduced into the predictions.A tradeoff must be maintained between the accuracy and predictions [3].
Recommendation systems are highly vulnerable to external manipulations especially in E-commerce where rating biasness can be introduced by companies who wish to recommend their products more than their competitors (Shilling attacks).
 Interfaces -The output of recommendation algorithm could be in the form of e.g., predictions, recommendations, and filtering of information.While the input for these algorithm could be broadly categorized into User data and Item data.Initially, these algorithms made use of explicit information (e.g., user rating for various items) to filter out items that could be recommended to other users of similar interests.But this was not sufficient to make reliable recommendations due to initial lack of ratings for new item, new user or new community.This is known as cold start problem.Then, they incorporated implicit information typically by monitoring user's behavior (e.g., songs heard, books read, applications downloaded).And now input from diverse areas is being used to make accurate recommendations.Among these techniques, Collaborative Filtering (CF) is has been the most popular in recommendation algorithm.It is based on the assumption that an active user preferences would be in accordance with other similar user preferences.It allows users to give ratings about a set of items, generating spare matrix of user-item.Based on the matrix, first the similarity between users can be retrieved (e.g., using k-nearest neighbor).Second, predict rating for an item for an active user who has not rated this item earlier and leverage user neighbor's ratings for the item (Fill in missing values).Third, select promising items for recommendation based on user's similarity with other users.This is generic CF procedure.
Collaborative filtering techniques could be implemented in 2 ways: Userbased Collaborative filtering where in neighborhood of similar-taste people is selected and their opinions are used for making predictions.Another is, Itembased Collaborating filtering where similarity among various items via ratings is pre-computed and user's own ratings are used to triangulate for recommendation.In other words, item based CF is usage of useritem matrix represented by its column vectors.
Content based filtering(CBF) makes recommendations based on user choices made in the past (e.g., if a user rated a rom-com movie positively over a movie recommendation site, the system would probably recommend more of recent romcom movies that he has not yet seen or rated).A user model is created using the user ratings (for the watched movies) and item attributes (in case of movies, attributes like cast, types of movie, genre etc.).This model is applied to predict which kind of movie would the user like in future.It is also known as Information filtering.
There is another variant to CBF which is knowledge based filtering where in, an item attributes form model in item space and users navigate that space.As in the case of personalized news feeds, user reads certain news articles, recommender systems read user's preferences and based on the item model, recommends similar news articles to the user.
Hybrid technique uses a different combinations of CF and CBF [3] to exploit the merits of each of these techniques.Hybrid techniques are usually based on probabilistic methods like Genetic algorithms, Bayesian networks, Clustering etc. www.ijacsa.thesai.orgMemory-based approach can be applied only on user-item matrix while in the case of Model-based approach, data is used to model the system.

III. SOCIAL RECOMMENDER SYSTEM
Ever since the rapidly increasing popularity of Web 2.0 applications and advent of Social Web, exploiting social contextual information (e.g., social links of users in the form of friend list, followed and followers, user's interest groups etc.) contain huge supplemental information about items or services that are likely to be of interest of user or about features of items.Therefore, providing tremendous opportunity to improve the recommendation quality.
There have been constant efforts for exploring social contextual information (e.g., user's social trust network, tags issued by users or associated with items, etc.) and devising methods to capture that information and incorporate it into recommender systems.It works on the principle that a user would trust their network of "elective affinities" more than generic suggestions made by impersonal entities unknown to them [13].In simple words, when asking your trusted friend about a book he would recommend you to read or movie he would recommend you to watch, you would rely on the recommendations given by your trusted friend rather suggestions given by some acquaintance you don't know or trust.This is a kind of verbal social recommendation indulgence.On similar lines, in users' social trust network, users more likely to go by the interests of the friends/people they trust.

SOCIAL RS TRADITINAL
RS www.ijacsa.thesai.orgKnowledge and content sharing systems (e.g., news, articles, bookmarks) too have been gaining momentum and generating huge amounts of shared data along with user created data in the form of comments, blogs, ratings, labels etc. Discovering relevant content in such shared data space has become a night mare.It's like finding a needle in haystack.In such systems, like E-news website, where the users can read news articles from around the globe.There should be some practical means to assess the quality and authenticity of the news going into the personalized news feeds of the users.Also, some parameters to check the trustworthiness of the sites publishing news articles and accessing reputation of sites before making recommendations.
So, we see that there is a lot of importance of trust and reputation in social web.Network of trust is a social network where nodes are inter-connected based on their trust relations [26].Many researchers have devised various approaches to measure trust.User trust and Item trust both can be measured either implicitly or explicitly.
User Trust can be computed through explicit information (e.g., trust networks [14], [26] distrust analysis [27], personality based similarity measure [4] etc.) or through implicit information obtained in the form of social network (e.g., trust propagation mechanism [5]).Item Trust can be explicitly obtained by assessing the reputation of items through feedback of users in online community or implicitly obtained by studying the relationship between the user and the items [3].

A. Types of Information Sources in Social Context
As, the traditional recommender algorithms exploit explicit user feedback as an information source, on which recommendation to similar users or items could be based.Similarly, various explicit and implicit information source that aid in capturing social information for User and Item are depicted in the Table 1.Researchers have tried to propose different ways in which the social information of user could be captured and used for recommendation process.In social media communities, explicit social networks are created by complex web of relationships amongst users making friendship with other users and/or by following/being followed by and/or joined by some common interest group.They are useful for forecasting users' inclinations, because the users' interests may be governed by their friends or neighbors in interest groups.A lot of work has already been undertaken for utilizing friendship relations for recommendation [9].The social filtering of links in social network to discover user's trust network constitutes the inherent implicit data of user.

B. Social Links
Wolfgang Woerndl and Georg Groh (2007) added social context of user as another dimension to the item-user matrix of CF, thereby broadening of domain of mapping the Rating (R) to 3 -dimensional space represented by U,I,C (U: User, I :item, C:Context).They used real data set where in subset of users from Lokalisten4 -a Munich-based German community for making friends, rated some restaurants via online survey.Their evaluation showed that the proposed social neighborhood based recommender outperformed oldfashioned collaborative filtering algorithms (using kNN method) in this scenario.Its limitation is, it remains doubtful whether these results can be generalized in all domains [28].
Fengkun Liu, Hong Joo Lee (2010) used social network information and CF methods for recommending suggested neighbor groups.The methodology followed involved collecting data about users' preference ratings for homepage skin (digital item) and their social relationships from a social networking Web site -Cyworld5, a social networking community in South Korea.Next, they developed approaches for selecting neighbors using Pearson's correlations and augmented it with friends' data.As a result, the model generated recommendations about items using proposed CF with suggested neighbor sets [29].Kazienko et. al. (2011) in their paper analyzed multimedia sharing systems (MSS), 'Flickr6' photo sharing system as multi relation social network (MSN) wherein they aimed at exploring the various relation layers based on contact list, tags, group, favorites, opinions.Eventually, aggregating these layers to form a comprehensive multidimensional social relationship between users.This enabled the successful merging of both the semantic and social background from which the concerned user hailed.The model was used to recommend other users' to the active user in MSS.Additionally some system and personal weights were adjusted for better accuracy.The experiment was conducted in two stages which lead to the generation of two separate recommendations.The initial suggestion being computation with an assumption that applied equal values of personal weight for layers, i.e., for each layer k and each user ui: By using adaptation mechanism the suggestions were provided which were adjusted according to each user and they were expected to rate it.,and this is how it lead to the generation of the second recommendation list.Thereafter, layer contributions were applied after the first lists were rated.After adaptation personal weight values were analyzed directing towards the revelation that the social layer based on indirect reciprocal contact list R coc and author-opinion R ao gained in their contribution much after adaptation, by 220% and 65%, respectively, where other layers lost in their importance.The tag-based layer R t increased in average by 8% [16].
Xin Liu and Karl Aberer (2013) proposed SoCo (social network aided context-aware recommender system).First they partitioned the original user-item rating matrix into groups based on similar contexts of ratings by using random decision trees.Next, they predicted missing preference of a user for an item in the portioned matrix by using Matrix factorization.A social regularization term was added to the matrix factorization objective function which inferred user's www.ijacsa.thesai.orgpreference for an item by learning interests from his/her friends who are expected to share similar tastes.The model was experimented on Real dataset, Douban7, largest Chinese social platforms for sharing reviews and recommendations for books, movies and music.It contains time/date related information, other inferred contextual information and social relationships information.SoCo outperformed compared to the contemporary context-aware recommender system and social recommendation model by 15.7% and 12.2% respectively [30].

C. Social Tagging
With the popularity of Web 2.0, there has been a progressive growth in creation, modification and sharing of online content over social network communities like Youtube, Facebook, Flickr etc. and social tagging systems (STS) provides powerful way for users to organize, administer, consolidate and search for innumerable kinds of resources.These tags [8], [9] , [12] , [15] , [17], [18] carry interesting information about the preference of users who make the tags and of course about the labelled items itself.For example, Last.fm allows users showcase their preferences by tagging artists, albums or music tracks and Del.icio.usallows users to tag webpages.Users annotate an item such as photos, videos, blogs etc., for which is otherwise difficult to generate attributes, by introducing a tag.A set of triples -user, item, tag form information spaces referred to as folksonomies [12] .Recommending tags can serve various purposes, such as: increasing the chances of getting an entity annotated, reminding a user what an entity is about and consolidating the annotation across the users [15].The collection of all his assignments is called his personomy, the collection of all personomies constitutes the folksonomy.
Jäschke et.al (2008) compared several approaches for tag recommendation in the domain of social bookmarking system.They evaluated an adaptation of user-based collaborative filtering, a graph-based recommender built on top of the Folk Rank algorithm and several simpler approaches based on tag counts.They computed the complexity and compared these algorithms over three real world folksonomy datasets from del.icio.us,BibSonomy and last.fm, and found that most popular tags ρ-mix approach outperformed all other approaches as it is can almost reach the grade of FolkRank (which was powerful but cost intensive) and is extremely cheap to generate [15].They have been used on small datasets, their performance on big datasets has not be evaluated.
Stefan Siersdorfer, Sergej Sizov (2009) represented Web 2.0 folksonomies as IR-like Vector Space Model and implemented known recommender methodology namelycollaborative filtering and content based filtering using additional social relations obtained from folksonomy features such as posts, contacts, favorites over it.They provided a large-scale experimental study for photo and contact recommendations on Flickr6 comparing a variety of object representations.The study showed that the common relationship model between users, items, and annotations is often not sufficient for constructing accurate recommendation algorithms in folksonomies.Personalized models which consider user's personal data and the local neighborhood for modeling provide higher accuracy at the noticeably lower computational and modeling costs [12].Nan Zheng, Qiudan Li (2011) investigated the usefulness of tag and time information in predicting user's preference and integrated this information into CF for building effective resource-recommendation model in Social Tagging Systems (STS).They realized this model in 3 phases where first they generated ratings based on tag-weight, time-weight and tagtime weight.In the second phase, used generated rating information to calculate user similarity finally in third phase, recommended the resource.They supported their research with empirical results by using a real-world dataset.Further they proposed to evaluate their model using other datasets [18].
Ma, H., Zhou, T. C., Lyu, M. R. and King, I (2011) proposed a generic framework by amalgamating user item rating matrix and users' social trust network by performing probabilistic matrix factorization analysis.Further, they extended the framework incorporating social tag information.They conducted the experiments on two different datasets: Epinions for social trust network, Movielens for tag information.The experimental results show that their approach outperformed the other contemporary CF algorithms, and the complexity analysis indicated scalability to huge datasets.The limitation when consolidating the social trust network information is they ignore the diffusion or propagation of information between users.Also, a more general framework could be designed to incorporate tags with users and items simultaneously, than associate tags with users and items individually [8].
Tan, S., Bu, J., Chen, C., Xu, B., Wang, C., and He, X (2011) proposed music recommendation hypergraph (MRH) algorithm wherein they incorporated various kinds of social media based information and music acoustic-based content.They used hypergraph to advance into a unified framework taking into account all objects and relations.Recommendation was reduced to a ranking problem on this hypergraph.To evaluate their algorithm, they collected data from Last.fm.They also compared the MRH algorithm with MRH-variant algorithms and some traditional methods.They found that the proposed algorithm significantly outperforms its variants and traditional recommendation algorithms [9].Jian Jin and Qun Chen (2012) proposed a Top-K recommender system which is based on social tagging network.The tag information is the representation of the item.Feature matrix is constructed by gathering information on all items annotated by tags (Item-tag).So the more tags an item has, the more complete semantic information it has.This matrix formed the basis for Item similarity computation.Then a User-tag matrix is constructed which gathered information about the number of times User i uses tag ij when he tags item j in the item set.www.ijacsa.thesai.orgThis matrix is used to calculate user similarity based on tagging the same item.The trust value between two friends could be abstracted and trust-based social network could be perfected.The recommendation algorithm used is MWalker (modified-Tustwalker) for Last.fmdataset.This approach outperformed the two traditional CF algorithms.It overcomes the problem of explicitly stating the trust values in the networks by users, which are subjective processes and the Cold start problem of traditional CF.The one limitation reported is there is redundancy in between tags [17].Bastian, Mathieu, et al (2014) have presented their experiences developing "Skills and Expertise" which allows its users to tag themselves with subjects expressing their areas of proficiency, a data-driven feature on LinkedIn,.Herein, they developed large-scale topic extraction pipeline on Hadoop platform in which they constructed a folksonomy of skills and expertise to assist the users in standardizing information in the skills section and provide a type-ahead.And then create a skill inference algorithm which is a collaborative filtering (CF) skills recommender system based on profile attribute data which would directly suggest additional skills to members through a recommender system.
A large number of members adding skills to their profiles led by the recommender system was one of the major benefit.Author also suggested that the extending it to include other foreign languages will be a compelling challenge [31].

D. User Generated Information
The users-generated content (e.g., comments, blogs, posts, opinions etc.) along with their social network links have stirred a new trend in improving the recommendation results.Semantic resemblance, social closeness and popularity are some of the additional aspects that could be employed as information source for measuring social information.
Yung-Ming Li, Tzu-Fong Liao, Cheng-Yang Lai (2012) modelled SNMC ( Social network-based Markov Chain) by integrating semantic similarity, expertise, social intimacy for knowledge sharing to generate discussion threads and expert recommendations into analysis in online forums [11].The systematic review of literature of Social Recommender system is summarized in Table II.

IV. FUTURE OF RECOMMENDER SYSTEM AND CHALLENGES
From the discussion so far, the success of any good recommender system is based on a comprehensive consideration set of information sources.The kind of information source used has a great impact on the recommendation quality.Therefore with the advent of web 3.0, context-aware information (e.g., geo-social information) and information from a variety of sensors (e.g., sensors for measuring various health data) along with the above information would be incorporated.Currently, only the geographic location [24] of the user is included in recommendation system.Other expected data that could be incorporated is RIFD, surveillance data etc.[3].The future of recommender systems lies in internet of things.
The understanding of dependencies and correlations between preferences in different domains led to transference and exploitation of user acquired knowledge from one domain to several other domains.Tobias et.al [32] conducted a survey have highlighted various Cross Domain Recommender systems.
The growing size of folksonomies poses interesting challenges and opportunities for searching and mining useful content and finding other users sharing the same interests [12].Analysis of such "Big" data is the one of the key issues faced by designers of recommender systems.
Another is Privacy, an important issue because these systems exploit information from social networking sites which contain a lot of information about its registered users.Sharing of such information by companies may pose identity threats.
Other issues are Difficulty in acquiring feedback from users due to the fact that users don't really rate the items (as in CF), therefore almost impossible to determine whether the recommendation made was correct or not.Also, Recommender systems (mainly in E-commerce) experience shilling attacks which generate rating biasness for example while products from competitors receive negative ratings, the product of company X receives positive ratings.These systems are highly susceptible to such external influences.

V. CONCLUSION
From the literature review, it can be concluded that Social information aided Recommender System have outperformed most traditional systems in making effective recommendations.Social link information have been captured from real time social networking sites and used in devising Hybrid approaches utilizing fundamental CF methodologies [16], [28], [29], [30].As the online content is progressively being created, edited and shared over social network communities social tagging provides a powerful way for users to organize, administer, consolidate and search for innumerable kinds of resources.Tags are considered as an expression of user's preference towards a certain resource over time, a denotation of user's interest drift [18].To best of my capacity, limited literature is available on social tag recommendation in different domains areas.It has been explored in areas like -Bookmarking, Bibliographic references, Music, Movies, Books, Skills.Recommending better tags is dependent on creation of improved folksonomy.In addition to it, a rich set of structures and annotations that can be used for mining in variety of purposes include a range of descriptive metadata, such as author, a textual narration of media item, tags expressing theme of an item, historical and geographic information pertaining to an item, and comments and preference logs by other users.

Fig. 1 .
Fig. 1.Linking of key elements of Recommender System Fig 2 shows a snapshot of different input data which has been sub categorized into explicit -implicit data and useritem data.Indicated in the fig, aggregated explicitimplicit user and item data is used in traditional recommendation systems.And input data used in SRS is superset of data used in traditional recommendation systems including some additional data. Recommendation Algorithms -In general, recommendation algorithms are based on 2 basic filtering techniques: Collaborative Filtering, Content-Based Filtering.These two approaches can be combined in different ways forming Hybrid technique [3].These filtering techniques (Collaborative, Content-Based, Hybrid) can be applied on databases (Nonpublic Commercial databases or Public databases) to yield accurate predictions and recommendations of items to the taste of users.

Fig. 2 .
Fig. 2. Categorization of Input DataRecommender Systems can also be divided into Memory based (Similarity Measures, Aggregation approaches) or Model based approaches (Clustering methods, Genetic algorithms, Bayesian classifiers, Neural Network, Fuzzy systems, latent features) in widely accepted taxonomy.Memory-based approach can be applied only on user-item matrix while in the case of Model-based approach, data is used to model the system.

TABLE II .
SYSTEMATIC REVIEW OF LITERATURE OF SOCIAL RECOMMENDER SYSTEM