Electronic Commerce Product Recommendation using Enhanced Conjoint Analysis

While finding any product, there are many identical products sold in the marketplace, so buyers usually compare the items according to the desired preferences, for example, price, seller reputation, product reviews, and shipping cost. From each preference, buyers count subjectively to make a final decision on which product is should be bought. With hundreds of thousands of products to be compared, the buyer may not get the product that meets his preferences. To that end, we proposed the Enhanced Conjoint Analysis method. Conjoint Analysis is a common method to draw marketing strategy from a product or analyze important factors of a product. From its feature, this method also can be used to analyze important factors from a product in the marketplace based on price. We convert importance factor percentage as a coefficient to calculate weight from every attributes and summarize it. To evaluate this method, we compared the ECA method to another prediction algorithm: generalized linear model (GLM), decision tree (DT), random forest (RF), gradient boosted trees (GBT), and support vector machine (SVM). Our experimental results, ECA running time is 6.146s, GLM (5.537s), DT (1s), RF (10,119s), GBT (45.881s), and SVM (11.583s). With this result, our proposed method can be used to create recommendations besides the neural network or machine learning approach. Keywords—Enhanced conjoint analysis; marketplace; ecommerce


I. INTRODUCTION
e-Commerce, in its first form, was established 40 years ago. Since then, new technology, advances in internet connectivity and security, payment gateways, and widespread consumer and business model acceptance have all aided egrowth. Commerce's Michael Aldrich launched electronic shopping in 1979 by connecting a customized television to a transaction-processing computer through a telephone line. Amazon was one of the earliest e-commerce sites for books, and PayPal followed it as an e-commerce payment system in 1995. Alibaba was founded in 1999 as an online e-commerce platform. It transforms the company model from single-store ecommerce to a multistore marketplace.
In the 2000s, Shopify and BigCommerce have launched a cloud e-commerce platform that helps retailers to have their ecommerce shop. Previously our system had been built from scratch, but Shopify and Bigcommerce allow us to have our ecommerce store by renting their platform. Since 2010, numerous payment systems have arisen, including stripe, apple pay, Samsung Pay, and other payment method. Now, there is a social commerce, where people can buy or sell something from social media like Facebook, Instagram, and even messenger applications like WhatsApp, telegram, and signal.
In Indonesia, Bhinneka and Sanur are the first ecommerce in Indonesia in 1996. Then Doku as payment gateway system is introduced in 2007. New era of ecommerce starts from 2010 as Tokopedia and Bukalapak launch as e-commerce marketplace. Later, there are a lot of marketplace launched such as: Blibli, JD.ID, Lazada, qoo10, and many more. e-Commerce is a new way in business transaction through internet with cover of lease or the auction goods [1]. e-Commerce marketplace in Indonesia are commonly act as mediator. Marketplace platform act as bridge between customer and seller, they obtained data generated by all its participant [2]. Author [3] formulate e-commerce as integrated platform from upstream supplier (individual industry, cooperative, and production) into end customer. This platform consists of intermediate channel provider (distributor, importer), one or few categories of product, called vertical ecommerce, and O2O (online-to-offline) model where customer can browse the product through e-commerce and buy it in physical store. e-Commerce today is not only a rigid platform for interaction between buyers and sellers, where information received by buyers only comes from the platform but provides direct access between buyers and sellers to create an interactive environment [4]. Even e-commerce offers transactions in one country and more than one country; it is called cross border ecommerce, a new concept about resource integration, supply and demand matching, joint operation, and supply chain logistics between countries [5]. People buy some items at online stores for a variety of reasons: They can buy at any time. They save time by not having to go to the store. They can easily compare prices. They do not have to bargain because the price of the product is fixed. More information and chances to compare goods and prices among thousands of products are available to customers in an online shop, and better product selection, convenience, and simplicity of discovering desired products [6]. In the marketplace, consumers usually search the products and filter them according to the needs of consumers, for example, location, price, discount, delivery courier. All sellers with their products will appear based on appropriate keywords. After choosing the product, consumers will choose the delivery courier and payment method. Finally, after payment is completed, consumers can wait for their product to come. Consumers can finish their order when the product comes, and the fund can be released to the seller. Flow of ecommerce transaction is shown in Fig. 1. Currently, transactions in a marketplace are determined by many factors: 1) Location of buyers and sellers. The location of the buyer and seller usually becomes one of the things that are considered for buying goods be-cause relating to shipping cost.
2) Delivery services that the seller activates. Every buyer has an opinion or experience using certain shipping services.
3) Price. One of the essential aspects for purchasers is the pricing of goods. Buyers hunt for the best deal on everyday items.

4)
Review. Reviews about products, services, and shipping from certain transactions of previous buyers.

5)
Promotion. Promotions are provided by sellers, such as discounts, cashback, or free shipping.
Indonesia is regarded as one of Asia's most developed ecommerce markets. Several reasons contribute to Indonesia's ecommerce sector's fast growth. To begin with, smartphone and Internet development continue to accelerate. Second, Indonesia has a large population with rising purchasing power due to the country's macroeconomic solid growth. Third, Indonesia's populace is young and tech-savvy. [2]. There are 175.4 million active Internet users in January 2020 [3]. With 2.201 start-ups, Indonesia can adjust and develop product based new technology rapidly [4]. Besides, the support from the government by opening the large scale of investment from domestic or overseas to participate in developing Micro, Small and Medium Enterprises and e-commerce have made ecommerce grow very fast. This growth gives chances to ecommerce practitioners to grow their business, such as Tokopedia, Bukalapak, Blibli, and others [5]. In Indonesia, buyers usually spend an average of 4 minutes in a marketplace to have a look at some products before deciding to buy an item or not. In general, there are many identical products sold in the marketplace, so buyers usually compare the items according to the desired preferences, for example: price, seller reputation, product reviews, and shipping cost. From each preference, buyers count subjectively to make a final decision which product is should be bought. With so many products in ecommerce, there is a problem finding the optimal product because buyers have to manually compare products one by one.
Motivated by the problem, the following research the question arises and will be discussed in this paper: what types of consumer strategy should be implemented to get the optimal product?
To answer the question above, we proposed a method for helping buyer finds optimal product from price, shipping cost, and insurance cost. An enhanced conjoint analysis method is chosen to recommend optimally product. Conjoint analysis is a well-known research technique in marketing and consumer research. This the approach has been used to address a wide range of marketing challenges, including predicting product demand, developing a new product line, and calibrating pricing sensitivity/elasticity. The approach entails presenting respondent consumers with a carefully crafted collection of hypothetical product profiles (defined by the necessary levels of relevant qualities) and gathering their preferences for those profiles in the form of ratings, rankings, or selections [6].
Conjoint analysis usually is used to identify some variables that important for a product. For example: [7] using adaptive choice-based conjoint analysis to identify surcharge for outdoor apparel, [8] evaluate domestic express coach service using conjoint analysis, [9] identify the critical attribute of smartphones using conjoint 3 analyses. Widely conjoint analysis is used to evaluate essential attributes for a product. By knowing the critical attribute of the product, a company can create a new marketing strategy, new version of the product, or new product that is close to what consumer needs. In this research, a conjoint analysis is proposed to get essential attributes from a consumer's product. Value from attribute is converted as attribute weight. Later multiply the value of each product attribute and its weight to get a score and choose the minimum score as a recommendation. Assumed that there exists e-commerce. When the consumer chooses the product, our system can recommend what the consumer should choose.
The following is a breakdown of the paper's structure. The second section examines research in a similar field. We introduce notations, assumptions, formulations, and the system that will be used to answer the problems in Section 3. To identify the performance of our method, we create simulation, test, and analyze it. Section 4 contains the conclusion and future works of this research.

II. LITERATURE REVIEW
Identify applicable sponsor/s here. If no sponsors, delete this text box (sponsors). Some study has been done on pricing strategies in the marketplace: [10] offers an empirical analysis and analytical a model that shows how an online shop may achieve a competitive advantage by designing an optimal mix of product price and shipping price. In [11], has proposed a unique algorithmic approach for estimating optimal prices in ecommerce scenarios using noisy and sparse data. Their structure is broad and can be tailored to a particular issue. Experiments have demonstrated that using their methodology in practice can result in significant increases in profit and revenue. In [12], recommends a multi-armed bandits algorithm for dynamic pricing using a customer-centric strategy, based on the notion that systems track client behavior and price impacts whether or not a purchase is made. In the e-commerce recommender system, there is some research about it. Collaborative filtering, knowledge-based reasoning, contentbased filtering, demographic-based filtering and hybrid technique are still become a favorite primary method to be explored. In [13], improve recommender system using previous search information, user behavior analysis, and current search information. The proposed method by combining content and web usage mining technique then measure the accuracy of the system. In [14], improving collaborative filtering to recommend trendy items to a customer to demonstrate the performance of their algorithm, they took the system to an actual retail mall. They claim that their system outperforms traditional collaborative filtering in terms of efficiency. In [15], developed and implemented a recommender system based on web mining. Their complicated recommendation engine takes data from web mining as input. For the implementation of a recommender system, [16] employed demographic data from the client registration form as an essential source of data mining algorithm. They proposed four-phase in their recommender system workflow: 1) acquire information implicitly and explicitly, 2) information processing, 3) recommendation processing, Moreover, 4) indicating the outcome to the clients. They also employed a data diversity such as online market data, query data, server log data, hyperlink inside web pages, client registration data, and web pages. From the research, the author concludes that the proposed approach improves the quality of recommendation efficiently. To construct their recommendation system, [17], [18], [19] used a knowledge-based filtering strategy. In [19], Using KBF to help a customer make a decision. They used the length of time spent shopping and the kind of things purchased by the consumer as input to choose the most important product. It was discovered that KBF could improve decision-making by reducing the length of shopping and the effort required to choose the right product. In [18], using data clustering analysis results to develop a web mining framework. It is used to deliver recommendations for e-commerce recommender systems. He also stores knowledge rules using pattern analysis on acceptable media and learns the recommender engine from web mining. The author concluded that the suggested method could manage massive data quantities while increasing reaction time substantially and scalability. There is also some research about recommender systems in e-commerce using contentbased filtering methods. In [20], suggest content analysis for improving the recommender system. The proposed method will recommend sample websites to meet user requirements. However, it has a high operation cost and response time because many websites have different structures, naming tags, and information. In [21], introduced BPR (Bayesian Personalized Ranking) to explore user preferences by ranking. They employed a global score function to determine the user's preference for various goods. In [22], proposed a strategy that used CF and CBF to achieve excellent recommendation accuracy. The author proposes a three-part weighted combination filtering recommender system: gathering required data, creating recommendations, and communicating findings to the user. Moreover, one of the most popular recommender system research techniques is a hybrid technique by joining various algorithms and methods. Several different algorithms are used for collecting and processing information that improves the quality of recommendations. In [23], developed a weighted parallel technique for developing an e-commerce recommender system in Indonesia. The author uses a combination of CF and CBF approaches. While in [24] uses a personalized hybrid way to evaluate standard algorithms to enhance user interest modelling, variety, and scale. As we know so far, there is no research about using the Conjoint Analysis method to create a recommender system, especially in e-commerce. So, we assess CA as a base method to create recommendations, especially for e-commerce in Indonesia.

III. NOTATION, FORMULATION, AND SYSTEM DESIGN
For convenience, we summarize the notation adopted in this paper is in Table II. In this research, we do preliminary study to determine what kind of attribute to use in this research. We investigate three of e-commerce in Indonesia: Tokopedia (https://www.tokopedia.com), bukalapak shopee (https://www.bukalapak.com), and (https://shopee.co.id). Each explicit product attribute is shown in Table I. In addition to determining the essential qualities, the examinable performance level for each attribute must be defined using the following [25] criteria: 1) attribute levels should be as similar to the real-life experience of the customer as feasible, 2) attribute levels should be related to the product that is accessible to the consumer, and 3) contain characteristics that are regarded to be essential competences. Then determine that the primary attributes that will be used for this method should be numeric.
Product score, sold, price, shipping cost, and insurance cost are chosen as our method's primary attributes. When not all preferences are satisfied, the customer must make a trade-off between all primary attributes during the selection process. We try to solve it using Multi Attributes Decision Making (MADM) problem formulation. The problem formulation for MADM is as follows [26]: 5) a set of statements expressing the decision maker's preferences P = {P 1 , P 2 , ... , P t } : this piece of data must be elicited from the decision maker prior to or during the encounter. Different decision makers may have different preference statements, and during the process of searching for the best answer, certain preferences may be violated for tradeoff purposes.
In this paper, consider there are i product (i = 1,2,3, ..., N). Every product i has their attributes product score (c i ), sold (d i ), price (p i ), shipping cost (s i ), and insurance cost (u i ). Each of attributes has a weight coefficient denoted by K, L, M, N, O. Notation is shown in Table II. Conjoint analysis is a well-known research technique in marketing and consumer research. This methodology, which allows for the understanding of consumer preferences, has been used to solve a wide range of marketing challenges, including forecasting product demand, creating a new product line, and calibrating pricing sensitivity/elasticity. Respondent customers are presented with a well-crafted collection of hypothetical product profiles (defined by the stated levels of the necessary qualities), and their preferences are collected in the form of ratings, rankings, or selections for those profiles [6]. The technique of ordinary least square regression (OLS) is often used to estimate preference functions. According to research, this technique's efficiency (predictive power) is often relatively similar to more complex techniques, but the findings are easier to grasp. OLS equation is shown below: Global utility of a given attribute is determined using the equation (2), where Op is the relative significance of the product attribute, max up is the utility of the attribute's most preferred level. Min up is the utility of the attribute's least chosen performance level. The operation is continually changing and based on a variety of factors. Here, define that Then we construct an enhanced equation to calculate weight for all attributes from each product below (3) and find the minimum value for the weight of the i product as an enhanced conjoint analysis method to give a recommendation.
A. Designed System The designed recommender system is shown in Fig. 2. The flow of the recommendation process is described as follows: 1) User the product name as a keyword, then aggregator will collect the data from the marketplace and save it to the database.
2) Pre-processor will get the data from the database and clean it first (such as: removing NULL values, unused attributes) before it comes to the conjoint analyzer.

B. Dataset Processing
There are some choices of Indonesian marketplace as data source, such as: Tokopedia, Shopee, Bukalapak, Lazada, Jakmall, or Blibli. Bukalapak has been choosen as our data source because Bukalapak provides API to get information about product delivery instead of product from their website. Assumed that users search for products using "gegep tekiro" to grab products related to "gegep tekiro." A searching page for the keyword "gegep tekiro" is shown in Fig. 4. Later from the searching page, go to the detail of every product. Information on the detailed product is shown in Fig. 5. We retrieve product name, link, price, sold, reviews, location, delivery services, shipping cost, insurance cost, eta (estimated time arrival), and eta on the detailed product page info. An aggregator is created to retrieve only needed information based on the HTML tag on the product detail page. Eliminating unessential parameters, then insert the value into the table on the database. Parameters processing is shown in Fig. 3.

IV. SIMULATION
In this experiment, we limit data to 240 sellers. The five cheapest product is selected from the dataset. In this simulation, we will simulate the Enhanced Conjoint Analysis method compared to price. There are three scenarios with dynamic growth data starting from 80 data, 160 data, and 240 data. The five cheapest products are shown in Table IV. 670 | P a g e www.ijacsa.thesai.org Three different scenarios are implemented to get result from our method. The result is described as follows: 1) Scenario 1: Importance attribute from conjoint analysis process is shown in Table V. Here, the most important attribute is insurance cost, followed by shipping cost, sold, and reviews.

Total seller 80 seller
Location buyer Bandung  In this scenario, the obtained result is different from the product in Table VI Table VI. 2) Scenario 2: Importance attribute from conjoint analysis process is shown in Table VII. Here, the most important attribute is insurance cost, followed by shipping cost, sold, and reviews.

Keywords
Gegep tekiro Total seller 160 seller Location buyer Bandung In this scenario, using twice data more than scenario one, the obtained result is the same as scenario one, but the equation to create a recommendation is different. Numbers of data makes different results for importance factor value and affected variable, used in (5). Our system chooses a product with a price of Rp 39.999 with the combination of shipping cost Rp 8.000, insurance cost Rp 216, 52 sold, and have a 5.0 reputation as show in Table VIII. 3) Scenario 3: Importance attribute from conjoint analysis process is shown in Table IX. Here, the most important attribute is insurance cost, followed by shipping cost, sold, and reviews.

Keywords
Gegep tekiro Total seller 240 seller Location buyer Bandung In this scenario, using three times more data than in scenario 2, our systems still choose products with combinations the same as indicated by scenario 2 as show in Table X. For each scenario, our system will calculate the coefficient of every preference depending on the number of data, and the result can be different. The most interesting is that insurance cost is the most critical factor in a product besides price. 671 | P a g e www.ijacsa.thesai.org For choosing a product in the marketplace, we need more than one attributes to determine which one is optimal in price. We also compared it with another method to observe its performance, relativity error, correlation, and root mean square error. Another method we try to compare is the Generalized Linear Model (GLM), Decision Tree (DT), Random Forest (RF), Gradient Boosted Tree (GBT), and Support Vector Machine (SVM).
From Fig. 6, our method outperforms another method in relative error results. Our experimental results, ECA relative error is 0.0001, GLM (0.008), DT (0.018), RF (0.048), GBT (0.014), and SVM (0.031). Compared to another method, ECA does not need training and test data. Using OLS regression, we obtain an equation and then apply it to the dataset, while another method still needs to predict using training and test data. A relative error has a relation with correlation. With the smallest relative error, the correlation of the ECA method is the highest as show in Fig. 7     Since ECA has the slightest relative error, it also has the smallest value for RMSE as show in Fig. 8. ECA RMSE value is 1.121, GLM (1.458), DT (2.139), RF (4.051), GBT (1.773), and SVM (4.156). Although the ECA method has a small error, the decision tree has the best running time as show in Fig. 9. ECA running time is 6.146s, GLM (5.537s), DT (1s), RF (10,119s), GBT (45.881s), and SVM (11.583s). The ECA method has a small difference value from the GLM method because both ECA and GLM are extended linear models. ECA method use Ordinary Least Square as a linear model and GLM using linear regression.
Since we do not have many features in this dataset, the Decision Tree does not have to create a big tree so it can do the deep search faster into the branch, while our method, though it has a regression equation, we still need to calculate a value for each data then sort it. The larger the data, the more time it is needed to calculate.

V. LIMITATIONS AND CONCLUSION
Conjoint Analysis is a standard method to draw marketing strategy from a product or analyze essential factors of a product. This method can also analyze critical factors from a product in the marketplace based on its price. We convert importance factor percentage as a coefficient to calculate weight from every attribute. The simulation results show how that conjoint analysis network works well in choosing a product from multiple attributes to get the best price.
Although our method has good performance, there are some limitations to this method. ECA needs time to calculate a value for each data then sort it. If the data becomes more significant than before, our method needs more time because ECA will calculate all data. We can reduce computing time by 672 | P a g e www.ijacsa.thesai.org sampling some data to be calculated, but it also can reduce the accuracy result. Another approach is by using a parallel computing approach in calculating value, but it needs observations.
Since ECA uses OLS regression to create an equation, it will choose the optimal value for each attribute. Optimal value can be the biggest or the smallest value in an attribute. In a real-world implementation, if one online shop has a lot of sales, reviews, and low prices, it will be recommended continuously by our method.
Besides its limitation, ECA gives optimal recommendation because all attributes processed using OLS regression give optimal coefficient for every attribute. ECA has the slightest relative error (0.0001) and good correlation and RMSE. The accuracy and the optimization are traded off with computing time. ECA has a running time of 6.146s meanwhile decision tree method has the best time in 1s. With this result, ECA can be used as one alternative recommendation system besides a neural network or machine learning approach.