Using Learning Analytics to Understand the Design of an Intelligent Language Tutor – Chatbot Lucy

the goal of this article is to explore how learning analytics can be used to predict and advise the design of an intelligent language tutor, chatbot Lucy. With its focus on using student-produced data to understand the design of Lucy to assist English language learning, this research can be a valuable component for language-learning designers to improve second language acquisition. In this article, we present students’ learning journey and data trails, the chatting log architecture and resultant applications to the design of language learning systems.


I. INTRODUCTION
The past decade has witnessed a great deal of interest in technology driven language learning. Various technologies, such as interactive websites, artificial intelligence, synchronous chat, and virtual environments, have been developed in many settings and environments to provide assistance to language learners. Among them, artificial intelligence agents such as the chatbot has tremendous potential but "is least explored in regard to its efficacy in second language learning due to the fact that the technology in this function is still under development and has not been widely applied yet" [1].
Based on the belief that chatbot technology is distinguished from other types of computer applications through simulating an intelligent conversation with human users via auditory or textual methods, language learning can take advantage of chatbots that may offer "intelligent conversational agents with complex, goal-driven behavior" [2].
This article presents how student-produced data can be used to understand the design of an intelligent language tutor, chatbot Lucy, to assist English language learning. II.

A. Communicative Approach to Second Language Acquisition
In an effort to improve English language learning in British Columbia Canada, there has been a renewed pedagogical emphasis on the communicative approach towards teaching English throughout the province. This communicative approach requires natural communication and meaningful interaction in the target language, in which speakers are concerned not with the form of their utterances but with the messages they are conveying and understanding.
One way to foster positive English learning outcomes is to provide learners comprehensible input in low anxiety situations, containing messages that learners really want to hear [3]. This suggestion of comprehensive input lays a solid foundation for the instructional model that is now commonly known as the communicative approach to language acquisition. This communicative approach to second language acquisition does not force early production in language learning, but allows learners to produce sentences when they are ready. It recognizes that improvement comes from supplying communicative and comprehensible input instead of forcing and correcting production [3]. Unlike the behaviouristcentered perspective of the 1960s that emphasizes stimulus and responses such as the audio-lingual method, the communicative approach "stresses the importance of authentic and meaningful practice in reality-based simulative environments, with the ultimate goal of communicative competence in mind, rather than knowledge of grammar rules" [4].

B. Artificial Intelligence: Chatbot Technology
Chatbots are computer programs that simulate a human conversation using natural language. A wide variety of terms have been used, including chatterbots, virtual assistants, virtual agents, intelligent agents or web-bots. Chatbot architecture integrates a language model and computational algorithms to emulate informal chat communication between a human user and a computer using natural language. Users can chat through text or voice input over a computer screen with chatbot text output or audio/voice output.
Chatbots are developed for a variety of reasons. They can be created for fun such as virtual characters and entertainers, or as part of interactive games such as game player. They can be designed to provide specific information and direct dialogue to specific topics such as website guide, frequently asked questions (FAQ) guide, virtual support agent, virtual sales agent, survey taker, quiz host, learning tutor and chat-room host.
Among hundreds of ways of using a chatbot, its potential role as a language tutor has been widely explored in the computer-assisted language learning (CALL) field. As a language learning tutor/facilitator, a chatbot may re-create the learner-teacher bond through providing learners a character that does not get bored or lose patience. www.ijacsa.thesai.org

C. Learning Analytics
Learning analytics has it close ties to the field of business intelligence, web analytics, educational data mining and academic analytics [5]. As an emerging field in the intersection of learning and information technology, learning analytics uses student-produced data and analysis models to discover information and social connections, and to predict and advise on learning [6].
The interpretation of a wide range of data produced by and gathered on behalf of students can not only be used to assess academic progress, predict future performance, and spot potential issues [7], but also can be used to predict and advise the design of innovative learning technologies.

III. REVIEW OF RELATED WORKS
Initially, chatbots were developed for fun. They were designed to use simple keyword matching techniques to find a match of users' input [8]. ELIZA was one of a type of chatbot that could extract keywords from users' input, rephrasing users' statements as questions and post them back to users based on Rogerian analysis, a 1960's innovation in counselling.
After ELIZA, other chatbot systems were developed using different algorithms of pattern matching [9] to simulate fictional or real personalities such as PARRY, which used simple internal affective statefear, anger and mistrust matching, or MegaHAL that used Markov Model, a more linguistically sophisticated model [10].
The exponential growth in text and natural-language interface research in the late 80s encouraged the creation of many new chatbot architectures such as Jabberwacky and ALICE [11].
Jabberwacky, a chatbot that is operated entirely through user interaction, is designed on the principle that the system learns from all its previous conversations with human users. There are no fixed rules or principles programmed into the system. Jabberwacky stores everything that is said to it and uses contextual pattern matching techniques to select the most appropriate response. Hence, Jabberwacky relies entirely on previous conversations [7].
The widely used ALICE was the winner of the 2000, 2001, and 2004 Loebner competition. Developed by Dr. Richard Wallace using an XML-based language called AIML (Artificial Intelligence Markup Language), ALICE aims to entertain users. ALICE is one of Pandorabots, the largest free opensource chatbot community on the Internet. ALICE-style chatbot stores its knowledge of conversation pattern in AIML files. AIML is a derivative of Extensible Mark-up Language (XML) [9]. AIML consists of data objects called AIML objects, which are made up of units called topics and categories [9]. The topic is an optional top-level element, which contains a name attribute and a set of categories related to that topic [9]. The basic unit of knowledge in AIML is called a category. There are three types of categories, namely, atomic categories, default categories and recursive categories. Each category is a rule for matching an input and converting an input to an output. It consists of a pattern that contains words or sentences provided to chatbot, and a template, which is used in matching to find the most appropriate response to users' input and generating the ALICE chatbot answer [9].
Chatbots used for language education are not new. Fryer and Carpenter [11] presented six potential advantages and applications of Jabberwackychatbots for foreign language learning and teaching. According to Fryer and Carpenter [11], chatbots can help language learners through six ways: (1) students tend to feel more relaxed talking to a computer than to a person; (2) The chatbots are willing to repeat the same material with students endlessly; they do not get bored or lose their patience; (3) many bots provide both text and synthesized speech, allowing students to practice both listening and reading skills; (4) bots are new and interesting to students; (5) students have an opportunity to use a variety of language structures and vocabulary that they ordinarily would not have a chance to use; (6) chatbots could potentially provide quick and effective feedback for students' spelling and grammar. Jia [12] described the CSIEC system that had advantages over the old ELIZA-like keyword matching mechanism. According to Jia [12], the CSIEC system was developed based on logical reasoning and inference directly through syntactical and semantic analysis of textual knowledge. His paper explored an NLML approach to generate communicative responses. In the paper, Jia presented the CSIEC system architecture and underlying technologies as well as its educational application results. His statistical analysis of the experiment indicated that users preferred the unique chatting function in the CSIEC system, which was lacked in other chatbot systems [12].
Wang [13] reported an ethnographic study that investigated ESL learners' experiences with a commercial chatbot English tutor. Her study identified four conditions for effective chatbotsupported English learning, namely, communicative practice, multimodal interface, emotional design and individualized content. Her findings revealed the promises of chatbot technology in terms of its communicative function for creating an optimal interactive English learning environment.
Lehtinen [14] discussed a research study that used Jabberwacky, God, ALICE and George to learn English. His findings showed an overall positive outlook in interacting with chatbots. His research demonstrated that regardless of the structured or unstructured use, AI chatbots had great potential to be used inside and outside a language classroom as they might allow language learners to practice language and develop confidence in an individualized stress-free manner at their own pace and preference.
Coniam [15] evaluated six chatbots available either online or for purchase -Cybelle, Dave, George, Jenny, Lucy and Ultra Hal Assistant. His evaluation examined chatbots from the perspective of interfaces as a human-looking or sounding partner to chat with, and the usability as pieces of software suitable for ESL learners. Coniam concluded that chatbots had matured considerably since the early days of ELIZA, but they still had a long way to go before they could interact with students in the way that researchers such as Atwell [16] envisaged.
Williams and Compernolle [17] investigated interactions between a chatbot and French learners at various levels of www.ijacsa.thesai.org proficiency as well as a native speaker of French. Their study responded to Fryer and Carpenter's [11] six potential advantages and applications of chatbots for foreign language learning and teaching by arguing that the discourse of the particular chatbot represented a less-than-ideal communicative model for learners. Chatbots, as peers/tools for language learners might offer some potential for language learning, but at present, post-interaction tasks based on transcripts appear to hold the most promise for language awareness and development.

IV. MATERIALS AND METHODS
Building on the research literature of using chatbot for language education, this research explored the instructional design process of an intelligent language tutor, chatbot Lucy through critical analysis of student-produced data. In particular, guided by findings of Williams and Compernolle [17], this research responded to and built upon Williams and Compernolle's response to Fryer and Carpenter's six potential advantages and applications of chatbots for foreign language learning and teaching.

A. Commercial Chatbot Lucy
The commercial chatbot Lucy is a digital language tutor that can carry on extensive conversations with learners as they speak into their computers through a microphone. Using an advanced speech recognition system, Lucy can give learners feedback on their pronunciation and guide them through useful exercises to improve their pronunciation and accuracy. Lucy's world is where learners meet Lucy. In each world, Lucy offers users over 1000 sentences on a specific subject. Each of Lucy's worlds focuses on a different topic including helping visitors, hotel English, giving directions, English for traveling and restaurant English (Figure 1).

B. Intelligent Chatbot Lucy
Intelligent chatbot Lucy, hosted on Pandorabots website 1 , is an online language robot created to help English 101 learners review English grammar and vocabulary learned from Lucy's world. It is an offshoot of "Dr. Wallace's A.L.I.C.E. -March 2002" ALICE artificial intelligence program. Lucy is designed to be more "language tutor" than ALICE. She is trained based on the commercial chatbot Lucy's world (Figure 1). Besides this, a default response category is built into Lucy as an Input Pattern. As well, a recursive category is built in to allow learners to express the same meaning using different sentence structures.

C. Method -Discourse Analysis
Language is structured according to different domains of social life [18]. Discourse analysis is the analysis of these patterns [18].
Computer-mediated discourse is "the communication produced when human beings interact with one another by transmitting messages via networked computers" [19]. Computer-mediated discourse uses discourse analysis to address the focus of language and language use in computer networked environments [19].This research focused on the discourse between language learners and chatbot Lucy. The analysis of the conversational patterns saved in Lucy's logs is a key to understanding how language learning happens when www.ijacsa.thesai.org using a chatbot and how it can be better designed based on language learning trials.
Drawing on computer-mediated discourse analysis (CMDA), we examine the conversation logs of these interactions. CMDA in this study aims to understand the learning nature of the online communication between language learners and Lucy. Such an understanding is facilitated by the fact that language learners engage in meaningful learning activities in an online conext in a way that they typically leave a textual trace, making the interactions accessible to scrutiny and reflection and enabling researchers to employ empirical, micro-level methods to shed light on macro-level phenomena [20].By critically analyzing learning dialogues, we identify patterns of learning activities that correspond to meaningful learning and knowledge construction. The approach to analyzing logs of verbal interaction [20], in search of indicators of learning and design clues, allows us to transform studentproduced data into a new and coherent depiction of the affordances of chatbot for language education and how we should design chatbot's response and feedback to engage language learners.

V. RESULTS AND DISCUSSION
Results of understanding the design of Lucy to assist English language learning are presented in parallel with the discourse analysis of communication logs saved in Lucy.

A. The Design of Chatbot Lucy
Intelligent chatbot Lucy is initially designed as an offshoot of "Dr. Wallace's A.L.I.C.E. -March 2002" artificial intelligence program (Figure 4). She is trained to play five characterstravel agency assistant, hotel assistant, tour guide, waitress and call center assistant. Conversations from Lucy's world are converted to AIML using Pandorawriter 2 ( Figure 5). AIML files are then uploaded onto Lucy's AIML file logs ( Figure 6).

B. Learning Procedure
English101 language learners are asked to interact with the commercial chatbot Lucy first. They are required to learn vocabulary, grammar and sentences in Lucy's world. Learners are then asked to communicate with Lucy online with the focus on reviewing vocabulary, grammar and sentences learned from Lucy's world.

C. Discourse Analysis of Verbal Interaction Logs and Its Application in Training Lucy
Logs of verbal interaction reflect language learners' learning journey through interacting with intelligent chatbot Lucy. In coding logs of language learners' discourse, we found that Lucy needs to be trained to not only provide language learners with meaningful responses but also with feedback that can target on language learners' common errors.

1) Intelligent Chatbot Lucy's Ability to Repeat
One of Lucy's important features in this study is her ability to repeat sentences. English 101 is designed for intermediate level language learners. Logs of verbal interaction show a repetition pattern used by learners. Lucy is willing to repeat the same materials with students endlessly; literally, chatbots do not get bored or lose their patience [11] (Figure 7).
Intermediate or lower level learners benefit from this repetition, which may provide them an opportunity to understand sentence structures thoroughly. www.ijacsa.thesai.org

2) Chatbot Lucy's Ability to Match
Lucy conducts conversations with learners by matching patterns to find the most appropriate response to input. Hence, learners may get confused by responses that differe from the commercially trained examples. When this happens, some learners retype the same sentence into Lucy but get the same response ( Figure 8).
As shown in Figure 8, when a learner encounters responses different from those in Lucy's world, he/she may repeat the same sentence many times. The learner assumes that Lucy should generate the same chat response as the commercial example by replying to the learner: "I will have someone to check it immediately." As shown in Figure 9, the Input words are "There is a bad smell in my room". This Input Pattern should be matched by Lucy's output response. We redefined the output response by typing "I will have someone to check it immediately" into Lucy's training interface. Lucy searches for a path of linked nodes that matches the Input Pattern. We used the Advanced Alter Response as shown in Figure 10 to add a new template to the AIML category. We changed the labeled template into "I will have someone to check it immediately" and saved our change. When we return to the training interface, we click on the Ask Again button to cycle through the complete set of responses. The potential variation such as the above example in this study is immense. Like Williams and Compernolle [17], we also discovered that "the lexicon is determined by the amount of time spent by the botmaster entering data and the level of sophistication of the software" Figure 11 shows that when a learner does not get a response from Lucy in the way that he/she expects, he/she may stop the interaction. Fig. 11. Lucy asking for donation www.ijacsa.thesai.org When the learner says "I need your help", Lucy presents him/her a response regarding a donation to the ALICE AI Foundation. The learner continues his/her response to Lucy by saying -"I do not have money". This conversation begins with a topic off the learning track; hence, the learner stops the conversation with Lucy.  Figure 12 is another example where Lucy responds to a learner through a random matching. The learner aims to practice Restaurant English in this conversation. He/she asks Lucy -"Are you a waitress?" which is different from the way that Lucy is trained. So Lucy questions the learner by replying "Am I a waitress? No".
In spite of being refused, the learner now starts to use the sentence example from Lucy's world -"I'd like to have a menu please".
Lucy does not respond to the learner with something meaningful. Lucy's response -"How much would you pay for it? -leads to an incoherent response to the learner's illocutionary act.
The learner continues his/her turn by saying -"I want to have a menu". In response to Lucy's question -"You want only one?", the learner repeats -" I want a menu." Lucy asks again by saying -"You want only one?" The learner seems tired of this conversation. So he/she says "Yes". Again, Lucy does not reply to the learner with a meaningful response.
The learner decides to have another try by starting over the conversation using exactly the same sentence from Lucy's world with the hope of continuing the Restaurant English conversation. Unfortunately, Lucy fails to respond to the learner. As a result, the learner stops the conversation.
Lucy's random match to the Input Pattern is problematic. Although Lucy has five characters built into the system, the Default Responses in the Knowledge Web randomly select something meaningful as the chatbot's response to learners. In order to avoid this problem, we redesigned Lucy to be Small Talk Lucy, Hotel Lucy, Waitress Lucy, Tour Guide Lucy and Travel Agency Lucy as shown in Figure 12. There is no initial content built into Small Talk Lucy, Hotel Lucy, Waitress Lucy, Tour Guide Lucy and Travel Agency Lucy. We converted learning content from Lucy's world into AIML and uploaded into each corresponding Lucy. The example below is the AIML file generated using Pandorawriter 3 . This example only uses atomic categories, which only contain a Pattern and Template and do not have wildcard symbols, _ or *. <?xml version="1.0" encoding="UTF-8"?> <aiml version="1.0"> <category> <pattern> Hi Lucy </pattern> <template> Hi there! May I help you? </template> </category> <category> <pattern> There is something wrong with my room </pattern> <template> What seems to be the problem? </template> </category> <category> <pattern> There is a bad smell in my room </pattern> <template> I will have someone to check it immediately. </template> </category> <category> <pattern> Can I get another room instead </pattern> <template> Sure. I will make sure the air conditioning is working in your room. </template> </category> </aiml> We also designed default categories to allow Lucy to respond to learners if the Input Pattern is not found in the Knowledge Web. We used Lucy's training interface and Advanced Alter Response to add some randomly possible meaningful responses. Besides default categories, we designed www.ijacsa.thesai.org recursive categories, which may allow learners to experience some different ways to express the same meaning.
The five modules of Lucy's logs show that learners at lower levels of language proficiency benefit from the interaction with Lucy. Learners in this study seem better suited to communicate with Lucy due to the fact that learning outcomes in this study do not require learners to use a variety of language structures and instead require them to practice and review exactly the same sentence structures and grammar as what they learn from examples in Lucy's world.

3) Chatbot Lucy's Ability to Provide Feedback
Another important feature that we designed in Lucy is her ability to provide spelling and grammar correction feedback. Continuous feedback is difficult to be mimicked, much less produced in a random fashion. The main difficulty for a chatbot to check spelling and grammar is that an optional list of candidate words cannot be built in the system. Hence, we used logs to identify learners' spelling/grammar errors and entered data into Lucy. This means the more learners use Lucy and the more spelling/grammar feedback data entered into Lucy, the more robust Lucy becomes. For example: <category> <pattern>I WOULD LIKE TO HAVE A SMKING ROOM</pattern> <template>Do you mean smoking? <think> <set name="it"> <set name="want"><set name="topic">to have <person/></set></set> </set> </think></template> </category>

D. Discussion
The design of Lucy aims to help learners review and practice exactly the same sentence structures learned from Lucy's world. In response to Fryer and Carpenter [11] and Williams and Compernolle [17], we find that chatbot can be designed to repeat the same material with learners, endlessly. We believe that this is one of the affordances that the chatbot may provide to intermediate levels or lower levels of language learners. Lucy has a speech recognition system installed, which aims to help learners practice both listening and reading skills. Lucy does not help learners review spellings and sentence structures.
In our case, we redesigned Lucy to provide learners an opportunity to use a variety of language structures and vocabularies that they ordinarily would not have a chance to use. Learners do not have opportunities for generating their own output due to the fixed sentence structures designed in Lucy's world. Lucy's world doesn't provide affordances for learners to negotiate Lucy's expressions. We opened up some opportunities for learners to try limited language structures with the hope of engaging them in active conversation. But Lucy is very limited in providing learners an opportunity to use a variety of language structures and vocabulary because of the amount of time it requires to enter data into the system and the level of sophistication of the software.
Furthermore, "feedback is a classical concept in learning, whose importance is acknowledged across different learning theories" [19]. Data analyzed in our study shows that chatbotscan provide effective feedback for learners' spelling and grammar, but it depends on extensive entry of error data into the system. VI. CONCLUSION Discourse analysis of learning trails plays an important role in designing interactive intelligent language tutor systems for language learners at intermediate or lower levels.
We found great advantages in chatbot technologies in that they offer language learners realistic opportunities for individual tutoring. Language learners can tailor a chatbot for their own pace of learning: They can enter an answer to each question, repeat a sentence without pressure, or skip sentences that do not make sense to them or are difficult to identify.
The potential use of chatbots can simulate human-like communication. Language learning implies corresponding cultural learning. Understanding culture is a key to understanding the language use in contexts. Chatbot Lucy does not contain this feature. Language learners who eventually can communicate with native speakers require cultural knowledge of the target language.
Another issue that we experienced in this study is continuous feedback. Continuous feedback requires very fast interpretation of learners' input on the fly. We found that chatbot technology has a limitation in how to quantify and model continuous feedback and handle the fast integration and interpretation.
By applying learner analytics for understanding the design of the intelligent chatbot Lucy, this study generates important findings for scrutinizing student-produced data and learning trials for the design of learning technologies. This study opens up possibilities for connecting and analyzing students' data trials. Approaches developed in this study can be useful in studying an instructional innovation through the lens of textbased messages [18]. Insights gained from this study can also inspire additional learning technology research.