Potential Data Collections Methods for System Dynamics Modelling: A Brief Overview

System Dynamics (SD) modelling is a highly complex process. Although the SD methodology has been discussed extensively in most breakthroughs and present literature, discussions on data collection methods for SD modelling are not explained in details in most studies. To date, comprehensive descriptions of knowledge extraction for SD modelling is still scarce in the literature either. In an attempt to fill in the gap, three primary groups of data sources proposed by Forrester: (1) mental database, (2) written database and (3) numerical database, were reviewed, including the potential data collections methods for each database by taking into account the advancement of current computer and information technology. The contributions of this paper come in threefolds. First, this paper highlights the potential data sources that deserved to be acknowledged and reflected in the SD domain. Second, this paper provides insights into the appropriate mix and match of data collection methods for SD development. Third, this paper provides a practical synthesis of potential data sources and their suitability according to the SD modelling stage, which can serve as modelling practice guidelines. Keywords—System dynamics modelling; data collection methods; data source; system dynamics methodology


I. INTRODUCTION
System Dynamics (SD) was developed by a former electrical engineer and researcher from the Massachusetts Institute of Technology in 1956, named Jay W. Forrester. He successfully incorporated the knowledge of a system control theory in electrical engineering into management science through a simulation model [1]- [3].
Generally, in simulation, the word "system" is referring to as "what, from the real world, is being simulated" [4]. The subject of "what" can refer to people, machines or/and resources. A model is a "representation of an event and/or things that are real (a case study) or contrived (a use case)". A simulation is "a method for implementing a model over time" [4]. With SD, real-world problems or interest systems are modelled through concepts (qualitative) and quantitative methods [5]. The interest system's available information is collected and organised in SD software to form computer simulation models [6].
Interestingly, pieces of information can come from various sources and types. They are not just in numerical form, but also comprise mental knowledge and other qualitative forms as well [3], [7], [8]. Modellers or SD experts have to depend on their expertise and skills to collect and synthesise this information and transform it into an SD model through SD methodology [9].
Although the SD methodology has been discussed extensively in most classic literature, methods to incorporate qualitative and quantitative data during the modelling process are not explained in detail by the most influential authors [10]. To date, there are still no fixed guide or comprehensive descriptions on how to incorporate them in SD development [11]. This has raised a few questions.
What method should be used to gather data as a suitable information source? At what stage in the modelling process should these data can be regarded as useful? How are qualitative data and numerical (quantitative) data linked to SD methodology? Therefore, this paper aims to provide an overview of potential data sources and possible data collections methods that can be practically helpful in SD model development.
In an attempt to answer the questions, the initial searching began in online publications databases. Related papers on data sources and data collection methods for SD modelling were compiled for review [12]. Throughout this process, relevant articles were collected from search engines including Google Scholar, System Dynamics Reviews, Science Direct, Taylor & Francis, Sage Publications and Emerald Publishing. Keywords such as 'knowledge elicitation for System Dynamics', 'knowledge elicitation for System Dynamics modelling', 'data collection methods for System Dynamics', 'data collection methods for System Dynamics Modelling', 'data source for System Dynamics', 'data collection methods for System Dynamics modelling', were used. The papers were further analysed to connect any identified keywords with the related questions. Further elaborations were added based on expert suggestions.
The remainder of this paper is divided into three sections. Section II presents a literature digest based on the three primary data sources for SD modelling. Section III explains in details of potential data collections methods based on four SD methodology stages. Lastly, Section IV serves as the concluding remarks.

II. DATA SOURCE FOR SYSTEM DYNAMICS MODELLING
The forefather of SD, Forrester suggested three important sources. The first is the mental database, the second is written or textual database and the third is the numerical database [13] (see Fig. 1). The first two are crucial in defining the non-linear relationships that control and generate normal behaviour [6]. *Corresponding Author www.ijacsa.thesai.org Unlike the two databases, numerical does not reveal the cause and effect directions of the variables. It is still crucial in model testing and serving as an input for running the SD model [14].
Although the three types mentioned earlier were already known for SD modelling, the descriptions were still too general. The method to obtain the data needed was not explained in detail. Moreover, with today's technology, new methods may appear to be more beneficial than traditional methods. Several researchers had proposed several suggestions. Unfortunately, the papers were focusing only on one or two specific data sources, not all three. For examples, some researchers suggested that potential data sources be retrieved through social sciences data collection methods and analysis [10] [8], but this works well mostly with mental database and written database. Whereas in separate papers, other researchers pointed out their suggestions to borrow methods from Artificial Intelligences, Data Sciences and Big Data domains [15]- [17], which suit well with the numerical database. Therefore, this section provides a revised literature review on the data sources and data collection methods for SD modelling based on the three categories aforementioned. Each is explained in Section A, B and C, respectively.

A. Mental Database
The first data source is a mental database. The mental database is the knowledge that lies inside the stakeholder's head [2], [13], [18]. This type of expertise involves the internal representations of reality that stakeholders use to understand, believe, reason about, and predict events [2], [6], [19], [20]. It is commonly expressed in oral linguistic communication by the stakeholders [21].
The stakeholders are the leading players or actors in SD modelling projects. Usually, the actors are the problem's owner or clients, analysts, modellers, facilitators and other experts involved in the interest case study. Stakeholders are generally the valuable primary source of information [13]. Their information values reside in the local contextual knowledge, perspectives, preferences and values. It is also noted that stakeholders' reasoning, observation and imagination are not bounded by scientific rationality. From one end, this can be beneficial when dealing with poor-structured and complex problems [22]. At the same time, some may argue about its accuracy in representing reality [20]. Forrester acknowledged that mental database is trickier because it is very rich with knowledge, often missed and hard to elicit [6], [23]. In line with Luna-Reyes and Andersen's suggestion, most SD researchers agree that social sciences methods are a suitable approach to be used for extracting mental database [10]. Fig. 2 shows knowledge elicitation from the mental database to written database and numerical database. The data collection methods can be applied whenever it is possible.

Fig. 2. Extracting and Collecting Mental Database for Written Database and
Numerical Database [10].
As further explanation, listed below are ten suggested data collection methods for the mental database.

1) Interviews:
Interviews allow for two-way communication between interviewer and interviewee(s) [10]. Interviewees are free to communicate their stories, opinions, provide descriptions in their own words. Ethically, any recordings done should be with permission. Interviews can be carried out in four ways.
First, it is face to face communication. Usually, this type of interview is set through appointments, as agreed by both parties. Interview sessions can be recorded using a voice recorder or written down in a notebook. Secondly, interviews can be done via a communication medium such as phone or Voice over Internet Protocol (VoIP) applications like Skype and Internet Phone [24]. The conversations can be recorded with supporting software. Thirdly, digital interview using text applications like Telegram, WhatsApp [25], Facebook or electronic mail (e-mail) [26] can also be conducted. The conversation is carried out in a textual form. With this method, no transcribing effort is needed. Fourthly, interviews through video conferencing such as Zooms, Cisco Webex and Microsoft Teams. These platforms are proven useful, especially when the interviewer and interviewee are geographically apart. The interview sessions can be digitally recorded and safely stored (depending on applications) with permission.
After the interviews were over, all the collected interviews data will be transformed into text. The text was analysed based on patterns, themes, definitions, stories or any key aspects that the researcher is looking for. This method is incredibly useful in discovering and building a dynamic hypothesis and understanding of the overall system process. Data collection methods to extract mental data base for the written data base or/and numerical data base www.ijacsa.thesai.org [27]. Oral history helps to obtain specific information or gain perspectives where there is no written evidence no available. This method helps discover and provide a basis for building dynamic hypotheses and how the system works, and changes happened [10].

2) Oral history:
3) Focus group: Focus groups are group interview session with eight to twelve individuals. This method also can be employed in pairing with either in-depth, individual interviews or surveys [28]. It is useful for discovering and building dynamic hypotheses and understanding how the overall system works based on respondents' shared beliefs.

4) Delphi groups:
Delphi is a similar focus group extension method, but it can also be accompanied by surveys or interview analysis [29]. Besides face to face, Delphi also can be done through online discussions. The Delphi method helps the researcher reach a good understanding of critical issues, fact-finding, exploration, or discovering what is actually known or not known the problem situation, including the group's consensus and disagreements. 5) Observation: Observation can provide a great deal of information regarding social structures, cultures, processes, and human interactions [30]. Observation needs to be in written form, either on paper or digital. This method requires strong dedication as an observer may need to observe and collect data for a long time. A skilful observer will capture useful observational data that can satisfy the requirements for the SD model. 6) Participants observation: In this method, the researcher is visible to participants under the non-strict assumption that the researcher will interact with the subject of study in his/her study situation. During observations, researchers may collect data through diaries, notebooks notes, or any other documents produced by the participants that are being studied. These are precious sources of information as these sources can be used to support primary data sources (i.e. interviews data) [10]. 7) Experimental approach: Data collections in an experimental approach can be in many forms. Some data from the experimental approach can come in numerical form and qualitative form. If the data's findings show a different sight of the issues, the modeller can contact the actors and discuss the other views. If possible, the modeller may record the differences for further analysis [10].

8) Questionnaires:
Some modellers begin the questionnaires by building a small SD model first and giving the questionnaire to the participants to get their feedback. The questionnaires can be closed-ended or open-ended. The closed-ended type is primarily employed when the modeller wants respondents to see whether they agree or disagree with specific issues. Open-ended questions are mainly used when the modeller wants respondents to brainstorm (identifying variables), rank order information, and produce causal reasoning. A questionnaire can also be used as a means to search parameter value for variables [31]. Questionnaires are suitable for a group of people who are geographically not together, or when the number of people in the interest group is large [23]. This type of data collection can also be employed within the Delphi approach and in multiple focus groups. 9) Group building: Using this method, selected group members or stakeholders are gathered together physically in one place and brainstorm together [23]. The aim is to build the model in a team where team members can communicate and share their mental databases [9], [20], [32]- [35]. Group building might involve one or more sessions to build the conceptual model. By communications, stakeholders from different domains can share knowledge, build understanding and reach the same level of consensus [34] [36]. For SD, developing a model in a team has been familiarised under several names such Participatory Modelling, Participatory Simulations, Mediated Modelling, Group Model Building, Shared Vision Planning, Collaborative Learning and perhaps many more [32]. In Operation Research, this can be established in two ways. One is building with an expert, and the other is building with a facilitator [37]. Building with an expert involves OR consultant handling the client's problem/situation. Appointments and meetings are set up. Based on the information shared, the OR consultant will build a model to develop an optimal solution. Whereas in facilitator form, the consultant and the client co-develop a model together, perhaps in a series of workshops. Even though both approaches have slight differences, both methods are very interactive, making them suitable approaches in engaging stakeholders.

10) Meetings in social media:
This method requires all the participants to have reliable internet access, the same applications installed in their computers or phones (e.g. Zooms, Cisco Webex and Microsoft Teams), and registered accounts. These platforms are proven useful, especially when participants are physically far away [38]. This method is believed able to replace physical meetings whenever required. For example, during the lockdown, quarantine time or work from home during COVID-19 outbreaks. Social media meetings for data collection can be employed with other approaches such as interviews, focus groups, Delphi, group model building and perhaps many more. Several SD experts have recently promoted social media as a platform for data collection and communication medium for SD development.

B. Written or Textual Database
The written database contains information in the form of text. Some researchers even recommended the use of text analytics tools and text analysis software [39] or namely as Computer-Aided Qualitative Data Analysis Software (CAQDAS), to support analysis activities [40]. For examples, Nvivo, Atlas.ti etc. With these softwares, analysis on nontextual evidence such as videos, images, audio recordings, pictorials and many more can be used as supporting evidence. Despite that, this still possesses challenges as modellers are required to move between qualitative and quantitative data. However, qualitative analysis helps modellers ground textual information and apply it in the model building process. This www.ijacsa.thesai.org also allows modellers to create a storyline of the system of the case [10] (see Fig. 3). A close example of this approach is a document-model-building strategy [41]. For this, five qualitative analysis from social sciences field seems a suitable approach to be utilised here, as suggested by Luna-Reyes and Andersen [10]. However, this paper extends another one more, Semantic Analysis [42], into the group. In total, the six methods are briefly explained from (1) to (6).

1) Hermeneutics
 Description: A qualitative analysis of any written text from documents, transcribed conversations, images, analogue recordings (audiotapes or videotapes), digital audio recordings and video recordings.
 Purpose: To find meanings and the pattern of relationship. This includes how they are linked to specific characteristics or expressions of specific themes in a particular study context, supporting evidence or contradicting one another.

2) Discourse Analysis
 Description: A qualitative method used to study people's interactions in their natural settings. This method is suitable to be applied with observation as a method to collect data.
 Purpose: To understand how interactions and pattern of behaviours.

3) Grounded Theory
 Description: A set of techniques employed to spot themes or concepts across texts. The methods can be performed on any textual data such as promotional adverts, interview transcriptions, memoranda, memorabilia, meeting minutes.
 Purpose: to link these concepts and to generate meaningful theories.

4) Ethnographic decision model
 Description: The researcher's interviews are oriented toward a specific decision or policy in the system.
 Purpose: To understand the reason behind a person's decision in a particular circumstance. This approach can help the modeller to build a decision tree (or dendrogram) describing the decision alternatives and processes.

5) Content analysis
 Description: consists of a deductive coding technique, where the researcher chooses and defines a set of codes to be used. Then, researchers organise their data into a matrix of codes and texts according to the unit of analysis selected for the study. The matrix data will be analysed using almost any statistical method to test the level of agreement between coders or do qualitatively.
 Purpose: to analyse meanings of content, or causal and relationships within texts, photographs, films or digital resources. This is carried out by quantitatively using statistical methods, or by qualitative methods.

6) Semantic analysis
 Description: consists of a process of extracting meaning from text or digital resources such as video recordings. This process is useful in obtaining the understanding of the system and meaning from documents.
 Purpose: to build or validate knowledge representations about the problem domain in a particular context.

C. Numerical Database
The third data source is a numerical database. According to Forrester, the numerical database is valuable in several ways [6]. Firstly, the numerical database is useful for a parameter value. Mainly, this serves as the input of the model. Secondly, numerical data can summarise characteristic behaviour between variables. Thirdly, numerical data can contain timeseries information. This information is often best for comparison with model output. Fourthly, numerical data allows SD simulation to work for quantitative analysis. It brings out the quantitative side of SD that can provide insights for possible improvements [43].
In some sense, it is believed that numerical data can provide more accurate and reliable insights than qualitative data [14]. Simultaneously, numerical data are often being discriminated against for determining model parameters [6]. Sterman specifically highlighted in his book Business Dynamics that "…no numerical data are available for many of the variables known to be critical to decision making…" [14]. For some time, this has been true for years. As a solution, Sterman suggested proper statistical methods to estimate parameters and assess the model's ability to replicate historical data when numerical data are available and suggested to look for alternative ways to measure whenever no numerical data is available [44].
Looking back in the early years, SD was not designated for numerically data-intensive applications. SD is initially intended for small data or poor data situations [45]. Traditional SD applications are usually fed with data from spreadsheet or CSV files or Microsoft Excels [46]. However, as years go by, in the era of Big Data (BD), Data Science (DS), Internet of Things (IoT), Business Intelligence and Analytics (BIA) and Industry 4.0, opportunities for SD to expand its capability seems promising and very inviting [16], [47]- [52]. With the blooming  (3) to analyses and interpret model-generated data" [15]. Thus, the usage of these available data sources deserved to be reflected. Not every case can be considered as big data cases. Some numerical data is not big but adequate. Therefore, this arguably depends on the case study.
Apart from the mental database and written database, databases or data warehouses have become new goldmines of potential numerical data sources. Although many more specialised tools have yet to be developed [46], several SD software can support database connectivity. For examples such as (1) VenSim (DSS version) [53], (2) Anylogic [54], (3) PowerSim, and (4) iThink Some use csv-files transfer such in STELLA [46]. There are also free opensource tools such as SimSyn. This comes with a graphical user interface (GUI), connecting VENSIM to a PostgreSQL database [46]. Other examples of third-party tools such as PySD can connect the traditional SD from Vensim; iThink, or STELLA with databases and models [16]; XMILE (eXtensible Model Interchange LanguagE), which allows SD model's connection with the database and other analytical tools [55]; or DEE protocol (dynamic data exchange) allows data transfer between SD and other models, tools and databases.
Besides database, outputs from other simulation models also can be potential data sources for SD model [9], [15], [16], [48], [56]. If one model's output becomes the input to the second model in s single flow, 'loose-coupling' between two models seems a good approach. There are many possible ways to couple the models for more complex interactions between two (or more) models, including a multi-directional flow of data. This depends on the functional suitability of the modelling approaches [9]. Potential data sources can also come from Data Science methods such as data mining. Machine learning can be mould into techniques that can catch selected data from a pool of data and use as inputs to feed SD models. Besides Big Data, some studies are already jump in to real-time data streams [15], [48], [56], [57]. Up to the present, it is no longer a surprise to see that initiative to using big data sources are already initiated by many SD researchers [17], [46], [56]- [58].

III. ALIGNING DATA COLLECTIONS METHODS WITH SYSTEM DYNAMICS METHODOLOGY
SD researchers classify SD methodology into two mainstreams. One is Qualitative SD, and one is Quantitative SD. Some SD researchers may argue that developing qualitative models alone may not be enough to complete the problem. This is because SD relies on quantitative data to generate feedback models in simulation. This feedback provides insights for further improvements and provides a sense of certainty in prediction [21]. However, some researchers had claimed to have utilised both types of SD in their work. The rationale of this is because the early stages of SD methodology are emphasising on qualitative knowledge. Based on qualitative knowledge, the latter then becomes the foundation of the quantitative approach [43]. This perception seems mutual among SD experts. Therefore, this paper is focusing on the combination of both types.
Based on classic literature, SD methodologies are organised in several stages, ranging from three to seven stages [10], [59]. Although they have different numbers of stages, the modelling process foundations are pretty similar [10]. For this paper, the four stages of SD methodology proposed by [9] and [10] are adopted together. The stages are distinguished as follows: (1) problem conceptualisation, (2) model formulation, (3) model verification and validation, (4) model use and application. This SD methodology framework will be used as a reference frame to discuss further how qualitative data and numerical (quantitative) data are linked to the SD methodology and at what stage are they useful. As a result of this alignment, potential data collection methods for SD methodology is organised in Table I.

A. Problem Conceptualisation
Problem conceptualisation stage is considered as 'qualitative stage' by most SD researchers [10], [39], [60]. In this stage, the SD model's purpose needs to be determined and justified through problem identification activities [9], [60]. Problem conceptualisation process involves framing and structuring the problem of the case. How stakeholders see the problem situation, how they perceive it can be diverse and very subjective. If the uncertainty issue is of concern, then the uncertainty elements must be considered in the context of the model's purposes. This process strongly relies on experts' or modellers' ability to extract the knowledge that resides in the heads of experts, modellers, and the rest of the stakeholders [61].
After critical stakeholders are identified, meetings and appointments are set up and scheduled. This is important because qualitative understandings of the problem case can be successfully gained through communications and interactions with stakeholders and not without [33]. These activities may include social learning by interest groups, knowledge elicitation and review, data assessment, discovering coverage, limitations, gaps, inconsistencies and many more, as explained in [9].
The suggested data collection method at this stage is mostly the qualitative approach. Examples are group model building team, interviews, oral history, focus groups, hermeneutics, discourse analysis and content analysis [10]. This is important to fulfil the SD model requirements. Suitable data will be collected and selected for model developments in the early stage. The rationale is to ensure that the case data must be enough to describe key variables at a minimum. This ensures the system feedback needs to be understood well enough to provide plausible estimates representing the relationships mathematically [9].
Traditionally, face to face communication interactions is encouraged throughout the SD stages, especially in the early stage. It is the most effective way to increases the www.ijacsa.thesai.org understanding with better engagement and fewer distractions. However, one may opt to have social media meetings as an alternative option if physical contact is impossible. This is useful, particularly during the quarantine period due to COVID19 [38]. Although online discussions may seem to be a promising solution, most communication theories argue that online discussion is not as effective as face-to-face discussion [62]. Therefore, if this approach is chosen, modellers have to embrace the advantages and bear the technology's disadvantages. They have to plan their data collections as best as they can.

B. Model Formulation
Model Formulation is a stage where the concept of a dynamic hypothesis model is translated into the formal quantitative model. In other words, this can be described as the transfiguration of a qualitative conceptual model to a quantitative numerical model.
Formulating and designing a model is not a straightforward process. In this process, the modeller needs to use their understandings and judgmental data to build the model. The initial SD model will slowly evolve and expand in more than one attempts iteratively. Modeller's judgments on methodologies selection for developing SD models are critical to ensure the model's results. Since different SD mappings will lead to different results, selections would depend on how well specific SD mapping can support the model objective [9], [63]. This is also to determine whether the method can satisfy possible interests, decision options, and impacts. This is because the formulations of non-linear functions and linear is a highly qualitative process. In this stage, the modellers must gather as much information as possible.
Most of the times, modellers have to utilised what they can to incorporate variables and parameters into the model. Usually, modellers will look at (published and non-published) academic and industry documents, including reports to get the parameter values, to get the model variables, or to get ideas of similar models' structures and components of a system. A systematic or non-systematic reviews framework can be employed on documents collection to seek relevant resources in a more organised manner [41]. In qualitative modelling especially document model building approach in SD, hermeneutics, content analysis like Decision Making Trial and Evaluation Lab (DEMATEL) [64], and text analysis [39] are helpful to address the cause and effects relationships among components of a system [41]. On the same side of the coin, grounded theory and ethnographic decision models can guide and enrich the identification of critical structures and formulations based on meaning and connections [10], [39]. In some cases, statistical analysis, such as regression analysis, helps address the relationships between components from multiple sources [65].
With today's technology, knowledge is more than just in straight textual forms. Digital information can be a valuable pool of information too. For example, the information in pictorial forms like info-graphics from social media, such as Facebook, Twitter, or online newspapers. Furthermore, essential information can lies inside video recordings or audio recordings, too (analogue and digital). So, knowing where to search, how to capture information and analyse information are critical. This is because the types of available data can shape the model's mapping [9].
It is also widespread practice for the modeller to consider variables and non-linear relations for which quantitative data are not available. Interestingly, this process can be accompanied by additional qualitative techniques to add formality to the process. Vital sources can come from interactions with individuals, groups, and clients [37]. For mental database elicitation, a number of methods appear to be more beneficial to obtain the system structures, parameters, and the policies to be included in the model [10], such as interviews, focus groups with Delphi, observation, from participant observation [14] and many more. Besides physical communications, online communications [38] can also play an essential role in data collection, such as online meetings via social media, including online interviews, phone interviews, interview via e-mails or in combination with other methods such as focus groups with online interviews, or surveys. Typically, all of the collected information will be transcribed into text and analysed.
Later in this stage, qualitative data could appear less useful and quantitative data start to take over interchangeably [10]. The most common way to elicit parameter values from stakeholders is through interviews, group sessions, or Delphi. Modeller can ask group members to estimate an unknown parameter individually. After collecting initial individual judgments, the modeller gives back a summary of the values gathered. Besides mental and written sources, numerical data sources can be retrieved from CSV files or database [46] or direct connection from databases, data warehouses [65] or devices like sensors or meters [66]. These data are usually favoured because of their completeness. These numerical figures can be presented in a single number or in time series [16], [66], or in streaming data [15], [46], [65]. If multiple simulation models are involved, the output data from other simulation models can also serve as input for SD depending on the model objectives [9], [16]. Apart from that, modellers have to determine the adequacy of data in term of size. If a real-time simulation is part of the model objective, then the suitable tools must be used to feed the SD model smoothly.

C. Model Verification and Validation
The validity of a model is assessed according to the purpose for which it is developed [42]. Knowing the purpose of the model can help determine which data patterns are important for model evaluation. This stage always interchangeably back and forth with the "Model Formulation" stage if there are additional changes in the model structure. To be useful, simulation models must resemble the problem owner's environment in the real world. Generally, there are two common testings to increase confidence in the SD model [9], [42], [67]: structural testing and behavioural testing.
In structural testing, testing is done by direct comparison with the real system structure. These tests are performed to see how well the model's logic represents the system's real-world structure [9], [67]. These tests also look into the sense of the model (including mathematical equations). Evaluation of the model structure is often hard to formalise and quantify. This is www.ijacsa.thesai.org usually conducted qualitatively. To ensure the logic is right, stakeholders verifications are needed [35]. The stakeholders can be experts, or analysts, or problem owners. This can be done through face to face or online interviews, focus groups, Delphi groups, experimental approaches, walk-through, formal inspections, or semantic analysis [10], [42]. Cross-checking with secondary sources like reports, statistical yearbooks, and observations can increase the model's reliability and validity [68].
In behavioural testing, testing is done to assess how close the model outputs can replicate the real-world system behaviour. Typically, this is achieved by looking at general patterns produced by the model, for examples, growth, decline, and oscillation [67]. One way of doing this is by using a statistical comparison of the data against the model output. This is usually done through goodness-fit measures such as correlation coefficient, root mean squared error, mean absolute relative error, maximum relative error and discrepancy coefficient in cases where adequate data were available. In this context, numerical data from historical data, operational data, from database or data warehouse or any observed data from the fields are precious for testings.
Both of these structural test and the behavioural test is highly needed, especially in poor data situation. After all, most SD practitioners agree that it is rare to have sufficient data for all variables. It is very uncommon to have adequate data for every SD model variable [9], [60]. Therefore, formal model testing should be done whenever necessary. Simultaneously, other evaluations such as sensitivity analysis, peer review, results from patterns analysis and model comparison analysis can be used as complement [42]. All these are to ensure the output produced by the model is reasonable and acceptable.

D. Model use and Application
The key activities in this stage are model simulation, decision analysis, and discussions. At this stage, the model is believed ready to be used and serve its purposes. In order to run the model, numerical data should be made available and ready for simulation, either in real-time mode or otherwise. For examples, in CSV files, from database or warehouses, or output from models. The size may vary. Sometimes this can take a series of simulation runs.
Model-based simulation, like SD, can act as an analysis enabler of various situations by modelling and simulating the model over time within a computer program [4]. With the SD model, the decision-maker can design and simulate a series of tests for system change [69]. Thus, they can test specific policies, narrate insightful stories about policy experiments, and generate discussion about the problem actors related to the result. This means that decision analysis can be evaluated through experimental approaches and evaluated qualitatively through active discussions [9]. That is why, in this stage, the uses of qualitative data and data analysis in SD are rich, and could be richer still.
Apart from the experimental approach above, oral history and grounded theory can help the sense-making from the simulation results and from the modelling process itself by providing a record of how variables or pieces of the structure can be formulated or reformulated along the way [10]. With today's communication technology, group discussion (via face to face or online) methods such as Delphi or focus groups are useful for generating discussion among actors about the meaning of the policy experiments' model results and the stories generated by the model. In oral history, discourse analysis, and grounded theory, the modeller also uses the learning accumulated during the modelling process. If a survey or questionnaires is needed, it might be helpful to be applied here too, depending on its suitability.

IV. CONCLUDING REMARKS
Data sources are very important for SD development. Based on the three primary pools of databases (mental, written and numerical) suggested by SD forefather Forrester, it is noted that no data are entirely perfect. Either mental data, qualitative data or numerical data, all of them possess the tendency to be flawed, biased and unreliable. The mental database is hard to capture because it is in the head. The mental database needs to be shared, so the context of the problem case is understood. The written database requires mental digestion by the modeller to translate them into knowledge mapping in SD. Numerical is reliable, but back then, numerical data was not available as much as today. Due to technological advancements, the numerical data base's traditional perceptions are now no longer fit in today's era. Thus, this deserves attention and highlighted. Due to this reason, the three sources are worth to put into consideration depending on the problem case.
At the end of this paper, suggestions of potential data collection methods are elaborately discussed and aligned with four staged SD modelling methodology. According to Sterman, system dynamics modellers should master the state of art and use these tools and follow new developments as the tools continue to evolve and innovate to develop new methods appropriate for the models. Hopefully, the alignment of SD methodology with the potential data collection methods can impact the entire modelling procedure whilst respecting the traditional SD modelling approach's key components. This is shown in various options, where the best selections of the methods to be used in an SD modelling process can be selected from Table I. In summary, this paper provides a brief overview of the currently existing knowledge extraction methods for SD modelling. It mainly emphasises the potential data sources and their suitability for each stage/step in the process of modelling. It also gives a good foundation for understanding the existing alternatives in the field of SD modelling. Moreover, the adaptation presented in Table I for suggested data sources in each modelling phase represents a practical synthesis of existing choices as guidelines for current practice. Since different cases might face different types of problem situations, the selection of data collection methods should be based on what is feasible and how they can complement or compensate each other.