Artificial Intelligence: What it Was, and What it Should Be?

Artificial Intelligence was embraced as an idea of simulating unique abilities of humans, such as thinking, selfimprovement, and expressing their feelings using different languages. The idea of “Programs with Common Sense" was the main and central goal of Classical AI; it was, mainly built around an internal, updatable cognitive model of the world. But, now almost all the proposed models and approaches lacked reasoning and cognitive models and have been transferred to be more data driven. In this paper, different approaches and techniques of AI are reviewed, specifying how these approaches strayed from the main goal of Classical AI, and emphasizing how to return to its main objective. Additionally, most of the terms and concepts used in this field such as Machine Learning, Neural Networks and Deep Learning are highlighted. Moreover, the relations among these terms are determined, trying to remove mysterious and ambiguities around them. The transition from the Classical AI to Neuro-Symbolic AI and the need for new Cognitive-based models are also explained and discussed. Keywords—Classical AI; machine learning; Neuro-Symbolic AI; Cognitive-based AI; deep learning


I. INTRODUCTION
Artificial intelligence (AI) is a wide-range models that empowers people to incorporate and analyze data to make insights and predictions that could be used in the decision making process, which is normally requires sufficient level of human expertise. In its early decades, the main challenge facing the artificial-intelligence researches was to learn the machines how to make relations between different states and a set of recognizable conditions, which have been maintained in its earlier models. During 1980s, AI models achieved a great value for probabilistic explanations over a set of discrete variables, e.g. machines can make interpretations and guess that a patient with specified symptoms may have a certain disease.
One of main objectives of AI models was to help people to anticipate problems or deal with issues as they come up, and operate in an intentional, and in an adaptive way. Despite the importance of the aforementioned objective, in 2015, Google apologized to a software engineer Jacky Alciné after he pointed out that the image recognition algorithms in Google Photos were classifying his black friends as "gorillas". Also, Google was algorithmically biased and showed an advertisement of a job to a male group rather than women [reported in Washington Post on July 6, 2015]. Another example that indicates a failure of AI systems, is when a street-sign recognition system used by self-driving cars mistaking the stop signs for speed limit with a little defacing. All these examples indicate the misbehaving of the current AI systems comparing to humans who can learn logical relations and make choices with little information. AI techniques, on the other hand, are more restricted in their abilities and require specific details to do their work.
The main and central objective of this paper is to illustrate how the AI field has been changed and deviated from its main goal, which causes that robust intelligence cannot be achieved. The paper also asserted that, without developing systems able to represent and reason the external world, and draw on substantial knowledge about its dynamics, this robustness will never be achieved. Recently, a lot of papers and researchers realized the importance of moving towards more adaptive, dynamic, and cognitive models. In addition, they provided comprehensive studies of the past, present and future of AI field, such as the work done in [1], [2].
The rest of this paper is organized as follows; In Section 2, an overview of different disciplines of AI is presented, while in Section 3, an overview of the history of the AI field was provided. Section 4 demonstrates how data-driven models overwrite the main goal of classical AI. Three different types of AI, Narrow, General, and Super AI are highlighted in Section 5, and the main challenges facing the Current AI are outlined in Section 6. The difference between Knowledge-Based, Cognitive-Based model and Consciousness is shown in Section 7. The importance of using a hybrid approach is discussed and explained in Section 8, while we conclude our study in Section 9.

II. THE DISCIPLINES AND TERMS OF AI
In this section, different disciplines of AI that contribute to the emergence of the field are outlined. Also, the main terminologies and terms used are reviewed, keeping in mind, removing the ambiguities associated with them in several works of literature.
According to Russell & Norvig [3], different disciplines including Philosophy, Mathematics, Neuroscience, Economics, Computer engineering, Control theory, and Linguistics all together contribute to formalizing the principles of AI Philosophy that formulates a precise set of laws governing the rational part of the mind which allowed one to generate conclusions mechanically, given initial premises. Mathematics is the second foundation that formalizes the formal logic, computation, and probability. Economics which studies how to make decisions that maximize the profit is another foundation of AI, while Decision theory that consolidates probability 70 | P a g e www.ijacsa.thesai.org theory with utility theory, to give a complete framework for decisions made under uncertainty. Also, Neuroscience, which studies the nervous system, is one of its main foundations, Camillo Golgi [4], was the first one who developed a staining technique allowing the observation of individual neurons in the brain. And Nicolas Rashevsky [5], was the first to apply mathematical models to the study of the nervous system. Another AI foundation is "Computer Engineering" which answers the question of how we can build an efficient computing machine. Economics which study how people make choices that lead to preferred outcomes is another foundation of AI. And finally, both Control Theory and Linguistics are the last two founders of AI, the first answers the question of how can machines operate under their own control, and linguistics main concern is how does language relate to thought. AI, Machine Learning (ML), Deep Learning (DL), and Artificial Neural Networks (ANNs) are often used interchangeably, but this is not true; " Fig. 1" illustrates the relations between these different terms, it shows the relation between Symbolic Artificial Intelligence [it will be discussed in detail in section7], and the Current AI. Artificial Intelligence or sometimes called Narrow or Weak AI is s a broader concept, which is briefly, study how machines are used to simulate the way of thinking and perform the mental functions in an "intelligent" way.
Machine learning is a set of AI techniques that study how machines can learn from a dataset and perform new predictions based on that prior learning. Deep Learning, Artificial Neural Networks (ANN), or sometimes called Connectionist AI (duo to its structure as connections), includes algorithms that simulate the mental functions to detect patterns, and classify information. Current DL techniques include Supervised, Unsupervised, Semi-supervised, and Active Learning with different algorithms. On the other hand, Rule-Based AI is a synonym for Symbolic-AI which is a traditional way of representing the problem by applying specified rules to an input, and accordingly, the output is governed by those provided rules. In " Fig. 2", different algorithms for these categories are specified and listed.   In this section, we will give a brief historical overview of the AI and its main idea. John McCarthy is an influential figure in AI, and Princeton is considered the true birthplace of AI, McCarthy. Minsky and others have organized a workshop for two months at Dartmouth in the summer of 1956 [6] and invited American researchers interested in automata theory, neural nets, and the study of intelligence. The proposal of the workshop, mainly stated that; " AI study is to proceed on the basis of the conjecture that every aspect of learning or any other feature of intelligence a machine can be made to simulate it" [7]. Their main target was attempting to find how to make machines use language, form abstractions, solve problems reserved for humans, and improve themselves. Furthermore, they aimed at developing machines that will function autonomously in complex, changing environments.

Different definitions have been proposed to Artificial
Intelligence from multiple dimensions. In the 1950s, Alan Turing provided an operational definition of intelligence that measured how the computer acting humanly through his proposed test [8], which is briefly, includes the following abilities for the computer: • Ability to communicate successfully in English using natural language processing.
• Ability to store what it knows or hears using knowledge representation models.
• Automated reasoning ability which uses the stored information to answer questions and to draw new conclusions.
• Ability to adapt to new circumstances and detect and extrapolate patterns.
Wilson and Keil in 1999 [9] presented another definition known as The Cognitive Modeling definition, which is based on the dimensions of measuring how the computer can think and act humanly, in contrast with think and act rationally, their classification is shown in " Fig. 3". Moreover, their ability to do these things is going to increase rapidly until-in a visible future-the range of problems they can handle will be coextensive with the range to which the human mind has been applied" [10]. The overconfidence and promising performance of the early AI systems on simple examples were due to applying a simple syntactic manipulation of the problem, which is not suitable for large and difficult problems. Unfortunately, the early systems turned out to fail miserably when tried out on wider selections of problems and on more difficult problems. As an example, the translation project initiated in 1957 by the USA National Research Council is considered a typical example of early AI failure. The translation project was proposed to translate the Russian space scientific papers using simple words' replacement based on both the Russians and English grammar. The project was canceled, and it was stated that there was no machine could be used to translate human languages. The failure has been explained as; it was not sufficient to get the right meaning while the program of translation requires good background knowledge in order to resolve ambiguity and establish the content of the sentence.
IV. DATA-DRIVEN MODELS BUILT ON RUINS OF CLASSICAL AIX "Programs with Common Sense" was the main and central concern of Classical AI. John McCarthy noted the value of commonsense knowledge in his pioneering paper [11], and Doug Lenat provided a representation of commonsense knowledge in a machine-interpretable form in his work [12] [13] [14]. The classical AI was, mainly built around an internal, updatable cognitive model of things, like individual people and objects, their properties, and their relationships with one another. But, almost all the recent models and approaches are lacking both reasoning, and rich cognitive models of the world [15], this may be due to the following reasons: 1) It was thought that using reasoning and data-cognitive models may be suitable for that small problem instance, while the scale of the problem has a proportional relation with sufficient hardware and larger memories.
2) To some extent, building human knowledge into machine learning systems has even been viewed within machine learning circles as cheating, and certainly not as desirable.
3) The complexity of the world is endless, and human minds are very complicated.
4) Lack of the essential methods used to capture the arbitrary complexity by finding good approximations of the world, there is a need to propose new AI systems that can discover like a human, not reinvent what he has already known.
Moreover, many saw this lack of encoded explicit knowledge or detailed cognitive models as an advantage rather than being anomalous; as they moved from classical AI and its core, towards different, more data-driven paradigms.

V. GENERAL, NARROW AND SUPER AI
The paper written in 1958 by John McCarthy, introduced what is recently named General AI (AGI) concept, a hypothetical program (Advice Taker) was described, which considered the first complete AI system. In this system, axioms were defined to allow a model to generate a program to drive 72 | P a g e www.ijacsa.thesai.org to the airport. The program was also developed to react autonomously to unexpected situations without being reprogrammed.
Narrow Intelligence, also known as "Weak AI" includes systems that perform a single narrow goal extremely well (e.g. chess playing). They are extremely centered around a single task and not robust and transferable to even modestly different circumstances. Such systems often work impressively well when applied to the exact environments on which they are trained, but in many cases, they are not reliable when the environment differs from that they are trained. Such systems have been shown to be powerful in the context of games, but have not yet proven adequate in the dynamic, open-ended flux of the real world.
When AI systems outperform the best human brains, Artificial Super Intelligence (ASI) will be achieved. [16]. In a public talk, Andrew NG, one of the key figures of AI, said that "the distance between AGI and ASI is very short; it may happen in mere months, weeks, or maybe the blink of an eye and will continue at the speed of light". Scientists have different views about that time.

VI. PROBLEMS OF THE CURRENT AI
Despite the remarkable achievement accomplished by the AI in numerous applications, huge numbers of its initiators including McCarthy [17], Marvin Minsky, and Jouda Pearl accepted that it is strayed from its principle thought "machines that think, that learn and that make" as expressed in Simon's first workshop.
Classic Artificial Intelligence begins to break when it starts managing the untidiness of the world; for example, in image processing applications, in which computers used to gain highlevel abstraction from digitized images. Consider the possibility that is needed to make a program recognize a cat; what number of rules is needed to make it. Another example is how it might be required to characterize the standards for a self-driving vehicle to identify all the various people on foot it may confront.
To illustrate the idea, consider Fig. 4 of a picture, which is known as a "Bongard Problem," named after its creator, Russian researcher Mikhail Moiseevich Bongard [18]. The problem is presented by two arrangements of pictures (six on the left and six on the right), the objective is to spot the key contrast between the two sets. As shown in the figure below, pictures in the left set contain one object, and pictures in the right set contain two objects. Although, it's simple for people to reach such inferences from such limited quantities of tests, yet there's still no neural network that can take care of the Bongard problem. In one investigation directed in 2016, computer-based intelligence scientists prepared a NN on 20,000 Bongard tests and tried it on 10,000 more; the NN's performance was much lower than that of average humans.
In the literature, there was a set of arguments and objections stated to answer the main question, "can a machine be intelligent?" Turing himself inspected a wide assortment of potential issues for building intelligent machines. In the following part, some of these objections including all intents that have been brought up in the last 50 years are highlighted.

A. Argument of Disability
This contention of inability implies that "machines can never, perform job Y", Y could be determined as a set of soft skills; as instances of X, to be benevolent, have initiative, have a sense of humor, commit errors, fall in love, enjoy strawberries, learn from experience, or accomplish something extremely new.

B. The Mathematical Objection
According to Gödel's incompleteness theorem which related to Halting Problem and Un-decidability, philosophers have asserted that machines are intellectually mediocre compared to people. Machines are formal frameworks that are constrained by the incompleteness theorem; they can't build up the reality of their own, while people have no such impediment [19], [20]. Briefly, for any mathematical system F contains a set of axioms which is assumed to be true without having any formal proof, Godel sentence or F(X) could be represented with these features: • F(X) is a sentence of X, with no prof using F.
• If X is consistent, then F(X) is true.
Gödel's incompleteness theorem is applied only to formal systems, and this includes Turing machines, but Turing machines are infinite, whereas computers are finite. This implies that; any computer can, therefore, be described as a (very large) system in propositional logic, which is not subject to Gödel's incompleteness Theorem.

C. The informality of behavior Objection
AI is subjected to what is called "The qualification problem" As it was claimed by philosopher Hubert Dreyfus that computers are unable to interpret everything as a set of logical rules [21]. Theoretically, human behavior, such as human expertise and knowledge is very difficult to be represented by a set of rules, and because computers just follow these incomplete rules, consequently, they cannot generate behavior as intelligent as that of humans. In [24], [22], [23], similar criticisms, regarding this objection, were also produced.

VII. KNOWLEDGE-BASED SYSTEMS VERSUS COGNITIVE-BASED MODEL AND CONSCIOUSNESS EXPLORATION
During the sixties to the early days of the eighties of the twentieth century, the field was ruled by what was named "Symbolic Artificial Intelligence" (Symbolic AI), or "Rule-Based AI," that includes transferring the human behavior and explicit knowledge into a set of codded rules. This approach is very efficient for systems where the rules are very obvious, and input can be represented by symbols. Symbolic AI used symbols to define things (chair, cat, trucks, etc.) and can represent conceptual objects (transfer statements) or things that are not tangible. Fig. 4 outlines some of the algorithms used by Symbolic AI compared to that of ML.
Despite all this success in AI models; according to Yoshua Bengio, the key weakness is lacking methods for defining objects in a conceptual way [25], [26]. This obviously occurred when it is required to generalize beyond the training distribution. As a principle, if a task can be broken down into 73 | P a g e www.ijacsa.thesai.org objects, any AI model will be able to learn it, however, there is no way to give each conceivable labeled example of the problem to the model [27]. This leads to needs for using cognitive and consciousness exploration-based models.

1) Cognitive models:
The Natural History Museum of Vienna has assaulted Facebook after the Facebook user was restricted from posting a photograph of a stripped ancient figurine of a lady which goes back to 29,500 years, Facebook replied, the ban was just an accident. Such failures demonstrate that there is no hope of accomplishing a complete intelligence system without first developing systems with what could be called deep understanding, which would involve an ability not only to correlate and recognize subtle patterns in complex data sets but also the capacity to look at any scenario and address unexpected situations. These limits become progressively clear in functional utilizations of the current AI. DL algorithms, for example, are data-driven, with no symbol or knowledge representation; consequently, it is difficult to be applied to systems that require reasoning and thinking [25]. Additionally, all DL models are prone to algorithmic bias because it gets its behavior from its training data. This implies that for any hidden or explicit biases embedded in the training examples will also find their way into the decisions the deep learning algorithm makes.
There is a need for the transferee to AI approaches that use cognitive models to overcome these limitations. Cognition is defined by psychological researchers as far as a sort of cycle; humans take in perceptual data from the surrounded environment, they assemble inner cognition models dependent on their view of that data and make their decisions accordingly. Psychological scientists perceive that such models might be imperfect, but they considered them to be the key to how humans see the world [28], [29]. However, what computational requirements needed to have systems that are capable of reasoning in a robust fashion must be studied.

2) Consciousness exploration: The Consciousness Prior
Theory defined consciousness as "The perception of what passes in a man's own mind or awareness of an external object or something within oneself". It specifies that segments of our consciousness are chosen according to awareness methods and then communicate to the remainder of the brain, emphatically affecting downstream recognition [30]. After cognitive neuroscience, Yoshua Bengio turned his concentration to consciousness; he asserts that now is the ideal opportunity for ML to explore consciousness, which he says could bring "new priors to support abstraction and good speculation [31]. Yoshua aims that such research direction could permit AI systems to grow from representing what current systems are very good at, to represents more rational, sequential, logical, and intelligent models [32]. For his work, he only used those parts of consciences that include how humans express their felling in their own languages.
He used awareness as a mechanism of generating a set of related sequences for each event or thought; this sequence can be abstractly represented as an algorithm. In that way, consciousness can give motivation on how to build general models where agents are accomplishing something at a particular time at a specific place and have a specific impact [33]. That impact could have constant results all over the universe with the right abstractions.

VIII. DISCUSSION (THE NEED FOR HYBRID APPROACH)
As illustrated in the former section, both cognitive and consciousness models are considered vital components for building a new robust AI system. Basically, "General knowledge" can be classified into two main categories; one includes all the ever known real-world factual knowledge that based on direct evidence, actual experience, or observation. The other reflects 'common sense', which is the sort of knowledge that humans assumed to be known intuitively without being told. For example, this simple fact "Once a baby born, he is alive" can't be inferred by any AI system. The main weaknesses in AI systems are that they don't get causation, they can see that a few occasions are related to different occasions, however, they don't find out which things legitimately cause different things to occur. Fig. 5 illustrates the transition process of the AI, and its evolution in the last decades, features and challenges are maintained. The Rule-Based systems had deductive reasoning, logical inference, and a search algorithm that is used to finds a solution within the constraints of the specified model. It also used specified rules to deduce conclusions from the input data, to perform a certain goal. While in the Current AI, the rules of the model are not predefined, rather the data are provided and ML algorithms discover the rules from the training processes, and by applying statistical methods to adapt and tune different parameters till the optimal values are found.
Recently, influential steps towards building integral models that join features of the symbolic approaches with insights from ML, to obtain efficient techniques able to extract and generate abstract knowledge from stochastic data [34], [35]. For example, Geoffrey Hinton and others [36], use backpropagation algorithm to tackle the issue of enhancing the manner of adjusting synapses in order to enhance the performance. Backpropagation learns rapidly using synaptic updates and utilizes the connections of feedback to transfer error signals. So, a hybrid approach could be used to formalize the messiness of the problem in symbolic representation, then find all the correlations and induce some reasoning from it. 74 | P a g e www.ijacsa.thesai.org Central work of Neuro-Symbolic models is shown in [37] which analyzed the mappings between symbolic frameworks and neural systems, and indicated significant cutoff points on the sorts of information that could be represented in ANN, and showed the incentive in developing hybrid systems. Battaglia has produced a number of interesting papers on physical reasoning with systems that integrate symbolic graphs and deep learning [38]. A lot of similar work, such as [39] [40], [41] have been done to use ANN to give the answers from the messiness of the real world by learning. Then the symbolic part, forming internal symbolic representations, and create explainable rules to formalize the way that captures everyday knowledge, as shown in Fig. 5 [for clear resolution of the figure, refer to the last page]. In the history of AI, one of the largest efforts to create common-sense knowledge in a machine-interpretable form launched in 1984 by Doug Lenat, known as the CYC Project [42]. The main idea of the project was, to build a massive knowledge base containing static facts and heuristics, besides the cognitive and reasoning models needed to create what could be called common sense reasoning. According to Lenat; to simulate human thinking, CYC's team expected to code millions of facts crossing all different areas of human experience including science, society and culture, atmosphere and climate, cash and money, medicinal services, history, and other governmental issues. It was estimated that the CYC project requires a huge number (maybe thousands) of individuals to catch facts about brain science, governmental issues, financial aspects, science, and many, numerous different areas, all in logical structures. Simple declarative semantics models are used in knowledge representation, incorporating conjunctions, disjunctions, quantifiers, equality, and inequality operators. The CYC project has been depicted as "one of the most criticized projects of Artificial Intelligence". Machine learning researcher Pedro Domingos described the project as a "catastrophic failure" for several reasons, including the ceaseless amount of data required to produce any viable outcomes and the inability of evolving its own.

IX. CONCLUSION
A lot of the AI systems have become extremely powerful in many areas, such as medical diagnoses, translating languages, and image recognition, where they also can outperform humans at many complicated applications; however, they can be duped or confounded by situations they haven't seen before. Sometimes, the performance of AI systems, in their specialized domains, is very chaotic and weird, as none of them has a commonsense knowledge. This lack makes them brittle, its brittleness occurs when it is confronted by problems that were not foreseen by its designers. In this paper, we consider appealing to study how to integrate human experience and cognitive models with the current AI approaches in order to obtain more adaptive to the changes of the models. These models can interact with people, services, and devices and can understand, identify, and extract contextual elements.
As a future work, to enter the next decade of AI, more efforts must be done to build reliable AI systems that match basic reasoning of human, and can offer abstract solutions using insights, common sense and relatively little information. Apparently, in the next decade of AI, there is a need to redefine and refine the learning concepts, which are considered the main part of the AI models. Additionally, rich cognitive models must, intensively, be studied to represent models with rich-prior knowledge and sophisticated reasoning techniques.