Urban Addressing Practices and Geocoding Algorithm Validity in Developing Countries

Addressing systems have a key role in understanding and managing economic connections and social conditions, especially in urban territories. Developing countries need to learn from previous experiences and adapt solutions and techniques to their local contexts. A review of the world bank’s experience in addressing cities in Africa during the 1990s provides valuable lessons. It provides an understanding of the operational issues and the key success factors of such operations. It also helps to understand the conceptual components of these systems and the efforts required to build them in the field before the creation of their IT infrastructure. An addressing experience from a private sector initiative in Casablanca-Morocco is also reviewed, where efforts concern the creation of a comprehensive database of addresses. The methods used to collect the data in the field are presented as well as the conceptual model for its integration. The validity of geocoding techniques, which represent the core computing tools of addressing systems, is discussed. In the Moroccan context, the official addressing rules follow Western models and standards, used by default in geocoding algorithms. The study of data collected in Casablanca, processed with GIS tools and algorithms, shows that the percentage of cases not respecting these rules is far from negligible. The analysis was particularly interested in the two main criteria of address numbers: “parity” and “respect of intervals”, analyzed by street segment. Compliance with these conditions was only observed at about 53%. It is then concluded that a geocoding system based on a linear model is not sufficiently validated in the Moroccan context. Keywords—Addressing system; geocoding; Geographic Information System (GIS)


I. INTRODUCTION
Addresses are necessary data for citizens, administrations and companies. Through an address, a citizen can have access to several civil rights and public services; Administrations can efficiently manage their territories and companies can manage and optimize supply chains. It was once believed that about 80% of information, especially those used by local authorities, have a geographical component, related in a way or another to address locators [1], [2]. In the times of IoT (Internet of Things), it's hardly possible today to find data without spatial coordinates. While the latest geocoding literature deals with the latest techniques in the matter, such as machine learning [3] and deep learning particularly [4], the classical issues related to historical address structure and standardization remain relevant [5,6]. The general literature deals with geographic related applications in different countries such as in Australia [7], Brazil [8], China [9], Croatia [10], Cuba [11], Germany [12], India [13], Morocco [14], Quebec [15], South Africa [16], Turkey [17], etc. The applications based on address locators are more than ever evolving, and the need for reliable and accurate address systems has never been more. Unfortunately, while such systems have already reached the stage of maturity in developed parts of the world [18], [19], it remains a real issue in developing countries. However, it is there where it's the most needed, for basic applications, already discussed in research works in other contexts, such as health studies [20]- [22], politics [23], criminality [24], traffic accidents [25], emergency dispatching [26], etc.
In developing countries, the issue of addressing systems presents a big challenge, including norms on the field, availability and quality of the reference data and reliability of geocoding techniques. On the field, addresses numbers and streets names should be assigned according to logical and consistent methods. The quality of geocoding, which consists of transforming a given number of descriptive into a geographic position [27], will then depends on the quality of both the reference database of addresses and the used methods.

A. The World Bank Experience in Africa
The addressing process is a critical issue for the city. It is a challenge to be taken up by several stakeholders, including town planners who plan the base of future addresses, local authorities who assign formalized addresses, install and maintain signs for street names and squares, utilities who use addresses when providing services or billing, postal operators who deliver mail to an address, as well as residents who maintain the numbering plates of their buildings and can correct errors in their addresses.
This complex operation includes the formalization of the rules of reference for the addressing process. It consists of creating and updating standardized addresses in the city. The two important operations carried out on the field are: the naming of the streets and the numbering of the buildings. The World Bank carried out several addressing experiments during the 1990s, in different African cities [28]. Table I presents an overview of the technical features of addressing practices in Burkina-Faso (Ouagadougou and Bobo-Dioulasso), Cameroun (Yaound and ouala), uinea (Conakry) and Niger (Niamey). The world bank's financially and technically supported addressing projects extended to several other African cities in Mali, Mauritania, Mozambique, Senegal, Benin, Rwanda, Djibouti, Togo, and C te d' voire (Ivory Coast). Fig. 1 shows the concerned countries. The World Bank's recommendations for addressing operations are based on the observation that it is almost impossible to name all streets that addressing operations arefirst of all -a municipal action and that addresses are to be defined in relation to the streets and not in relation to the blocks.
The key success factors for successful addressing operations that were concluded from these experiences are: organization and motivation of the addressing unit; the involvement of the municipalities, decision-makers and technical services (which must have the necessary means and skills); financial efficiency during the project (while having good control); the simplicity of the database and the software developed to facilitate transition after the project phase (in particular to ensure that addresses are updated); controlling the scope of the project (concentration of efforts on the pure and simple objective of addressing); good coordination with stakeholders, in particular utilities and the post offices.
The main indicators that were used to assess the outcome of these projects are: the budget of the operation, the number of street signs installed, the number of buildings "addressed" (percentage of households concerned) and cost per capita and per addressed door. In the long term, the growth rates of local services such as tax collection and postal services should confirm the success of these operations.

B. Private Sector Initiative in Casablanca-Morocco
The first known addressing project, aiming to create a comprehensive database of addresses, inventorying all address locators of a major Moroccan city was initiated in Casablanca city in the late 2000s. It is the private company, insuring the delegated management of water and electricity utilities in the Grand Casablanca that was behind this initiative. In the absence of providers of such important data, critical for its operations, the company had to collect more than 400000 address locators. Fig. 2 shows the projects area. www.ijacsa.thesai.org After planning and reference data acquisition, such as streets and necessary base map data, 200 operators were sent to the field, equipped with 3500 printed plans in total. Each map concerns a specific sector and contains the reference data necessary to recognize address locators to collect: streets, plots, remarkable places, neighborhoods and sector limits [29].
The targeted area is divided into sectors, well known and mastered by the field operators. Each sector is printed in suitable format map, with the sector reference included. Once the field survey is performed, with both positions and descriptive information required for the matching with the company's operational database, the maps are handed to the back-office for filling the project database. A sub set of these maps is then used for quality control. All maps are in the end scanned and archived.

C. Discussion of Addressing Field Operations Practices
In order to create an addressing database (as in the case of the private initiative in Casablanca), or to prepare an inventory to improve addressing in the city (as in the case of initiatives assisted by the World Bank in Africa), complex field surveys operations are required. Two missions can be distinguished, to be carried out successively, since the first one makes it possible to better prepare the second one by providing its necessary reference data. First, mission I, the survey of streets and their signs, then mission II, the survey of the numbering of buildings.
Before starting the field operations, data model conception of information to be collected is necessary. Table II shows the key data for the two missions.
Once identified, these data should be integrated into a larger addressing data model, such as the one presented in Fig. 3 using the UML formalism.

1) Mission I: Streets survey:
Given that the area to be covered is often very large, (for instance, more than 1220 km2 in the case of the Casablanca project), the field surveys must be organized by geographic elements that can be mastered and easily navigable, in order to allow fluidity of operations on the field. All data that can be acquired before the field mission, such as the streets plots, must be integrated beforehand, so that the strict minimum data should be gathered from the field.
The choice of street segments as survey basic elements has the following advantages: allow the control of the total coverage of the study area; optimize the circuits of passage in the field; allow the allocation of different sections, belonging to the same street, to several operators (which corresponds to the logic of the administrative division of cities); gather information that may change from one street segment to another, such as width; prepare for the city signs plan which is designed by segment of street.
2) Mission II: Numbering survey: For the same reason of optimization, the use of streets' segments as basic elements for organizing this second mission remains relevant. It would also be possible to combine this logic with an administrative or business division of the project area, in order to define circuits that are easily recognizable by the field operators.
Another reason to consider the streets segments for this second mission too would be to anticipate the preparation of basic data for the development of a geocoder, the quality of which improves while using street segments instead of streets.

A. Geocoding Techniques
The principle of geocoding methods is to compare a list of descriptive address elements with a well-structured reference database. This operation is done in three steps that represent the general geocoding algorithm, which are well documented in the literature [30]- [32], shown in the Fig. 4.
The eocoding process manipulates " nput data" in order to get "Output data" through a "Matching algorithms" using a structured "Reference database"; those are the four parameters of geocoding and here after their dynamics. In Input, data to geocode is introduced, in form of a list of descriptions such as a postal address details. In Output, a georeferenced data is returned, with the geometry that is supported by the processing algorithm. It is often a two-dimensional point, but it can also be another form of complex data such as 3D objects [33]. It is the matching algorithm that decides of the corresponding result, based on the input data, the reference data and matching rules. This is why research works focuses especially on matching algorithms.
When all the address points' locations are available, in the case of a comprehensive reference database, the matching operation becomes quite evident. A simple search request is enough to find the exact coordinates to return. On the other hand, in the case of linear alternating numbering model, the returned position is rather calculated. It is an interpolated position based on the elements available in the linear referencing database, notably in the streets and street segments, such as intervals and parity of numbers on each side of the streets segment [34], [35].

B. Geocoding Algorithm Preconditions
The geocoding algorithms, in the case of linear alternating numbering model widely prevailing, begins with determining the street segment which corresponds to the descriptive list of the searched location. This first step uses the address numbers interval in the attributes of the street segments. Once the segment is determined, the address side (right or left) is concluded from the parity types (odd or even) of the numbers on either side of the segment, and the parity of the number in the searched address location. The exact position on the corresponding side is then calculated. Other parameters such as distance and angle from the segment and offsets from its ends can fine-tune the accuracy of the estimated position.
Other data can also improve this accuracy, such as information on the number of buildings on each side and their geographical distribution [36], [37].   For the geocoding algorithm to be effective, two main prerequisites must be satisfied by addresses in the field: consistent number intervals per street segment and consistent parity on each side. The uniformity of the distribution of buildings on each side of the street segment improves the accuracy of the calculated position.

IV. METHODOLOGY
In this section, the evaluation of the main prerequisites of geocoding is studied, notably address numbers parity and intervals, in the city of Casablanca.

A. Study Area and used Data
The main data used are the streets shapefile and address locators collected in the field as part of the private initiative project in Casablanca, presented in Section II. This database of 63,833 address locators of the communes of: Anfa, Maarif, Mers Sultan, Sidi Belyout and El Fida, represents approximately 15% of the total number of address locators in Grand Casablanca (Fig. 6). From the streets shapefile, using a GIS software, street segments are generated. Then, for each address point, the relative position (left or right) with respect to its street segment is calculated. Finally, for each street segment, the geocoding parameters are calculated.

B. Geocoding Preconditions Analysis Method
To assess the validity of the main prerequisites of geocoding presented previously, two main questions need to For an odd number and an even number belonging to a street segment S, the probability that the even number is on the left side and the odd number is on the right is: PS (EL and OR) = P(EL) × P(OR) Similarly: PS (ER and OL) = P(EL) × P(OR) Since the the street segments that meet the parity condition are those that have all odd numbers on one side and all even numbers on the other side, these streets are those verifying: P (EL and OR) =1 or P (ER and OL) =1 Among the address locators belonging to such a street segment, those which meet the number range condition are those whose numbers are exclusively within the range of this street segment (taking into account all segments belonging to the address's street). Table III shows that 87% of street segments verify (3), then fulfil the first condition: the consistency of the parity of the address numbers. The percentage of address locators belonging to these streets segments is 83% as shown in Table IV. Among these points, 61% have a number that belongs to one and only one range of segment numbers, for this also meet the second condition: the consistency of the rages of numbers (Table V). If we consider all the address locators with a street toponym in the study area, the percentage of compliance with the two rules is around 53%. This percentage drops to only 47%, if we consider all address locators, regardless of the types of toponyms, since 6% of address locators are not linked to a street toponym.

V. RESULTS DISCUSSION AND CONCLUSION
These results indicate that the conditions necessary for the use of standard geocoding, based on linear alternating numbering model according to Western standards, are not met in the Moroccan case. Thus, the establishment of a national addressing system based on geocoding could not ensure the quality necessary for the applications which depend on it.
This problem can be explained by the lack of application of addressing standards in the field: temporary addresses allocated to housing development projects that become permanent, streets that are not assigned official names for long periods of time (years in some cases), addressing services which lack resources and coordination in cities experiencing a rapid expansion which further complicates the situation.
That said, the bulk of the problem is first organizational before it is technical. So, in order to be able to set up a reliable reference addressing system, urgent solutions must be proposed and others must be established over time.
Thus, in the short term, addressing campaigns like those of the World Bank in Africa must be carried out to overcome the lack of address references in the field and to promote the addressing of cities. Such campaigns will also make it possible to have up-to-date and more general data on the addressing situation throughout the country, the latest data available only www.ijacsa.thesai.org concerning the city of Casablanca and already dating back almost ten years. Addressing information systems must be built around comprehensive databases and not on interpolation algorithms as long as the addresses in the field does not comply with the standards.
Over time, the addressing standards themselves as well as the address attribution procedures must be improved. They must be integrated into the urbanization and management processes of the city since early stages in order to avoid temporary, non-compliant solutions that last. This surely cannot be done without a good governance, led by specialized organizations which collaborate with businesses and research institutions.