“Dr.J”: An Artificial Intelligence Powered Ultrasonography Breast Cancer Preliminary Screening Solution

Breast cancer ranks top incidence rate among all malignant tumors for women, globally. Early detection through regular preliminary screening is critical to decreasing the breast cancer’s fatality rate. However, the promotion of preliminary screening faces major limitations of human diagnosis capacity, cost, and technical reliability in China and most of the world. To meet these challenges, we developed a solution featuring an innovative division of labor model by incorporating artificial intelligence (AI) with ultrasonography and cloud computing. The objective of this research was to develop a solution named “Dr.J”, which applies AI to process real-time video live feed from ultrasonography, which is physically safe and more suitable for Asian women. It can automatically detect and highlight the suspected breast cancer lesions and provide BI-RADS (Breast Imaging-Reporting and Data System) ratings to assist human diagnosis. “Dr.J” does not require its frontline operators to have prior medical or IT background and thus significantly lowers manpower threshold for preliminary screening promotion. Furthermore, its cloud computing platform can store detailed breast cancer data such as images and BI-RADS ratings for further essential needs in medical treatment, research and health management, etc. as well as establishing a hierarchy medical service network for this disease. Therefore, “Dr.J” significantly enhances the availability and accessibility of preliminary screening service for breast cancer at grassroots. Keywords—Breast cancer preliminary screening; lesions detection; ultrasonography; artificial intelligence; deep learning; cloud computing; BI-RADS


I. BACKGROUND
Breast cancer is one of the most common tumor diseases among women and causes millions of death every year globally [1]. In 2017, China recorded near 270,000 new breast cancer cases which continued to rank top among all the malignant tumors found in Chinese women. The average age of Chinese female patients is 49 while compared with over 60 in western countries. Moreover, growing incidences are reported in the younger age groups, and the incidences in the urban population are twice as high as the rural ones. Furthermore, women from higher education and income background also show higher incidences. The breast cancer pervasiveness is assessed to reach high since fewer than 60 cases according to 100,000 females matured inside 55-69 age to exceed 100 cases according to 100,000 females till the year of 2021 in China [2], with anticipated to reach at 2.5 million cases until 2021.
Although the universal health coverage in China growing quickly but put more pressure on low and middle income"s cancer patients. In this review, only limited data statistics are included in national cancer registries for incidence and mortality because of breast cancer is about only 13 % of the nation-wide in China [3], distinction with 96 % patients in the USA and 32% patients in the European Union [4,5]. Breast cancer growth is the most various diseases in the Chinese females which are accounted for by the GLOBOCAN with 21.6 cases per 100,000 females [6] as age standardized rate (ASR). Breast cancer is the more successive infection in the midst of urban females and the fourth most regular kind of disease in the wide open as per Chinese National Cancer Registry. On the other hand, specifically in central China, the ASR for the breast tumor is probably going as short to 7.94 cases per 100,000 females. In 2008, there were 16.6% patients revealed in China of having breast cancer matured within 65 age or senior (interestingly with 42.6% patients in USA).
It is assessed that by 2030, there will be 27.0% of patients with breast cancer [7] will remain accounted for matured at least till 65 ages. In Chinese females, breast cancer dangers are profoundly connected with realized hazard highlights for females in the high-salary countries [8]. The same to western females, hormonal and generative highlights, for example an extensive menstrual lifespan (primarily dependent on younger age at menarche and elder age at menopause), expanded phase from the outset alive work, invalid uniformity along with confined breastfeeding in view of not having more than normal kids-are connected differentially to build danger of breast malignancy in the Chinese inhabitants [9][10][11][12].
The benefit of mammography [13] stays uncertain in females with higher percentage in less than 50 age women. In other case, 57% of Chinese patients with breast malignant growth [14] are up-to 50 years old. A national screening program was endeavored in 2005 for breast malignancy with an objective of screening 1000 females with both mammography and ultrasound, however was ended as a result of absence of financing and worry about false-positive [15] determinations. Discoveries from an examination in Beijing [16] demonstrated that the solitary 5.2% of novel cases were *Corresponding Author

II. INTRODUCTION
The survival rate for breast cancer is over 90% if found and treated in early stages and can overcome the leading cancer fatality rates in women. However, in China, only 15.7% of breast cancer patients were found when in BI-RADS category 1, while 44.9% and 18.7% were in BI-RADS category 2 and 3 respectively. Chinese women patients in relatively lower income and social background were normally at BIRADS category 3 and 4 when founded, while for those from higher income and social status, those females had BIRADS category 1 and 2 [18].
The increase in breast cancer fatality rate indicates an alarming situation amid highlights urgent need of developing a preliminary breast cancer screening system for the detection and diagnosis of breast cancer [19] available, accessible and affordable to the general public at community level. Current major screening and diagnosis methods include imaging, breast clinical exam and tissue sampling. Imaging tools include ultrasonography, magnetic resonance imaging (MRI), X-ray or mammography, and CT scanning. Finally, ultrasonography became our choice for the medical examination tool of our integrated solution. We excluded breast clinical exam, tissue sampling, mammography, MRI, and CT scanning for their inadequate sensitivity, invasiveness, radioactivity, high false positive rates, non-cost effectiveness, and high hardware installation threshold. These vulnerabilities or disadvantages, alone or combined, were detrimental to their promotion in community or grassroots scenarios.
Ultrasonography is physically safe, technically reliable and cost effective. Its ultrasound imaging will not cause invasive or radioactive damage as tissue sampling or mammography does. When examining dense breast that features Asian women figures, ultrasound has stronger penetration effect than mammography and thus better serves diagnostic accuracy. The procurement and maintenance costs and operation requirement of ultrasonography, which is radioactivity free, are also lower than other imaging tools such as MRI and mammography. The identification accuracy of breast masses detection of ultrasonography is 27% better than mammography [20,21] among the women aged up to 50.
However, ultrasonography has its limitations. Its pixel quality is not as fine as other imaging tools and can be potential risk for missed diagnosis of small tumors. Its diagnosis accuracy is subject to influence from human factors including individual radiologists" experience and their physical and psychological status. In China, there exists an enormous gap seemingly impossible to fill amid very limited supply of breast cancer screening service by less than 130,000 ultrasonography radiologists against the demand by hundreds of millions of women, not to mention that these limited amounts of ultrasonography radiologists are also responsible for examination of other diseases.
In the past two decades, significant research work was done in an attempt to reduce the casualty rate of breast cancer and increase the accuracy of detection systems such as computer aided detection (CADs) [22]. CADs were introduced in different countries especially in United States. However, the effectiveness of diagnosis [23,24] of various CADs is affected by their detection algorithms and database availability for processing and storing related medical images. There were a few systems developed in the past such as the Digital Database of Screening Mammography (DDSM), which was completed in the late 1990s. It was the main source for researchers to make image analysis of breast cancer. A database of ultrasonography imaging for breast cancer is not yet available in the world. "Dr.J" becomes game changer by innovating a division of labor model for breast cancer screening that is empowered by AI, ordinary frontline staff without ultrasonography background, like average nurses or technicians, can finish the initial scanning and diagnosis of subjects, and the ones with suspected lesions detected by AI are filtered out for remote diagnosis confirmation by human ultrasonography radiologists. "Dr.J" minimizes the risks of misdiagnosis and missed diagnosis by thorough analytic processing of real-time ultrasound video feed, instead of sampled static images, at pixel level by AI. Its AI system can recommend categories for breast examination based on Breast Imaging Reporting and Data System (BI-RADS, which was developed and published by the American College of Radiology (ACR)) [25]. We also developed an automatic tracing system to ensure complete scanning of the entire breasts, take a snapshot of the location where the ultrasound probe detects the lesion, and display the location of the lesion with a clock position diagram. The operation of "Dr.J" is simple and does not require prior medical education or training background for its users. Since the diagnosis is made by deep learning, its diagnosis service is reliable, stable and almost infinite. "Dr.J" is integrated with a cloud computing platform that enables remote data processing and real time delivery of preliminary screening services. It also serves as a big data platform for storage of vast collected breast cancer screening data, such as patient information, lesions" images and BI-RADS classifications.
This innovative model significantly optimizes the utilization of constrained resources of ultrasonography radiologists and enhances the availability and accessibility of breast cancer screening services to the public, especially in communities. It also brings tremendous value with continuous improvement of the intelligence algorithms and neural networks as well as other in-depth utilization such as public and personal health management. It is China"s first AI ultrasonography preliminary screening solution for breast cancer and AI Powered Regional Breast Cancer Tiered Medical Service Network.

A. Development of "Dr.J"
The development of "Dr.J" involved the technologies of deep learning, image processing, cloud computing, big data, computer vision and the object detection [26][27][28][29][30][31][32][33][34]. Fig. 1 indicates the design model for its development. It follows the track of capacity building for training deep neural network model for lesion detection and classifications, real-time www.ijacsa.thesai.org processing of video feed of breast ultrasound, and a big data platform for the sharing and storage of information based on cloud computing.

 Convolutional Neural Networks (CNNs)
AI-enabled detecting and diagnosis of breast tumor are the essential values of "Dr.J" solution. To apply deep learning models to analyze medical images has been a topic of profound interest for researchers. Advanced artificial intelligence featuring algorithms and scientific neural models can be critical to the development of a breast cancer screening system and achieve breakthroughs in the machine detection and diagnosis of lesions. The deep learning and artificial neural networks have already matched and exceeded the human performance level as proved in a number of high-profile contests highlighted by Google"s "Alpha Go". Rapid technology innovations have made neural networks stronger and learn faster.
Convolutional neural networks (CNNs) can be applied to develop AI-based screening and diagnosis systems and already proved effective in radiologist operation environment [35,36]. There have been research done using different image analysis algorithms based on deep CNNs [37,38] for image recognition and lesion detection for both mammography and ultrasonography [39][40][41]. In this study, we formulated the operation and features of the ultrasound-based screening and applied CNNs to develop a set of algorithms for ultrasound image analysis featuring machine detecting and diagnosis of breast problems including malignant tumor, benign tumor, cyst and lymph nodes. It was trained with over 60,000 breast ultrasound images and tested on numerous real patients. A twophase clinical validation of "Dr.J" has proved that its algorithms" reliability and accuracy matches human radiologists" level. To exclude operation risk of missed scanning of breast target areas, we also applied our own patentpending visual tracing technology to use a camera to automatically track the ultrasound probe and display on monitor the scanned and remaining target areas. Such comprehensive smart capabilities based on AI and other advanced IT technologies can effectively empower staff in community clinics where radiologist services are usually not available.  Along with the training of a neural network model for lesion detecting, we also worked on its capability for real time processing video feed of the breast ultrasound images. The system needs to decode live ultrasound video feed, whose frequency is 24-32 image frames per second (FPS) for digital image processing and object detection as shown in (Fig. 2). The deep convolutional neural network would further analyze each single decoded image, identify and highlight the malign tumor, benign tumor, and cyst as well lymph node. This real time processing keeps repeating until the breast scanning is finished. Comprehensive coverage significantly contributes to the reliability of "Dr.J". The ultrasonography scanning of a subject normally takes five to ten minutes. During this process, at least 7,200 and more ultrasound images from the live video feed are thoroughly processed real time by "Dr.J". The neural network of "Dr.J" has a strong generalization ability that can intelligently recognize and process ultrasound images from various machine models manufactured by different companies. In comparison, former currently available AI powered analysis of X-ray, CT and MRI only examines much smaller numbers of samples ranging less than 1000 static images in a non-realtime mode. Furthermore, "Dr.J" computing analysis of images is conducted at the finest possible level of pixels. All these strengths maximize the examining coverage of the targeted areas and minimize the risk of missed diagnosis.

B. Training of Lesion Identification and Classification Model
The next thing to train was the lesions identification and classification model. When processing live video feed, the neural networks of "Dr.J" diagnose the detected lesions. It can identify four types of breast health problems, namely, malignant tumors, benign tumors, cysts and lymph nodes. It is an essential capability of "Dr.J" to detect tumors and make initial judgment of their malignant or benign status, cystic or solid status as well as their BI-RADS classifications. Detected lesions will be highlighted in rectangular shapes together with side indication of its nature as judged by AI. If AI detects no lesion, the system would see its subject as healthy and not alert it for further processing; however, if a lesion is diagnosed, the lesions classification functions would perform in accordance with BI-RADS. "Dr.J" sets four categories of BI-RADS classifications from Category 1-4 depending on their malignant or benign status, cystic or solid status, and size of lesions. BIwww.ijacsa.thesai.org RADS Assessment Categories include seven categories: 0) Incomplete information to diagnose; 1) Negative, means healthy; 2) Benign finding(s); 3) Probably benign; 4) Suspicious abnormality; 5) Highly suggestive of malignancy; 6) Known biopsy-proven malignancy.
For the purpose of preliminary screening, we exclude Category 0, and combine Category 5 with Category 4, since as long as a subject is classified as either Category 4 or Category 5, the subject definitely needs to have further examination, and it does not make too much difference in the scenarios of preliminary screening. We exclude Category 6 too, since Category 6 will be classified only after biopsy. "Dr.J" would alert such findings for consideration of further processing including radiologist review and tissue sampling. (Fig. 3) shows the BI-RADS classifying. The detected lesions found in a subject will be displayed with description of its size and locations. In the first phase as indicated in Fig. 4, the examination of subjects is performed by the same ultrasound equipment and the same radiologist, the ultrasound video live feed was analyzed by the radiologist and the AI system of "Dr.J" simultaneously, the results from radiologist was compared with the results from "Dr.J", and both suspected positive results were verified through biopsy.     5 represents the second phase of validation in which breast ultrasonography examination was still conducted by the same ultrasound equipment. At the first step, a radiologist examined the subject, and at the second step, a nonradiological operator scanned the subject"s breasts under the surveillance of "Dr.J" and the ultrasound data was analyzed by the AI system of "Dr.J". The results from radiologist were compared with the results from "Dr.J", and both suspected positive results were verified through biopsy.

C. Big Data Platform Based on Cloud Computing
Another highlight of "Dr.J" is its big data capability supported by cloud computing technologies for efficient and effective sharing and storage of breast health information. In ""Dr.J"" solution, breast cancer screening data are stored and processed with cloud computing for data sharing, and big data analysis. The information normally includes a subject or patient"s age, gender, name, ID number, living place, captured images of lesions, types of lesions, locations of lesions, recommended BI-RADs ratings of breasts, and the time of conducting the screening, etc. The scope of collected data can be extended to include other related information such as family breast cancer history, dietary preference, career background etc. to serve further public and personal health management and scientific research desires. Furthermore, the number of images collected from a subject varies and depends on her health conditions of breasts. The collection of lesions information includes their locations and sizes of lesions (width, height, area). For healthy breasts, no image will be captured. The captured images of lesions are saved in the format of .JPEG format. Such information can also be downloaded and printed from the system. Multiple lesions can be detected on one ultrasound image.
For each frame with lesion detected, a group of three images are captured, one image is the original ultrasound frame without any labeling by AI system, one image is the image with labeling by AI system, and one image is the clock position diagram generated by ultrasound probe tracing system when lesions detected, which demonstrates the positions of lesions. This information facilitates the quick detection of lesions in the follow-up examination by physicians. www.ijacsa.thesai.org Supporting by infinite storage space in cloud, "Dr.J" can save huge amount of information for future development including image analysis and statistical analysis [30]. The database can efficiently filter the data including detected images according to desired query. They can also serve the continuous deep learning of the various neural networks of "Dr.J." Convenient information sharing that is enabled by cloud computing gives radiologists" convenient remote access to the breast cancer screening data. The ultrasound images with lesions detected by "Dr.J" are uploaded with lossless compression to the data repository on the cloud, and as a quality control means, remote radiologists can download and verify these images with lesions. Fig. 6 indicates a radiologist"s remote examining of images with suspected lesions as detected and displayed on her end by "Dr.J".
After the preliminary screening performed by "Dr.J", an AI Breast Cancer Preliminary Screening Report will be generated automatically by "Dr.J", as shown in Fig. 7, which displays the images with detected breast cancer lesions, and the BIRADS categories for both breasts suggested by the system. This AI Breast Cancer Preliminary Screening Report can be accessed on smart devices via social media application of Wechat which is the most popular social app in China. After the preliminary screening performed by "Dr.J", an AI Breast Cancer Preliminary Screening Report will be generated automatically by "Dr.J", as showed in Fig. 9, which displays the images with detected breast cancer lesions, and the BIRADS categories for both breasts suggested by the system. This AI Breast Cancer Preliminary Screening Report can be accessed on smart devices via social media application of Wechat which is the most popular social app in China. Fig. 8 is the workflow diagram of the solution. Before the breast cancer screening, "Dr.J" system needs to be connected to the ultrasonography equipment in healthcare institution to receive ultrasound video feed from the ultrasonography equipment.

D. "Dr.J" in Operation
During the screening, "Dr.J" will analyze the breast ultrasound video signal, the ultrasound images will be captured and labeled when lesion is detected, compressed with lossless algorithm and uploaded to the data repository on the cloud. Remote radiologists or physicians can download the breast cancer screening data for quality control verification or referral of medical treatment.   IV. BREAKTHROUGHS ACHIEVED BY "DR.J" ""Dr.J" innovatively reshapes the division of labor in preliminary breast cancer screening and significantly enhances the availability and accessibility of such service in communities. Its AI-enabled diagnosis capability successfully lowers the threshold for preliminary breast cancer screening and empowers frontline manpower. In traditional screening model, breast scanning and diagnosis are conducted by the same radiologist who has to screen all subjects equally no matter she has or has no breast health issues, in order to filter out the ones with breast health issues. Such practice is not an effective use of the valuable time of ultrasonography professionals who are in great shortage, especially when the majority of screening subjects are healthy and only very few of them have suspicious breast masses that need radiologists" intensive attention. With "Dr.J" solution, the labor-intensive breast preliminary scanning and upfront software operation can be performed by average people who are not required to have medical or IT background. These people can fulfill the frontline tasks after receiving a few hours of training. www.ijacsa.thesai.org The preliminary diagnosis of screening subjects is conducted by solution"s neural network. Only the cases with suspected positive results will be sent to radiologists for further processing. Therefore, "Dr.J" maximizes the value of radiologists by focusing their professional diagnosis service on those who are most needed. Consequently, in proportion to the patients with breast health issues, the number of subjects that a radiologist can effectively cover multiples exponentially when compared with the traditional screening practice as indicated in Fig. 9. The ""Dr.J"" system makes possible the establishing of a tiered hierarchy medical network for breast cancer screening, diagnosis, treatment, recovery, and follow-up checkup as shown in Fig. 10. Such network comprises community health service providers at the grassroots, regional centers at the higher level, and medical institutions such as general and specialty hospitals at the top. In such a network, preliminary screening can be conducted by normal staff at community clinics by using "Dr.J." When a suspected tumor is detected, via the cloud computing platform of "Dr.J", the community health service providers at grassroots can refer the case to radiologists stationed in regional center for remote or site diagnosis. When the positive result is confirmed, the patient can be further referred to general or specialty hospital for medical treatment. When the treatment is completed, the hospital can refer the patient back to a community clinic for rehabilitation where the patient will be monitored periodically by "Dr.J". The two layers of regional center and medical institution can be combined into one depending on local resource availability and needs. "Dr.J" is a digital big data platform for efficient information storing and sharing for this complete screening-treatment-recovery ecosystem of breast health managements.
The ""DR.J"" system can be used in diverse scenarios, including community health service provider centers (normal clinics), beauty parlors, mobile screening stations (ambulances) and the community mobile breast cancer preliminary screening operators as presented in Fig. 11. The experts can remotely access related information such as images and suggested BI-RADS classifications and provide their diagnostic feedback via "Dr.J" cloud platform. "Dr.J" also has both Chinese and English language versions. It can not only serve women in China but also other areas in the world. For further information, one can contact by the given website link (http://www.800ai.com). V. RESULTS During the evaluation of the neural networks of "Dr.J", deep learning based captured tens of thousands of breast cancer ultrasonography images are used and results shows that the developed solution is capable to efficiently identify the breast lesions of breast cancer patients. The data was collected from several hospital partners including Duanzhou District Women and Children"s Hospital, Guangdong, China. These images were labeled by radiologists with lesions of various breast problems highlighted.
To achieve valuable sensitivity and specificity, it also involved distinguishing lesions from normal tissues like fat, gland etc. For the training and testing of the neural networks, the training images are in .PNG format and their size formats are set at 300x300 pixels. The training also involves application of advanced computer vision and object detection techniques. Its calculation is based on pixel level of the images. Fig. 12 indicate the successful lesion detection by "Dr.J." www.ijacsa.thesai.org There has been constant feeding of newly labeled images to the trainings of detection model. Its accuracy and reliability of detection have consistently improved from repeated learning and training of more images. We have captured all images with related information of exact time and location during the screening process as shown in Fig. 13.

VI. CONCLUSIONS
"Artificial intelligence + Internet + Cloud computing + Ultrasonography" can effectively break the bottleneck of insufficient large screening capacity, addressed the irregular availability and quality of existing breast cancer screening services, and empower community clinic staff. It greatly optimizes the utilization of professional service resources to let radiologists" focuses on where they most needed. It will effectively minimize the risks and costs from women"s exposure to and treatment of the deadly breast cancer.
In future, with the growth of data processed by "Dr.J," its detailed labeling of detected lesions, suggested BIRADS classification, structure of neural networks and its performance will be conferred in depth. As a result, its accuracy and reliability will continue to improve and can help in serving most of the women. The model of "Dr.J" can not only be applied in the combat against breast cancer, but will also be used for preliminary screenings of other diseases such as thyroid cancer.