Mammogram Segmentation Techniques: A Review

There is a significant development in computeraided detection (CADe) and computer-aided diagnostic (CADx) systems in recent years. This development coincides with the evolution of computing power and the growth of data. The CAD systems support detections and diagnosis of significant diseases, including cancer. Breast cancer is one of the most prevalent cancers influencing women and causing death around the world. Early detection of breast cancer has a significant effect on treatment. The typical CAD system goes through various steps, including image segmentation, feature extraction, and image classification. Image segmentation plays an important role in CAD systems and simplifies further processing. This review explores popular mammogram segmentation techniques. A mammogram is medical imaging which uses a low-dose x-ray system to see inner tissues of the breast. There are many segmentation techniques used to segment medical images. These techniques can be categorized into five main categories: regionbased methods, boundary-based methods, atlas-based methods, model-based methods, and deep learning. A ground truth image is needed to measure the performance of the segmentation algorithm. Different performance measurements were used to evaluate the segmentation process, including accuracy, precision, recall, F1 score, Hausdorff Distance, Jaccard, and Dice Index. The research in mammogram segmentation has yielded promising results, but there is room for improvements. Keywords—Mammogram; medical imaging; segmentation; preprocessing; breast cancer


I. INTRODUCTION
Over the years, Artificial Intelligence (AI) algorithms have been improving and having impact on every aspect of human life. In recent years, there has been a significant development in machine learning techniques and high-performance computers, along with a massive increase in digital data in various fields. Diagnosing diseases through radiology is an important medical application of AI algorithms. An example of this application is CADe and CADx systems. The CAD systems are used to assess patient's diagnostic images by clinicians and radiologists. Most CAD systems consist of the following steps: image preprocessing, segmentation, feature extraction, and classification. There are many studies conducted in using a CAD system to diagnose and detect breast cancer from medical imaging [1]. This review discuss different aspects related to mammogram segmentation. The rest of the review is divided into the following sections: Section II provides background about medical imaging and mammogram. Section III includes a description of some public mammogram datasets. Section IV explains performance measurements used in mammogram segmentation. Section V discusses different segmentation techniques used in mammogram images. Section VI is a discussion of studies mentioned in the review. The last section concludes the paper.

II. BACKGROUND
This section provides background about medical image analysis, breast cancer, and mammogram images.

A. Medical Image Analysis
Medical images are different from regular photos; they represent physical features measured from the human body. Therefore, the analysis of medical images must be guided by particular expectations and follow a medical reference. AI has been used in medicine since the 1980s [2]; later on AI medical applications are continuously expanding. Nowadays, medical image analysis has become a branch of artificial intelligence. There are books, academic journals, and conferences for medical image analysis research. There are various types of medical images, including X-ray imaging, magnetic resonance imaging, ultrasound, nuclear imaging, optical microscopy, etc. X-ray Imaging uses electromagnetic waves with a wavelength above the visible spectrum to produce a diagnostically meaningful image. Fluoroscopy and angiography, Computed Tomography (CT), and mammography are kinds of X-ray imaging [1].

B. Breast Cancer
Breast cancer is a disease caused by an abnormal growth of breast cells [3]. It is one of the most prevalent cancers influencing women and causing death around the world [4]. Early detection of this disease increases the recuperating rate significantly [5]. Three main types of examinations are commonly used to detect breast cancer: 1. self-examination performed by the patient herself, 2. a clinical examination conducted by well-trained specialists, 3. a radiology examination conducted by a radiologist using visual evaluation. Studies show that the most accurate radiologic procedure for early detection of breast cancer is the mammogram [4].

C. Mammogram Images
A mammogram is medical imaging that aims to see inside tissues of the breast by using a low-dose x-ray system [3]. There are two imaging modalities of mammograms: digital 520 | P a g e www.ijacsa.thesai.org mammogram and screen-film mammography. The screen-film mammography (SFM) contains conventional analog mammography films. Usually, SFM contains labels and markers in the background, which considered as noise and need to be removed. The digital mammograms are also called Full-Field Digital Mammography (FFDM) images. The FFDM is more recent and does not include labels [2]. Moreover, mammogram images can be found in several formats including LJPG, DI-COM, PGM, and TIFF. In the stander view for each breast, two X-ray images need to be taken on both sides. Therefore, four images of both breasts need to be examined. These four images are called: LEFT CC, LEFT MLO, RIGHT CC, RIGHT MLO [6], [7]. The Craniocaudal (CC) view is obtained from top horizontally compressed breast (head-to-foot picture). The CC view captures the medial portion and the breast's outer lateral region as much as possible. The Medio Lateral-Oblique (MLO)-side view-captures the whole breast and usually contains the lymph nodes with the pectoral muscle. Fig. 1(a) and 1(b) show the example of CC and MLO views. And Fig. 2 illustrates the angle of each view [8].

III. PUBLIC MAMMOGRAM DATASETS
There are several mammogram datasets publicly available. Following is a brief description of the most used datasets, which are referenced in studies cited in the review.

A. Mammographic Image Analysis Society (MIAS)
The Mammographic Image Analysis Society (MIAS) is a research group from the UK interested in studying mammograms. This group generated a small mammogram database in 1994 called mini-MIAS or MIAS for short. The mini-MIAS consists of 322 digitized films stored in the PGM image format. Every image has a resolution equal to 1024 × 1024 pixels [9].

B. Digital Database for Screening Mammography (DDSM)
The DDSM project is a collaborative effort between the Massachusetts General Hospital, the University of South Florida, and Sandia National Laboratories. The dataset includes 2620 cases. A case consists of between 6 and 10 files. These are an 'ics' file, an overview "16-bit PGM" file, four image files compressed with lossless JPEG encoding, and zero to four overlay files [10].

C. INbreast
INbreast is a full-field digital mammographic database. The cases were collected from Centro Hospitalar de S. Joa o [CHSJ], Breast Centre in Portugal, in 2011. The database includes 115 cases with a total of 410 images. The resolution of images was 3328 4084 or 2560 3328 pixels and saved in the DICOM format. The region of interest (ROI) was annotated by two specialists and stored in separate .roi and .xml files [11].

D. Breast Cancer Digital Repository (BCDR)
The IMED Project supported the creation of BCDR. The IMED project was supported by FMUP-CHSJ University of Porto, Portugal, INEGI, and CETA-CIEMAT Spain, from March 2009 till March 2013. The BCDR includes 1734 cases with mammography and ultrasound images. Also, it includes clinical history, mammogram lesion segmentation, and selected pre-computed image-based descriptors. The dataset is subdivided into Full Field Digital Mammography-based Repository (BCDR-DM), and Film Mammography-based Repository (BCDR-FM) . Mammogram images were saved in the TIFF format. The BCDR-FM part has a resolution of 720 x 1168 pixels and 8 bits depth. While the BCDR-DM resolution is equal to 3328 x 4084 pixels and 14 bits depth [12].

E. Curated Breast Imaging Subset of DDSM (CBIS-DDSM)
CBIS-DDSM is an updated and standardized version of the Digital Database for Screening Mammography (DDSM) stored in the DICOM file format [13]  There are several ways to measure the performance of the segmentation technique. If the ground truth image of the target area is available, then the Dice similarity coefficient or Jaccard Index can be used. Dice Similarity Coefficient (DSC or dice): equivalent to twice the number of elements common on both sets divided by the sum of the number of elements in each set. DSC is usually used for auto-segmentation models and computed by this equation: Where A represent the segmented image resulted from the algorithm, and B represent the ground truth image. Jaccard Index or Intersection over Union (IoU) is another similarity measurement. IoU computes the similarity ratio of elements in two sets, A and B, as set intersection over the number of elements in the set union: Hausdorff Distance is also used to assess the medical image analysis algorithm's performance. This measurement used when outliers need to take it into account. The Hausdorff distance, h (A, B), is given by Equation (3): where d (a, b) is the Euclidean distance between the points a and b [14].
Other performance measurements used in CAD systems are accuracy, sensitivity, specificity, precision, recall, and F1 score. Following are equations for these measurements.

Accuracy=
Number of examples identify correctly Total number of example (4) To compute precision and recall, the confusion matrix must be created first. A confusion matrix is a table used to describe the performance of a classification model. The table II illustrates the confusion matrix: Positive here mean a target class, for example in brest cancer detection problem the positive class is a cancer or abnormal masses, and negative class is the mammogram with no cancer detected [15].  V. MEDICAL IMAGING SEGMENTATION Image segmentation aims to simplify further processing by partitioning the digital image into regions that share similar characteristics. Fig. 3 shows a block diagram of the standard CAD system. There are many segmentation techniques used in segmenting mammograms. These techniques can be categorized into five primary types: region-based methods, boundary-based methods, atlas-based methods, model-based methods [16], and deep learning [17], [5].

A. Region based Segmentation
In region-based methods, a segmentation is done based on similarities between regions. Thresholding, Region-growing, watershed, split and merge, and clustering are types of regionbased segmentation methods [16].

1) Thresholding:
Thresholding is mostly used to separate an image into a background and foreground object. First, a specific value T is selected as a threshold value based on image histogram and local properties. All pixels below T will be considered background, and all pixels equal to or greater than T will be considered foreground. Using multilevel thresholding gives a better result, the authors in [18] proposed a CAD system detecting suspicious mass lesions in the mammogram. The proposed system starts with three preprocessing steps. First, the median filtering with a 3 x 3 window is used to remove noise. Second, morphological operations are applied to remove artifacts and background. At the last preprocessing step, a single-seeded region-growing algorithm is used to remove pectoral muscles. The second 522 | P a g e www.ijacsa.thesai.org phase in the proposed CAD is detecting mass using Dual-stage adaptive thresholding. The performance was measured by sensitivity and false-positive per image (FP/image). The evaluation was done on DDSM and MAIS datasets. The result was sensitivity= 93, FP/image = 0.84. The work [19]. proposed a hybrid approach based on Otsu's multithresholding and Watershed Segmentation (WSS) to mine the suspicious sections from mammograms. They used images from the MAIS dataset and measure the performance with many measurements includes Root Mean Square Error (RMSE) and Normalized Absolute Error (NAE). Different thresholding levels were tested; however, th=4 gave the best results, RMSE= 21.7732 and NAE= 0.2429. The authors in [20]. developed a fully automated pectoral muscle segmentation method. This method consists of four steps. First, capturing a small rectangular region in the top-left corner of mammograms and enhancing it using the fractional differential method. Second, segmenting a rough binary boundary of the pectoral muscle in the rectangular region, using an improved iterative threshold method. Third, adapting a rough contour with the least-squares method based on points of the rough boundary. Finally, evaluate the local active contour to acquire the final pectoral muscle segmentation line. The dataset consists of 720 MLO, which are FFDM. The overall performance of this method in the Dice coefficient equal to 0.986±0.005. The authors in [21] , proposed multilevel thresholding based on the electro-magnetism optimization (EMO) technique to segment pectoral muscles. EMO is an evolutionary method that mimics the attractionrepulsion mechanism among charges to evolve the members of a population. The first step is to crop the mammogram image. The second step is extracting a region of interest (ROI) using the stepwise contrast limited adaptive histogram equalization (CLAHE) algorithm. Also, the CLAHE method is used to enhance contrast in mammogram images. The third step is to enhance the image using the histogram equalization technique. In the fourth step, the EMO algorithm with Otsu objective function and Kapur objective function is applied. Finally, the straight-line estimation is used to identify the pectoral muscle. This segmentation was tested on the MAIS dataset and gave an accuracy = 96.58%. The work [22] proposed an adaptive hysteresis thresholding method to detect mammogram masses. This method was applied on MAIS and DDSM datasets and gave sensitivity equal to 96.6%, 96.4%, respectively.
2) Region growing: In a region-growing segmentation, algorithm starts with seed points representing each class of image (e.g., background and foreground classes). Each class grows according to the homogeneity of neighboring pixels; this process continues until reaching homogenous and connected regions [23]. The work [24] ,proposed an automated mammogram segmentation based on region growing and sliding window algorithm (SWA). First, the authors prepared the MIAS dataset by removing artifacts and labels using the opening morphological operator and binary mask. Then remove pectoral muscle using SWA and segment mammogram ROI using Dispersed Region Growing Algorithm (DRGA). The overall accuracy of this approach equals 91.3%. The authors in [25] , proposed a pectoral muscle segmentation and tumor detection approach. This approach starts with the Otsu method to remove artifacts. The region-growing method is used to eliminate the pectoral muscle. Then estimate the number of classes based on the LBP Technique and classify mammogram objects using Kmeans clustering. Finally, they extract the tumor by a hidden Markov model. The proposed approach was tested on the MAIS dataset, and the overall accuracy = 91.92 %. The work [26] aapplied an adaptive fuzzy region-growing algorithm on two FFDM private datasets to segment suspicious lesions and characterize them. The performance was measured by sensitivity and specificity, and the results were 91.67%, 58.33%, respectively. After detecting suspicious lesions, k-NN and SVM classifiers were used to classify masses as benign masses or malignant tumors. The classification results for k-NN and SVM achieved sensitivity = 84.44% and 85.56%, specificity = 91.11% and 91.67%, FPsI = 0.54 and 0.55 respectively. The authors in [27] , proposed another method to detect the lesion's boundaries in mammogram images based on the region-growing algorithm. The MAIS dataset was used in this work. The performance was measured by accuracy, specificity, sensitivity, and overlap, and the results were 91%, 97 %, 83%, and 79%, respectively. The work [28] , proposed an automatic breast cancer detection approach consisting of four amin processes applied on the MAIS dataset. The first phase is the preprocessing, enhancing images, and removing noise using median filtering. The second phase is mammogram segmentation using region growing. The third phase is feature extraction. Finally, the classification phase uses an optimized fuzzy logic classifier. The performance was measured for the segmentation and classification phase. The segmentation accuracy = 0.98%, and the fuzzy classifier accuracy = 0.91667 %. The authors in [29] proposed a segmentation method based on region-growing techniques. The proposed method included four main steps and was applied on MAIS and DDSM. The first step was extracting the Region of Interest from mammogram images. At the second step, automatic thresholding was applied to binarize the image. The third step was determining the seed points automatically using the density of the pixels' value. Finally, they calculated the threshold value for region creation in seed region growing. The results show that the Dice Similarity Coefficient (DSC) =94.8, 94.6, and Relative Overlap (RO) = 90.2, 89.8 for MAIS and DDSM, respectively.

3) Watershed:
The key behind using the watershed transform for segmentation is this: Change the image into another image whose catchment basins are the target objects. Watershed Algorithm is based on simple morphological operations [23]. The work [30] , proposed a two-phase microclassification segmentation approach. First, detection microcalcifications used morphological operations. Second, 523 | P a g e www.ijacsa.thesai.org the micro-calcification shape was extracted using the watershed. This approach was applied on DDSM, and the overall performance in dice (similarity index) equals 80.5%. The authors in [31] proposed another segmentation approach based on the watershed algorithm. This approach consists of four stages. In the first stage, the ROI images were cropped to 200 x 200 pixels, and the background was removed. In the second stage, the Principal Component Analysis (PCA) method was applied on the cropped image to remove the noise. In the third stage, the Fuzzy C-Means (FCM) was applied to partition the ROI images into the foreground and background clusters. The foreground includes the abnormality region, which will be used in the final stage. In the last step, marker-controlled watershed segmentation was performed with three various structuring elements: disk, diamond, and octagon shapes. The dataset used in this study was obtained from the National Cancer Society of Malaysia.  [32], proposed a semi-automatic segmentation of masses from mammogram images. The proposed approach includes three main stages. First, the median filter was applied to enhance image quality. Second, an initial segmentation was composed based on canny and watershed algorithms. Finally, the boundaries of tumors were extracted using the region-growing algorithm. The MAIS dataset was used and the performance measured by overlap value was equal to 81.3%. 4) Splitting and merge: Split and merge depend on the tree structure, the image splitting successively into quadrants tree based on a homogeneity criterion. Then similar regions are merged to create the segmented result. The work [33] , used a blended approach of region-based method and splitting and merging technique. The proposed approach is applied on MIAS mammogram images. First, the morphological operation is used to remove the noise from images. Then, the splitting step is performed based on the region's growing method (seed points). Finally, at the merge step, the binary values are reconstructed to form a structured mammogram image. The structured image is completed for finding the seed point and grown points. The performance measured by five statistical parameters: mean = 0.0759, variance = 0.0702, entropy = 6.521 standard deviation = 0.2649, and correlation = 0.7869. 5) Clustering: In clustering, pixels are grouped into clusters, in which pixels in the same cluster are more similar to each other than to those in different clusters. The two types of clustering used in image segmentation are K-means clustering, and fuzzy C means clustering [23], [16]. a) K-means clustering: The K-means clustering algorithm start by setting K centroid points (or pixel values). Then assign the remaining pixels to their closest cluster center for each cluster. Based on the resulting cluster, reset a suitable centroid of each cluster. These two steps repeat until the algorithm meets the chosen criteria. In segmentation, the value of k depends on the number of objects want to be extracted.
b) Fuzzy C means clustering: Fuzzy C means (FCM) is a kind of clustering in which a single data point can belong to more than one cluster [34]. The authors in [35] applied FCM and K-mean clustering algorithms on the MAIS dataset to segment mammogram images. The performance was measured by accuracy. The accuracy of FCM was 94.12%, while Kmean accuracy was 91.18%.

B. Boundary-based Segmentation
Unlike region-based segmentation, boundary-or edgebased segmentation depend on differences between regions. There are variety boundary-based segmentation techniques. Roberts, Sobel, Prewitt, Laplacian, and Canny edge detection are examples of boundary-based segmentation techniques [23], [16]. The work [36] proposed a method that segments the breast boundary and pectoral muscle in mediolateral oblique (MLO) views of mammograms automatically. The proposed method consists of three main stages. The first stage removes noise from mammogram images by applying median and anisotropic diffusion filters. The second stage segment the mammogram using Canny edge detection. Finally, the overestimated boundary caused by artifacts was handling by a proposed post-processing stage. Three public datasets were used to evaluate this method including MIAS, INbreast, and BCDR. Experimental results show that dice similarity coefficients on breast boundary and pectoral muscle estimation were equal to 98.8% and 97.8% for MIAS, 98.9% and 89.6% for INbreast, and 99.2% and 91.9% for BCDR respectively.

C. Atlas-based Segmentation
Atlas-based segmentation is an algorithm that aims to extract the relevant anatomy from medical images and to present it in an appropriate view [37]. The atlas-based approach is suitable for segmenting images with unclear associations between regions' and pixels' intensities [38]. The authors in [14] ,proposed an atlas-based algorithm to segment breast area from mammogram images. The preprocessing step includes standardizing mammogram images by flipping the left breast mammograms so that all mammograms had the same orientation. Then make the images square by padding on the left and right, then remove this padding and determine the breast region. The main algorithm consists of two stages. In the first stage, select a set of atlas mammogram images using the K-means clustering algorithm. The number of clusters is determined by applying 2D projection using tributed Stochastic Neighbor Embedding (t-SNE). In the second stage, they used atlas mammogram images with a deformable registration algorithm to segment the images. They tested this algorithm on mini-MIAS and DDSM datasets. The performance measurements used were Hausdorff Distance = 13.34 and Jaccard Index= 0.94. 524 | P a g e www.ijacsa.thesai.org

D. Model-based Segmentation
Model-based segmentation, or energy functions, are based on deformable models. We can define the deformable models as curves that deform due to some external or internal force [16]. This group of segmentation techniques has the ability to integrate high-level knowledge with information from lowlevel image processing. There are two classes of deformable models, parametric and non-parametric. The parametric deformable model is also called the active counter model. The work [39] , proposed a bimodal level-set formulation-based approach for mammogram segmentation. They used the mini-MIAS dataset and drew ground truths manually using a handbased polygonal tool. Compared with the Chan-Vese and Zhang models, the pro-posed approach achieved Precision AVG = 0.9448, Recall AVG = 0.975 within only 4-6 iterations, while the other two models required more than 60 iterations to get such results. The authors in [40] , proposed a preprocessing method for the mammogram CAD system, include pectoral muscle segmentation. The proposed method consists of four phases. First, remove noise using median and mean filters. Second, enhance image quality using the CLAHE algorithm. Third, remove radiopaque artifacts and labels present in mammograms by applying thresholding and morphological operations. Finally, using active contours to remove pectoral muscle. This preprocessing approach was tested on two datasets, mini-MIAS and INbreast. The results show that accuracy equals 90%, 98.75% for mini-MIAS and INbreast, respectively. The work [41] , provided an automatic mammogram image segmentation approach based on the Chan-Vese model. The target ROI consists of three classes. The mass class contains pixels pertaining to the mass. Background class includes background pixels which not pertain to the mass class.The remaining pixels, which separate the mass from the background belong to the contour class. The proposed approach consists of the Contour initialization step, fuzzy contours estimating step, and Contour optimization step. In the Contour initialization step, the gamma correction was used to improve the image contrast; then, Otsu thresholding was applied to binarize the mass region. The fuzzy contours estimation step aims to refine the initial contour. The last step is contour optimization using the Chan-Vese model. The accuracy of the proposed method was 93.96%, while precision and recall equal 88.08%, 91.12%, respectively.

E. Deep Learning
Deep learning is an artificial intelligence technique that can learn a pattern from raw data [42]. A typical deep learning algorithm is called artificial neural networks or multi-layer perceptron (MLP). Artificial neural networks (ANNs) is a mathematical model developed to mimic the operations of the human neurophysiological structure [43]. The authors in [44] , proposed a neural network framework to deal with complex shape variations of the pectoral muscle boundary in mammogram segmentation. This framework consists of a convolutional neural network inspired by Holistically Nested Edge Detection network (HND). The main benefit of HND is that it can deal with edge and boundary ambiguities of the object. The performance of this approach was measured by computing Jaccard and Dice metrics. Four public datasets were used in the study, including MIAS, INbreast, BCDR, and CBIS-DDSM. On average the Jaccard equals 94.6%, and dice similarity equals 97.5%. The work [45] proposed automated mass segmentation from mammograms based on a multi-level nested pyramid network (MNP Net). The proposed MNPNet divided into three subsections and employed an Encoder-Decoder framework. In the first section, the atrous spatial pyramid pooling (ASPP) module encoding was used to solve the intra-class inconsistency. In the second section, the multilevel feature pyramid produced by CNN was used to improve inter-class indistinction. In the last section, different ResNet structures were used to perform feature extraction, and the ResNet34 was the best. The proposed segmentation is applied on INbreast and DDSM-BCRP and achieves dice index equal to 91.10% and 91.69%, respectively. The authors in [46] , proposed an approach to segment breast tumors within mammograms' ROI using a conditional Generative Adversarial Network (GAN). They tested the model on two datasets INbreast and a Hospital Sant Joan de Reus private dataset. The cGAN network learns a complex pattern from simple data and has two subnetworks, generative and adversarial networks. The generative network recognizes the tumor area and generates the binary mask that detects it, while the adversarial network distinguishes between real (i.e., true) and synthetic segmentations. The performance of the cGAN model on mammogram segmentation task was measured using dice and intersection over union (IoU) performance metrics. For the INbreast dataset, the dice and IoU on the full mammogram image were 68.69% and 52.31%, respectively. After generating mask images, these images feed to a Convolutional Neural Network (CNN) to classify the tumor into one of four types: irregular, lobular, oval, and round. The The overall accuracy of the CNN classifier was 80%. The work [47] ,proposed an automated segmentation to detect microcalcification from mammograms. This approach consists of five steps; image enhancement, removing skin and air boundary, segmenting pectoral region, selecting suspicious region, and U-net segmentation. First, the Laplacian filter was applied to enhance mammogram images. Second, skin and air boundaries were removed using horizontal line fitting and the image erode method. Third, the breast region is segmented from the pectoral region by K-means pixel-wise clustering. Fourth, suspicious regions were selected using the fuzzy Cmeans clustering algorithm and were divided into positive and negative patches. Fifth, the U-net was trained on the positive patches of the previous step. Finally, the trained U-net was applied to segment the micro-calcification regions automatically from mammograms. This approach was applied on the DDSM dataset and measured by F-measure = 98.5%, Dice score = 97.8%, and Jaccard index = 97.4%. The authors in [48], used the Dense U-Net algorithm to segment suspicious breast masses from mammograms. The evaluation was done on the DDSM dataset. The performance was measured by F1score, sensitivity, specificity, and overall accuracy and results equal 82%, 77.89%, 84.69%, and 78.38% respectively. The work [49] also used U-Net to detect mass from mammograms. Moreover, they used different data augmentation techniques, such as image zoom, extracting nine regions of interest, and horizontal reversal. The training and evaluation were done on the DDSM dataset. The overall accuracy was equal to 85.95% and the Dice was equal to 79.39%. 525 | P a g e www.ijacsa.thesai.org

VI. DISCUSSION
The investigations mentioned in this review were selected to meet the following criteria: 1) The date of publication should be after 2016.
2) It was published in the ISI magazine or the ACM or IEEE 3) The work should contain precise quantitative performance.
Several segmentation techniques were discussed in this review. Table III summarizes them. As seen from Table III, there are different targets of segmentation. Some segmentation techniques aim to detect suspicious lesions (mass or tumor). Other methods aim to remove background or pectoral muscles. Moreover, there are segmentation techniques that target microcalcification. Table III categorize the segmentation techniques based on the target area. Table IV lists the preprocessing techniques used with each segmentation. The preprocessing techniques include filtering, applying morphological operations, performing data augmentation, and some other enhancement methods. Median filter, Gaussian filter, Bayesian non-local mean filter, Anisotropic Diffusion filter, and Laplacian filter were used to enhance mammogram images. Also, the morphological operations were used in most mentioned works. The data augmentation was mostly used with deep learning-based segmentation techniques. Moreover, Transformer (Intensity, Gamma) and the CLAHE method were used in some papers [18] [49], [21], [40]. The threshold is mostly used as a segmentation technique, but also could be used as a pre-processing step to remove unwanted regions in mass detection techniques [26], [41]. Also, the principal component analysis PCA was used to denoising images [31]. To simplify the processes, many researchers used to resize or cropping in the pre-processing step.
Some papers combine different segmentation techniques such as watershed and region growing. Among different mass segmentation techniques, improved region growing [28] gave the highest accuracy, while Canny edge detection [36] outperforms other approaches in pectoral muscle segmentation. Although studies in mammogram segmentation have yielded good results, there is room for improvement.

VII. CONCLUSION
In this review, an elaborate coverage has been performed in mammogram segmentation techniques. First, we provided an overview of medical image analysis and described the mammogram images. Then we gave a brief description of MIAS, DDSM, INbreast, BCDR, and CBIS-DDSM datasets. We discussed region-based segmentation, boundary-based segmentation, atlas-based segmentation, model-based segmentation, and deep learning approaches for segmentation; we gave an ex-ample from recent papers for each of these segmentation techniques. Then we explained the most-used performance measurement in the segmentation process. Finally, we summarized different mammogram segmentation works in table III, including the preprocessing step, dataset(s), and performance results for each one.