Improving Breast Cancer Diagnosis using Mask RCNN Nguyen Duc Thang *

Improving Breast Cancer Diagnosis using Mask RCNN

Nguyen Duc Thang *, Nguyen Hoang Phuong*, Nguyen Viet Dung **, Tran Vinh Duc**, Anh Nguyen***

* Thang Long University
Nghiêm Xuân Yêm Rd. Hoang Mai District, Hanoi, Vietnam
{[email protected]; [email protected]}
**Hanoi University of Technology, Hanoi, Vietnam
{[email protected]; [email protected]}
*** Auburn University, Auburn, AL, USA
{[email protected]}

Abstract – In this paper, we propose an approach to detect breast cancer on screening mammograms based on Mask region convolution neural networks (Mask R-CNNs) that extends Faster R-CNN using ResNet-50 model as a classifier, adding a branch for predicting an object mask in parallel and the existing branch for bounding box recognition. The proposed method is evaluated on CBIS-DDSM dataset which contains four views of screened breasts (L/R-CC, L/R-MLO) of 1240 patients. Our proposed method achieves better than state-of-the-art results with a per-image AUC score of 0.85. While other methods have higher score which only determine tumor localization and classify normal or malignant tumor.

Keywords – Breast cancer diagnosis, mammography, convolutional neural networks, deep learning, Mask R-CNN, segmentation, localization.

1. INTRODUCTION
Recent studies show that the leading cause of death in females, Vietnam 2017 aged between 45 and 55 is breast cancer. According to the "World Health Statistics 2012" published today by WHO, Vietnam had approximately 11,060 cases of female breast cancer were diagnosed, increase 30% compared to 2002, with 64.7% of the cases below age 50. With the increasing incidence of this disease, early detection of breast cancer leading to early treatment has contributed to a reduction in breast cancer mortality rates. Screening mammography is one of the most common tools used for early detection of breast cancer. Screening mammography has become significantly more commonplace to do that. However, screening mammography has a low sensitivity. This means that a small minority of the breast cancer cases remain undetected.

Therefore, for the early detection of breast cancer, we propose a Computer aided diagnosis – CAD system, designed to be used as a “second-option” to help the radiologist give more accurate diagnoses. which typically relies on machine learning/ deep learning techniques to detect tumors in digital mammogram images. In current deep learning technicals make something for Breast Cancer Diagnosis problems.

Many recent researchers have shown the potential of applying such networks to medical imaging, including breast screening mammography. Those studies only treated as an object detection problem including localization, recognition, classification, but they did not handle semantic segmentation, instance segmentation problems.

In order to address the computational issues, we augment dataset that could improve accuracy rate. We implemented an innovative architecure, Mask R-CNN to generate bounding boxes and segmentation masks for each instance of tumors in the image. Mask R-CNN based on Feature Pyramid Network (FPN) and a ResNet backbone.

2. THE DATA
457201713865a). Collection
1345565156718021164551950720Our dataset uses from the Curated Breast Imaging Subset of DDSM (CBIS-DDSM) which is an updated and standardized version of the Digital Database for Screening Mammography (DDSM). The CBIS-DDSM is a database of 1,566 patients each including both the mediolateral oblique (MLO) and craniocaudal (CC) views of each breast. It contains normal, benign, and malignant cases with verified pathology information. The images have been decompressed and converted to DICOM format. Updated ROI segmentation and bounding boxes, and pathologic diagnosis for training data are also included as shown in Figure 1.

b). Data preprocessing
All the images were preprocessed as follows:
– Noise removal using Morphology and Median filter Method.

– Label removal
center635
– Contrast enhanced digital mammography
center635
– Pectoral muscle suppression
center635
c). Patch generation
We only consider the mammograms containing masses which typically is most popular breast malignant tumor. So we have 1318 mammogram images from a total of 1566 patients. Which are high-dimensional (from 2000×2000 to 4000×3000 pixels) and high-resolution while the tumorous regions or ROIs are typically a small portion, representing approximately 1% of the overall mammogram image. So we divided origin mammograms into patches using sliding windows across 256×256 slices of the image and then classify each patch into background, malignant or normal folder. Therefore we have 2636 patches (1318 patches containing mass tumors and 1318 patches without any tumor as background).

d). Data Augmentation
In deep learning techniques, the neural network models need to learn a large number of parameters. The chance of overfitting the training data increases due to the model complexity. Augmentation of data is an upright way to avoid this action. Some augmentation techniques that are popularly used like flipping, rotation, scale, crop, translation, Gaussian noise.
We applied 5 rotations by angles (0, 45, 135, 225, 315) and outward by (-20%, -10%, 0%,10%, 20%) and then flipped all of those images horizontally and vertically and created a total of origin dataset from 2636 to 131800 cases. The dataset was performed an 80% – 10% – 10% split into training set, validation set, test set: 105440, 13180 and 13180, respectively.

3. METHODS
a). Mask R-CNN (regional convolutional neural network) was introduced in 2017 by extending a version of Faster R-CNN by same authors. Faster R-CNN is a most popular framework for object detection problems and Mask R-CNN extends it with instance segmentation. Mask R-CNN includes two stages. The first stage scans the image and generates proposals which areas contain an object. And the second stage classifies the proposals and generates bounding boxes and predicts object masks.

We derive our Mask R-CNN architecture based on the Functional Pyramid Network (FPN) variant of Mask R-CNN. Feature Pyramid Network extracts feature at different scales based on their levels in the feature pyramid. The ResNet – FPN backbone provides good accuracy without sacrificing? ? speed. Our model architecture has over 63 million total trainable parameters and over 111k non-trainable parameters. The overall architecture of our model is shown in Figure .

__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_image (InputLayer) (None, 256, 256, 3) 0
__________________________________________________________________________________________________
zero_padding2d_1 (ZeroPadding2D (None, 262, 262, 3) 0 input_image00
__________________________________________________________________________________________________
conv1 (Conv2D) (None, 128, 128, 64) 9472 zero_padding2d_100
__________________________________________________________________________________________________
bn_conv1 (BatchNorm) (None, 128, 128, 64) 256 conv100
__________________________________________________________________________________________________
activation_1 (Activation) (None, 128, 128, 64) 0 bn_conv100
__________________________________________________________________________________________________
max_pooling2d_1 (MaxPooling2D) (None, 64, 64, 64) 0 activation_100
__________________________________________________________________________________________________
res2a_branch2a (Conv2D) (None, 64, 64, 64) 4160 max_pooling2d_100
__________________________________________________________________________________________________
bn2a_branch2a (BatchNorm) (None, 64, 64, 64) 256 res2a_branch2a00
__________________________________________________________________________________________________
activation_2 (Activation) (None, 64, 64, 64) 0 bn2a_branch2a00
__________________________________________________________________________________________________
res2a_branch2b (Conv2D) (None, 64, 64, 64) 36928 activation_200
__________________________________________________________________________________________________
bn2a_branch2b (BatchNorm) (None, 64, 64, 64) 256 res2a_branch2b00
__________________________________________________________________________________________________
activation_3 (Activation) (None, 64, 64, 64) 0 bn2a_branch2b00
__________________________________________________________________________________________________
res2a_branch2c (Conv2D) (None, 64, 64, 256) 16640 activation_300
__________________________________________________________________________________________________
res2a_branch1 (Conv2D) (None, 64, 64, 256) 16640 max_pooling2d_100
__________________________________________________________________________________________________
bn2a_branch2c (BatchNorm) (None, 64, 64, 256) 1024 res2a_branch2c00
__________________________________________________________________________________________________
bn2a_branch1 (BatchNorm) (None, 64, 64, 256) 1024 res2a_branch100
__________________________________________________________________________________________________
add_1 (Add) (None, 64, 64, 256) 0 bn2a_branch2c00
bn2a_branch100
__________________________________________________________________________________________________
res2a_out (Activation) (None, 64, 64, 256) 0 add_100
__________________________________________________________________________________________________
res2b_branch2a (Conv2D) (None, 64, 64, 64) 16448 res2a_out00
__________________________________________________________________________________________________
bn2b_branch2a (BatchNorm) (None, 64, 64, 64) 256 res2b_branch2a00
__________________________________________________________________________________________________
activation_4 (Activation) (None, 64, 64, 64) 0 bn2b_branch2a00
__________________________________________________________________________________________________
res2b_branch2b (Conv2D) (None, 64, 64, 64) 36928 activation_400
__________________________________________________________________________________________________
bn2b_branch2b (BatchNorm) (None, 64, 64, 64) 256 res2b_branch2b00
__________________________________________________________________________________________________
activation_5 (Activation) (None, 64, 64, 64) 0 bn2b_branch2b00
__________________________________________________________________________________________________
res2b_branch2c (Conv2D) (None, 64, 64, 256) 16640 activation_500
__________________________________________________________________________________________________
bn2b_branch2c (BatchNorm) (None, 64, 64, 256) 1024 res2b_branch2c00
__________________________________________________________________________________________________
add_2 (Add) (None, 64, 64, 256) 0 bn2b_branch2c00
res2a_out00
__________________________________________________________________________________________________
res2b_out (Activation) (None, 64, 64, 256) 0 add_200
__________________________________________________________________________________________________
res2c_branch2a (Conv2D) (None, 64, 64, 64) 16448 res2b_out00
__________________________________________________________________________________________________
bn2c_branch2a (BatchNorm) (None, 64, 64, 64) 256 res2c_branch2a00
__________________________________________________________________________________________________
activation_6 (Activation) (None, 64, 64, 64) 0 bn2c_branch2a00
__________________________________________________________________________________________________
res2c_branch2b (Conv2D) (None, 64, 64, 64) 36928 activation_600
__________________________________________________________________________________________________
bn2c_branch2b (BatchNorm) (None, 64, 64, 64) 256 res2c_branch2b00
__________________________________________________________________________________________________
activation_7 (Activation) (None, 64, 64, 64) 0 bn2c_branch2b00
__________________________________________________________________________________________________
res2c_branch2c (Conv2D) (None, 64, 64, 256) 16640 activation_700
__________________________________________________________________________________________________
bn2c_branch2c (BatchNorm) (None, 64, 64, 256) 1024 res2c_branch2c00
__________________________________________________________________________________________________
add_3 (Add) (None, 64, 64, 256) 0 bn2c_branch2c00
res2b_out00
…………………………………………………………………………….

mrcnn_class_conv1 (TimeDistribu (None, 1000, 1, 1, 1 12846080 roi_align_classifier00
__________________________________________________________________________________________________
mrcnn_class_bn1 (TimeDistribute (None, 1000, 1, 1, 1 4096 mrcnn_class_conv100
__________________________________________________________________________________________________
activation_68 (Activation) (None, 1000, 1, 1, 1 0 mrcnn_class_bn100
__________________________________________________________________________________________________
dropout_1 (Dropout) (None, 1000, 1, 1, 1 0 activation_6800
__________________________________________________________________________________________________
mrcnn_class_conv2 (TimeDistribu (None, 1000, 1, 1, 1 1049600 dropout_100
__________________________________________________________________________________________________
mrcnn_class_bn2 (TimeDistribute (None, 1000, 1, 1, 1 4096 mrcnn_class_conv200
__________________________________________________________________________________________________
activation_69 (Activation) (None, 1000, 1, 1, 1 0 mrcnn_class_bn200
__________________________________________________________________________________________________
pool_squeeze (Lambda) (None, 1000, 1024) 0 activation_6900
__________________________________________________________________________________________________
mrcnn_class_logits (TimeDistrib (None, 1000, 4) 4100 pool_squeeze00
__________________________________________________________________________________________________
mrcnn_bbox_fc (TimeDistributed) (None, 1000, 16) 16400 pool_squeeze00
__________________________________________________________________________________________________
mrcnn_class (TimeDistributed) (None, 1000, 4) 0 mrcnn_class_logits00
__________________________________________________________________________________________________
mrcnn_bbox (Reshape) (None, 1000, 4, 4) 0 mrcnn_bbox_fc00
__________________________________________________________________________________________________
input_image_meta (InputLayer) (None, None) 0
__________________________________________________________________________________________________
mrcnn_detection (DetectionLayer (None, 100, 6) 0 ROI00
mrcnn_class00
mrcnn_bbox00
input_image_meta00
__________________________________________________________________________________________________
lambda_3 (Lambda) (None, 100, 4) 0 mrcnn_detection00
__________________________________________________________________________________________________
roi_align_mask (PyramidROIAlign (None, 100, 14, 14, 0 lambda_300
fpn_p200
fpn_p300
fpn_p400
fpn_p500
__________________________________________________________________________________________________
mrcnn_mask_conv1 (TimeDistribut (None, 100, 14, 14, 590080 roi_align_mask00
__________________________________________________________________________________________________
mrcnn_mask_bn1 (TimeDistributed (None, 100, 14, 14, 1024 mrcnn_mask_conv100
__________________________________________________________________________________________________
activation_71 (Activation) (None, 100, 14, 14, 0 mrcnn_mask_bn100
__________________________________________________________________________________________________
mrcnn_mask_conv2 (TimeDistribut (None, 100, 14, 14, 590080 activation_7100
__________________________________________________________________________________________________
mrcnn_mask_bn2 (TimeDistributed (None, 100, 14, 14, 1024 mrcnn_mask_conv200
__________________________________________________________________________________________________
activation_72 (Activation) (None, 100, 14, 14, 0 mrcnn_mask_bn200
__________________________________________________________________________________________________
mrcnn_mask_conv3 (TimeDistribut (None, 100, 14, 14, 590080 activation_7200
__________________________________________________________________________________________________
mrcnn_mask_bn3 (TimeDistributed (None, 100, 14, 14, 1024 mrcnn_mask_conv300
__________________________________________________________________________________________________
activation_73 (Activation) (None, 100, 14, 14, 0 mrcnn_mask_bn300
__________________________________________________________________________________________________
mrcnn_mask_conv4 (TimeDistribut (None, 100, 14, 14, 590080 activation_7300
__________________________________________________________________________________________________
mrcnn_mask_bn4 (TimeDistributed (None, 100, 14, 14, 1024 mrcnn_mask_conv400
__________________________________________________________________________________________________
activation_74 (Activation) (None, 100, 14, 14, 0 mrcnn_mask_bn400
__________________________________________________________________________________________________
mrcnn_mask_deconv (TimeDistribu (None, 100, 28, 28, 262400 activation_7400
__________________________________________________________________________________________________
mrcnn_mask (TimeDistributed) (None, 100, 28, 28, 1028 mrcnn_mask_deconv00
==================================================================================================
Total params: 63,744,170
Trainable params: 63,632,682
Non-trainable params: 111,488

b) Transfer learning
Transfer learning and fine-tuning aim at reusing the pre-trained weights of a CNN as an initialization for a new task of interest. We used Mask-RCNN pretrained on the COCO dataset.

3. RESULTS
center635
The most frequently applied performance metric is the AUC (area under the ROC curve).

Computational environment
The research is carried out on a Linux with single NVIDIA GPU K80 with 12GB memory hosted on AWS. The deep learning framework is Keras 2 with Tensorflow as the backend.

4. CONCLUSIONS
In this paper we have made a

Acknowledgements
We would like to thank ….. for providing comments on the manuscript;
References
1 Li Shen, End-toend Training for Whole Image Breast Cancer Diagnosis using An All Convolutional Degign.

2 Trieu, Phuong Dung et al. “Female breast cancer in Vietnam: a comparison across Asian specific regions.” Cancer biology & medicine (2015).