According to the World Health Organization, breast cancer is the most commonly diagnosed cancer and is the leading cause of cancer deaths among women worldwide. On average, a woman is diagnosed with breast cancer every two minutes and one woman dies of it every 13 minutes worldwide. In 2019, an estimated 268,600 new cases of invasive breast cancer are expected to be diagnosed in women in the U.S. alone.

Since 1989, early detection and diagnosis have increased treatment success and survival rates. Screening or detection is generally conducted by self-examination or clinical breast palpation, followed by mammography or ultrasound imaging. This typically identifies the presence of lesions/lumps that could be cancerous. Finally, conclusive breast tissue biopsy and histopathological analysis ascertains the presence, type, grade and malignancy of cancer.

Pathologists typically use a light microscope to manually identify various cellular markers. Such visual diagnosis, however, is tedious and subjective, with average diagnostic concordance among pathologists being relatively low.

The recent introduction of slide scanners that digitize the biopsy into multi-resolution images, along with advances in deep learning methods, has ushered in new possibilities for computer-aided diagnosis of breast cancer. Artificial intelligence (AI), machine learning (ML) and computer vision can potentially automate several steps, helping to make diagnosis more accurate, reliable, efficient and cost-effective.

We have developed a deep-learning-based breast cancer grading approach aimed at automating the error-prone pre-diagnostic steps pathologists perform manually, enabling them to make faster and more accurate diagnoses.

AI approach for computer-assisted diagnosis

As noted, one of our goals is to automate error-prone pre-diagnostic grading steps that pathologists perform manually. We have applied deep convolutional neural networks (ConvNets) for localization and segmentation tasks, which pathologists can use to perform further quantitative analysis and grading of biopsy tissue. Note that deep networks require large training data sets, while available public breast cancer data sets are small. This necessitates special methods; data augmentation and transfer learning techniques are used to offset the training data sparsity.

Tumor localization

Histopathology slides feature image sizes up to several gigapixels. Processing these very large images is computationally expensive, so common practice is to identify the regions of the slides that are of interest prior to performing more detailed analysis. “Localization” refers to identifying the regions that require further analysis. For pathologists, this is a tedious and time-consuming undertaking (see Figure 1).

Figure 1

Automatic detection of the relevant regions focuses pathologists’ attention on the significant areas, reducing their workloads while ensuring that no critical region is overlooked. Our localization approach is a three-step process:

  1. Benign/malignant classification.

  2. Patch benign/malignant classification (this helps the pathologist decipher the cancerous tissue at a granular level, reducing false positives).

  3. Malignant region segmentation. Here a fine-tuned classifier, optimized for sensitivity, produces a saliency map to segment malignant regions for further analysis.

Nuclei segmentation

Nuclei segmentation is today a challenging problem due to the variability of tissue appearance caused by imperfections in the staining process. Additionally, nuclei may be overlapping, clustered or tightly clumped, which makes them difficult to distinguish. In our approach, images are preprocessed by color normalization and training data is augmented using random cropping, flipping, rotation, scaling and other techniques. We leveraged U-Net, a convolutional neural network developed for biomedical image segmentation. Its architecture is modified and extended to work with fewer training images and to yield more precise segmentations.

Tubule detection and segmentation is even more challenging, primarily due to a lack of publicly available annotated training data sets and its diverse manifestations. Currently, this must be analyzed and assessed by pathologists, and represents an opportunity in the future for AI, ML and computer vision.

Quantitative analysis and grading

Pathology is still mostly a subjective, semi-quantitative scientific discipline performed by expert human pathologists. Grading nuclei pleomorphism and mitotic staging are specialized tasks still performed by expert pathologists. This may change with the availability of annotated data sets, evolving trust and adoption with further AI advances. But it must be noted that many challenges remain. For example, automated image analysis of certain tissue sections has failed to provide general reproducible results because of the variation in the staining intensity of various slides. Variability in several steps of tissue collection and preparation significantly influence the outcomes of automated image analysis, whereas human pathologists can effectively integrate these variables in their decisional process.


Our approach (in limited proofs of concept and pilots with hospitals and other clinical research partners) enables pathologists to be more efficient, consistent and accurate by using AI to automate certain steps in the breast cancer grading process. It achieves the following:

  • High reliability and accuracy, leading to fewer misdiagnoses (false positives) or missed diagnoses (false negatives).
  • Reduced diagnostic variability, leading to informed treatment and decision-making, hence facilitating better outcomes and prognoses.
  • Automatic region-of-interest detection and nuclei segmentation, which decreases the workload for pathologists, making them more efficient and enabling them to devote their time to more complex tasks.
  • AI-based tools for computational pathology can be applied to similar types of cancer with minimal changes.
  • Lower cost of diagnosis due to higher efficiency and reliability.

Looking ahead

AI- and ML-based technologies have the potential to transform healthcare by delivering new and important insights from the vast amount of clinical data. High-value applications include faster disease detection, more accurate diagnosis, and the development of personalized diagnostics and therapeutics. The availability of digitized wellness, clinical and medical imaging data along with the widespread success of deep-learning methods is accelerating this transformation. From radiology and drug discovery to disease risk prediction and patient care, deep learning is transforming healthcare from every angle.

However, AI-based pathology and oncology diagnostics are still in a nascent state; they represent an opportunity, not an actuality. Pathology is an excellent candidate for AI, especially for labor-intensive tasks such as histologic cell counts, structure and morphological estimation. Computational pathology leverages the power of AI, ML, image analytics and clinical big data integration to enhance the diagnostic precision of pathologists.

Future directions in this area include enhanced histopathological imaging, clinical decision support and even pathology diagnosis at home. Similarly, clinical decision support leverages statistical and ML models that can assimilate information from various sources such as pathology, cytology and radiology to facilitate accurate and timely decision-making.

AI and ML are beginning to transform disease diagnosis and treatment processes. Challenges such as the reliability of diagnosis, patient safety, privacy and security must still be addressed to realize the technology’s full potential and mass adoption.

To learn more, read our white paper “Applying Deep Learning to Transform Breast Cancer Diagnosis,” or contact us.