He combined medicine with technology. This is the first such effective algorithm for breast cancer diagnosis

Jan Witowski, MD, PhD, is the co-creator of the algorithm used to diagnose breast cancer. The model is able to detect cancers that are difficult for radiologists to see. Thanks to the created system, thousands of women do not have to undergo painful and often unnecessary tests, such as breast biopsy.

Joanna Biegaj, “Wprost”: What is the system for early detection of breast cancer created by you and your team of scientists and how does it work?

Jan Witowski, M.D.*: Our system was built to improve breast cancer diagnosis during mammography screening. This is both about improving the accuracy in detecting cancer and improving the time needed to make a diagnosis. Although screening mammography is an effective method of early detection of breast cancer, in many patients the suspicious lesions turn out to be benign. Unfortunately, these patients undergo unnecessary and often painful tests, such as breast biopsy. Building accurate diagnostic support systems can help avoid unnecessary procedures, detect very early and hidden cancers, and speed up diagnosis times.

Can you say that a completely new model has been created?

Our team has developed a new artificial intelligence (AI) model based on neural network technology. Developing AI systems, including ours, mainly involves showing neural networks a large number of examples: examples of what radiological examinations of patients with breast cancer look like and examples of what radiological examinations of patients without breast cancer look like. Artificial intelligence is very good at recognizing differences between examples and is able to independently learn what a diseased and a healthy organ looks like in a radiological image. An essential issue here is to prepare a sufficiently large data set.

We trained our model on hundreds of thousands of mammogram exams from New York City, both in women with and without breast cancer. It is this scale effect that allows AI models to be highly accurate. They are able to see many more examples than a radiologist will see in their lifetime. The idea of using artificial intelligence in breast cancer diagnosis is not new, but as part of our research, we modified neural networks in a unique way – so that they are adapted to various types of mammography examinations. Only appropriate adaptation of standard neural networks allows you to build systems that will work effectively in clinical practice.

How accurate is the algorithm for detecting breast cancer? What makes it effective?

Our model is able to diagnose breast cancer with an accuracy higher than that of the average radiologist and as high as the best radiologists. The two most important aspects of such high accuracy are data and algorithms. Data – because in the case of artificial intelligence, the more data we show the algorithms, the more accurate they will be. We have managed to collect one of the world’s largest datasets, containing over one million breast imaging studies. Algorithms – because to make the most of such a powerful data set, it is not enough to use a ready-made methodology not adapted to radiological research. We have made appropriate modifications that allow the neural networks to perfectly distinguish between healthy and sick patients.

Is this algorithm the most effective in the world? Why and how can this be determined?

We have compared our model several times to systems developed by other research groups and private companies, clearly outperforming them. Such a comparison is made in a very simple way: each AI model can be run on the same, standard group of patients who were previously diagnosed with breast cancer. This way we can measure how accurately each of these models is able to predict whether a patient is sick or not.

Based on what data was the system created?

Our AI systems are trained based on databases of the New York University network of facilities. Over the years of our scientific work, we have collected over a million breast radiological examinations, including several hundred thousand mammographic examinations.

Is the system already working? Is it already saving women’s lives?

Yes. The system is implemented in New York University Langone Health facilities that perform screening mammography tests. These are over 10 locations in New York, where several hundred patients are examined for breast cancer every day. The radiological examinations of each of these patients are interpreted by radiologists who use our system to improve the diagnosis. Recently, one of the analyzes of our system showed that it allowed over 10 patients to find breast cancer that was not initially noticed by the radiologist. Additionally, the number of patients who have avoided unnecessary breast biopsy is already in the thousands.

Apart from effectiveness, what are the other benefits of using the created model?

The “accuracy” or “effectiveness” of our model improves two aspects of breast examination. Firstly, the use of our system allows us to avoid unnecessary breast biopsies and additional diagnostic tests. This is the protection against false positive test results. This works in such a way that our AI model is able to identify with very high accuracy (>99.9%) those mammography examinations in which there is no suspicious lesion. Then radiologists may have greater confidence in their decisions and are less likely to send patients for additional tests. As I mentioned in the previous question, thousands of patients have already avoided unnecessary breast biopsy thanks to the use of our system.

What is this second aspect?

Second, our model is able to find tumors that are difficult for radiologists to see. This is protection against false negative test results. The AI system is able to mark suspicious areas on an X-ray examination and alert the doctor to the lesion. As in the previous question – according to the latest analysis, at least 10 patients had cancer identified that was not initially noticed by radiologists. It is worth noting that the former task is more difficult than the latter. Mammography tests are very sensitive, i.e. radiologists are able to detect cancer with very high accuracy. It is rare for cancer to go unnoticed. A bigger clinical problem is that too many patients needlessly go through stressful and painful tests even though they do not have cancer.

Is the model effective even when dealing with a very rare subtype of cancer? Is its effectiveness reduced by any factors, such as the patient’s age or race?

The model is accurate across all cancer subtypes and patient groups. There are differences in model accuracy depending on the subgroup, but these are a direct result of biological variability, not weaknesses or beliefs of the artificial intelligence. For example, patients of Asian origin have so-called “dense breasts”, i.e. breasts with little fat tissue, and therefore cancer detection in them is more difficult than in other women. This problem is common to both radiologists and the AI model – cancer is simply harder to see. Nevertheless, the model performs as well or better than radiologists in these patients.

To what extent is diagnostics via an algorithm more effective than diagnostics undertaken by a doctor?

Let me start by saying that currently, artificial intelligence models like ours do not perform diagnostics fully on their own. They are “decision support systems”, which means that they only inform doctors, who make the final decision about diagnosis or treatment. AI models are clearly more or as effective as radiologists in an increasing range of diagnostic problems, and over time they will begin to replace doctors in particular tasks.

What’s next? When will it be possible to say that the model actually works? When will the tool be available for use by doctors?

Referring to the previous question: although AI models – especially in radiology – work very well, often better than doctors, their entry into clinical practice will take longer. There are several reasons. Perhaps the most important is that artificial intelligence models, like any other new medical device or drug, must undergo some type of verification. The first verification is to gather sufficient scientific evidence to clearly demonstrate that these models are highly accurate. Because artificial intelligence is a hot topic today, the threshold to pass is often higher than in other areas of medicine. Producers of artificial intelligence software must prove that their AI models work in different hospitals and with different patients. This evidence must be developed by independent scientists from different centers. This entire procedure takes time, often for years.

What specialists does the team working on the algorithm consist of?

Building a unique, well-tested system with artificial intelligence models requires the involvement of a team of several or a dozen people from various fields. Our research group at NYU is a good example of this: it consists of over 10 computer scientists – researchers, PhD students and students. Additionally, we cooperate with over 10 radiologists and medical students who support us in the medical analysis of our test results and in the development of databases.

How do the two worlds of technology and medicine connect here? What is the essence of cooperation between these fields?

For me, the most important thing in this cooperation is to identify problems that are really important in clinical practice. Most of the projects that technical teams work on will have no impact on the diagnosis or treatment of patients, because they concern problems that are not really important for doctors or patients. We need the voice of experts – doctors and patients who are able to indicate which problems require the help of artificial intelligence systems. Every well-defined medical problem can be solved – it’s a matter of time and money.

Jan Witowski, MD, PhD – physician and scientist dealing with computer processing of medical images, in particular cancer imaging. In the years 2019-2023, he worked as a research assistant professor (so-called “post-doc”) in scientific groups at Harvard University/Massachusetts General Hospital in Boston and at New York University. As a scientist in the USA, he worked primarily on the application of artificial intelligence in cancer imaging breasts. He developed AI models that automatically recognized suspicious changes in mammography, ultrasound and magnetic resonance imaging of the breast. Some models developed by Jan Witowski were implemented into clinical practice. His AI model for screening mammography was implemented in the network of New York University clinics (over 10 institutions) and allowed for a reduction in the rate of false-positive results and the detection of several cancers unnoticed by radiologists. The results of research co-led by Jan Witowski were published in prestigious scientific journals such as Nature Medicine, Science Translational Medicine and Nature Communications.