Better Leukemia Diagnostics Through AI (BELUGA)

Munich Leukemia Laboratory (Industry)
Overall Status
Recruiting ID
Anticipated Duration (Months)
Patients Per Site Per Month

Study Details

Study Description

Brief Summary

To the best of our knowledge, BELUGA will be the first prospective trial investigating the usefulness of deep learning-based hematologic diagnostic algorithms. Taking advantage of an unprecedented collection of diagnostic samples consisting of flow cytometry datapoints and digitalized blood-smears, categorization of yet undiagnosed patient samples will prospectively be compared to current state-of-the-art diagnosis at the Munich Leukemia Laboratory (hereafter MLL). In total, a collection of 25,000 digitalized blood smears and 25,000 flow cytometry datapoints will be prospectively used to train an AI-based deep neuronal network for correct categorization. Subsequently, the superiority will be challenged for the primary endpoints: sensitivity and specificity of diagnosis, most probable diagnosis, and time to diagnose. The secondary endpoints will compare the consequences regarding further diagnostic work-up and, thus, clinical decision making between routine diagnosis and AI guided diagnostics. BELUGA will set the stage for the introduction of AI-based hematologic diagnostics in a real-world setting.

Condition or DiseaseIntervention/TreatmentPhase
  • Diagnostic Test: Automated AI-Guided Diagnosis of Hematological Malignancies

Detailed Description

In numerous recent studies, deep neuronal networks (DNN) have been leveraged to examine the usefulness of artificial intelligence (AI)-based DNN for diagnostic purposes. In essence, they have successfully proved to recapitulate state-of-the-art diagnoses currently performed by humans.

Specifically, the use of artificial intelligence for pattern recognition showed that DNN could categorize complex and composite data points, chiefly images, with high fidelity to a specific pathogenic condition or disease. The majority of these studies are primarily based on extensive training sample collections that were categorized a priori. Subsequently, this "training" provided the necessary input to classify newly delivered specimens into the correct subgroups, frequently even outperforming independent human investigators. So far, these studies have thus provided the rationale for the use of DNN in real-world diagnostics. However, the prerequisite for using DNN in a real-world setting, where specimen sampling and analysis would need to outperform human diagnosis prospectively, would be a blinded and prospective trial. Currently, there is a lack of prospective data, therefore still challenging the notion that DNN can outperform state-of-the-art human-based diagnostic algorithms. Here we want to investigate the validity and usefulness of AI-based diagnostic capabilities prospectively in a real-world setting.

Hematologic diagnostics heavily rely on multiple methodically distinct approaches, of which phenotyping aberrant blood or bone marrow cells from affected patients represents a cornerstone for all subsequent methods, such as chromosomal or molecular genetic analyses. At the MLL, five different diagnostic pillars are required to provide diagnostic evidence for a specific malignant blood disorder faithfully: cytomorphology and immunophenotyping first, guiding more specific methods such as cytogenetics, FISH, and a diversity of molecular genetic assays.

+++ Objectives +++

Phenotyping of blood cells is primarily based on two distinct challenges; (1) the morphological appearance and abundance of specific cell types and (2) the presence of particular lineage markers detected by flow cytometry. These two methods are critical for each subsequent decision-making process and, thus ultimately, the final diagnosis. Simultaneously, these two methods are ideally suited for automated analysis by DNN due to their inherent image-based nature. This has been recently illustrated by a publication by Marr and colleagues (Matek et al., 2019;

In BELUGA, we want to investigate whether the automated analysis of blood (from peripheral blood and bone marrow aspirates) smears and flow-cytometry-based analyses can provide a benefit for diagnostic quality and, ultimately, patient care. Moreover, BELUGA will provide evidence for the cooperative nature of image-based diagnostic tools for other pillars of hematologic diagnostic decision making such as genetic and molecular genetic characterization.

BELUGA, therefore, consists of three parts (A-C) (See Figure in the attached File). In A, we want to train a DNN with an unprecedented collection of blood smears and flow-cytometry-based data points collected during the course of 15 years. These samples consist of all hematological malignancies currently identified and recognized by the current WHO classification for hematologic malignancies. Due to the varying incidences of these entities, the total number of training items varies from 1,000 to 20,000 for 15 years. However, we deem this discrepancy a benefit to this trial's overall aims, because this diverse spectrum will inform us on the number of training items needed for outperforming the state-of-the-art diagnostics in cytomorphology or flow cytometry.

In part B, we will compare the overall performance of our trained DNN prospectively to new yet undiagnosed samples arriving at our laboratory (see the main section for details). The superiority of DNN based categorization will be challenged based on the pre-defined outcome parameters accuracy with respect to state-of-the-art diagnostics, mismatch-rate, and time needed to provide a diagnostic probability.

Lastly, in C, we will investigate the effects on faster and more accurate diagnostic power by leveraging our trained DNN to aid downstream diagnostic methodologies such as chromosomal analysis or panel sequencing of patient samples.

Study Design

Study Type:
Anticipated Enrollment :
25000 participants
Observational Model:
Time Perspective:
Official Title:
A Case-Control Study To Determine The Suitability Of Artificial Intelligence For Leukemia Diagnostics
Actual Study Start Date :
Jan 5, 2020
Anticipated Primary Completion Date :
Jul 31, 2022
Anticipated Study Completion Date :
Jul 31, 2022

Outcome Measures

Primary Outcome Measures

  1. Sensitivity and Specificity of AI Guided diagnostics in Hematology [08-01-2020 until 07-31-2021]

    As a primary endpoint, we will examine the ability of DNN to classify disorders according to (after initial assessment disease/healthy) to the gold-standard diagnosis. The gold-standard diagnosis is defined as an integrated diagnosis, including cytomorphology, flow cytometry, cytogenetics, FISH, and molecular genetics. DNN will independently provide a bi-directional (probabilistic) diagnosis, with the most probable diagnosis. The primary analysis will include a direct comparison between the human cytomorphological examination and the pattern recognition software. Secondly, this result will be provided to downstream diagnostic departments to assess phenotypic diagnosis's usefulness for genetic characterization. We hypothesize that the turn-around time will be significantly enhanced, further providing quality at sooner timepoint.

Secondary Outcome Measures

  1. comparison of clinical consequences [08-01-2020 until 07-31-2021]

    We will compare the clinical recommendation obtained after routine gold-standard diagnostics and after AI-guided categorization of all samples enrolled in this study

  2. predictive diagnostic value [08-01-2020 until 07-31-2021]

    We will assess the predictive value of unsupervised categorization and diagnosis in comparison to gold-standard routine testing.

  3. turn-around-time [08-01-2020 until 07-31-2021]

    We will measure the turn-around-time of gold-standard diagnostics in comparison to AI-guided diagnosis.

  4. enumerate entity-specific benchmarks (e.g., blast count in leukemia) count) [08-01-2020 until 07-31-2021]

    We will assess secondary disease specific values determined by AI/DNN based unsupervised diagnosis versus routine testing.

Eligibility Criteria


Ages Eligible for Study:
18 Years and Older
Sexes Eligible for Study:
Accepts Healthy Volunteers:
Inclusion Criteria:
  • Patients having been diagnosed with a suspected hematological disorder

  • The suspected diagnoses constitute a primary diagnosis

  • Only samples of patients min.18 years of age will be used

  • Samples must suffice quality attributes which are denoted in "Exclusion Criteria"

Exclusion Criteria:
  • The sample is not fit for state-of-the-art diagnosis or fails initial quality control. For quality insurance, we will exclude samples in heparin- instead of EDTA. Samples with damage due to atmospheric reasons (freeze-thaw damage or elevated temperature) will be excluded.

  • Samples with too scarce material jeopardizing routine gold-standard diagnosis will be excluded.

  • Bone marrow aspirates without sufficient material to assess malignant or healthy hematopoiesis.

Contacts and Locations


SiteCityStateCountryPostal Code
1MLL Munich Leukemia LaboratoryMunichBavariaGermany81377

Sponsors and Collaborators

  • Munich Leukemia Laboratory


  • Principal Investigator: Wolfgang Kern, Prof. Dr., MLL Munich Leukemia Laboratory

Study Documents (Full-Text)

More Information

Additional Information:


None provided.
Responsible Party:
Torsten Haferlach, Prof. Dr. Dr., Munich Leukemia Laboratory Identifier:
Other Study ID Numbers:
  • MLL_001
First Posted:
Jul 10, 2020
Last Update Posted:
Apr 5, 2022
Last Verified:
Apr 1, 2022
Individual Participant Data (IPD) Sharing Statement:
Plan to Share IPD:
Studies a U.S. FDA-regulated Drug Product:
Studies a U.S. FDA-regulated Device Product:
Keywords provided by Torsten Haferlach, Prof. Dr. Dr., Munich Leukemia Laboratory
Additional relevant MeSH terms:

Study Results

No Results Posted as of Apr 5, 2022