MAI-DxO : Microsoft AI Diagnostic Orchestrator. First Impression and Comments
Introduction
Microsoft has informed recently about the evaluation of its Artificial Intelligence tool for clinical diagnosis. There is not a summary publication, but the preliminary information is accessible on the internet. Some of the characteristics of the tool will be exposed here.
Origin of the Clinical Cases
304 Cases Record from The New England Journal of Medicine. They have adapted the cases published in the Journal to a more algorithmic or iterative situation (the exact details about that are not specified yet), in a way that the program can follow a logical path to arrive at the diagnosis, trying to conform the situation to some type of natural dialogue with the program.
Mechanism
The Microsoft application combines the expertise of other AI tools, such as GPT, Llama, Gemini, Claude, Grok and DeepSeek , thinking of the collaborative job that some doctors, working as a team, follow to reach a diagnosis. This is the reason to add the word “Orchestrator” to the Microsoft tool.
Outcome
The diagnostic performance was evaluated with the clinical cases referred from The New England Journal of Medicine, reaching an 85.5% diagnostic accuracy, with an equal percentage for the OpenAIo3 application. The application was tested against the diagnostic performance of 21 family physicians from the UK and USA, each with 5-20 years of clinical practice. The mean diagnostic accuracy for this group of doctors was 20% analyzing the same clinical cases.
Another interesting aspect is that the tool is designed to select diagnostic tests in the most efficient way, with the intention of reducing healthcare spending.
Analysis and Comments
The major advance of the development of this new application from Microsoft is, in my opinion, that clinical diagnosis is entering a new era, and we are starting a new wave to improve this aspect of the clinical practice, giving importance not just to the therapeutic side of the clinic, but to the diagnostic process.
We have to enhance the good behaviour of the tool for diagnosis, reaching accuracy over 85.5% of the cases, in a similar range to other tools such as ChatGPT. But, in the preliminary information from Microsoft, there is some paradoxical data that are worth commenting on.
First, the tool have been evaluated in a “controlled” environment, with a set of complex and selected clinical cases, outside of the real clinical world.
Another aspect is a low accuracy diagnostic for the physicians who participate as a comparison, together with the few number of them. I believe that the sample has not enough power to represent a group of physicians. At the same time, the low diagnostic accuracy can be associated with a problem of context, that means, every physician is used to working in a particular clinical field, without expertise in another clinical situation. It is clear here the advantage of an electronic tool that is not limited to a lack of memory or difficulties to handle with a big amount of data.
In summary, every new AI development is good news. The power for these tools in terms of diagnostic accuracy is improving day by day, but the most important action is to convince doctors and nurses to incorporate them into the practice, in the same way that they use the stethoscope or the blood pressure monitoring.
Microsoft AI Diagnostic Orchestrator is very promising, the outcome in terms of diagnostic accuracy and control of healthcare spending is impressive. We need to know more information about the outcome in a real clinical world.
We are waiting for more publications about this promising tool.
Author: Lorenzo Alonso Carrión
FORO OSLER