top of page

Myelodysplastic Syndrome to Acute Myeloid Leukemia progression predictioN

Updated: Jul 3, 2023

MDS stands for Myelodysplastic Syndrome. It is a group of disorders caused when something disrupts the production of blood cells. The chances to have MDS grows with age. For people younger than 40 years old, the chances are less than 1 out of 100,000 people; however, 21 patients out of 100,000 people will develop MDS between 70-80 years old.

An MDS patient’s prognosis is highly dependent on whether they develop Acute Myeloid Leukemia (in short AML). So, we are looking at MDS patients' potential to develop AML. Patients with a high probability to develop AML face a bad prognosis, mostly because AML is a disease that is hard to treat. This disease requires a long treatment time. Also, we are talking about older patients who might have background diseases that need to be addressed.

On the other hand, some patients have a low risk of developing AML. These patients will probably have follow-up visits with their oncologists once every few months, and if the disease shows signs of progression, the follow-up visits might increase in frequency.

On average, about 50% of patients are classified as low-risk patients, while 23% are classified as high or very-high-risk patients. Treating AML patients is challenging; therefore, the medical team tries to avoid the AML progression as much as possible. One way to do so is to treat high-risk MDS patients with a bone marrow transplant. This procedure has a mortality rate higher than 10%, so it needs to be considered carefully.

The dilemma is even greater when a high-risk patient has some likelihood of not developing AML or a low-risk patient might develop AML within a year. The go / no go decision of a bone marrow transplant needs to be as informed as possible.

How can we get a high level of confidence in the decision to go through a bone marrow transplant? One way, adopted by the medical community, is to collect clinical information about all the MDS patients around the world and look for clinical discrimination criteria between the low and high-risk patients.

ORT's approach to solving this question is to collect genomic data of MDS patients–exactly in the same way that the patient's classification system is working right now. However, ORT’s data consists of RNA-sequencing data that was collected from MDS patients at their diagnosis time. We are looking both at these patients’ RNA-sequencing data and their progression to AML information.

RNA-sequencing, or in short RNA-seq, is a technique that can examine the RNA sequence order information and its quantity in a sample. Analyzing the RNA indicates which of the genes encoded in the DNA are turned on or off, and to what extent. For the MDS cases, we are planning to work with a measurable and comparable expression in their blood cells.

We are collecting blood RNA-seq samples from MDS patients who we know whether they progressed to AML or didn't progress to AML. Both types of patients are important for the success of the prediction. We want to be able to use this information to learn which patients are going to progress and which patients are not going to progress to AML. We push this data into a learning machine that we have built.

Eventually, this machine at diagnosis time will be able to indicate if a patient is going to progress to AML.

Oncologists want to have confidence in the patient's progression classification. They want the prediction to be highly accurate, so they can make a clinical decision based on the patient’s current medical condition, such as their background diseases, and a highly accurate prediction of their potential to develop AML.

Currently, the ORT model has an accuracy of 90%. We are constantly working to improve the accuracy of this prediction by curating more relevant genomic data, improving proprietary prediction models, and validating it in a clinical setting.

The remainder of this episode will elaborate on the ORT platform and how are we able to make such predictions.

The ORT Platform:

This platform is the execution engine of everything we do for our clients and our development. The ORT platform is composed of four main parts:

The first part is the ORT database, which includes the clinical and genomic information of the patients.

The second part is the annotation pipeline, which takes care of all the annotation information processing.

The third part is the genomics pipeline. This part takes care of all the genomic data processing.

And the last and fourth part is the modeling part, where we build the prediction models, such as the model to predict whether MDS patients will progress to AML.

Let's talk about the first part.

The database:

The ORT database is an automatic curation mechanism that adds sample annotations to the database and tags them based on their genomic criteria. For example, a customer might be looking for patient samples with a specific disease that responded or did not respond to therapy. For this patient, we want to have RNA sequence data, so we can use this data to make a clinical insight for future patients.

ORT’s tagging system can easily and quickly extract these patients' data.

The ORT database has more than one million samples annotated and can find data based on any criteria that might be relevant to your needs.

In the case of MDS patients, we were looking at patients' blood samples to determine whether they will develop AML.

Another aspect of the ORT database that makes it a valuable resource for deriving clinical insights is the use of common ontology. In simple words, that means all the samples are described with the same dictionary. A disease will have one name in the database, no synonyms are allowed, and one spelling format. Clinical information has many synonyms, and we are continuously expanding ORT’s dictionary to allow more descriptors of samples with one term per possible value. For example, all MDS patients' samples will be described with their full names, Myelodysplastic Syndrome. Any other name will be converted to this common name in the database.

This is very important when you want to be able to compare samples that share common clinical characteristics. The second part of the platform, which is the annotation pipeline, is part of the effort to align the database into a uniform and harmonized dictionary.

The third part of the ORT platform is the genomic pipeline. This is a fully automated computational pipeline that takes the sequencing information and provides and calculates genomic base measurement in the cells of the sample. In the case of MDS patients, we were looking at the RNA-sequence of the MDS patients. To simplify, an RNA sequence is a file with a list of RNA pieces and information about their presence in the cell and on the sequencing process that reports their abundance values. The ORT platform genomic pipeline will process this data to calculate a measurement with clinical impact. In the case of the RNA sequence, we are calculating how many specific RNA pieces were present in the cells at the time of sequencing. This is a complex sum operation that takes into consideration the biology process variability to get such data from the cell. This and other well-defined restrictions are used to make sure the data that is being reported on the samples are as accurate as possible. In the case of MDS patients, the ORT platform processed all the RNA-sequence of the selected samples through the same pipeline with the same restrictions enforced. This makes sure that harmonized, comparable samples are used for the analysis of the samples.

The fourth part is the cherry on top. This is the reason we are collecting these samples’ data so meticulously. This part is our way to ask clinical and biological questions and find answers based on genomic data. We are using multiple analysis tools, including machine learning-based methods, to make discoveries for ORT’s services and our clients. The model for predicting the progression of MDS patients to AML is based on a machine learning deep neural network method that takes RNA-seq and clinical information of the patients as input.

In the next episode, I will talk about the use of the ORT platform to predict immune response. Please let me know if there is anything specific you would like me to discuss. You can contact me at Stay tuned, take care and talk to you soon.

13 views0 comments


bottom of page