In this episode we welcome Davide Cirillo. He represents the Barcelona Supercomputing Center, a partner in iPC which, by the way, stands for Indivualized Peadiatric Cure. We talk about how explainable AI and machine learning is the key to tailor treatments for kids while minimizing the risks.
The iPC project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 826121.
This is a Technikon podcast.
There is a digital transformation all around us. We know this from news reports, but even more so we know this because we are a part of it. How many of your daily activities are done on your computer or mobile telephone? So what about the big picture, like treating cancer in children? How has this changed for the better in our digital world? I'm Peter Balint from Technikon, and today we explore this very topic with Davide Cirillo from the iPC project. He represents the Barcelona Supercomputing Centre, a partner in iPC, which, by the way, stands for individualized paediatric cure. Doctors, clinicians, oncologists, biomedical engineers and computer scientists are working together in iPC to use data, human samples and artificial intelligence to tailor treatments for kids while minimizing the risks. Let's have a listen. Thanks, Davide, for coming on today.
Thank you. Thank you. It's a pleasure to be here. Thank you for having me.
First, let's look at what iPC is trying to do, but frame your answer around this quote and this is from the iPC webpage: "The future of pediatric cancer treatment is in personalization."
Yes. So in this in this sentence, there are two words that really stand out and and really - like - summarize very well the goals of the iPC project, and those words are personalization and future. So in the health domain, by personalization, we generally refer to personalized medicine and we choose our medical paradigm centered around the patient. So in order to deliver better therapeutic strategies and the preventive solutions. So have you ever noticed that if you have two patients that have been diagnosed with the same condition and you give exactly the same treatment to them, the outcome of this treatment will be different. So maybe like in one of two people that the drug is used is working very well. But the other is maybe not working very much. So the reason behind this is that the variability in the human population is is huge. So we are very different one from each other, and the personalized medicine approach is actually looking for those differences in order to provide better therapies, better diagnostic solutions. And so looking for individual information like, for instance, demographic characteristics like age, sex, ethnicity, the clinical history, maybe previous diseases, genetic factors like mutations and also the socioeconomic context like, for instance, the I don't know, the environmental conditions in which the person lives or the psychosocial factors. And so this is about personalization. And then we have the future of pediatric cancer and and again, in the health domain, when we talk about the future of biomedical research, we refer to the applications of artificial intelligence and we are witnessing a digital transformation of the health care. But I would say like a digital transformation of our lives in general. So we have artificial intelligence algorithms embedded in all the devices that we are using from our smartwatches to the mobile phones. We have algorithms that are able to recommend us products based on our tastes and preferences, for instance in Amazon or videos on on YouTube and stuff like that. So in biomedicine and in particular in pediatric cancer research, we are trying to work with those artificial intelligence systems in order to realize this personalized medicine approach and to use those algorithms and create those models in order to provide a better health service.
I see, so we could almost say that this is using our digital world to sort of connect with the world of cancer research or maybe even medicine in general for better outcomes, so that in this case, children don't have to suffer with treatments that don't necessarily work.
Exactly.
OK, so then we have to ask the question what's the role of Barcelona Supercomputing Centre in iPC?
Yes. So the Barcelona Supercomputing Centre is the National Supercomputing Centre in Spain, and we are experts in HPC, which stands for high performance computing and which is basically the use of computational resources with a high level of parallelism and scalability . And one of those resources that the bigger one is called MareNostrum 4, that is a supercomputer that we host. Just to give you an idea of this, the peak performance of this supercomputer is almost 14 petaflops, which corresponds to more than the 13000 trillion operations per second. So this is like a big machine that is really able to run fast programs. Yeah. And so this is MareNostrum 4 . But the best known supercomputing center will soon host the MareNostrum 5 , which is one of the three so-called pre-exascale supercomputers selected in 2019 by an initiative that is called the Euro HPC. And so do you remember the peak performance of almost 14 petaflops of MareNostrum 4 , well, with MareNostrum 5, these will become 200 petaflops. So these machines are growing . Yes.
Oh, wow.
So yes, this is this is what we do. We we use those computational resources in order to provide these HPC resources to different communities and to generate knowledge in in different areas such as, for instance, engineering, climatology and, of course, life sciences. So our role in, the role of BSC in iPC is to develop research in artificial intelligence using these HPC resources and provide, of course, the computational infrastructure required for such a big project.
Mm-Hmm. You're using artificial intelligence then in iPC, and I think many people might not understand that, and they might immediately jump to the fact that they think that there are ethical issues associated with this. How is that dealt with in iPC? Or is that even the case?
It definitely is the case for many reasons, and not only for the aspects related to artificial intelligence, but also those. So in general ethics is fundamental to the project iPC and as I said, like in general to the applications of artificial intelligence to health, so the issue is that - thinking of a form of medicine that goes hand-in-hand with artificial intelligence is fueling a very strong debate nowadays. So there are some opinions free stance on the possible dystopic future for healthcare where, you know, machines takeover doctors jobs and things like that. But actually like the most likely scenario for the next decades is actually an increase in flourishing human machine interaction, where a human doctor will be accountable for any decision about the patient without a direct.. and in particular without an unauthorized intervention of any machine. So yes, the ethical aspects are a crucial tool to this type of projects, and BSC has dedicated resources to do ethics in the project, and we also participated in the many workshops devoted to these themes organized by the consortium. Then there are an important aspect that is more related to the actual development of artificial intelligence is that all the data that we can use to train artificial intelligence systems can also be extremely biased towards certain groups. So we really have to be careful that we are not excluding -that the data that we are using is not like excluding under-served groups and minorities and the situations of this kind. Biases can be hidden in all the steps of the lifecycle of the artificial intelligence development. For instance, we can have historical biases in the documents and data that can retain cultural aspects, beliefs of stereotypes. We can have the representation biases if we like select a specific group of patients, excluding others. We can have biases the in the way the values and the parameters of the data that we are collecting are measured and we can have different biases in the way those systems are evaluated. And finally deployed in the real world, because you can release a model, but depending where you are deploying the system, who is going to use it? You can create a bias also there. So it's really like a complex landscape of biases and issues that we really have to be very careful to address when we start an artificial intelligence based project like iPC, for instance.
Yeah. And it seems like ethics is a really necessary part of the framework in a project like this.
Yes.
So in many existing personalized treatment plans for cancer, large datasets are used to help inform clinicians about treatment plans. Why is this not the case when examining the paediatric patient population? And what does iPC propose to do about this?
Yes. So the the the main conundrum here is that many pediatric tumours are rare and so a rare disease by definition, affects a small number of individuals compared to the general population. And so as a consequence, the data sets that that we, that we have are characterized by being of small sizes and and so like we have like few data points, and these limit our ability to be statistically confident about any finding that we might identify in our in our research. So one solution to overcome this limitation is to augment the data by generating synthetic instances. And so indeed, BSC is very much involved in this aspect. This is a very advanced application of artificial intelligence for synthetic data generation and and this is an emerging dominant A.I. solutions for personalized medicine since it enables to address those types of challenges. As, for instance, creating the data volumes that are needed to deliver accurate results and and also like correcting for possible biases, as we were discussing before and also complying with increasingly restrictive privacy regulations. So and this is, of course, very much relevant for pediatric cancer research. I'm talking about privacy here because you have to imagine that when you are creating a synthetic version of a patient, you are kind of detaching from the real person. So you can then work on the on these digital twin and you can, for instance, test different perturbations. You can see like if a drug is actually working or not working, but without doing this on the real patient, but only like in a virtual environment inside a computer. So this is this is like the main idea, and this is like the main advantage of the synthetic data generation. And regarding to this, there is an entire field of research that is working on on this particular area. And this is because synthetic data generation uses mainly deep learning nowadays. And the problem with deep learning is that this is something called like a black box. We we hear more and more about this this term, and a black box is basically a system in which, like all the complexities and the nonlinearities that that are used to model the data are not intelligible to humans. And and so there is a lot of research focusing on the explainability of those systems, the explainability of artificial intelligence, which basically means how to convert a black box into a white box so something that we can actually understand. Like, we like the mechanism behind the learning process and behind the algorithm. We can really see what is going on and why the machine reached a certain solution, a certain outcome.
I see. So it's not just about the machine making a decision. You also have to understand why it made a decision.
Exactly.
And when we talk about data and the lack of data, I'm guessing that besides data just not being there because there aren't so many cases, because pediatric cancer is rare. In the cases where there is data, it may be protected and difficult to access, perhaps due to the laws that protect minors to a greater extent than adults.
Yes, exactly. We also have to consider this. So we are talking about children. And of course, in this case, the regulation is much stricter. And I mean, this is this is understandable. And of course, it's completely right. One of the main issues most of the time, and this is not just related to pediatric cancer research, but in general is is that those regulations are very strict on data access and the management of the sensible data, also for the people that are working with this data. So it's generally very difficult to, you know, to to share for instance the data among different hospitals or among different research institutes. And so in my opinion, personally speaking, I think that creating a safe environment for data sharing is crucial to to advance research in this field. So all the regulation is ok we all agree, that's that they are right and they must be there, but we also have to guarantee that the researchers can actually work. So there are some solutions to this, in particular in the artificial intelligence area. One of them is called federated learning. So basically, imagine that you have many different hospitals in different countries and and each hospital has its own constraints to data sharing. So instead of bringing the data to a centralized place, what Federated Learning proposes is to bring the models in situ so drained the artificial intelligence system inside the hospital and share only the parameters that have been learned during these local training. So in this way, basically, you can like train, a general model that is accounting for all the data stays like in the in the periphery, stay in those places and doesn't move, but still you can learn from from from it.
So previously you mentioned in iPC, the virtual patient or the digital twin, and this is used to test treatments while preventing any harm to any humans. Tell us more about this.
Yes, a digital twin, is a synthetic version of some characteristics of a patient. So I think it's important to to say that when we say digital twins, we do not mean like a digital physical version of a person. You know what this is what the probably most of people think when you say digital twin. And what we are, what we are simulating are specific characteristics. So the scale is important because we can simulate, you know, the macroscopic or the microscopic. And so and there are different modeling approaches depending on the on the scale that is more adequate to do the things that you are doing, what we want to study so we can produce multicellular systems like, for instance, pieces of tissue using coarse grain simulations with a lower resolution. Or we can go much more finer and we can free sensory produce the expression levels of the genes that are sitting on the DNA of a patient with a very high resolution. So no matter the scale all the time, what we can do is basically to test what if scenarios. So again, if we if we have like if we can reproduce, you can simulate something we can then test perturbate and see what is the effect inside a computer and not inside the real patient.
So if you look ahead a bit, what is the best outcome for iPC? And I mean not only from the domain of science and medicine, but also from the side of pediatric patients and their families?
Yes. So the the best outcomes for iPC would be, first of all, to generate knowledge about the pediatric tumors that are under study, especially because most of them present many unanswered questions. And the aspects that need to be further studied. So this is an important aspect of this of this project that is fostering the research in specific pediatric tumors, then, is to provide an infrastructure where all the data and the tools that have been created can be accessed in a secure and privacy preserving environment and also reused for the other pediatric tumors and also other rare diseases as well. And finally, to to find the theoretical but also practical solutions to technological limitations that are related to the typical scenarios of personalized medicine applications or artificial intelligence, and we talked about that before like, for instance, the small sized datasets or the explainability of the black box models. So all those are definitely would be the best outcomes that we can have from a project like iPC.
OK, well, it sounds like iPC is a really important project for making inroads into research and treatment options when it comes to pediatric cancer. And I want to say thank you for taking some time with us today and sharing your knowledge in the project and how it works.
Thank you so much. It has been a pleasure.
For more information about iPC, go to ipc-project.eu . The iPC project has received funding from the European Union's Horizon 2020 Research and Innovation Programme under grant agreement Number 826121 .