Clean

Can Your Phone Tell When You're Getting Sick?

Published Mar 14, 2024, 4:30 AM

What does sickness sound like? Sometimes it’s obvious, like a cough, sniffle, or stuffy nose. But some conditions cause subtle changes that only a trained ear – or AI – can detect. Dr. Yael Bensoussan is a professor of otolaryngology and the director of the Health Voice Center at the University of South Florida. Her problem is this: How do you build a giant, public database of thousands of voice recordings, and use it to train AI tools that can hear when people are getting sick?

Pushkin. There are a lot of reasons that I'm excited about today's show. I'm going to tell you three right now. Number One, the show is about this whole dimension of medicine that I essentially didn't know existed, acoustic biomarkers, basically using a person's voice to assess their health. Second thing I'm excited about the show is about the intersection of AI and healthcare, one of my top say five intersections. Love that intersection. And three, today's guest doctor Yael Bensusan, gave me what was truly the best excuse that anyone has ever given me for canceling an interview at the last minute.

Yeah, so I'm really sorry for having to cancel on you yesterday.

What was the what was the surgery you had to do yesterday?

So yesterday we call it airway surgery, where I take a base to the oar and I have to open up their windpipe or their trachia because they're a scar tissue that's blocking them from breathing. So I have to go with a laser and cut the scar tissue out and then take a balloon and open up their windpipe so that they can wake up and breathe better, and that translates to a different sound when they're breathing.

So when they're not.

Breathing because of the scar tissue, we can sound like, you know, very noisy breathing. We call it the Darth Vader breathing. And then when wake up they wake up from surgery and they're done, they have silent breathing, which means that I know that I did a good job.

I'm Jacob Goldstein and this is What's your Problem the show where I talk to people. We're trying to make technological progress. Doctor Yeah. Eld and Susan run the Health Voice Center at the University of South Florida, and she is also leading a team of researchers that's building a giant database of human voices and breadths and health information. Her problem is this, how do you record the voices of thousands of people without violating patient privacy laws while building a giant public database that could someday allow your phone to warn you, based solely on your voice that you may be getting sick. Yeah El told me that she got into this field in part because she used to be a singer.

So I growing up, you know, I always was in a very musical family. I took singing lessons when I was a kid, and then I started singing more professionally around eighteen years old, and I had a short but exciting singing career. I wrote pop folk music. We had a bend, and we toured. We had an album out in two thousand and twelve. Yeah, and I mean it was a lot of fun. And actually the reason I was able to have that short and exciting career was because I met a speech pathologist when I was fifteen. So I was taking singing classes and one day my teacher looked at me and she said, there's something wrong with your voice.

Go get checked.

And I met a laryngologist who put his camera down and she said, you have nodules on your vocal cords and you might not be able to sing again if you don't take this seriously. And I went to see a speech pathologist. I did rehabilitation with my voice for six months, and I was able to sing again. And I mean that's what led me to then become a speech pathologist growing up, and then eventually go to med school and then decide to become a laryngologist.

So it was kind of all interconnected.

So I know that your research now and most of what I'm really interested to talk with you about is is around acoustic biomarkers. So just to start, I mean, what's an acoustic biomarker?

Very good question. So what is a biomarker? First, A biomarker is something that indicates the presence of a disease, right, So if you think about a biomarker for a cancer, so different cancers have different types of biomarker. For example, for ovarian cancer, we're looking for a specific thing you know, called ca in your blood. For different types of cancers, they could take a blood draw and find a specific biomarker. It's an indicator of a disease. An acoustic biomarkers is something that can indicate a presence of a disease, but that you can hear. So that's the definition of an acoustic biomarker. So I always say, you know, when you have people in your family that are not well, you will always notice first and you'll say you don't sound good right, or you sound funny. And I have the luxury to know that because I'm a voice doctor. So then people will bring me their family members or people will come saying, I don't know what's wrong with me, but my wife told me to come because my voice is not good. And sometimes it's because their vocal cords are not working, but a lot of times it's because they can have a neurological issue or a cardiac issue that is affecting their voice.

So, more broadly, what's going on with AI and acoustic biomarkers.

Yeah, so, so many exciting things are going on. I think that's the first answer. There are so many startups, so many companies, industry researchers, academic researchers that are working and looking into voice AI. And the reason is it's really cheap to collect. Right to think about this, If you have a phone, it's really cheap to collect Compared to this.

You don't have to pick a blood sample. You have exactly just you've got the phone. You've got the device literally in your hand already. All you have to do is talk, and you're talking already.

And you're talking already, so it's cheap to that's why pharmaceutical industries are also very interested, and there's a lot of pharmaceutical projects around it. So there are a lot of projects that are going on and the state or the The current landscape is that there's tons of people working on very similar things and very interesting and various disease. So I always I kind of categorize them in three categories of diseases that are being studied. One is the disease that affects the voice box. Okay, so vocal court paralysis, absolutely, it's intuitive. There's going to be vocal biomarkers in that voice box cancer, right, that's easy. Then there's a voice and speech affecting disorders, so disorders that don't affect the voice box, but that have an impact on the voice and the speech. Parkinson is one of them, right, Alzheimer's is one of them. A stroke somebody having a stroke, they don't have a problem with their voice box, but their speech is going to be altered. So these are voice and speech affecting conditions. So lots of work is being done in that field. And the third one is diseases that you don't think would affect speech, but still people are doing research on that. So there was a really interesting study on diabetes. They're saying that there was a group that published that they could diagnose people that were diabetic versus non diabetics based on their speech and this.

So this third group is one presumably where there's at least the potential for AI to detect differences that even experts like you cannot detect, right, I mean, is that what's going on there? What?

So AI is not magical, you know, I think it's It does a lot of things. But what AI does that the layperson doesn't do is that it can analyze a lot more data faster.

Yeah.

Right, So AI has the possibility, if you have a large data set, to then find small differences in these data sets that we don't have. I mean, I would have to listen to, you know, thousands and thousands of voices and compare them statistically.

It might it might, right. It might also be able to detect differences that are not even audible.

It could exactly. I can give it an example. There's a company looking at atrial fibrillation, and I cannot validate their data because that's one of the limitations that we're going to talk about. But obviously their data set is not public. But they're saying that they can diagnose atrial fibrillation based on the voice. And their explanation is that our voice vibrates to the sound of our heartbeats.

Big if true? Fun if true?

I mean you know, again, the limitation here is that it's there's a lot of things you can't validate. But they say that they've been validating it with EKGs and that they can see it. They can hear a difference in the voice between patient patients with a.

Fib atrial fibrillation. It puts you at risk for a stroke, right, it can go undiagnosed. So like, if if this works, that would be very helpful to many people, right, absolutely, absolutely. So you're mentioning like that's super interesting. It's it's interesting more generally. So, so you're building a giant database, right, and I find that interesting for a lot of reasons. It happens. I don't have you come across the work of faith A Lee. Absolutely, yeses, So I talked to faith A Lee for this show not long ago. Wow. Right, she's like nerd famous, right yeah, And so you know, as you know, she built this giant database of images about ten years ago a little more now called image net. And that was that giant database was what allowed these early machine learning models AI models to you know, start recognizing images, right, and so the database was this necessary tool, necessary thing for the AI to really work, right, And so are you building the acoustic biomarker version of that?

So the first the short answer is yes, but I'd like to start by saying that I am not building it's our distortion.

Yes, yes, are you all are?

Actually, I'll just.

First start by recognizing here that it's it's a it's a huge team. So we're the Bridge to Way I Voice Constortium is a team of fifty investigators across the US and Canada. We're funded by the NIH through the Bridge to Way I program and the goal absolutely this is the first time I hear the analogy to the image net database.

I like it.

I usually give the example of the genomic database, the Human Genome Project, huge.

Project, more famous, more famous, they're.

Both very famous. But I like this analogy.

Well. Image net is maybe a little bit closer of an analogy, but maybe less Yeah yeah, sexy, yeah.

Well, but I mean it's interesting because the genome project has also very interesting ethical particularities like voice, right, the image has a little bit less of the ethical constraints.

For is, when we talk about whole genome.

Sequencing or genomics data people kind of understand that voice has similar concerns in terms of process.

We want to get to the concerns, but I want to first talk about what you're doing and and then we can talk about you know, not doing anything wrong. Yeah. So broadly, if it becomes the thing you hope it will be, what, what is it going to be? What is the bridge to AI voice database going to be?

So it's going to be this large database of thousands of human voices linked to other health information that are going to be available to researchers and potentially people other than researchers as well, to be able to make discoveries, right, to learn to use a voice AI, to train you know, the next generation of people on how to learn to build models on voice AI, to help pharmaceutical companies develop products or learn even to to develop products, right, And the other really important thing is to teach people what type of standards we need right right now, a lot of different projects, there's really a lack of standards. People collect voice in different ways. That's why it's really hard to pull data together. So our dream was really to say, like, hey, you want to do voice research, here's a manual, my friend, right, like here is how you collect the voice to make it accurate. This is the protocols with the task that we think, based on our studies, give the best biomarkers. Right, These are the type of biomarkers you can look for and this is the data you can train, so really create a manual of operations also for people to be able to make discoveries, and that's the goal to have the most impact on patient care.

So what are the biomarkers? What are you asking people to do? What are you collecting?

So I separate things between. So there are respiratory biomarkers, voice biome markers, speech biomarkers, and linguistics biomarkers, and they're all different. So let's go about why these are different. So respiratory is easy, right, So we ask people to breathe, to cough, to take big breaths in and that has a lot of information on our pulmonary capacity, on how our windpipe is shaped. Okay, that's respiratory. Then voice and speech what's the difference. So voice is really the sound that we make when our vocal cords come together. So when we say, like birds can voice, but they can't speak. If you have a bird that speaks, then you'll be very.

Rich or you have a parent.

So when we when we do voice tasks, we ask patients to say E or.

Ah or I.

Get the difference.

Birds and voice biomarkers will be impacted when our voice box is changed or our resp is changed. Right, So somebody with pneumonia probably cannot hold a note for very long, So that's voice biomarkers. When we talk about speech biomarkers, then you go into articulation. So some people, for example, who have neurological deficits or their mouth is not working correctly, they're going to have trouble articulating. They're going to have trouble saying some words. So these are biomarkers we can extract. And then lastly there's linguistic biomarkers. So what type of words are people using, what type of semantic how fast do they speak for example? These are all different types of biomarkers that.

That we can extract.

So to give you a very tangible example, I was reading a paper from a group looking at biomarkers of depression, and rate of speech was one of the important biomarkers they found. So people who are sad or depressed will speak at a slower pace, so words per second is smaller. So that's simple when you think about it, it's a simple by marker, right, So that's to give up tangible examples. So in terms of I think I didn't answer your question fully, So what are we asking patients? So we ask people to do all these tasks so coughing, breathing, a e. Then we make them read those validated passages, and we also ask open questions. And then when we ask open questions, we have to ask about questions that make them emotional and some that don't make them emotional, because if you trigger emotion, that causes a bias on how your voice will sound.

What what question do you ask to make people emotional?

So it's really interesting.

So at first we would ask, you know, our first question was, you know, can you talk to me about something that makes you sad? It could be somebody that died in your family or you know, So that was our prompt. And then our question without emotion was tell us about your disease and.

Only a doctor. What'd think that's for that emotional question?

Exactly?

I mean, but it's like when you think about it, like Our consortium is like tons of experts that put their minds together to develop.

Tell me about having Parkinson's. That's the unemotional question we're going to ask.

And then we I mean, we like, why are you here?

I think it was not that obvious, but it's like, tell us about why you're here to see your doctor today. And then analyzing the data, because we do pilots, right, we audit our data. We realized that people were starting to tear up, like we had people crying while talking about why they were coming to the doctor today, which is.

Supposed to be the example of unemotional.

Sure, correct, So we had to change that.

Yes, interesting, So okay, this is great. So you're getting a lot of auditory information from every patient. What other information you're getting from each person? So much?

So to give you an idea, our full protocol is about one.

Hour okay, so of the patient with the patient.

With an ipassion it's an iPad, So everything is based on an iPad and there's a helper right now or research assistant. So we collect data. We collect very extensive demographics in terms of you know, age, race, geographical location. We collect language, So what language do you speak? How many languages is do you speak, what languages do you write? You know, what part of the world are you from? That's really important. Then we collect about disabilities. Are you hearing compared are you visually impaired? Because that makes a change in your voice, your smoking status, your hydration status, your fatigue status, because that's so we're we kind of thought about anything that could affect voice, right, your socio economical status because if you think about it, that's going to affect you know, your linguistics as well. And then so other that extensive demographics, then we collect confounders, so we think about anything that could change your voice. Do you have allergies? Do you do you have dental issues? Do you wear braces? So everybody gets a basic test about if they are depressed. So no matter what disease you have, you kind of get the basic tests for all the other disease to measure if it's possible that you have concurrent diseases at the same time.

Because presumably because people are in fact complex, and there are many people who have depression and Parkinson's and you want to understand what's going on there.

I mean, most people are complex, right, It's really rare to have and people that go to the doctor are not twenty year old and healthy. Right, most of the people who will use our technology or will benefit from these database will be your typical sixty year old chronic disease patient that comes into the doctor and they're not they don't have a sterile bill of health.

How many people do you want to have in the database? Like, is there a final number you're going for?

So at the beginning, we were aiming for thirty thousand, which is extremely it's extremely ambitious, I think to be fair, I mean, if after four years we get to ten thousand, I think it'll be a huge success. Okay, And you know the data collection. I think what we're learning is that data collection is very resource intensive. To have good data is very resource intensive.

So what happened that made you realize that thirty thousand was maybe harder than you thought?

So? I think we thought that we wanted to collect as much data as possible, and our original plan was to collect a lot shorter protocols, you know, like shorter clips. But as we started working with patients, we realized that by getting more data from the same patients, we can actually have a lot more information and it provides a lot of interesting you know biomarkers. So we're focusing more on getting more data from a smaller amount of patients and really with the right data, kind of right data with a lot of clinical information attached to it.

After the break, what the world will look like in a few years if everything goes well. So this is a big project that yeah, Elle and her colleagues are embarked on. It's a four year project. They're about a year in and there will be interim data releases along the way. So I asked her, how long will it take for this project to advance the state of the science in acoustic biomarkers.

Yeah, I would say to say at the end of the four years would be a probably the best answer. I think at the end of the four years. But I think that you know, you can just say, oh, we'll just start training models at the end of the four year once we have all the data. Right, It's not just about you know, building one model that I'll answer your question, is about continuously training models to understand which biomarkers to extract the then build products that walk.

So, so, if things go well, what will this world look like in whatever five years?

Yes, So, I mean there's there's a few things that this can help with in general, voice biomarkers. Let's not talk about just our project. Diagnosis is one thing, right, early diagnosis, but that's probably the hardest thing, Huh. Screening is most more important. So when we think about screening, it means you, let's say you live really far you don't have access to a doctor, but your doctor has an iPhone and you can talk into the iPhone and it can say, hey, something's wrong. You know, you need a neurological specialist, for example. So to help screen and triage. I think this probably we're looking at in the next five years, something definitely possible. The other product that I think will be very possible within five years is tracking of diseases. If you want to monitor the evolution of parkinson or how people respond to drugs. That's why pharmaceutical companies are very interested.

Right. So the acoustic biomarker is not just a binary signal of disease, no disease. It can tell you a lot about the status of disease. Is it getting better, is it getting worse?

Evolution, especially if you train it on your own voice. Right, it's even easier to detect changes in somebody's voice as they progress, like your sory for example, or Alexa that learns listens to your voice. So that's going to be a really good tool for pharmaceutical companies. That's why they're investing in it, right, to see how you respond to a drug, how you respond to a treatment. And when you think about telehealth at home, right, so more and more we're going to talk about remote monitoring people. There were just too many people on this earth to all be in hospitals when we're sick.

Well, and if you can stay out of the hospital when you're sick, that's better, Right, You don't want to go to the hospital unless you have to do yeah, or.

That you're you know, your Lexa detects when your voice starts detailor rating and sends you a nurse before you need to go to the hospital.

So there's a more general version of that one, right that you could imagine, which is you get your whatever, your iPhone, your Android phone, and you have a choice when you're setting up your phone, like do you want to opt into to the phone listening and to tell you if you need to go talk to your doctor, right, just like a very broad based thing that you could opt into, like I would probably opt into that. I mean, is that a thing that you think about?

So, I mean yes, I'm sure that you know Apple is working on that already.

They are, you know.

The question is there has to be technology that's being developed as well to ensure privacy of not only you, but your environment. Right, because when it's your phone, then it's your environment as well.

So you brought up privacy in that context, we can knock out private see in the context of the database as well. Here, how could it go wrong? Building a database of thousands of people's voices with tons of data about them sort of answers itself.

Yeah, it can go wrong in many ways. And I just came out of like two hours of meetings of this. So add the Bridge to ay I program. We have a huge group of bioethicists and one of our big aim as a group is really to ensure patient privacy and to answer these questions of how do we protect patient privacy in the context of open data. Right, So, you are absolutely right, tons of things can go wrong. People can be potentially reidentified through their voice. So one of our biggest goals this year is determined what part of the voice is identifiable and which part is not okay, And all of this is based on the Hippo law. Hippo law is from the nineteen nineties.

Hippola the that governs sharing and security of people's medical information.

Correct protected health information PHI, we call that, and that law was made in nineteen nineties, right and back then they listed a list of things of what they called PHI or identifiers that cannot be shared openly and that should stay in the hospital. And voice prints are listed. When you go into what a definition of a voice print is, it's very nebulous. It's you know, we don't know. So because of that nebubularity.

If I have that word, if that's nebulosity, I'm fred, I don't know.

Because it's so nebulous. A lot of institutions, a lot of hospitals will say, well, you know, voice is not is not an identifier as long as you don't say hi, I'm John Doe and I live at four twenty five, blah blah blah. Other universities will say, no, no, no, voice is always an identifier. You can never really least voice data. So what our group is doing right now is really looking at why the hippo law says this, what are the actual legal implications of sharing voice? And we always grade it in terms of risk, Right, if I talked about all the things that we collect, you can think that the respiratory sounds are probably very safe to share versus a speech sample. As we say free speech, it's probably the most identifying if you have to grade it, right, And we're kind of looking at, well, where is the balance? How much can we release? And also we can transform the data, so for example, we can change the data, the audio data and what we call visual spectrograms.

Like a waveform.

Yeah, it's a sort of waveform that machine learning can use. We can extract acoustic features, right, like loudness, frequency, stuff like.

That, and basically trying to figure out how to make a person be not identifiable based on their voice without messing up the database. Like that's the balance, right, Like if you monkey with their voice too much, then your monkey with the data, the database that we care the most about. Like that seems like a hard trade off. So if we go farther out into the future, you solve all these problems, you build your giant database, the models get really good. All of these things seem like things that may well happen. I'm curious about, you know, AI doing some chunk of what you do now. Right, we see this happening, say in radiology already. AI is clearly very good at doing some of the technical work that radiologists do in diagnosing scans of patients. Right, how do you think about the future of AI, you know, using acoustic biomarkers to make diagnosis in a way that is similar to what you do now as a human being.

Yeah, So, I mean, I I don't think I'm going to lose my job yet because I would say that my primary goal as a doctor is not to necessarily do that, right, Like, my primary goal is, yes, to diagnose, but it's to treat patient. So for now, AI is not going to treat the patient. So I think what it's going to do is it's going to support a lot of the workforce. So for example, I'm an academic laryngologist. I'm a super a super specialist.

Right.

For people to get to me, they often see like five different doctors.

So instead of going through for specialists who can't figure it out, you go to your primary care doctor or even you just talk to your phone, and your phone says you better talk to your primary care doctor, and your primary care doctor sends sends the patient directly.

To you, correct, correct, right to say like hey. Because again, most of what we do for a very long time will will need a gold standard right diagnosis. So often you know it's it's a it's a biopsy, or it's a it's and imaging.

You need a gold standard.

The acoustic biomarker is not a clear enough diagnostic technique. You need something more reliable.

So I don't think it's going to you know, no doctor will say, oh, well, based on this, this is your diagnostics, start chemotherapy. That's not where we're going. I wouldn't take chemotherapy based on an acoustic biomarker. But it's hopefully going to support a lot of primary care and access to care to get to the right person faster.

Great anything else we should talk about.

The one thing we didn't talk about, I guess I talk about this all day, so sometimes it's hard to remember what I've said in iowha haven't said. But the implication for probably all this new telehealth you know, online world that we live in, a lot of industries are already integrating tools. So, for example, Canary Speech is a startup that sold a product. I think they're working with teams to capture if there's signs in your voice of depression.

Teams meaning Microsoft teams, like Microsoft's version of Zoom.

Yeah, yeah, so I think And don't quote me on the particular. Maybe I'm you know, I'm not giving, but but I know there's a few startups that are starting to integrate products in Zoom or in teams to let employers know that, hey, your employee is not doing well based on his voice, for example, Right, and.

What is your view of the efficacy of those?

So, I mean, I I the easy the quick answer is it probably works partially. Yeah, But the question is not if it works full you're not. The question is does it make a difference? Right, So let's say let's.

Do what they say it does. Is a question that matters to me, right, like does it are the claims valid? Seems like a reasonable starting Yeah.

I think so.

So. I just I just reviewed one an article of one of the startups Fantastic that's looking at like depression, and I mean their numbers look great. I do think that's that the results that a lot of these projects are getting are definitely positive and promising.

Absolutely, we'll be back in a minute with the light.

M h.

Now, as promised, we're back with the lighting around. What was your band called?

Ha, My chase stage name was Ella Bence Ella Bence because my last name is Ben Susan so that was too long.

And yeah, al became Ella.

Yeah, I could say my first name.

What did you have a hit song in French? What was it called?

I wouldn't call it a hit song. It was called annalis samp means I guess in English, it's like a one way flight.

Can you sing a line?

No, that's my previous life.

Can you just say a line? Yeah?

I was Uh, it's in French, though I know it'll sound great. Yeah, And nalisam.

Means get me a one way flight for the other side of the world. I hope people are really happy there.

Well you're here, now, you're you're you're in Tampa. Now did it work out as hoped?

Oh? Yeah, I mean I have I have the best job in the world, you know. I get my my mom raised us me and my brother saying you guys need two jobs, one that make money, makes money and the other one that makes you really happy. And if you manage to have both in one job, then you'll have made it, you know. And I get to be a surgeon and work with voice and voice professional and you know, it's been my passion pretty much all my life.

So yeah, do.

You work with professional singers as a physician?

Absolutely? I mean I love treating my professional singers, so yeah, I love that part.

Of my job.

Have you treated anybody famous? Yes?

But I can't tell.

What's Taylor Swift really like?

Oh that I don't know. I wish No.

She sounds fine, though she probably doesn't need a laryngologist.

What's the best cure for a sore throat?

Voice rest and advil? It takes the inflammation away.

Uh huh? Advil? Just ibuprofen and don't talk and voice rest.

Yes?

Okay? Are you just always involuntarily diagnosing people based on their voice all the time? Okay?

So funny funny fun fact.

Two months ago, my girlfriend from residency called me. I hadn't spoken to her in like a year and a half and she called me and she said hi, And I said, you're pregnant?

Really?

And I could hear it because pregnancy gives you like this, you know, you get stuffy in a certain way in your nose, like we call it rhyanidis of pregnancy.

And I knew her voice very well.

She was my girlfriend for a long time, you know, we studied together, and she just I knew it, you know. And and I think she says, hey, how are you. I wanted to talk to you, and I'm like, you're pregnant, and She's like, how.

Did you know? So?

Yes, that's amazing people.

I mean I was listening to the political debates, you know, and I'm like, ooh, this guy needs a laryngologist. I could I'm diagnosing people all the time.

Well, they should give you a call. Yeah, absolutely, Okay, lovely to talk with you.

It was one of the funnest interviews I've done.

Yeah, Albin Susan runs the Health Voice Center at the University of South Florida. She's also a principal investigator on the Bridge to AI Voice Project. Today's show was produced by Gabriel Hunter Chang, edited by Lydia Jeane Kott, and engineered by Sarah Bruguer. Just a quick note, We're going to be taking a break for the next couple of weeks, but we will have an episode in our feed next week from our colleagues over at the Happiness Lab that is timed not coincidentally to World Happiness Day, which I'm informed is on March twentieth. I'm Jacob Goldstein and we'll be back soon with more episodes of What's Your Problem

What's Your Problem?

Every week on What’s Your Problem, entrepreneurs and engineers talk about the future they’re trying  
Social links
Follow podcast
Recent clips
Browse 151 clip(s)