Having a good data strategy can streamline the way a company does business. In this episode of Smart Talks with IBM, Malcolm Gladwell takes on this topic with Ronald Young Jr., host of Solvable, and guest Nicholas Renotte, Data Science and AI Technical Specialist at IBM. They discuss how data literacy can help make a business more efficient, the fundamentals of data management, and why data is step one to AI solutions. A study quoted by Nicholas and referenced in this episode can be found here. Some of Nicholas’ guidance on machine learning can be found here.
This is a paid advertisement from IBM.
Hey everyone, it's Robert and Joe here. Today we've got something a little bit different to share with you. It is a new edition of the Smart Talks podcast series, which is produced in partnership with IBM. This season of Smart Talks with IBM is all about new creators, the developers, data scientists, c t o s, and other visionaries creatively applying technology and business to drive change. They use their knowledge and creativity to develop better ways of working, no matter the industry. Join hosts from your favorite Pushkin Industries podcast as they use their expertise to deepen these conversations. Malcolm Gladwell will guide you through this season as your host to provide his thoughts and analysis along the way. Look out for new episodes of Smart Talks with IBM every month on the I Heart Radio app, Apple Podcasts, or wherever you get your podcasts. And learn more at IBM dot com slash smart Talks. Hello, Hello, Welcome to Smart Talks with IBM, a podcast from Pushkin Industries, I Heart Radio and IBM. I'm Malcolm Gladmo. This season, we're talking to new creators, the developers, data scientists, ct o s, and other visionaries who are creatively applying technology in business to drive change. Channeling their knowledge and expertise, they're developing more creative and effective solutions, no matter the industry. Our guest today is Nicholas Renaut, Senior Data science and AI technical specialist at IBM. Nicholas's job is to help companies formulate a data strategy that streamlines the way they do business and prepares them to use sophisticated AI technologies. But beyond his day to day, Nick is also a content creator on YouTube, where his channel has over a hundred thousand subscribers. His videos explain computer science incepts in a way beginners can understand, and he often demonstrates how to use machine learning and data science to solve novel problems. On today's show, How Nicholas learned Data science from the bottom up, the fundamentals of data management, and how an innovative data strategy can help businesses create novels solutions, Nick spoke with Ronald Young Jr. Host of the Pushkin podcast Solvable. Along with being a frequent contributor to MPR, Ronald also hosts and produces the podcast Time Well Spent and Leaving the Theater. Okay, let's get to the interview. So tell me a little bit about how you got into data and when you found out like the power that it really harnesses. Do you have a story or anything that kind of like when you first piqued your interest in data. My first interaction with data and with coding was act really when I was around about eleven years old. So this was really just getting started with just looking at spreadsheets. So my dad would come home and after working a nine or five job, he actually started working with investing in stocks and doing value based trading that way. I'll always remember I walked up to his desk one time and he said, Nick, if there's one thing that you should learn, I'm seeing all these people work on these things called macros in spreadsheets, and these people like wizards inside of my business. I know that you're still you're still in high school, but I really think you should learn this stuff. And I started doubling in some Excel spreadsheets and started just recording macros and tweaking stuff, and that that's where it all started. But from there, it's It's always been a recurring vein throughout my career that I've done some sort of wizardry with data, whether it a coding or business intelligence or data is it's it's always had a bit of a strain throughout throughout whatever I've done, whether they start ups or YouTube or or what I'm doing now at IBM. Your dad was right. Let me just say that, because as someone who's trying to put together a spreadsheet just to manage my personal finances, trying to look up the formula to actually bring a value from what what uhet to another is enough of a struggle for me. So I'm glad to do it. Really, it's like it's absolutely is uh so like knowing that you know, this was how you started getting into spreadsheets. You know you're looking at stocks and all of that. Um, can you talk to me about how you found out the importance of data literacy, how you begin to value understanding what the numbers meant and what power that could have. I got a cadet ship at one of the big four accounting firms and started out as an orditor there, which is pretty much day to focus. So I saw that these numbers ultimately fed into a significantly bigger picture, which was a formal annual report, and numbers being wrong in an annual report can move markets. Right. Those numbers need to be absolutely bang on. But I think that is sort of where it started. Where it really culminated was when I started doing some work at the Reserve Bank of Australia. And those numbers don't just impact the metrics for a particular organization, they impact the entire countries metrics. Getting those numbers wrong on a particular chart or getting them right on a particular chart can move entire organizations or can shift an entire country. It's kind of crazy what the value that doing things correctly with data has. So when you're presenting a metric, you have to ensure that you are portraying the appropriate message. It's not just about the raw number, because correlation does not necessarily imply causation. So understanding what it is that you're saying is so so important, and it is so much more powerful now that we've got so much more data available at our fingertips. It's really easy to go and grab a bunch of metrics and go, hey, I'm gonna grab this data from over here, grab that data from over here for a measured together. Hey, look, these two lines follow the same trend. They must be related. Do you find yourself ever looking at data points and saying those the how do I don't understand this chart? Why did they Where did they pull this from? Do you find yourself doing that a lot of your regular life? Oh? Yeah, that There's there's some great charts out there as well that you always see, and they plut like the number of Nicolas Cage movies against the g d P of Bolivia or something, and it's like, well, they're going in the same direction. They must have some relationship. But people can really quickly look at a picture and go and make an assumption about what that is saying without actually interpreting. Hey, are these on the same scales? The what time period is being displayed? What am I actually looking at here? And I find myself doing this more and more often when I just see a child on my hold on, Let's just not make any assumptions. What is this chart actually trying to say? What is it actually trying to portray? Because you can lie with statistics if you know what you're doing. It is they're so powerful and people can gloss over them so quickly. We've got attention spends that is so much shorter of these days that it can be very very easy to take away the wrong message. So you also produce content across various platforms, including YouTube and your personal blog. Uh as a content creator, how did you get started in that field and what type of content are you creating? Yeah, that's a crazy story, right. So I always wanted to get into tech and said, hey, I'd really really like to work for IBM. I saw what they were doing with Watson, and I'm like, why people were talking about this more? And I had no affiliation with with IBM at the time, and I'm like, well, this is so cool. There used to be this thing called or this service available and that the cloud platform called Personality Insights, and you could plug in a little bit of text and from that piece of text, it would analyze that particular person's personality based on the Big five personality traits. And there actually used to be this demo app where you could hook it up to a Twitter account, so I could pass through Oprah's Twitter account or Lebron's Twitter account and it would actually analyze their profiles. And this is so cool. It was nuts, and I was like, and a lot of people don't know how to use this. So that was quite possibly one of the first two toils that I made on YouTube, and I actually used a bunch of videos that I made following after that too. Finally land a job at IBM. I actually spammed a bunch of links in my resume and my couple that I was like, Hey, I'm already working with this stuff and I could do it. And the person that hired me, she actually said that that was like such an amazing way to portray what what you love about what you do. That that that had such an influencing factor in actually getting the job. But yeah, I did it because one the tech was so cool and I thought it was so interesting and so powerful, and yeah, eventually that helped me land that job. So you do a lot of tutorials where you're you're breaking down complex topics to kind of a wider audience. Why is that important for you to do? Yeah? I think one of the amazing things about knowledge is it's one of the things that you can give away and never lose, right. And I think one of the trickiest things about the whole data science and machine learning field is that it can be pretty tricky to get started, and sometimes we get hung up with learning from the bottom up right and there's nothing wrong with learning fundamentals and learning foundations and really getting stuck in. But in order to stick with something, you have to find it interesting. So if you can see the end result and then work your way back up and work out how that's worked, then it is so much more attractive because you get that instant gratification and go, hey, I've just built this machine learning app that is able to decode sign language. It's so cool. Now I'm going to go and work out the tech behind it. Admittedly, not everyone goes and works out the tech behind it, but what I'm trying to do is make it so that more people can get involved and get started with it. Lately, I've been doing these things called code that challenges, and they're kind of crazy, right, but I love doing them. So I have to build entire machine learning or data science applications without looking at any reference code, stack over a flow, or looking at any documentation within fifteen minutes. So it is literally just like a trial by fire. I'll have my phone, I'll set a time, and I'm like, all right, guys, we're on. Like the edit is literally just coding NonStop and me explaining on the go. But it allows people to see and explain my thought process as I'm developing it. UM, that's obviously super fun, right, because it's highly engaging and it shows people that, hey, you can get started in this relatively quickly. Nicholas is a kind of person whose passion for data science is so great it spills over from his professional life onto his YouTube channel. But when he's not making videos, he's using that same expertise to help his clients make their businesses work better. At IBM, Nicholas works with businesses to formulate a data strategy, preparing them to get the most out of technology like machine learning or deep learning. He explained to Ronald Wife, thinking critically about the data it generates can help a company run more efficiently. So there's a quote that you've used in your presentations say their firms are trying to become insights driven, but only one third report succeeding. What is the role of creativity in the successful one third and how are you at IBM helping to increase that number. I remember going to a talk by our previous CEO, and she said that there's a large number of organizations that are just experimenting with random acts of digital so they're just testing out some of these news technologies are saying kind of what's possible. But the ones that are truly being successful are the ones that are getting there, that data ready, that data strategy in play. They're the ones that are starting to collect their data. They're starting to get it ready and organized. They're starting to take a look at it and starting to iterate and prototype and in a st ructured manner, they're starting to roll this stuff out. The journey to get something as sophisticated as machine learning into production is a lot more difficult than I think people realize because you're now building a box that has its own rules. You haven't defined those rules yourself, So how do you explain that when something goes right? But how do you explain when something goes wrong? And having governance around that is absolutely critical, which is really whether the data strategy does come into play. So let's let's get into a more business focused data strategies. Why is it so important to have a data strategy in place to fuel AI modeling and how does data literacy play a role in getting value from these models. We've got algorithms left, right and center these days, but I think the thing that people forget is that you can't use any of these algorithms unless you've got data. So ensuring that you have a structure in place too one, collect your data, to organize it, three, analyze it, and then or infuse to machine learning or deep learning into it is absolutely critical because if you don't collect it, you can't do anything with it. If you don't organize it, you can't discover what you've actually got, what the quality looks like. You don't analyze it, you don't know whether or not you can trust it. Um and then he infused is always like the icing on the cake, right to the machine learning, the deep learning, all the cool buzzwords that people throw around. That is like the last step, and it is always the coolest step. But you can't ever get to that last cool step unless you've gone through that the hard work that that's come before. Let's like expand a little bit on the pain points for companies when they're developing or implementing a data strategy. What do those pain points look like? Honestly, the biggest pain point that I see organizations, actually the top two that I see them coming back to over and over again, is collecting and organizing their data. So let's say, for example, you've got a manufacturing type organization, and what they want to do is they want to improve the production quality on a particular manufacturing line. So ideally, if they see that they've got defective products on the manufacturing line, they want to get rid of those sooner rather than later because they don't want to be shipping him out to the customer going through the whole warranty and claims process that just costs a ton of money. So they're like, well, it would be great to use some computer vision or some deep learning to detect when we've got defects on the product line, and then we can grab those and rip them out. Somebody along the line is like, great, let's go and do it. The first stumbling block that you're going to trip up at is, hold on, do you have any images of defective products from example cameras that are looking at that production line. So if you haven't gone and collected images of that or video of that, there is no way in hell that you can actually go and build that system to improve your organizational productivity. So knowing well in advance what data you're likely to need is absolutely critical. It is the first step in the data science life cycle. So collecting, understanding, and exploring your data is the absolute first step. The second one is a little bit more interesting. So let's say, for example, you sort of want to get in on the craze that is data science or machine learning, and you bring on a data science team. The next biggest stumbling block that I find a lot of organizations trip up on is discovering their data. They've got a ton of data, but nobody knows what they've got. So being able to find, search, discover, rate, review, and rank that information is paramount because you'll have people come in and go okay. So a line managers approached me and said that we want to take a look at our top performing customers and we want to build a retention strategy so we're not losing customers anymore. Well, your data scientists is then going to go, well, do we have data of customers that have left previously. If you can't easily search and find out what you've got, that makes it pretty hard to go and build those models. So collecting, organizing, and discovering really absolutely critical, but that they can be a little bit tricky to handle in a large number of organizations. What kind of supporting technology and new solutions do we need to meet growing data management issues? It really comes down to a few things. So ensuring that you can one collect the types of data that you're looking at. So I think when people think of data, they're always thinking of hate it's just going to be a bunch of spreadsheets. It might just be stuff that we can throw into a database, But there is so much more out there. Right, there's video, how do we store that? How do we hold that? There is images, there's natural text. Like we're just talking about ensuring that you've got appropriate processes in place to be able to store holding catalog that I think is absolutely critical. We talked a little bit about data cataloging and the need to be able to search and discover that data. That is absolutely paramount. Once you've got it collected, how do you find it? What is IBM's unique approach to facilitating access to data within companies. So one of the biggest things, and one of the my favorite things that I get to work with, is a particular tool set, right, and this tool set is called cloud Path for Data. So, without getting too pitchy, that the absolutely amazing thing about This is that those stages that I was talking about, right, So collect, organized, analyze, and infused. It actually helps facilitate each one of those stages. Right. So you can actually collect, store, and hold your data in a secure and government place. You've got data catalog in capabilities which allows you to search. Like one of my favorite things is that you might have a data set. Right, So I might be a data scientist, and then we might have another data scientist on the team. I can have a data set inside of there, and I can actually rank it and add comments and go, hey, just be wary of this column with lot certain features that you need to be mindful of, and that provides additional metadata understand what is what my data actually looks like and and things that I should be mindful for. So I'm I'm Joe employee. How can data be helpful to me? Great question? So, I mean data is impacting everyone, right, whether you you like it or not. Um and more often than not, what you're going to find is that you can improve whatever it is that you do by by looking at that data, whether it's let's take an organization out of it. If you use sleep trackers, you can begin to see when you're sleep, or when you're getting good quality sleep versus when you're getting bad quality sleep. If you start to collect additional data points like hey, am I drinking enough water during the day? Am I doing certain things like looking at my phone just before I go to bed? Are these things influencing my sleep? And is that causing a negative impact on my quality of life? So that's taking a broader view of it. But when you step into a team or a business view, data can can make your life for billion times easier. If you know that there's a particular issue in a system earlier on in a data pipeline, before something crosses your desk, you might go and say, hey, look, if we just changed how we collected these pieces of information, if we just transformed what we actually did with it, this is going to streamline my entire workflow and and help me out. But not only that, Right, So I work a little bit with the automation team, and they're really big on robotic process automation. Let's say you're doing something each and every single day. You're copying a far from here to there. You're grabbing some information from a website, You're throwing it into a form and you have to do that twenty times a day. There are tools that can automate that entire process for you, and they're smart. They're not just looking at where you're clicking on the page. They're looking at what applications you're opening. They're looking at what fields you're pulling data out of. You can automate those entire workflows. That means that you don't have to do that repetitive kind of boring work that you don't really want to. You can palm that off and do the very bot and do the stuff that you actually really want to get involved in. As Nicholas said, the way a company leverages this data has an impact on every level of the business. Data informs how we do our jobs day to day and how we plan for the future. Having an open mindset about data makes it easier for a business to come up with creative solutions. In the next part of their conversation, Ronald asked Nicholas how data science and creativity come together. So let's talk a little bit more about creativity. We talked a little bit about your YouTube channel, UH and how you use that to help people get started with data science. What does creativity mean to you? And do you see your work as creative. I definitely say my work as creative, and I think creativity is truly thinking outside of the box and looking at just different ways of doing things. I think the biggest thing that I try to embody is having an open mindset and really never being willing to shut something down or not look at a particular solution or option, because you really never know where a particular solution might come from. If you look at where some of the advancements in that the medical field are coming from, it's because they're being open to new ideas, new materials, new ingredients, new recipes, new technologies. Having an open mindset really helps improve that that that ability to solve complex problems. And I think for me, creativity is really just having that that open mindset. Tell me a little bit about how you approach novel problems. What do you do when you get stuck? I think the most important thing I really like when I push myself to do something that I've personally never done before, and a lot of the time that yields new solutions to problems that that that might be really difficult to solve. It doesn't necessarily need to be using this particular set of techniques. It's what else can we do to solve this problem? And sometimes like it'll be staring you in the face and you'll just have no idea until you go, hey, I'm going to throw everything out of the box and just give it a crack and see what is possible. Um. But sometimes it does require that that little bit of grit to to push yourself to see just what is possible. And I think that's when I've come up with some of my favorite things that I've ever done, so something that I'm trying to adopt in my in my daily life. And I'm reading a lot more about stoicism and philosophy, and I'm seeing that you kind of really just got to push through sometimes to to see what what's on the other side. We talked a little bit earlier about how um folks can take bits of data and kind of tell their own story with it, especially if they if they know the story that they're trying to tell. But let's talk about using that for good. How does creativity play a role in data storytelling. I think there's just so much good that you can do with data that if you have that in your core ethos then the world's your oyster, right. I always come back to my favorite project that I've ever done, and that was using computer vision to try to decode sign language. It is by no means a state of the art model, but I forget hold on why is never nobody ever approached this or at least shared how they've tried to do it. And I've kind of just had to get real creative and trying to build that I had. I literally spent weeks just trying to install stuff, then trying to get it writting on my computer before I even got anywhere near building that particular model, And and it's super hard grow in terms of trying to get it set up. But there's so many opportunities for good, whether that's improving accessibility to certain technologies, improving the quality of life for people that could benefit from us using data a little bit better. There's a large body of work with a bunch of different data scientists where they're actually building language translation models for languages which aren't hyper popular or aren't as widely spread as we might see in our day to day lives. If you look at India, there are a turn of dialects. If you look at even where my parents from Mauritius. There's there's a whole, completely separate dialect where if you've never heard it before, you were like, it's just slang French, but no, it's it. It's like um, it's its whole separate language. That obviously allows or improves the ability for people to to to tap into data and do a little bit of good. But there's so much I mean, people are using medical image data to improve medical segmentation and improve diagnoses that there's just so much amazing work that that's happening in that space. There is obviously the temptation or used data for bad, but I'd like to think that the large majority of the community are really trying to use it for good. You started talking about a little bit just now, but what are some future trends and challenges and future topics or projects you're excited about, anything in particular looking real further forward. What I'm super excited about and I still don't know how it's necessarily going to impact me, whether or not that's going to change my experience as a developer or not. That we've got quantum computers coming right, there's a ton of work that's happening in that space. It's going to radically shift how large a machine learning model we're able to create, how fast we're able to train them. I'm just excited to see what happens in that space. I'm not a quantum physicist by any means, but I'm still excited to see what I'll be able to do with him in the future. I love that, as you'll continued belt this technology, you're excited to play with it after it's built, which I'm I'm totally bored that I don't want to have to build it, Nicholas or not. Thank you so much for a talk with me today. It's been an absolute pleasure. Thank you so much for your insightful questions. It's it's been awesome. Ronald Nick made a point that I think is important to remember when it comes to technologies ability to improve our businesses, or make our jobs easier, or even do social good, a thoughtful data strategy is always the first stepping stone. Without good data, using machine learning or artificial intelligence to create in a sative solutions becomes much much harder. Our technology gets more sophisticated every day, but that doesn't mean we should lose sight of the fundamentals. If we want to get the most out of smarter technologies, better business decisions, more optimized technology, fresh and unexpected insights, we're going to need smarter data strategy. On the next episode of Smart Talks with IBM, the Power of Salesforce to transform the customer experience, we talked with Phil Weinmeister had a product for Salesforce America's at IBM consulting about transforming digital experiences with the Power of Salesforce and IBM. Smart Talks with IBM is produced by Matt Romano, David jaw Roist and Deserve and Edith Rousselo with Jacob Goldstein were edited by Sophie Crane. Our engineers are Jason Gambrel, Sarah brug Air, and Ben Holliday. Theme song by Granmoscope. Special thanks to Carli Migliori, Andy Kelly, Kathy Callaghan and the eight Bar and IBM teams, as well as the Pushkin marketing team. Smart Talks with IBM is a production of Pushkin Industries and I Heart Media. To find more Pushkin podcasts, listen on the I Heart Radio app, Apple Podcasts, or wherever you listen to podcasts. I'm Malcolm Gladwell. This is a paid advertisement from IBM