Rajiv Maheswaran is the co-founder and president of Second Spectrum. Rajiv and his company figured out how to turn raw sports data into useful information for coaches. Today, the company works with basketball and soccer teams in the NBA, the Premier League and Major League Soccer.
Rajiv's problem: How do you teach a computer to understand sports?
If you’d like to keep up with the most recent news from this and other Pushkin podcasts be sure to subscribe to our email list.
Pushkin. About a decade ago, special cameras in the rafters of NBA arenas started following all the players around the court and tracking the ball everywhere it went. It was this huge new trove of data, but nobody really knew what to do with it. This is a very familiar problem in the modern world, tons of data that just kind of sits there. Then a few computer scientists came along and had an idea for how to solve the problem. They thought they knew how to take all that data and turn it into profoundly useful information, and so they started a company called Second Spectrum to see if their idea would work. I'm Jacob Goldstein, and this is What's Your Problem, the show where entrepreneurs and engineers talk about how they're going to change the world once they solve a few problems. My guest today is Regeev Mahiswaran. He's the co founder and president of Second Spectrum. The company started out turning video of NBA games into information that coaches could use. Today, they work with every NBA team and with Major League Soccer teams in the US and with Premier League teams in the UK. Regiev's problem is this how do you teach a computer to understand sports? Regiev and his co founders started Second Spectrum back in twenty thirteen. Last year, the company was acquired by a sports analytics company called Genius for two hundred million dollars. My conversation with Regieve focused on his work with the NBA. That's where Second Spectrum has been working the longest. But he started out talking about this bigger idea, how do you turn data, really any kind of data into something useful. So I think this happens everywhere. There's a lot of companies that collect data, faster data, more data. But what generally happens is the big piles of data. I feel like they're like grain. They just if people don't know what to do with them, so they just stick them in the closet and it's like, yep, I got piles of grain in there, and it keeps piling up. And I think what we said was, you know, we brought a bunch of machine learning, a bunch of other stuff to and said, you know, the data, this massive amount of coordinate information. No one can work with it. Coaches can't work with it. The leagues that have a tough time working with the media has a tough time worker that we are going to sort of grind that grain into story elements that people can actually use. So we turn it into oh, that's a pass, or that's a shot, or that's a pick and roll, or that's a between the lines pass, or that's a blitz, and so we start turning them into words by which people can tell stories. So I would say, like the grain is not useful, you don't want to make like bread and donuts. Is there some particular example you can give me from the early days of the company of basically of a problem you solved, of a thing you set out to do. Maybe it was hard to do, maybe it didn't work the way you thought it would, but in the end you made some useful thing for coaches, teams, whatever. There are two things that we did that sort of a big we're a big game changer, right. So one was the idea that people were taking shots and you couldn't tell if they were a good shot or a bad shot. You would use some rules. It's like, oh, if you took it here, you know, if people weren't guarding you too closely, it was good. But you know, if you dribbled a lot, and you jumped in somebody's face. It was bad. I think we were able to use math and say, oh, that shot should go in forty two percent of the time. That shot goes in forty nine percent of the time for an average player, it goes in for this player. So we were able to quantify the quality of a shot, which was basically the core thing in basketball is you want to get high quality shots and prevent high quality shots, and there was no way of measuring it. And we basically said, like, no, no, we know exactly what the quality of the shot is, and then we also know which players add value beyond that shot. So players like Steph Curry and Kevin Durant can take a shot that it might be a forty five for an average player, and for them it goes in fifty five. And that was a big deal. That was the first big thing, just to step through that a little bit more slowly. So before you came along, what did people know? What did coaches know about sort of percentage probabilities that a shot will go in? They would just track it much more coarsely, so they would say, oh, from the corner three, or you would shoot this, and from above the break you would shoot this, or in the lane you would shoot this. But they didn't know things like, hey, there's a really tall defender running straight at you, or you're moving sideways. You know, and these things also matter important. Yeah, and then the raw data allowed us to know all that, and then what we did was start to build models that basically predicted the likelihood of various shots going in. So one of the things you did is coming up with a better metric for shot quality that has more inputs. The other thing you did, is it right, was around the pick and roll. So I think there are all these events in basketball, and you can get pretty esoteric. There's a pick and roll, but there's many types of pick and rolls. There's many types of ways to defend pick and rolls. Just to be clear, let me let me just say we won't go into detail, but pick and roll it's just a play. It's an offensive. It's a play. Yeah, it's somewhat complicated. It involves two people on offense and two people on defense. Let's just say that's that's all you need to know. Okay, So what did you do with the pick and roll? So I think the first thing we did was we had a machine be able to identify it by staring at the data. So before us, it was basically humans would watch the video and just start counting how many happened and counting what the type of defense was, and basically they would just collate all that information. But I think the big thing that we discovered is when machines did it. The big example was there was a year where the human collection said Chris Paul led the league running eight hundred pick and rolls, and we said, well, Chris Paul led the league running four thousand pick and rolls, and one of these is not right. But then we could just say like, well, here's our four thousand, and then you're saying, oh, okay, you're missing sort of you know, eighty percent of them. And so I think then I think once we did that, we could automate a lot more complicated things, like having the machine understand all the defenses that you could play against various types of pick and rolls, and then once we solve those, we sort of built this engine that could generate words accurately, and then we started growing teams and each one of them would say, hey, can you do these words can you do these words? Can you do these words? Like say, some of the words they were asking for. So they're sort of like you know, flares and loops and zippers and jams and trails and whips and you know veer backs, and I mean the sort these things are basically what plays their their defenses, their moves. That's right. So there's like sort of you know, the particular kind of screen that they used to get Steph Curry free off in the side versus the name of a defense that the person running after him might choose to play while he's doing that thing, right, so to avoid that screen. Yeah, and when they say learn a work, they really mean can you teach your machine or get your machine to learn to recognize this play this thing? Because they want to know every time they do this two questions. Every time they do this and we do this, how efficient is it? And or every time they do this and we do this, show me every time that that a video of every time that happened over the last several years. Those are the questions they tend to ask. Okay, and so it had it has a bunch of quirks, you know, working with it, But but it was not easy. But we've done it, and now we've created sort of five hundred of these words that you know, coaches and you know players use on a daily basis. And so I can think of two parts of this that would have been sort of hard problems to solve, one being the kind of nathy computer part, like building the machine, and the other part being the getting people to believe you and use your insu So tell me a hard thing? Well from each right, So what's a hard thing from the from the MATTHI computer part? I think the mat the mat side is just that you don't have a lot of data. So it's for example, a lot of people who do machine learning will say like, oh, give me millions of examples of a thing and I will learn it. Where well, there aren't millions of pick and rolls that you're going to get this sort of how do you solve that? Right? Small sample size? Classic problem, small sample size, So you know, how do you solve it? Well, there's some mathematical chicanery that we invented. The answer to be very clever, like we're really good at bath. Is that the unsatisig outs? Yeah, let me let me ask you. This was there some like things you tried that didn't work. It was just like one equation didn't work and then another did. I mean, yeah, yeah, yeah, I know. I think we've we we you know, I think we we we said over the course of our life, we've used everything in the AI textbooks to try and solve the problem, and they've evolved over years to try and get them get to be better. Um. But you know, I think that that's part of the adventure of sort of trying to figure that stuff out. It wasn't like we used we used method X and the answer popped out, and so we've been constantly evolving the ability to do But I think I think part of the I mean, whatever the secret is, you have to leverage the structure of the problem. You have to leverage the fact that you know something about the fact that it's a bunch of people playing a particular sport. When you don't have enough when you don't have enough data, you have to leverage structure in some way. Yeah, so you're you're able to because it's such a constrained environment, because you know the rules, because you know the things you wanted to find. You can sort of say this is a pick and roll this is a pick and roll. This is a pick and roll. This is not a pick and roll. This is not a pick and roll that kind of thing. Yeah, good, So okay, So that's the math, sort of technical side. Then there's the getting people to believe you and you know, buy what you're selling ultimately, both metaphorically and literally. Tell me about that side, like, were there any interesting I don't know, people who didn't want to buy it, you know, little stories from that side. Almost every coach that we've talked to couldn't believe that we did what we believed. So the story is generally involved just sitting there in front of these coaches. You go up a couple of levels. So I think that you know, this happened with two coaches. It's a very similar story. They were very well known coaches, you know, long resumes, have been around the NBA for a long time, won lots of championships. You know, it took several meetings to even get an audience with them. You would go through sort of layer one, than layer two, then layer three, and then you would get to them. It's like knitting the Pope or something exactly, and then you would sort of get to them and there they were. They would just basically say it like, you have no idea what I intend for the players to do. There's no way this machine can understand what what you need. And we would say try us, and we would say, okay, show me all the times that the play started with the pick and roll and there was three passes and I took a shot in the corners like okay, here it is like, show me all the times there was a screen. I mean like you have a laptop and you have your software running or something. Yep, that's right, and they would we would have tables and charts and they would we would project onto a screen and they would basically just give us the quiz, the sort of the the They would put us through the ringer for for like two hours, so it's like a it's basically a impromptu grilling from the coach and make your machine dance for me and show me the correct answers. And any time you come up with an answer that I don't agree with, you know, I wouldn't think you're an idiot. But we held up. In fact, sometimes these coaches would say, wait, show me this list, and if these two guys aren't number one, and number two, I think your thing is wrong, and then we would do it, and like thank goodness, the right players would end up number one and number two and and then normally after sort of two hours of grilling, they would be pretty good and they would just leave the room, and then we knew we would have a contract. And is the output like the thing that actually you see on the screen when you're running your software. Is it lists? Is it little videos where all the players are dots? Is it both of those things? It's everything. So we have sort of ranking stables that can answer who's the best at X, Y and Z. We have sort of a variety of visualizations that show breakdowns of various actions. You can always click on anything and show all the video of every moment you know anything that you asked it. So we've built lots of visualizations and data formats to make it easy for various people in the organizations to use them. There are instances in the world in different domains where like a data driven approach suggests doing one thing, but that thing is contrary to conventional wisdom, right, But like in football, it seems pretty clear that the data suggests coaches should go for it on fourth down more often than they do, and it seems like as the result, coaches have started going for it more on fourth down. But there is this thing where if a coach goes for it on fourth down and doesn't make it, the team doesn't get the first down, then the coach gets pillaried. Right. So even though the data says it makes sense to go for it in a kind of a personal incentives way, it might not make sense right in a social way, which is real and important, it might not make sense. Is there a basketball version of that? Is there a version of that you have observed where your data suggests that coaches should be doing something, but they're reluctant to do it because it's contrary to sort of conventional wisdom. No, I think there's a lot more nuance and what we do. I think there's the public basically is aware that they say, oh, people should take more threes. But I think the reality is much more complex than that, because I think it's not about just simply take a three. What you want to do is get the best possible shot for yourself. Sometimes that's a three, sometimes that's something else. Different kinds of threes are not the same, and your talent is not the same. In fact, if you look at all the you know that you know, I'm just gonna sort of Lebron James and Kevin Durant and Steph Curry and Janis and Yo Kitchen, all these players, and you know they are very different players, and teams are built around these players, and what you want to do is optimize the team built around these very unique players. And so a lot of it is in the details of how you structure your offense and your defense to sort of use them to get your team the best possible shot. So a lot of it is that these coaches do a lot of work on basically on the micro level to sort of create these plays. That it's not just sort of walk up the court and take a three. They do a lot of work to try and put their position their players in positions to have a lot of good options to get the best possible shot. So I think that it's it's really a lot more nuanced in the actual execution of it. Fans would be really surprised at the level of sophistication entirely across the league and many leagues in sports, especially the top ones. There's there's a degree of sophistication and how they use data and video that I think would surprise almost everyone. You think that the sort of play calling side of coaching will become more and more delegated to the machine. I mean, I could see coaches as essentially psychologists, being persistently useful. But do you see a time when the sort of core strategic decision making will just be be done better by a machine than by coach. I don't. I don't think so. Our thesis has always been we build iron Man suits and and different people will want iron you know, different iron Man suits, and you will want because you want you need somebody in the middle of the iron Man suit directing it, using everything they know about the world. But you want to be a lot more powerful. And I think that that's the model we've always used, is we build Iron Man suits for everybody, and then different coaches and different assistant coaches are going to use them, you know, to be a lot more powerful. That's what we want to do. We want to build Iron Man suits for everybody. Does basketball look different? Is it played differently because of your work? I certainly know that there have been examples where you know, in a couple of NBA Finals strategies were changed because of the data that we provided. You know, close ones are where one team basically discovered a pick and roll defense strategy that was effective and employed that and it to you know, very good effect. Another one it was one of these sort of off ball screen defense strategies where they found the data show that there were particular strategies because that would shave off a couple of points of efficiency from the other team, and they employed that very heavily and they won a very close series. So I think, can you say what teams it was? I should Okay, I get people looking at they could probably figure it out. The point that I want to realize, like there are a lot of people who come out and think like, oh, these these teams are sort of the geniuses and these teams are the Luddites, and that's not the case. Like almost every team is. Like people would be surprised that at the minimum, that's fine. I mean, everybody in the NBA is every player in the NBA is good at playing basketball, but some are better than others. And so similarly, you might imagine that every sort of quant team in the NBA is good at at being you know, doing quantitative analysis, but some are probably better than others. So I would say, you're right, there are some who are better than others, but the minimum level is a lot higher than when everyone would sure, yeah, yeah, And can you say who's better than others? I cannot. I I this is our lifeblood of being the reason that all these coaches trust us and is that we don't. We don't leak. So I'm sorry. It would be fun one day, One day, I'll call all the stories. It's not okay, I'm sorry. Regiev is not going to name names today Alas, but he is going to talk about a big, interesting problem that second spectrum is close to solving but has not quite nailed yet. That's in a minute. Now back to the show. I want to talk about things you haven't figured out yet, like what are the next problems you're trying to solve, Like what is the frontier? So I think the frontier where we're trying to basically basically get full body pose of the human, not just a little dot. We're trying to get their entire skeleton. So the basic sort of product you have, the basic thing you do turns each player into a dot, just like in the classic like x's and o's diagram or whatever. And so you can watch little dots moving around and you can see how far the dots are from the other. But there's a lot going on that you don't see in the dot, right you might want to know, like what are the things that are important that you don't see in the dot? That's exactly right. So the challenge we're working out right now is turning the dot into a human skeleton and then having that skeleton and generate data in you know, one hundred to two hundred milliseconds. So that's the challenge that we're working on right now, and we've actually made a lot of progress on that, and that's right now. And so what's an example of why that would be useful? What are some of the ways you could learn from seeing each player as a whole body rather than as a dot. So I think there are a lot of things you can do. There's one is, obviously, you know, with health and fitness, where you can figure out did they land on one foot, did they land on two feet? You know, did they you know they take off from this port of that foot. So there's a lot of health and fitness stuff that you can figure out health and fitness, meaning learning to reduce the risk of injury, learning that certain kinds of moves or subtle distinctions in moves make a player more or less likely to be injured. That's right, And so that's one class of things we can do. Another class is sort of just understanding moves better. So if a player made, you know, dribble moves between their legs and crossovers and step backs and did all all kinds of things with their hands and feet to get open, now we can categorize those things. There's that, there's health, there's strategy, there's media, there's officiating. There's you can help with officiating if you start knowing where people are. So there's a lot of things that you can do once you understand more of the human skeleton. And let me ask you this, why is that hard? It sounds hard, but like tell me about that being hard. So basically, you have a bunch of cameras in a stadium and it sees these sort of blotches of that you tell them it's a human and and you're to say like, oh, that's a knee, and that's an ankle, and that's a that's a waste, and so it's just sort of you're you're teaching a machine to see a human from scratch, and it's not that it's it's it's you know, there's been a lot of progress in computer vision over the years that as making this making this problem, you know, easier and easier. A lot of it is also just doing it extremely quickly like you want to. You want this machine to be able to do it and on the order one hundred two hundred milliseconds, to be able to enable a lot of things to happen, and happen in real time in sports one hundreds or two hundred milliseconds, but you might want it to be only delayed from reality by a tenth of a second. Let me ask a dumb question, why can't you just let it take its time to figure it out. You're not trying to do it in the middle of the game, right, or are you trying to do it in the middle. So, because I think sports is one of these interesting this is part of the reason why sports machines. Understanding of sports is so hard. In sports, it's not like you want to analyze a video and then know what happened a long time later that might be the case for coaches, but for a lot of media applications or refereeing applications, you want to be able to do it basically instantaneously. Yeah, right, And so it's almost like the self driving car, like you need it to be able to do its thing very very quickly to be able to react to it. You know, is self driving car research helpful to you? Are you borrowing from that? Well? Yes, because I think that's generally pushing the field of computer computer vision forward because it's basically the same problem. You need to be identify people and what they're doing very very quickly, So one is to sort of avoid running into them, the other one is sort of figuring what's going on on a sporting event. And so I think the general trend of computer vision research moving to solve these problems quickly is helpful. Have you figured out anything yet about injury, Like you have any useful injury insights? I mean I've always been, you know, to be honest, wary of injury because it's such a it's it's a hard thing to do. But I think that well we have why wary of studying it, wary of trying to to be honest that the teams use the data we give them in that way, even though I don't want it to happen, because because they just it's just it's any information is better than no information. And so I've always been, you know, the whole nine years, I've said, we'll give you the information, but I'd be I'd be careful. But you know, it's such a such an important thing about keeping athletes. You know, why are you so wary of trying to figure out injury risk? I just I don't know that at this point. I think that I've seen stuff that is is uh, super predictive that the scientists that I would sort of laid down on. I mean, I know there are a bunch of other companies who have so because it's such a hard problem, because you don't feel like you can you can reliably predict it. Yet, Yeah, what is it about injury risk that makes it so much harder to understand than sort of things you know that are more kind of within the game itself. I mean, I just think that you know, it's it's it's almost a sample sized thing. You know, people get injured in all sorts of you know, unique ways, you know, not super often, right, and so you know, I said, there aren't that many pick and rolls, but they're way more pick and rolls than there are you know, yeah, whatever ACL tears or whatever these and presumably injury is more complex. Yeah, it's a more complex problem, and it happens way less often, that's right, which makes it harder on both sides. That's right. I want to talk a little bit about the applications or potential applications of your work beyond basketball, the extensions of your work beyond beyond sports. Um. I mean sports and games more generally seem interesting as a sort of testing ground. Obviously they're useful in and of themselves, but also as a testing ground for I guess for AI in particular. Right, I mean if you think of of chests famously and then go, you know, you had deep mind figure out chess and then go and then they solve the protein folding problem, this profound problem in in biochemistry basically, right, Um, tell me about that. I mean, well, why are games and sports a good place to start? Yeah? So I think that, you know, I think one of the things that we're doing in sports is the general problem of human activity recognition. Right, So that's really what we're doing. But people do a bunch of stuff in a space, and we want to figure out what they're doing. And you know, we didn't actually put you know, sports in the name of the company because we didn't we thought we might go beyond sports, but it did not. There were so much stuff to do in sports we sort have stayed here. Sports is interesting because it's just it has the ability to create a lot of data capture and the activities are are sort of bounded and well known, so you can just have this, you know, this intense capture of this rectangle, various rectangles based on sports, and then it's very clear what some of the activity recognition is and so it's a it's a great place to sort of start with human activity recognition. So what is what is the like obvious adjacent thing in the world where what you do might be useful. I think that if you look around the world, this can be applied everywhere. So for example, you know, in your house, right if you have cameras in your house, and you'll be able to figure it out like oh, you know, someone has fell down or some you know, the kids are fighting right my house. It feels a little creepy to me, like I don't particularly want it in my house. I will say that's fine, Yeah I'm not I'm not necessarily against it, but like it doesn't feel creepy at all in the NBA to know that. So a lot of the other things are sort of you know, I've seen there are other companies that basically watch how people move around stores so you can say, oh, this is where we should put food, or this is where people are congregating, or this is where we get blockages, or this is the travel patterns inside a store. So I know a bunch of companies doing that. So I know that's sort of security and stores. There's a bunch of companies out there. You know. We were approached by people who are like to go to concert venues and say, okay, can you can you put a bunch of cameras and figure out how people move in and out of concerts and figure out where the bottleneck exactly, how does the foot traffic flow. So we would have approached by lots of different industries to say, can you apply what you're doing to us? And what have you said? When those people have approached you, I always like I would like to get to it. But I think the sports has kept us so busy over the years that we've just had plenty of work to do in sports. In a minute, the lightning round with lots of questions about basketball and a little bit about soccer. That's the end of the ads. Now we're going back to the show. Um, still a lightning round, Lightning Round. Who is your favorite NBA player of all time? Larry Bird? What does the data say about Larry Bird as a player? He was very good? That lines up. So it seems like still, at least to some extent, there is there is this cultural divide in the NBA and in other sports between the data people and the sort of old school sports people. And I'm curious, what do the data people not get about the sort of traditional sports people. I think that what happened was, you know, the traditional sport sports people spoken words and the data people spoken numbers, and I think that that's what needed to bridge, Like basically both people had to speak words in numbers, and I think that's happened so that you think that divide is done now it no longer exists. Yeah, Is it still fun for you to watch basketball? Or does it just feel like work. Oh no, it's fun. It's fun. According to the data, who is the greatest basketball player of all time? We're unable to do that because the data capture only goes back five years. Who's the greatest player of the last five years? The last five years? Steph Carry? Who's the greatest soccer player the last five years? MESSI? Can you compare Steph Curry and Messy in some quantitative way? Yeah? I think that all the great players have some number where they perform something far above expectation. I mean, can you is it a dumb question to say who's better Steph Curry or MESSI? It's not, but it's not a question that coaches tend to worry about. Well. Sure, but but if it's not a dumb question, I'm going to ask you who's better Steph Curry or Messy? The reason is that we haven't built it. So it's almost every question requires building a set of tools to answer them. We could, but as no one has asked to build those tools do. I think what you would want to do is say you could answer that question, but I'd have to pay you to do it. Basically, because I think, like I think, there is a way to say it because like how much of it? Basically, these questions you're asking is how much was an outlier? Was this particular person compared to everyone? Yeah? How much value did he add? Would be a clumsy way to say it. Yeah, yeah, And I think that there are definitely ways to answer that. That's sort of not that is not where we have spent our time. But I don't think it's unanswerable. I think that there are interesting ways you can go about answering that question. This one goes to sort of data versus kind of public acclaim among NBA players, Like based on the data versus what people think in general. Who's the most underrated player right now? Say Chris Paul. I mean he's not really underrated, but he's not quite at the pantheon. You can be highly rated but still underrated. Underrated does not mean low rated. It means not rated high enough. That's right. Who do you think the most overrated player? Oh, that's a good question. It's it's it's tough. I don't know. I'm saying no. One just jumps to mind because I think, like, well, also, you get it. You could get in trouble, right, you're gonna you're gonna call out one are your clients stars? Well, there's there's two questions when it's like, hey, I thought of someone, but I don't know who to say. I don't want to say it, but right now I can't actually think it because because I I it's a lot of people. People are much better at rating nowadays. I think that's that's a lot of what has changed over the years is that because there's so many more numbers like the error bar, and ratings have gotten a lot narrower. So the people have gotten more, players have gotten more appropriately rated because fans are savvier about exactly. So what has happened in the place is big centers are not as valuable in basketball. But you actually see that, like they're not rated as highly, you know, and so a lot of the a lot of big centers who would have had you know, massive contracts ten plus years ago aren't getting those now. But that's because you know, the ratings have adapted to value them less. The data shows that big centers aren't as valuable as people thought. That's right. You think you're going to work at second spectrum for forever, for as long as you're working. It's our baby, it's been our the baby of many, many people. I mean, babies grow up. I think that there are problems I want to solve, and sure I would love to be able to solve all those problems, and then once they I'll say what will I do next? But I think there's certainly plenty of problems to be solved. Rejieve mehs Run is the co founder and president of Second Spectrum. Today's show was produced by Edith Russolo, engineered by Amanda kay Wong, and edited by Robert Smith. I'm Jacob Goldstein, and we'll be back next week with another episode of What's Your Problem