Rerun: What is an algorithm?

Clean

Published Jun 8, 2022, 6:18 PM

The word algorithm pops up a lot in tech discussions, but what the heck is an algorithm? From learning about an Arabic genius to discovering a way to figure out what day of the week falls on any given date, we look at algorithms.

Speaker 1

Welcome to tech Stuff, a production from I Heart Radio. Hey there, and welcome to tech Stuff. I'm your host, Jonathan Strickland. I'm an executive producer with I Heart Radio. And how the tech are you. I'm not doing so great myself right now. I'm running a low fever. I'm hoping that I just have like a stomach bug or something and that it's not a second case of COVID, because I don't want to be that guy. Uh. And that's a good reminder that if you are in an area where the COVID numbers have not really slacked off, still a good idea to wear a mask, you know, limit your exposure to other folks. Seriously, this is no fun. You don't want to go through it if you can avoid it. So yeah, just you know, I care about you guys, so take care of yourself. Also, you know that means that I'm not really able to write a new episode because I can't read good when I got a fever. So we're gonna have a little bit of a rerun. I apologize for that. I wish I could avoid it, I really do. But I also feel like this is an episode that is timeless. It's one that is important. It's called what Is an Algorithm? Which originally published last year on July twenty one, And the reason why I think it's important is that a lot of people know the word algorithm, but they don't have a deeper appreciation of what that actually means. We almost just use it as a placeholder. It might as well be magic, right. The Facebook algorithm determines what you see and what you don't see, but that doesn't really explain what an algorithm is. So I hope that you find this episode interesting and if you would like to leave a suggestion for an episode topic for me in the future. There are a couple different ways of doing it. One is you can go over to the I Heart Radio podcast app, which is free. You can just navigate over to the tech stuff page and there is a little microphone icon which allows you to leave a talk back message up to thirty seconds in length, so you can always leave me a message if you like. You can indicate if I could use the audio in an upcoming episode, which would be a lot of fun. But I'm not gonna just assume that you're cool with that. You'll have to tell me. And um, A lot of people have been doing that. They've been leaving messages. It's been great. You've all been so kind and so enthusiastic, and I love it. Keep them coming. Also, if you don't want to use the app, I totally get it. You can always leave me a message on Twitter. The handle for the show is tech Stuff hs W. Now let's get to this episode that originally published July twenty one, two thousand twenty one. What is an algorithm? I frequently speak about algorithms on the show. I talked about them all the time, about Facebook's algorithm or YouTube's algorithm. We talk about search algorithms and recommendation algorithms and tons of others. But I haven't really spent a ton of time talking about what algorithms actually are. Like it kind of become shorthand for do not look behind the curtain, right, It doesn't really help you if you don't know more about it. Uh. I might go as far as to say it's a set of rules or instructions to carry out a task, but that's pretty reductive and it doesn't really help you understand what algorithms actually do. So today I thought we'd talk a little bit about algorithms and what they do in general, and then maybe chat about a few famous examples and some fun, little quirky stuff. So before I dive into this, we do need to establish a few things, and arguably the most important of those things is that I am not a mathematician. I am not a computer scientist. I'm a guy who loves tech, and I'm decent at research, and I do my best to communicate topics in an informative and you know, on a good day and entertaining way. So with that in mind, I fully admit that this is a subject that is outside of my narrow expertise, and so my descriptions and explanations will be from a very high level. Uh. And you might even be able to get something like this just by glancing at a pamphlet that's about a computer science introductory course. So for all you folks out there who are deep in the computer sciences, this is going to be like a children's story book for you, possibly one where you don't like de illustrations very much. Another thing that I need to establish is that algorithm itself is a very broad term, you know, in that sense, it's kind of like the word program or app or software, and that these words are categories that can have a wide variety of specific instances. For example, let's just take a look at software. If I'm talking about software, I could be talking about a very simple program that serves as a basic calculator that can add and subtract and divide and multiply. That could be a piece of software, be a very simple one. Or I might talk about something a little more complicated, like a word processing program. Or I could talk about an advanced three D shooter video game, or an incredibly complicated computer simulation program, or anything in between. All of those could fall into the realm of software well. Algorithms similarly come in a wide variety of functions and complexity. There are some that are elegant in their simplicity, and there are some that when I look at them written out in in what is clearly well defined terminology, uh, I'm just left wondering what the heck it is I just saw. They might as well be hieroglyphics to me. In other words, but let's start off by talking about the word algorithm itself. Now, unlike a lot of words that we use specifically in English, algorithm does not have its roots in Latin vocabulary. There are words in Latin that are like al gore and things like that, but that's not where algorithm comes from. Instead, it's a Latinized version of an Arabic name. Algorithm comes from algorithmic, which is the name that Latin scholars gave to a brilliant man named al Ka Ritzmy and my apologies for the butchering of the pronunciation of that name. No disrespect is intended. It's merely American ignorance anyway. This particular mathematician philosophers scholar served as a very important astronomer and librarian in Baghdad in the ninth century. His work in mathematics was truly foundational, so much so that we get the word algebra from a work that he published. He treated algebra as a distinct branch of mathematics, when before it was just kind of lumped in with other stuff. And he did a lot of work and showing how to work various types of equations and how to balance equations. And on a personal note, this was something that I used to love to do in my mathematics courses. I could really see the beauty in determining if two sides of an equation truly were equal. It was like an elegant puzzle, and I really love that. But again, just to be clear, my love of algebra and trigonometry is pretty much where math and I parted ways, because when calculus came along, I was I was pretty much done. So at that point it was it was beyond my ken as they say. Anyway, back to our Arabic scholar, he highlighted the importance and effectiveness of using a methodical approach towards the solution of problems, and that kind of gets at the heart of what our gorhythms actually are. The basic definition, which I kind of alluded to earlier is quote a process or set of rules to be followed in calculations or other problems solving operations, especially by a computer. End quote that's from Oxford Languages. But you know this can apply to all sorts of stuff. For example, if you've ever seen anyone who can solve Rubics cubes, those those puzzles that are in cube form where you've got all the little, uh different colors of squares and your job is to make each side of the cube the same color. Well, there are people who can solve Rubik's cubes in in a fraction of like, like less than a minute. It's it's amazing to watch them go. It's so fast and it's hard to keep up with what they're doing. They are essentially following algorithms, these processes that will get to a dependable and replicable outcome based on what you're doing. Um So, an algorithm is really just a way for us to problem solve. Let's say we have a particular outcome that we want to achieve. The algorithm is the recipe that we followed to go from whatever our starting condition is our starting state. In order to get to the outcome we want to our target state, it needs to be well defined. Algorithms have to be well defined because computers are not really capable of implementing vague instructions. It needs to have a pretty solid approach so that the computer, when following the algorithm quote unquote, knows exactly what to do based upon the current state of the problem. It also needs to be finite. Computers are not capable of handling infinite possibilities, so the algorithm has to work within certain parameters, and those parameters depend upon the nature of the problem you're trying to solve. And the equipment that you have to solve it with. So let's take a problem that computers typically struggle to achieve, and that is random number generation or r in G. Now you would think that creating a random number would be easy. I mean, we can do it just by throwing some dice for example, Right, you can throw a die and and whatever the number comes up, that's your random number. But computers have to follow specific instructions and execute operations that by definition need to have a predictable and replicable outcomes. So computers find random number generation really tricky. Typically, the best we can hope for when it comes down to this sort of thing is pseudo random numbers. So these are numbers that, to an extent, appear to be random, right Like, on casual observation, it looks like the computer is generating random numbers. However, if you were to really seriously dig down into the process, you might discover that they are not really really random at all. But we just need them to be random enough, right. We need it to be close enough to seeming to be random for it to fulfill the function we need if it's not truly random. For a lot of applications that ends up being, you know, negligible. On a related note, there are some types of mathematical problems that have many degrees of freedom lots of variables, for example. So a problem that does have a lot of variables could have a lot of potential solutions, and so it could be a challenge for computers to solve. We need algorithms to tackle these sorts of problems, to go at them in ways that go beyond deterministic computation, which I guess I should define that. So, deterministic computation refers to a process in which each successive state of an operation depends completely on the previous state. So what comes next always depends upon what came just before. And I'll give you an example. Let's say I give you a math problem, and I say you need to add these two numbers together. And let's say it's like, uh, five and six, and then from the sum of those numbers you have to subtract four. Well, you would first add those two numbers together, so five plus six you got eleven, and then you would subtract four from that number, and so eleven minus four equal seven. And this process is deterministic because I gave you those specific instructions that computers do really well with deterministic processes. These are pretty easy to follow. These sorts of processes are by their nature replicable. So in other words, if I ask you to do that same operation, if I tell you add five and six together and then subtract four, it doesn't matter how many times you are you do that that process, right, You're always going to come out with the same answer. It's always going to come back no matter what you do, You're gonna always of seven as your answer, just based on that same input. So there are a family of algorithms that collectively get the name Monte Carlo algorithms, and they're named after Monte Carlo, the area of Monaco that is famous for being the location of lots of high end casinos have been there. It's not really my jam, but it is pretty darn flashy. So the reason these algorithms get the name Monte Carlo algorithms is that, like gambling, they deal with an element of randomization. So gambling is called gambling because you're taking a risk. You're gambling, you're betting that you're gonna beat the odds in whichever game it is that you're playing. And that you're going to achieve an outcome that's favorable to you, which you experience as a payout. So, whether it's throwing dice or playing a card game, or placing bets at a roulette wheel or playing a slot machine, you are engaged in an activity in which there is some element of chance, at least in theory if the game is honest, and what you're hoping for is that chance lands in your favor. Similarly, Monte Carlo algorithms often focus on chance and risk assessments such as what are the odds that if I do this one thing, I'm gonna totally regret doing it later? But you know, more of the a computational problem and not a lifestyle choice. Well, back in ninet, scientists at the Los Alamos Scientific Laboratory, including John von Neumann, uh stand All Lam and Nick Metropolis propose what would become known as the Metropolis algorithm, one of the Monte Carlo algorithms, and the purpose of this algorithm was to approximate solutions two very complicated mathematical problems where going through a deterministic approach to find the definitive answer just wasn't feasible. So In other words, for certain complicated math problems, figuring out the quote unquote real answer would really just take too long, perhaps to the point where you would never get there within your lifetime. So you need a way to kind of get a ballpark solution that hopefully is closer to being correct than incorrect. Doesn't sound like the sort of thing that computers would be particularly good at doing right off the bat, right. In fact, the algorithm allows computers to mimic a randomized process, so it's not truly random, but it's random ish if you like, and it uses this to approximate a solution to whatever the problem is that you're trying to solve that falls into this category. I once saw a simulation of a randomized process using the Monte Carlo method that I thought was pretty nifty, and it was determining what is the value of pie. Let's say that you don't know the value of pie and you're just trying to figure it out. So and by the way, I mean pie is in p I not as in the delicious pie dessert treat. So here's the scenario. Imagine that you've got a table and the tables like, I don't know, three ft to a side, and positioned over this table is a robotic arm. And this robotic arm can pick up and drop large ball bearings onto the table, and the arm can just move over the entire region of the table, so it's got a three ft by three ft range of motion. And in on this table you set down two containers. You've got one that's that's a circular container and one that's square. And the square container measures six inches to a side. The radius of the circular container is also six inches, so that means it's got a twelve inch diameter. Right, So those are your containers, and you've got your situation with your your robot arm above this three ft square table. So the robotic arms job is to pick up ball bearings and then from a random que will drop that ball bearing somewhere above the table. So some of those balls are going to fall into the square container, some of them are going to fall into the round container, some of them are not going to fall in either. And if the process is truly random, meaning there's an equal chance that it will drop a ball bearing over any given point of that table, over time, we would expect to see more ball bearings fall into the round container because it has a greater area and there's more space for a ball bearing to fall into one of these. At the end of repeating this process many, many times, we should be able to take these two containers and then count the number of ball bearings in each of them. And if we divide the number of ball bearings in the circular container by the number of ball bearings that fell into the square container, we're gonna get results that should be really close to pie. And you might think, wait, why why would the number of ball bearings in one container versus another, and then dividing those why would you get pie? Well, if you have a process that is uniformly random across an area, like our hypothetical three ft square table, the probability that a dropped ball bearing will land in one of those two specific containers is proportional to that specific containers area or cross section. So then we just ask, all, right, well, how do we define area? Well, with the square, it's really simple, right. We take one side of a square in this case, it's six inches, and we multiply it by itself six times six. Because all sides of a square are equal, we know that the length and the width are each going to be six inches. We multiply those together. We get thirty six square inches. That is the area or cross section of our square container. But for our circular container, we take the radius, which again this case, the radius is six inches, and we square it and then we multiply that number by pie. So area for the circular container is six times six squared times pie. And if we divide the cross section of the circular container by the cross section of the square container, the two six squared you know, the two elements the squared elements cancel each other out, and so all we're left with is pie. So the end result is that when we count up these ball bearings that have landed randomly in these containers, that that difference, the division that we get that should be a close approximation of pie. Now it's not going to be precise. Chances are the number of ball bearings in the two containers will just get you close to pie. But it illustrates how a randomized process can help you get to a particular solution that might otherwise be difficult to ascertain. There are videos on YouTube that show this particular simulation. It's kind of a famous one. So you can actually watch and get kind of an understanding of what I said to you makes no sense whatsoever. There are other options for you to look into that might help, And that is a great example. But what are the actual algorithms that fall into the Monte Carlo methods spectrum? What are they used to do well? They can be used to simulate things like fluid dynamics, or cellular structures or the interaction of disordered materials. So in other words, these are all processes. They have a lot of potential variables at play, and you can think of variables as representing uncertainty, something that computers typically don't do well with. You know, from a general stance, computers excel at specific instances rather than potential instances. So algorithms like the Metropolis algor rhythm would lead computer scientists to develop methodologies to kind of work around the limitations of computers and leverage computational power and tackling very difficult problems in that way. In the Monte Carlo family of algorithms, we typically assume that the results we get are going to have a potential for being wrong, like there'll be a small percentage of error that we have to take into account, and so keeping that in mind, we need to frame results as being again approximations of an answer, rather than the definitive answer to whatever problem it is we're trying to solve. For me, this was one of those things that was very difficult to grasp for a long time because I was equating it to math class, where you're supposed to get the answer. Now, when we come back, we'll talk about a few more famous algorithms and what they do. But first let's take a quick break. Now. In our previous section, I talked about a family of algorithms that trace their history back to nineteen and so does the next one I want to talk about, which was created by George Dantzig, who worked for the Rand Corporation. And Danzig created a process that would lead to what we call linear programming. So at its most basic level, linear programming is about taking some function and maximizing or minimizing that function with regard to various constraints. But that is super vague, right, I mean, it's not terribly helpful. It feels like I'm talking in a circle. But this is one that's really important for say, business developers. So let's talk about a slightly more specific use case. Let's say that you are launching a business and you're trying to make some you know, big decisions as you're going to do this, and you're going to manufacture and sell a product of some sort, some physical thing. Now, your whole goal as a business is to make money from the work you do, and to do that, you need to figure out what your costs are going to be and how you cannot just offset these costs but make enough money to actually profit from it, so cover all your costs plus some extra. So in this case, what you're trying to do is figuring out how you can minimize costs and maximize profits. Right. The constraints come into play as you start to look at specifics, like maybe you're gonna partner with a fabrication company, and maybe you've actually got a few choices. You've got different companies that you could go with. Well, these choices could represent constraints in your approach. Perhaps one is more expensive than the others, but it can produce a greater volume of product, so you'd be able to get more product out to market in less time. Or maybe you've got one that's more expensive but it's also less likely to have fabrication errors, and errors can be both costly and time can assuming to address. So you have to quantify all these variables and use them as constraints to start figuring out your men maxing approach. Here, linear programming simply looks for the optimum value of a linear expression, and depending on the problem you're trying to solve, like maximizing profit versus minimizing costs, you're either looking for the highest value being optimum or the lowest value being optimum. Like with costs, you want that to be the lowest, and you have to look at what constraints are at play with any given expression. So Danzing's approach to linear programming, called the simplex method, was efficient and elegant and far more simplified than other methodologies attempting to kind of do the same thing around that time. But it also had some limitations, and that's because the real world isn't as simple as math tends to be. Like math tends to be pretty, you know, firm in the grand scheme of things in the world is a little more wibbly. Doubly, so, a linear programming solution only works if the various relationships between the different elements like the requirements and the restrictions and constraints that you've set up with this problem. It only works if all of those relationships are linear, or, in other words, for a given change in one variable, we see a given change in other variables. And that is super vague, So we're going to talk about two variables. Will reduce it down to that. So we'll use the classic X and Y as our variables. Those can stand in for all sorts of different values. And let's say that we see there's a relationship between X and Y, and that if the value of X changes from one to two, we then see that the value of Y goes from three to nine. Well, that doesn't tell us anything on its own, right, We just know that if x is one, y is three. If x is two, y is nine. If we then see that if x is three, then why is fifteen? And when x is four, why is twenty one? We start to see a linear relationship because each time X increases by one, why increases by six. The rate of increase for each variable is linear. It's it's not like the rate of increase doesn't change. So we've got this linear relationship here. We can contrast this with exponential relationships, in which we see not a constant increase, but a constant ratio in successive terms in one variable when you change another. So in this case, let's say that we find out then, when X is one, why is three? When X is two, why is nine? When X is three, why is twenties seven? Now we see that why is not increasing by six each time. We're seeing that the value of why is multiplied by three each time X goes up once. So we're seeing there's a constant ratio of change with why with regard to X. This gets into an exponential relationship. So when someone describes something is having exponential growth, what this is supposed to mean is that you should see a rate of growth that is increasing by some exponent for given unit of time. So it's not just that the thing is growing. In fact, it's not even that the thing is growing quickly. It's that the thing is growing faster as time goes on, so that you would say today it's growing, you know, say twice as fast as it was growing yesterday, and tomorrow it will grow twice as fast as it was growing today. And a lot of people use this phrase incorrectly. We often really just mean yo, that thing is growing wicked fast, but we don't necessarily mean yo, that thing is growing wicked fast, and the rate of growth increases by a constant amount over time, So you know, it's it's an important distinction to make. Anyway, that was a bit of a tangent to say that linear programming works if the relationships between all the variables is in fact linear. But if if it's not, If the relationship, say is exponential or logarithmic or anything like that, linear programming doesn't apply. And the real world is pretty darn complicated. So linear programming relies on sort of kind of dumbing down what's really going on in the real world. Two. Again, more of an approximation, and the solution that you arrive at might not be the ideal solution, but it might be good enough, depending upon what it is you're trying to do. Another limitation of linear programming is that as you add more variables to a problem, This is specific to the simplex method. As you add more variables, well, it grows more difficult to lay out in a linear fashion. You're you're adding more uncertainty, and the methodology has trouble dealing with additional uncertainty. Perhaps ironically, the complexity of the linear function would grow exponentially as more variables joined a problem. So this process would be unsuitable for certain subsets of computational problems because they just would get so complex that even the most advanced computers in the world wouldn't be able to handle them. Later on, other mathematicians would propose algorithms that could compensate for this particular limitation of the simplex method. For example, there are polynomial time algorithms. We're gonna leave that for the time being, because ain't no way I'm going to get that right anyway. I wanted to talk about those two algorithms, in particular, the Metropolis algorithm and the simplex method, because they are the foundation for a lot of work that's been done in computer science. But let's get some really basic ideas in with the next one. Let's talk about sorting. Sorting is all about arranging a list of things in order with regard to some specific feature of those things. That seems pretty basic, right, I Mean, there's a real world example that I think most of us have had in the past, which is that let's say you're looking at an online store and you're searching for a specific product or specific type of product. Let's say it's a blender, because I had to do this recently, right, So you go to some online store and you're searching for blenders and you get results, Well, you might actually want to sort those results by a specific way. Maybe you want to sort it by price, and you want to go low to high because you have a specific budget, so you just want to look at, you know, blenders that are an x price or less. Maybe you want to sort it by high price too low because maybe you're a fat cat tycoon type and you're just thinking, I want whatever is the most expensive. Or maybe you want to sort the results by customer review because we know rice and quality don't always go hand in hand. Maybe you just want to look at all the blenders in alphabetical order, or maybe you want to look at them in a reverse alphabetical order. You do you. Sorting is one of those most basic functions and heavily studied concepts within computer science, so much so that a lot of programming language libraries have sorting functions built into them. But it's still one of those things that people look at to see if there are better ways to sort various types of data, particularly as you get into a larger number of of sortable features um and it becomes really important. So let's take Google's search algorithm for example. Now, that name suggests that the algorithm we're talking about is related to the process of searching the web for pages that relate to a given search query, and in fact, Google does have a search algorithm that does this. But most of the time I hear people talking about the search algorithm, they actually mean that the algorithm that Google relies upon to determine what the order of search results are going to be that a person sees when they search for, you know, whatever it might be. This is critically important because multiple studies have shown that most people never bother to go beyond the first page of search results, which means that Google really needs to do its best to make sure that the most relevant results to any given search query show up toward the top of this list, because most people are never going to bother going any further than that. And if those people decide that Google search results just aren't any good, they could conceivably use some other search engine. And let me tell you. Google hates that idea. So Google's goal is to present to users search results that are most likely going to meet whatever their needs are based on their search criteria. And way back in the day, Google relied pretty much solely on an algorithm called page rank, and from what I understand, Google still relies on page rink, but also makes use of several other algorithms to augment page drink. But the page rank algorithm would take a ton of stuff into account in the process of determining where within a list of search results any given example should appear. So let's say that I'm searching for blenders on Google for some reason, and the search algorithm has gone out and indexed a ton of web pages that deal with blenders. Let's say that one of those results is in a blog post that isn't that old, it's fairly recent, but from a mostly unknown source, and very few other documents are linking into this blog. Well, chances are that search result would appear really far down on the list, probably pages and pages and pages down on the results. But then let's say that there's another web page that belongs to an established entity, one that has a good reputation that's been around for a while, and there are a ton of other pages that link to this particular website, well, that one would probably rank fairly high on the list. So the page rank algorithm essentially assigns a score to each search result, and the value of that score depends upon lots of stuff I mentioned, as well as tons of other variables, and these variables can either boost a score higher or it can knock some points off. And then page rank would take all these different search results and order them based upon those scores. Essentially really oversimplifying here, but the idea was that the more reliable and therefore real event results would likely be the ones that were the most popular, So all those links that aimed at a page would count for a lot. If you happen to run the definitive page on blenders and everyone was linking to you, that would boost your page links score significantly. Now, clearly, just because something is popular doesn't necessarily mean it's the most relevant or valuable thing, but it's generally more true than it isn't true. It might not always be the absolute best result, but it might be reliably relevant, and that was what was important to Google and to Google's users. Again, it just needs to be good enough. So the page rank algorithm was using a pretty complex sorting method to determine which results should be the most visible. You can, of course, page through search results. You don't have to go with whichever one's are on page one, And in theory, if you keep on scrolling through, going to page after page after page of search results, then over time you should see that the results you're getting are not as relevant as those that appeared earlier on the list, assuming everything is working properly. Again, this doesn't always happen, but generally it's oh, it holds true, and yeah, I'm drastically oversimplifying. You could teach an entire computer science course on the page rink algorithm and how it works. We're not going to do that. I would just get it wrong anyway. Speaking of search, however, let's talk about a much simpler form of search that's based on a basic algorithm, and this is called binary search. This is a process that's meant to decrease the amount of time it takes to search for a specific value within a sordid array of values. Alright, so let's say that you've got a really big array or list of entries right, and it's massive, and you're searching for a specific target that has a specific value associated with it, and a computer could go through this list item by item, comparing it with your target right and look to see if there's a match. If there's not a match, you can go to the next item on the list and do that. But if you're talking about a truly huge list, this could take ages because if it happens to be at the bottom of that list, you have to wait for the computer to go through every single other entry before it gets there. So a binary search cuts this short. It takes a value that's in the middle of the array, so like halfway down the list. Essentially it just pulls that value and it compares it against your target, the thing that you're searching for. And if the array is a hundred thousand trees long, it's essentially looking at entry number fifty thousand and one. It compares this to your search. And let's say the value target of your of your search query is lower than the entry that appeared at fifty thousand and one. Well, now the binary search just tosses everything that's fifty and one or higher on the list because the values are only going to go up from there, so it can get rid of half the list. Right there, You're left with fifty thousand items in this array, and the algorithm repeats this process. It goes for an entry smack dab in the middle of the array, compares it to the target. If the target's value is lower than the middle entry of this array, then you you dump everything that's at a higher number than that if it's lower or its higher. Rather, if the target value is higher than the middle entry, then you would get rid of all the previous ones, like one through twenty five thousand. For example, Let's say that you know your your target value is higher than the middle entry of twenty and one. It's gonna dump everything from one to thousand and start over again and half it again again. I'm kind of oversimplifying here, but the idea of binary search is that rather than go through every single entry, it keeps cutting this array in half, comparing it with the target, and doing it again, which reduces the number of times it takes to find the specific answer. Uh. You might have thought of the like the the experiment where you think about you're given two pennies on day one and the next day, you're given four pennies or cents if you prefer two cents. The next day you get four cents. The next day you get eight cents, and it doubles, and pretty quickly you see how that doubling can actually start to amount to what we would consider, you know, like quote unquote real money, same sort of thing. By by having an array over and over again, you very quickly whittle down the stuff you have to go through and it speeds up the process of searching for a specific value within a huge UH data list. Binary search is just one form of search for for various data constructs. There are lots of others. For example, there's depth first search and breadth first search. I won't get too deep into this but because it gets really technical, but I can describe what's what's happening here from a high level. So these search functions are used for data structures that are that consists of nodes. And this is easier to understand if we use an analogy, and I want you to think of a choose your own adventure book in case you're not familiar with those. These are books where you would read a page and at the end of the page, you would actually be given a choice, like you could choose what the character you're reading about would do next. So you might get to the end of the first page of like a mystery book, and the instruction might say, if you want your protagonists to go down the hall, turn to page eight. If you want her to shut a door and walk away, turn to page twenty three, and you would make your choice, turn to that page and continue the story that way. And so the story branches, and you've got these branching pathways. Well. Depth first and breadth first searches explore nodes in different ways. A depth first search follows each branched path all the way down from start to the end of that one branch, before moving on to look at a different branch. So in our example, would the Choose your Own Adventure book, a depth first search would go to page eight to follow your protagonists down the hall. Then whatever the options were at the end of that, it would go with the first option, continue and doing this all the time until you get to the end of that pathway. Then it would come back up to the next most recent branch, follow that all the way to the end, and so on until it had searched through every possible pathway in this this uh this graph, that's what we would call it. Breath search instead progresses equally across all possible paths. So breath search would mean that you would go to page eight, and let's say at the end of page eight, it says you can then turn to page seventeen or page fifty two to make your next choice. But instead of continuing down that pathway, you back up to your starting position and then you go down to page twenty three. You know, the other option where you would close the door and walk away. You would read that maybe at the end of that page it says you can choose to either go to page three or to page forty six. Well, then you would go back to your your first branch. You would go down the list of pages seventeen and fifty two and three and forty six, maybe those all end in different choices. You would then do all of those across the board. Next you're you're searching across the width of the graph the breadth of it. Both are valid forms of searching, and the method used typically depends upon the type of data being searched and what you're trying to do. We've got a lot more to cover with algorithms, but I feel like I'm gonna have to give you guys a break, So let's do that really quick, and I'll just give you a very short example following, and then we'll pick up back up with algorithms again in a future episode. But first let's take this quick break, all right. I know I've waxed poetic about algorithms so much so that this episode is running longer than I anticipated. I have a whole extra section to talk about with algorithms, including things that are related to algorithms but aren't necessarily specifically algorithms, stuff like hashing and Markov models and things of that nature. But I want to finish up with one that I think is kind of fun. And this is called the Doomsday rule. And to be be fair, this is not just a computational algorithm. This is something that you can do in your head if you learn the rule. I'm not saying I could do it. I haven't really committed this to a memory process yet, but it is possible. And the Doomsday rule sounds pretty ominous, however, right. I mean, it's basically to figure out what day of the week any given date is So if you were an expert with the doomsday rule and I asked you, hey, what day of the week does June five fall on? You could use that rule and say, oh, that's gonna be a Thursday. That also, by the way, happens to be the day I'll turn fifty. Yikes, man, we're coming up on that one. All right, So where did this rule come from? And what the heck is this doomsday stuff about? Well, the doomsday thing is kind of just a cheeky name, and there's all these different ways we could describe that. But during any given year, there are certain dates that will always share the same day of the week, which is no big surprise, right. I mean, you might notice that Christmas, that is December, falls on the same day of the week as New Year's Day for the following year that happened, you know, January one. It always is that way. But with the doomsday rule, the dates for that are called doomsdays. Are things like April four or four four, June six that's six six, August eight that's eight eight, and you know, so on like October tenth, that's ten ten. The anchor day are called these are anchor days. They're called doomsdays, um they change with each year or with leap years. The anchor day actually leaps a year, but once you know that day, you know it's the same for all of those dates. So for one, as I record this, this year's doomsday is or anchor day is a Sunday. So that means every Doomsday is going to be a Sunday. So I'm recording this in July with our next doomsday being on August eight, so that means August eight should be a Sunday. And if you look at the calendar, you'll see that yes, August day is a Sunday. That means that October ten should be a Sunday, that December twelve should be a Sunday. So the doomsday rule involves a little bit more than just this, but not much more. And this means that if you learn the doomsday rule and you have a general idea of what the anchor day is for each year, you can calculate what in what day of the week any given date will fall on, including past years. The guy who came up with this was a mathematician named John Conway, and reportedly he could give you an answer in less than two seconds if you gave him a date, he could tell you what day of the week it was going to fall on in two seconds or less. Sadly, he passed away last year at the age of eighty two after he contracted COVID nineteen. But that doomsday rule is one that I think is super cool. I've seen people who can do this where you give them a date and they're just able to tell you what day of the week that falls on, and you can go and check it and sure enough they're right. Well, this is one process where you can do that. It's not the only one, by the way, there are other algorithms that also can help you determine what day of the week a specific date will fall on. But the doomsday rule is is one that works. And again I don't have it, um. I don't have an expertise in this. I haven't practiced it at all, so I can't do it. If you came up to me and say, hey, Jonathan May first, nineteen, what day of the week was that, I'd be like, I don't know. Probably really a it's probably a fine day. I don't I don't know because I haven't done this yet. But I think it's really cool. I gets. It illustrates how for certain types of problems you can come up with a mathematical approach to solving those problems, and it just is applicable for all problems that fall into that category. And that's really the elegant and cool thing about algorithms. Also the other cool things that people are coming up with new ones all the time, and they might be trying to refine earlier algorithms to make them even more efficient and effective, or they might be trying to come up with all new algorithms to take advantage of emerging technology stuff like quantum computing. I hope you enjoyed that episode. Like I said, it originally published last year in two thousand twenty one, and hopefully tomorrow I'll be back in shape and I'll be able to do another news episode. We'll have to wait and see. Like I said, I'm really hoping that this is just a stomach bug thing that I'm dealing with and not like round two of COVID. I do have a test that literally I think just arrived at my door, so I will check and make sure, because obviously I want to be responsible and safe for all of my fellow human beings. Out there. I hope all of you are well and staying safe and healthy, and that you're having a great summer so far. Uh you know, I plan on having a great summer once I feel better. For now, I'm just having a kind of grouchy summer because that's I mean, that's my go to mood for pretty much any situation, but particularly when I'm not feeling well. Anyway, take care and I will talk to you again really soon. Text Stuff is an I Heart Radio production. For more podcasts from my Heart Radio, visit the i Heart Radio app, Apple Podcasts, or wherever you listen to your favorite shows.

In 1 playlist(s)

TechStuff
2,469 clip(s)

TechStuff

TechStuff is getting a system update. Everything you love about TechStuff now twice the bandwidth wi…

Social links

Website

Follow podcast

RSS feed

Recent clips

Week in Tech: AI’s Problem Solving Problem

The Story: Are the US and China in a Tech War? w/ Jake Sullivan

Week in Tech: Nintendo Thinks Sideways

Browse 2,466 clip(s)