Listener The Gregolas asked if I might explain The Illegal Number. What makes a number illegal? Can numbers actually be illegal? Isn't that absurd? YES!
Welcome to tech Stuff, a production from I Heart Radio. Hey there, and welcome to tech Stuff. I'm your host, Jonathan Strickland. I'm an executive producer with I Heart Radio and how the tech are you. I've been getting some requests on Twitter as well as the talk back feature that's in the I Heart Radio app under the tech Stuff podcast label. More on that at the end of the episode, and I wanted to start tackling some of those requests this week. We also will have a special episode coming up a bit later this week, so it's gonna be a little different from a normal tech Stuff week. That will actually carry over next week in a bit beyond as well, because I'm going to take a short vacation from work. But I'm trying to record stuff in advance to at least cover some of that so that it's not all reruns. Anyway, let's get to the request for today's episode. Now, today's episode comes to us courtesy of the Gregorlis on Twitter, who asks, quote, could you do a tech Stuff tidbit episode about the illegal number? End quote? And yes I can. The Gregorlis. Now, first I should say there's not just one illegal number. Not if we follow the logic, and I use the term loosely about what makes a specific number illegal. There is a particular number referred to as the illegal number, and I'll explain that as well. I will not be saying that number, not because I fear reprisal. It's not that I'm worried that the Feds are gonna kick down my door. It's because the illegal number is a one thousand, four hundred one digit number. And by the time I would finish saying it, we be well into Tuesday. So this topic ties into some other stuff I've been talking about in recent episodes. And to understand how a number can be illegal, we have to cover a few different concepts. Some of those concepts deal with technology, uh, some deal with politics, and some deal with business. And as you might imagine, the last two categories there, politics and business have a great deal over of overlap because we're talking about the United States in particular. Also, you could overlay the word stupid or at least absurd over those two. All right, I'm gonna cover the political and business stuff first because this ties into the specific case of the illegal number. So back in here in the United States, then President Bill Clinton signed into law the Digital Millennium Copyright Act, or d m c A. The d m c A in turn incorporated two World Intellectual Property Organization or wi BO treaties UH. These were the WIPO Copyright Treaty and the WIPEO Performances and Phonograms Treaty, both of which had very similar language in them. Now, WHIPO itself is an international organization that is part of the United Nations, and it has been since nineteen seventy four. The purpose of WIPO is to create international agreement on certain aspects of copyright so that the created works enjoy copyright protection across borders, though the extent of that protection and the penalties incurred by those who ignore it are largely up to each individual country's government, so enforcement and everything that's not covered in the treaty just that these are the basic concepts that we want to make sure countries across the world agree on. So the d m c A is America's version of adhering to the rules established by those two WIPO treaties. The protections in the d m c A stem from those treaties, but the actual wording and implementation are uniquely American. UH. The d m c A also has other stuff that wasn't directly covered by WIPO. For example, under Title five of the d m c A, UH, there was a new form of protection created for quote the design of vessel hulls end quote as in boats and a dooming boats because the update to the Copyright Code only applies to boat hole designs with holes that are no longer than two hundred feet. So yeah, this update got real specific. But never mind that we're not here to talk about boats. I'll say that for some future episode. The part we need to focus on is the bit that created section twelve oh one one hundred one to Title seventeen of the US Code. That section covers, quote, the obligation to provide adequate and effective protection against circumvention of technological measures used by copyright owners to protect their works end quote, or as we typically think about it, digital rights management or DRM. Now, a lot of people, including myself, have oversimplified the intent of this piece of legislation to say that this protection means it's illegal to get around DRM, which means that while you are within your rights here in the United States to make a backup copy of a work for your personal use, whether it's a digital music file or some software or a book. I mean, it can be a physical thing. It doesn't have to be digital. Whatever it is. You are allowed to make a personal copy for your own backup purposes as long as that's all it's for. But you are not allowed to circumvent DRM in order to do it, which is kind of like saying the stuff that's inside this locked safe is yours, and you can do whatever you want with the stuff that's in this safe. You can copy it as many times as you like, but you're not allowed to open the safe to get at it. Now, that analogy isn't perfect, because you can still use DRM to material, but there can be some limitations depending on the implementation of DRM. But I think it gets the idea of the issue across well. The d m c A gets a bit more nuanced than just that. It's actually a little more complicated, though in practice it doesn't end up mattering very much. See, the d m c A actually differentiates between measures that control the access to a copyrighted work and measures that prevent unauthorized copying of a copyrighted work. So, if you were making a personal backup copy of a copyrighted work that is legitimate, it's a type of fair use. So if the technological prevention of copying is all it's stopping you from making a copy, it's okay to circumvent that protection, assuming the copy you make is legitimate under the umbrella fair use. So if you were archiving, say a piece of software, and the only thing stopping you was an anti copy piece of technology, there's no legal issue getting around it. However, if the DRM controls access to the copyrighted work, it's a different story. The d m c A is clear about that is illegal to circumvent or develop tools meant to circumvent DRM that controls access to a work. Well, here's the real problem. The way companies use DRM is tied directly to access. Yes, copying is part of that, but it's a sub part. So trying to get around it so that you can make your personal copy also means having to get around the access part, and that's illegal. Let's consider the DRM that Apple used to use on digital music files. They don't use it anymore, but they used to use a system called fair Play. Apple was strong armed into developing fair Play by the major music labels of the time, the DRM put a limit on how many devices would be allowed to access any given digital file. That would prevent the unauthorized distribution of digital songs that were downloaded from Apple's store, because you would very quickly hit the small limit of devices that would be allowed to access the file, and creating or using a tool to strip those files of that protection would be against the law because of the d m c A. Now, there are a few other exceptions to the d m c A that would allow folks to get around DRM and it wouldn't be illegal, but those are extremely limited in scope. For example, a nonprofit library would be allowed to circumvent DRM on some software, for instance, only for the purposes of evaluating the copyrighted software if the library were considering obtaining a legit copy for itself. So, in other words, this nonprofit library is saying, we're thinking about getting this, but we don't know yet. We want to evaluate it. We don't own a copy of it, so we have to use this method to circumvent DRM to evaluate the copy. That would be okay under those very specific circumstances. Security companies could circumvent DRM for the purposes of testing computer and network security and there are a few other exceptions, but they're all very limited and they have specific criteria that have to be met in order for it to pass muster. So we arrived at a point where, because of the technological protection overlaid on top of some copyrighted works, US citizens were denied the rights to create personal backup copies as is permitted under fair use, which is pretty absurd. Right, you're allowed to do this thing, only there's a lock that prevents you from doing that thing, and it's illegal to get rid of the lock. So you're effectively saying it's illegal for me to make a backup, and and the court would say, oh no, no, it's perfectly legal for you to make a backup. You just can't break the lock that prevents you from making a backup to make a backup. But if you can make a backup without getting around or breaking the lock, you're good to go. Well, that's patently absurd. You cannot do those things. It's the sort of situation you would expect to encounter in a a novel like Catch twenty two. Now companies were thrilled, right like the big music labels were thrilled. Movie and television studios were thrilled, publishers were thrilled. Organizations like the Electronic Frontier Foundation were not thrilled. Also, while the measures were intended to curb piracy, uh that largely failed. Piracy was still running rampant, and in fact, there was a pretty strong argument that the pirates were ending up with better versions of the copyrighted works because their versions didn't have all that pesky DRM attached to them. They had stripped it out, and DRM could cause other problems like for legit users, your DRM could actually interfere with your access to material you had fairly and legally purchased. So legitimate customers who did not remove their DRM had inferior versions of the works that the pirates were enjoying. The d m c A did, however, give big software and media companies the equivalent of an enormous cannon They could point at people who were found or suspected to be downloading material without permission to do so. So they fueled a lot of incredibly disproportionate lawsuits against people, and it did not win the industry any favors. Anyway, we're gonna pivot in a second. We're gonna come back to the d m c A and DRM toward the end of this episode because it will play a pivotal role in the creation of the concept of the illegal number. First, we're gonna take a quick break, and when we come back, we're gonna talk about abstraction. We're back, and it's time for us to talk about computer languages and machine language. It's a good jumping off point for the concept of abstraction. Now, I'm sure you all know that when you really get down to the hardware level of what's going on inside a computer, all the information passing through the system is in machine code a k A binary that's zeros and ones bits. In other words, machines use zeros and ones to represent anything and everything, and we can use zeros and ones to represent different types of stuff. We just need to gather enough zeros and ones to be able to do it. So a collection of eight binary digits is a bite. Eight bits is a bite. That's something we arrived at after a lot of back and forth in the computer industry. But there's no need to rehash all of that right now. With eight binary digits, you can represent up to two hundred fifty six different values or two to the power of eight. We have two possible states zero and one, and we have eight total binary digits, so it's two to the power of eight. With eight binary digits, we could designate different letters and figures. For example, the letter A and binary is zero one zero zero zero zero zero one. The letter Z in binary is zero one zero one one zero one zero. And yes, it sounds like I'm singing robots by Fly of the concords, but that's how we could designate A and Z in binary. Now you've probably already noticed that using binary would be far too clunky for humans. We would quickly get lost while trying to spell a simple word, let alone create complex instructions for a computer to follow. For that reason, we have different ways or abstractions to deal with machine languages. Programming languages are an example. With programming languages, we can use formats that humans can read, some we can read more easily than others, and machines cannot read. These not natively programming languages that aren't far removed from the struction set that's used by the machine itself. Those are called low level languages. Assembly, or sometimes as Simbler language, is the ultimate example of low level languages. These days, a utility program would take the instructions written an assembly, which would very much mirror the instruction set of the architecture of the machine that's running the stuff, and then it would convert the assembler language into executable machine code. So it takes this one representation of processes and converted into a different representation of processes, and it's the representation that the machine can actually use. Higher level programming languages have more levels of abstraction in them. They are generally speaking easier for people to work with, so it's much easier to write a program in these languages. The instruction sets in these programming languages are further removed from the base architecture of the machine that you are programming for. In fact, there are programming languages where you can program for all sorts of different machines, and you use a different utility for each machine to convert it into a language the machine can use. And because of all this, the computer code has to go through a process called compiling before a machine can actually execute the program. If there was no compiling process, the machine wouldn't know what to do with the sets of instructions. It would be it would be as if you spoke only one language, someone else spoke only a different language, and you just tried to keep on asking the same question like eight different ways in English, for example, and the other person only speaks a Mandarin, it wouldn't matter how many times you reworded the question. The languages are fundamentally different. There would be no communication same sort of thing. Without compiling, the machine has no way of understanding what is you wanted to do. So the compiler is absolutely key in this case. Now, the compiler essentially takes one language and converts it into a different language. That language could be assembly language. So compiling could take your very complex, sophisticated program, break it down into all the different steps that would be required to make that program work in assembly language, and then through an assembler that we get converted into machine code, and then the computer would ultimately quote unquote, understand what to do with your program. So you might write a program in a language like Python, and a compiler might take that and translate it into assembly language, and then through an assembler that gets converted into machine code and a computer actually uses that. So really we can think of all this as a general process by which we humans take stuff that we can work with easily and convert that into stuff that a machine can work with easily. There's a lot more to it than that, and I am oversimplifying, but you get the idea now. The reason why I went through all that rigamarole is one to talk about the actual types of information that machines work with and why that's important, and also to get across this idea that there are so many different ways that we can represent numbers, and there are different ways we can represent stuff in numerical form. And if we go back to a bide of information, those eight bits, we can describe that in several ways. Right. We can say eight bits represents two to the eight values, or we could say it represents two D fifty six values, or that we can say it represents values from zero to two D fifty five, and so on. All of those are essentially saying the same thing, but we're saying it in different ways. Right. Well, there are other ways to represent numbers as well. Another numeral system that's often used in computing is called hexadecimal. Hexadecimal is a base sixteen system. So the system that you and I count in is based ten. Right, you start at zero, you go to nine, then you move into the tens. So ten is just one zero, eleven is just one. One do you go up to nineteen? You move up to the twenties. That's to zero. So that's base ten. Right. Well, how the heck do you have a base sixteen? Right? How do you get up to a point where? How do you count in base sixteen? You only have ten single digit numerals from zero to nine. Well, the way you create base sixteen as you start to borrow from letters, hexadecimal designation goes from zero through nine, and then to represent values ten through fifteen, we switch to letters and use a through F. Now, in hexadecimal, a single digit represents four bits, so two hexadecimal digits are equal to a byte or eight bits. So we can go from eight zeros to eight ones for the full range of expression with bits right, eight zeros to eight ones, that would be zero to two hundred fifty five. If we thought about it numerically, in like the regular digits that we use today, but in hexadecimal, we would represent that as zero zero, which would be the same thing as eight zeros, or f f, which would be the same thing as eight ones. So hexadecimal creates a slightly easier way to represent binary data than just working with zeros and once. Now, the whole reason I brought all this up is that it really is important to understand there are so many different ways to represent numbers. You can convert one number system into another number system. You can convert numbers into other stuff too, like you could create an image based off numbers, or vice versa. You could take an image and reduce it to numerical data representing the image. If we couldn't do that, computers would not work, or at least they wouldn't be able to do anything other than perform operations on numbers like a very simple calculator. But Ada Lovelace had it right. We can use numbers to represent all sorts of incredible things, from images to music to sophisticated programs like the Curse of Monkey Island. All right, when we come back, we're gonna go back to the d m c A, and we're gonna combine the things we learned about the d m c A and the things we learned about using numbers to represent different things and figure out how a number can be illegal But first, let's take another quick break. All right, back to the d m c A. The d m c A makes it illegal to attempt to circumvent access controls on digital copyrighted works. That means it's illegal to develop tools expressly for the purposes of getting around digital rights management or d r M. So let's say that someone goes out and writes a program that is essentially a work around of d r M. That program is illegal unless it happens to be used in one of those very narrow exemptions I mentioned earlier. Distributing that program is also illegal. You could reduce the program to something like its source code, and you could represent that source code in some other format. You could then post that format on a web page for anyone to see, and someone could see that representation, they could copy the representation, they could reverse the process you use to arrive at that representation, and then they would have the source code, which means now that person has possession of the illegal material. So, in other words, by changing the representation of the program, you have created a different means of of displaying that information. Does that, in fact make that display itself illegal? If we follow the d m c A rules, to the letter, and we acknowledge that it is possible to share something that is illegal to possess if we just represent that thing in numerical value, well, by extension, the only logical thing we can say is that that number itself is illegal. And in fact, you could go a step further. You could break down illegal material such as an encryption key for example, convert that into hexadecimal code. Convert the hexadecimal code into an image, and use that image to distribute the code. This whole thing is still reversible, Like you have to know the steps that were used, but you could do those same steps in the reverse order and we're you know, arrive at that source code. This this could be a form of steganography, that's hiding a message inside something else, like an image. Doesn't have to be an image, but that's frequently what we think of when we think of steganography. Now, if we follow D M C A to the logical conclusion, we would say, well, that would mean the image itself would be illegal. So really, when you break it down, the concept of illegal numbers is meant to show how absurd it is to try and legislate information, because information is mutable. That is, you know, we can change it from one format into a different format, and the format that would seem innocent and definitely abstract from whatever it was representing, and any attempts to ban that information becomes absurd because how far do you go with that. Let's say, for example, that I cracked the encryption used by some software companies to protect their work. Right it limits the access to their work, And I've cracked it, and I create a program that lets other people circumvent this protection, and then I represent the program as a hexadecimal value, which, when you convert it over, becomes the source code for this program. But then I take that hexadecimal value and I use it to run through another program that creates a very very large number, and it's totally reversible. If you were to take that very very large number and run it through a similar program, you would arrive at the hexadecimal value, which you could then can vert into the source code, and you would be back where I started. So I post this very very large number, which by itself, without any other context, is just a number. Can the software company actually claim that that very large number is in violation of its intellectual property, because isn't that absurd, But because of the nature of digital information, all of this is entirely possible. Complicating matters is that sometimes once you do all these conversions, you end up with a prime number. Depends on what you're using in order to create these. Like I said, you're usually using some form of software that takes the value of something and converts it into another format. There are some where if you do this and you set things just right, then the end output you get is a prime number. If that prime number represents illegal material, does that make that prime number illegal? And if so, wouldn't that mean that listing that number in a database of prime numbers would be a crime. But there are legit reasons you want to be able to get lists of prime numbers. Prime numbers are incredibly useful in computation. And the reason I talked so much about the d m c A is that the big instance of the concept of illegal numbers. In fact, the number that is referenced as the illegal number, popped up because of an issue that happened around two thousand one. That's when some programmers created software that they called d E c s S. This software could decrypt DVDs, so DVDs had encryption that would prevent people from doing things like copying the DVDs. Now, there are a couple of reasons you might want to decrypt a DVD. One might be that you do want to copy DVDs, you want to create bootlegs of it and sell bootlegs at a mark down price on a street corner or something. That's the dread piracy route that the big companies were all scared of. But another reason you might want to get around DVD encryption would be to run the DVD on, say a computer system that had an operating system that wasn't supported by DVD encryption, like Lenox back in the day. You have a Linux computer. You want to watch a DVD on that Linux computer, but because Linux is not compatible with that encryption method, you need to strip the encryption off of it first. Now, according to the d m c A, the d e c s S program was illegal. It allows one to circumvent the access control of the copyrighted work on the DVD, and d e c s S prompted criminal cases against the programmers, only one of whom was ever identified, and and that one was acquitted. But in the United States, d E c s S was deemed an illegal piece of software. Well, then a programmer named Phil Carmatti got the nifty idea to convert the code, the source code for d E c s S into a prime number. So he took the source code, which was originally programmed in the C programming language, and he used a Unix process to reduce the file size. Um it's called g zip. He then took the new file format and he ran it through a different program in order to arrive at a one thousand, four hundred one digit prime number. He would later actually boost that up to one five digits for reasons I'll explain in a second. Now, when that prime number gets converted into hexadecimal, it would represent a g ZIP file of the original source code for d E c SS. Now, his whole point was that he could create an archivable representation of this illegal software. You see, because prime numbers are important and they could be mathematically interesting. If you were to take information that has deemed illegal for whatever reason and use this conversion process and turn it into an interesting prime number, then that would be enough to have reason to archive it like it's it's important you need to be able to archive it, which means that whether the information is illegal or not, you can you can save it, you can archive it, and um, yeah, sure, that number ultimately, if you go through this process, represents a way to decrypt DVDs illegally. But that same number, it can also be mathematically interesting on its own merits, So making it illegal to publish that number would be a real sticky wicket, as they say, And that's why he boosted it to one thousand, nine five digits. The one thousand, four hundred one digit prime number, he argued, was not really mathematically interesting and so it probably did not represent a really a strong case for being archivable without being illegal. But by boosting it, he said, well, this number does have some interest mathematically, and so therefore it would be ludicrous to deem it as illegal, even though it also represents this illegal process. Now, Karamati's thought was that the banning of pure information just doesn't make any sense, and he sees source code as being a type of pure information. And I think his argument is really strong since we've seen there are so many different ways to represent pure information. You could convert those numbers into music if you liked you would just need to create a program that would follow specific rules in order to take this and turn it into music. And then would you say that that music which represents these numbers, which ultimately, through however many layers of abstraction, represents illegal material. Would you say the music itself as illegal. It's all very puzzling, and the more you think about it, the more it feels like you're slipping into an alice in Wonderland situation. Lewis Carroll would have a field day with this stuff. But that is the concept of the illegal number, a number that ultimate le represents a process to decrypt DVDs. And as I said, it is the illegal number, But there are other illegal numbers. Really, anything that could represent illegal material could be conceived of as an illegal number. It might be that the majority of people who see that number have no idea why it represents, nor would they necessarily know how to go through the process of converting that number into whatever the illegal material ultimately is. But the argument still stands that if you can successfully say that this this number displaying this number is illegal, where does that end? How do how do we actually have a world that makes sense where you have made pure data illegal to distribute or to exhibit. That is the the absurdity that's on on display here, and um, yeah, it shows how the the worlds of technology, politics, and business can come into conflict with one another. And you can kind of understand the perspectives of the different parties here, but it still doesn't make it any less absurd. So I hope you enjoyed this episode. I know it was a bit of a convoluted one, but it's important to kind of get all these different concepts in your head so that you can kind of understand the the challenges here when you're trying to balance things like copy protection versus the representation of pure information. Um, it does get messy and there may not be any simple solution for all of this, but I definitely think that prosecuting people for demonstrating for you know, showing a number would be absolutely ludicrous. Like again, where does that stop. It's entirely possible to have something like the with Krmadi. He he actually said, if you have a prime number that represents ultimately this illegal process, there are legitimate reasons for demons for displaying that prime number. They have nothing to do with the process, right, It's all about the prime numbers, and therefore you cannot make it illegal to display, because if you did, then you would invalidate all of these legitimate purposes for showing that information, which makes no sense. It would be like saying, you know what, uranium is a dangerous element, so from now on, you're not allowed to show it on the periodic table of elements. Does make no sense either, Right, That's kind of what we're going at here. Anyway, thank you the Gregorlis for that suggestion. It was a lot of fun to go down that rabbit hole. And uh, I welcome everyone to send in their suggestions for episodes like this. I'm good. Like I said, I'm gonna be tackling a few of them in the upcoming episodes. The way you can reach out to me, there are two ways. One is that you can get the I Heart Radio app and you can go to the tech stuff page and I Heart Radio app and you can use the little microphone talk back feature there, which will let you record a message of up to thirty seconds in length. And uh, the only people who can hear that are Tori and myself, and if we you know, if you like, we can even use that thirty seconds in an episode to kind of launch into whatever the topic suggestion is. Uh, but if you don't want that, just let me know and I definitely won't play it, but that is a possibility and I love hearing from you. I'm gonna be using one of those pretty soon, so that's one way. The other way, of course, is to reach out via Twitter. The handle for the show is text stuff hs W. Just send me a message that way. That's how the Gregor List did it, and I'll be sure to see that as well. That's it for this episode. Hope you enjoyed it and I'll talk to you again really soon. Text Stuff is an I Heart Radio production. For more podcasts from I Heart Radio, visit the I heart Radio app, Apple Podcasts, or wherever you listen to your favorite shows. H