TechStuff Classic: The National Facial Recognition Database

Speaker 1

Welcome to tech Stuff, a production from iHeartRadio.

Speaker 2

Hey there, and welcome to tech Stuff. I'm your host, John that Strickland.

Speaker 1

I'm an executive producer with iHeart Podcasts and how the tech are you? It's time for another classics episode. This episode originally published on May nineteenth, twenty seventeen. It is called the National Facial Recognition Database. Pretty, I would say, controversial topic. Well, let's listen in now. Before I dive into the topic, I want to make a couple of things very clear at the very beginning. First is I'm biased. I think the use of facial recognition software is problematic even if you have regulations in place. But I'm mostly talking about unregulated use because really we haven't a establish the rules and policies to guide the use of facial recognition software in a law enforcement context. So that's problem Number one is I have a very strong opinion about this and I'm not going to shy away from that. It's really unjustifiable to have unregulated use of facial recognition software in law enforcement contexts. So I want to make that clear out of the gate that I have this bias, and if that's an issue, that's fair, But at least I'm being honest, right, I'm not presenting this as if it's completely objective, unbiased information. I own this. You don't have to tell me. I know it already. Next, this is largely going to be a US centric discussion so that I can talk about details. But please know that there are a lot of these types of systems all over the world, not just in the United States, and a lot of these places have similar issues to the ones I'm going to be talking about here in the US. I'll just be focusing more on US stories to make specific points because this is where I live, and now to explain what I'm actually talking about here. So, back in twenty ten, the FBI undertook a project that cost more than an estimated one point two billion dollars that's billion with a B to replace what was called the Integrated Automated Fingerprint System or IAFS. Now, if I had been in place since nineteen ninety nine, and I've talked about fingerprints in a previous episode, IAFS was an attempt to create a usye database of fingerprint records so that if you were investigating a crime and you had lifted some prints from the crime, you could end up consulting this database and see if there are any matches in place to give you any leads on your investigation. The twenty ten project the FBI undertook was meant to vastly expand that capability by adding a lot more data to the database, not just fingerprints, but other stuff as well, and the new system is called the Next Generation Identification or NGI. It includes not just fingerprints, but other biographical data and biometrics information, including face recognition technology. So a lot of images are included in this particular database. So as part of this project, the FBI incorporated the Interstate Photo System or IPS, so you have NGI IPS it typically is how it's referred to now. That system includes images from police cases as well as photos from civil civic sources that are not necessarily related to crimes. That's not the only way the FBI can scan for a match of a photograph they've taken that relates to a case in some way to this massive database, but more on that in a little bit now. The general process of searching for a match follows a pretty simple pattern, although the details can be vastly different depending upon what facial recognition software you are using at the time. So you first start with an image related to a case, and this is called the probe photo. It is the one you are probing for lack of a better term, you don't know the identity of the person in the photograph, typically, or at least you might have suspicions, but you don't necessarily know for sure. So you've got a picture of an unknown person in this photograph. You then scan that photo and you use facial recognition software to analyze the picture and to try and find a match in this larger database. It starts searching all of the images in the database looking for any that might be a potential match. Depending upon the system and the policies that are in use, you could end up with a single photo return to you. You could end up with dozens of photos, so these would all be potential matches with different degrees of certainty for a match. You might remember in episodes I've talked about things like IBM's Watson that would come up with answers to a question and assign a value to each potential answer, and the one that had the highest value, assuming it's above a certain threshold, would be submitted as the answer. So it's not so much that the computer quote unquote knows it has a match. It suspects a match based upon a certain percentage as long as it's over a threshold of certainty, or you might end up with no photos at all. If no match was found or nothing ended up being above that threshold, the system might say, I couldn't match this photo with anyone who's in the database. A study performed by researchers at Georgetown University found that one in every two American adults has their face captured in an image database that is accessible by various law enforcement agencies, including but not limited to the IPS. In fact, the IPS has a small number of photos compared to the overall number represented by databases across the US. Now, this involves agencies at all different levels, federal, state, even tribal law for Native American tribes. That ends up being about one hundred and seventeen million people in these databases, many of whom, in fact large percentage of whom have no criminal background whatsoever. Their images are also in these databases, and this raises some big concerns about privacy and also accountability. So in today's episode, we're going to explore how facial recognition software works, as well as talk about the implementation for law enforcement and the reaction to this technology, and will probably listen to me get upset and a little head up about the whole thing in general. All right, So first, before we leap into the mess of law enforcement, because it is a mess, that's just a fact, let's talk first about the technology itself. When did facial recognition software get started and how does it work? Well, it's related to computer vision, which is a subset of artificial intelligence research. If you look at artificial intelligence, a lot of people simplify that by meaning, oh, this is so that you can teach computers how to think like people. But that's actually a very specific definition of a very specific type of artificial intelligence. When you really look at AI and you break it out, it involves a lot of subsets of abilities. One of those is the ability for machines to analyze imagery and be able to determine what that imagery represents. In a way, you could argue it's teaching computers how to understand pictures. It's also really challenging, and this is one of the object lessons that I use to teach people how Artificial intelligence is really tricky. It requires more than just pure processing power. I mean, processing power is important, but you can't solve all of AI's problems just by throwing more processors at it. You have to figure out from a software level how to leverage that processing power in a way that gives computers this ability to identify stuff based upon imagery. So a computer might be able to perform far more mathematical operations per second than even the cleverest of humans, but without the right software, they can't identify the picture of a seagull compared to say, a semi truck. You have to teach the computer how to do this. So let's say you develop a program that can analyze an image and break it down into simple data to describe that image, and then you essentially teach a computer what a coffee mug looks like. You take a picture of a coffee mug, you feed it to a computer, and you essentially say this data represents a coffee mug. You then would have to try and train the computer on what that actually means. The computer does not now know what a coffee mug is. It will recognize that specific mug in that specific orientation under those specific lighting conditions, assuming that you've designed the algorithm properly. But it's way more tricky than that. What if in the image that you fed the computer, the coffee mugs handle was facing to the left with respect of the viewer, but in a future picture the handle is off to the right instead of to the left, or it's turned around so you can't see the handle at all. It's behind the coffee mug. Well, if the mug is bigger or smaller, or a different shape, well if it's a different color. Image recognition is tough because computers don't immediately associate different objects within the same category as being the same thing. So if you teach me, Jonathan, what a coffee mug is, and you show me a couple of different examples saying, this is a coffee mug, but this is also a coffee mug, even though it's a different size and different shape and a different color, I'll catch on pretty quickly and it won't take very many coffee mugs for me to figure out. All Right, I got the basic idea of what a coffee mug is. I know what the concept of coffee mug is now, But computers aren't like that. You have to feed them thousands of images, both of coffee mugs and of not coffee mugs, so that the computer starts to be able to pick out the various features that are the essence of a coffee mug versus things that are not related to being a coffee mug. It takes hours and hours and hours of work of training these computers to do it, so it's a non trivial task, and this is true of all types of image recognition, including facial recognition. Now, to get around that problem, you end up sending thousands, countless thousands, millions maybe of images of what you're interested in while you're training the computer. And the nice thing is computers can process this information very very quickly, so while it takes a lot, it doesn't take relatively that long, it's not as laborious a process as it could be if computers were slower at analyzing information. So you might remember a story that kind of illustrates the point. Back in twenty twelve, there was a network of sixteen thousand computers that analyzed ten million images, and as a result, it could do the most important task any computer connected to the Internet should be expected to do. It could then identify cat videos because it now knew what a cat was, or at least the features that define catness. Catness as in the essence of being a cat, not a character from Hunger Games. Even then, there were times when a computer would get it wrong. Either it would not identify a cat as being a cat, or it would misidentify something else as being a cat because its features were close enough to cat like for it to fool the computer algorithm. A major breakthrough in facial recognition algorithms happened way back in two thousand and one. That's when Paul Viola and Michael Jones unveiled an algorithm for face detection, and it worked in real time, which meant that it could recognize a face that it would appear on a webcam. And by recognized, I mean it recognized that it was a face. It didn't assign an identity to the face. It didn't say, Oh, that's Bob, It said, oh, that is a face that is in front of the webcam right now. The algorithm soon found its way into open CV, which is an open source computer vision framework, and the open source approach allowed other programmers to dive into that code and to make changes and improvements, and it helped a rapid prototyping of facial recognition software to Other computer scientists who helped advance computer vision further were Bill Triggs and Navnit de Lal, who published a paper in two thousand and five about the histograbs of oriented gradients. Now, that was an approach that looked at gradient orientation in parts of an image, and essentially it describes the process of viewing an image with attention to edge directions and intensity gradients. That's a complicated way of saying the technique looks at the totality of a person, and then a machine learning algorithm determines whether or not that is actually a person or not a person. A bit later, computer scientists began pairing computer vision algorithms with deep learning and convolutional neural networks or CNNs. To go into this would require an episode all by itself. Neural networks are fascinating, but they're also pretty complicated, and I've got a whole lot of topics to cover today, so we can't really dive into it. You can think of an artificial neural network as designing a computer system that processes information in a way that's similar to the way our brains do. The computers are not thinking, but they are able to process information in a way that mimics how we process information, or a semi close approximation thereof that's a really kind of weak way of describing it. But again, to really go into detail will require a full episode all by itself. Typically, facial recognition software uses feature extraction to look for patterns in an image relating to facial features. In other words, it searches for features that resemble a face, the elements you would expect to be present in a typical face, So eyes, nose, a mouth, that would be major ones. Right. Then the software starts to estimate the relationships between those different elements. How wide are the eyes, how far apart are they from each other, How wide is the nose, how long is the jawline, what shape are the cheekbones? These sort of elements all play a part as points of data, and different facial recognition software packages weight these features in a different way. So it's not like I could say all facial recognition software looks at these four points of data as its primary source. It varies depending on upon the algorithm that's been designed by various companies, and part of the problem that we're going to talk about is that law enforcement across the United States they are not relying on a single facial recognition software approach. Different agencies have different vendors that they work with, So just because one might work very well doesn't necessarily mean it's competitors work just as well. And that's part of the problem. Now, all of these little points of data I'm talking about, these notle points and how they relate to one another, all of that gets boiled down into a numeric code that you could think of as a face print. This is supposed to be a representation of the unique set of data that is a compilation of all of these different points boiled down into numeric information itself. Then what you would do is you would have a database of face So if you wanted to find a match, you would feed the image you have, the probe image into this database, and the facial recognition software would analyze the probe photo. It would end up assigning this numeric value and would start looking through the database for other numeric values that were as similar to that probe one as possible and start returning those images as potential matches or candidates. They tend to use the word candidate photos. Otherwise you'll either get no match at all or you get a false positive. You will end up getting an image of someone who looks like the person whose image you submitted, but is not the same person. That does happen, And that's the basic way that facial recognition software works. But keep in mind, different vendors use all their own specific approaches, like I said, and some could be less accurate than others. Some might be accurate for specific ethnicities and not as accurate as other ones. That's a huge problem, so it gets complicated. Even when I'm talking in more general terms, you have to remember that there are a lot of specific incidents and specific implementations of facial recognition software that have their own issues. So I'm gonna be as general as I can. I'm not going to call out any particular facial recognition software vendors out there. I'm more going to talk about the overall issues that various organizations have had as they've looked into this topic. Now, there are plenty of applications for facial recognition that have nothing to do with identifying a person. I mentioned that earlier that there was the one for a webcam that could identify when a face was in front of the webcam. This wasn't to identify anybody. It was again just to say, yes, there's somebody looking into the webcam at this moment, which by itself can be useful and have nothing to do with identification. There are plenty of digital cameras out there and camera phone apps that can identify when there's a face looking at the camera, and again it's not necessarily to identify that person, but rather to say, oh, well, this is a face. The camera is most likely trying to focus on this person, so let's make this person the point of focus and not focus on something in the background like a tree that's fifty yards back. Instead, let's focus on the person who's in the foreground. So that's pretty handy, and again there's nothing particularly problematic from an identification standpoint, because that's not the purpose of it. But then you also have other implementations, like on social media, which allow you to do things like tag people based upon an algorithm recognizing a person. So Facebook is a great example of this. Right, if you upload a picture of one of your Facebook friends onto Facebook chances are it's giving you a suggestion to tag that photo with the specific in mind. That may not be that problematic either, depending upon how your friend feels about pictures being uploaded to Facebook. Some people are very cautious about that, and of course you know, I always recommend you talk to anybody before you start tagging folks on Facebook photos, just to make sure they're fine with it. I say that as a person who has done it, and then notice that some of my tags got removed by the people I tagged later on, which taught me I should probably ask first, rather than give them the feeling that they need to go and remove a tag or two. We've also seen examples of this simple implementation of facial recognition going awry. Google's street View will blur out faces, for example, in an effort to protect people's identity while street view cars are out and about taking images. This makes sense. Let's say that you are in a part of town that you normally would not be in. For whatever reason, you might not want your picture to be included on Google street View, so that whenever anyone looks at that street for that point forward, they see your face on there, you know, coming out of I don't know a Wendy's. Maybe you are a manager for burger King that would look bad, or you know, lots of other reasons that obviously can spring to mind as well. You don't want to violate someone's privacy. But Google StreetView would also blur out images that were not real people faces, like images on billboards or murals. Sometimes if it had a person's face on a mural, the face would be blurred out, even though it's not a real person, it's just a painting or In September twenty sixteen, c Neet reported on an incident in which Google street View blurred out the face of a cow. So Google was being very thoughtful to protect that cow's privacy. But what about matching faces to identities? So in some cases, again seemingly harmless if you want to tag your friends, but when it comes to law enforcement, things get a bit sticky, particularly as you learn more about the specifics. And we'll talk about that in just a second, but first let's take a quick break to thank our sponsor. All right, let's first start with the FBI's Interstate Photos System, or IPS, because this one has perhaps the least controversial elements to it when you really look at it, it's still problematic, but not nearly as much as the larger picture. The system contains both images from criminal cases like mugshots and things of that nature, but it also includes some photos from civil sources like ID applications, that kind of thing. When the Government Accountability Office or GAO, they're gonna be a lot of acronyms and initializations or initialisms, I should say in this episode, so I apologize for that. But Government Accountability Office they did a study on this matter just in twenty sixteen, so not that long ago. They published its report on facial recognition software use among law enforcements, specifically the FBI because they're a federal agency, so they were concerned with the federal use of this. The database contained about thirty million photos at the time of the GAO study, so thirty million pictures are in this database. Most of those images came from eighteen thousand different law enforcement agencies at all levels of government, that includes the tribal law enforcement offices. About seventy percent of all the photos in the database were mugshots. More than of the photos in that database are from criminal cases, so that means that less than twenty percent were from civil sources. In addition to that, there were some cases, plenty of them, where the database had images of people both from a civil source and from a criminal source. So I'll give you a theoretical example. Let's say that sometime in the past I got nabbed by the cops for grand theft auto because I play that game. But let's say that I stole a car, which we already know is a complete fabrication because I don't even drive. But let's say I stole a car, and that I had moved the car across state lines. It became a federal case. Therefore, my criminal information is included. My mugshot would be included in this particular database. On related note, my ID also is in that database as a civil image, not as a criminal image. Well, in my case, they would tie those two images together because they refer to the same person and I had been involved in a criminal act. So while I would have an image in there from a civil source, it would be filed under the criminal side of things. This is important when we get to how the probes work. Now, let's say you have been perfectly law abiding this whole time, and that your ID is also in this database, but it's just under the civil side of things. Since you don't have any criminal background, it's not connected to anything on the criminal side, So when it comes to probes using the IPS, your information will not be referenced because the FBI policy is when it's running these potential matches with a photo that's been gathered as part of the evidence for an ongoing investigation, they can only consult the criminal side, not the civil side, with the exception of any civil photos that are connected to a criminal case, as in my example, those are fair game. So it might run a match and it turns out that my photo for my state given identification card is a better match than the mugshot is. That's going to be fine because those two things were both attached to a criminal file in the first place. But let's say that it would have matched up against you since you didn't have a criminal background, and since the only record in there was a civil source, the match would completely skip over you. It wouldn't return your picture because your image is off limits in that particular use very important because it's an effort to try and make sure this facial recognition technology is focusing just on the criminal side, not putting law abiding citizens in danger of being pulled up in a virtual lineup, at least not using that approach. That's the problem is that that's not the only way the FBI runs searches. In fact, that might not be the primary way the FBI runs searches when they're looking for a match to a photo that was taken as part of evidence gathering in pursuing a case. But let's say that you are an FBI agent and you've got a photo, a probe photo, and you want to run it for a match. What's the procedure. You would send off your request to the NGI dash Ips Department, and you would have to indicate how many potential photographs you want back, how many candidates do you want. You can choose between two candidate photos and fifty candidate photos. These are photos of different individuals, by the way, not just here's a picture of Jonathan on the beach. Here's a picture of Jonathan in the woods. No, it's more like, here's a picture of Jonathan. Here's a picture of a person who's not Jonathan, but also kind of matches this particular probe photo you submitted. And here are forty eight others. The default is twenty, so if you don't change the default at all, you will get back twenty images that are potential candidates matching your probe photo, assuming that any are found at all. It is possible that you submit a probe photo and the system doesn't find any matches at all, and which case you'll just get a null. You might get less than what you asked for if only a few had met the threshold for reliability. Now we call them candidate photos because you're supposed to acknowledge the fact that these are meant to help you pursue a lead of inquiry. In a case, it is not meant to be a source of positive identification of a suspect. So in other words, you shouldn't run a facial recognition software probe, get a result back and say that's our guy, let's go pick him up. That's not enough. It's meant to be the start of a line of inquiry, and whether or not it gets used that way all the time is another matter. But the purpose of calling it candidate photo is to remind everyone this is not meant to be proof of someone's guilt or innocence. The FBI also allows certain state authorities to use this same database, and different agencies have different preferences. So in the GAO report that I talked about earlier, the authors noted that law enforcement officials from Michigan, for example, would always ask for the maximum number of candidate photos, particularly when they'd use probe images that were of low quality. So let's say you've got a picture captured from a security camera and the lighting is pretty bad and perhaps the person wasn't facing dead on into the camera. You might ask for the maximum number of candidate photos to re return to you, knowing that the image you submitted was low quality, and therefore any match is only potentially going to be the person you're actually looking for. And again, this is all just to help you with the beginning of your investigation. It's not meant to be the that's our guy moment that you would see and say police procedural that would appear on network television in primetime. The FBI I also has a policy in that all returned candidate photos must first be analyzed by human specialists before being passed on to other law enforcement agencies. Up to that point, the entire process is automatic, so you don't have people overseeing the process once it's probing all of the database, but once the results come in, human analysts, who are supposed to be trained in this sort of thing, are supposed to look at each of those returned candidates and determine if whether or not they really do resemble the person in the probe photo that was submitted in the first place, and if they're not, they are not supposed to be passed on any further down the chain. Now, so far, this probably doesn't sound too problematic. The FBI has a database containing both criminal and civil photographs, but when it runs a probe, it can only use the criminal photos or the civil ones that are attached to criminal files. Candidate photos are supposed to only be used to help start a line of inquiry, not to positively identify suspects, and everything has to be reviewed by human being. That sounds fairly reasonable. But even if you're mostly okay with this approach, which still has some problems we'll talk about in a bit, things get significantly more dicey as you learn more about the FBI's policies. For example, they have a unit called the Facial Analysis Comparison and Evaluation Services or face FACE. This is a part of the Criminal Justice Information Services Department CG. Rather I yeah, I can spell justice with a G. It doesn't make sense. No, the cjis. This is a department within the FBI, and FACE can carry out a search far more wide reaching than one that just uses the ngi IPS database. FACE uses not only that database but also external databases when conducting a search with a probe photo. So let's say again, you're an FBI agent and you have an image that you want to match. You want to find out who this person is. Maybe it's just a person of interest, doesn't even necessarily have to be a suspect. Could be that, hey, maybe this person can tell us more about this thing that happened later on. Well, you could follow the NGIIPS procedure, which would focus on those criminal photographs, or you could submit your image to face. Face then would search dozens of databases holding more than four hundred eleven million photographs, many of which are from civil sources. So NGIIPS has thirty million, all of them together have four hundred eleven million pictures. And again a lot of those pictures just come from things like passport ID, driver's licenses, sometimes security clearances, that sort of stuff. That's this database has a lot of law abiding citizens who have no criminal record, and the images have nothing to do with any sort of criminal act, but they're in these databases. These external databases belong to lots of different agencies, and both at the federal level and state level. So you've got state police agencies, You've got the Department of Defense, You've got the Department of Justice, you have the Department of State, and again it contains photos from licenses, passports, security ID cards, and more. So your submission would then go to one of twenty nine different biometric image specialists. They would take that probe photo and run a scan through these various databases and they would look for matches. Here's another problem. Each of these systems has a different methodology for performing and returning search results, which makes this even more complicated. For example, I talked about how the ngi IPS system gives you a return between two and fifty candidate photos. Right, Well, the Department of State will return as many as eighty eight candidate photos if they are all from visa applications from people who are not US citizens. So you can get up to eighty eight pictures from visa applicants, or you could just get three images from US citizen passport applicants, because that's a hard limit. They can only return three candidate photos from US citizens who applied for passports, but they can return up to eighty eight visa application photos. The Department of Defense will will down all of their candidates into a single entry. So, in other words, Diberna Defense, if you query that database with your probe photo, you will only get one image back, so they will call all the other ones and give you the most likely match out of all the ones that they find in their search. Some states will do similar things where they will narrow down which images they will return to you. Some of them will just give you everything they've got. Every match that comes up, they'll just return it back to the FBI. So it's very complicated. You can't really be sure what methods people are using to be certain that the potential matches they have represent a good match, a good chance that the person that they've returned is actually the same one who is in the probe photo. At any rate, you as an FBI agent, wouldn't get all of these at all, all of these photos that would come back, They would come back to that biometric analyst over at face, So you send your request to face face takes care of the rest. They get back all these results. Then they go through the results they get back and they whittle that down to one or two candidate photos and they send those on to you, the FBI agent. So by the time you get it, you only see one or two out of the potentially more than one hundred images that were returned on this search. But you might ask, well, how frequently does this happen? I mean, how often is the FBI looking at images, including pictures of law abiding citizens in these virtual lineups. It can't be that frequent, right, Well, again, according to that GAO report, the FBI submitted two hundred fifteen thousand searches between August twenty eleven, which is pretty much when the program went into pilot mode and started to be rolled out more widely, through December twenty fifteen two hundred and fifteen thousand. From August twenty eleven to December twenty fifteen, thirty six thousand of those searches were on state driver's licensed databases. So it happens a lot thirty six thousand times. Chances are if you are an adult in America, you got like a coin flip situation that your image was looked at at some time or another by an algorithm comparing it to a probe photo in the pursuit of information regarding a federal case or in some cases, state cases, because the FBI has also allowed certain states law agencies access to this approach. Now, according to the rules, the FBI should have submitted some important documents to inform the public of their policies and to lay down the regulations, the rules, the processes that they would have to follow in order for this to be fair, for it to not encroach on your privacy or to violate civil liberties or civil rights. Without those rules, the use of the system is largely unread, which can lead to misuse, whether it's intentional or otherwise. The Government Accountability Office specifically pointed out two different types of notifications that the FBI either failed to submit or was just very late in submitting. The first is called a Privacy Impact assessment or PIA. Now, as that name suggests, a PIA is meant to inform the public about any potential conflicts with privacy with regards to methods for collecting personal information. The FBI did submit a PIA for its next generation system, but they did it back in two thousand and eight when they first launched the NGIIPS. According to the Government Accountability Office, the FBI made enough significant changes to the system to warrant another PIA that anytime you make a significant revision to your personal information systems, you have to submit a new because things have changed, and according to the GAO, the FBI failed to do that for way too long. Now ultimately the FBI would publish a new PIA, but by that point, the Government Accountability Office said they had delayed so long that it made it more problematic as a result, because during the whole time that they were supposed to have submitted this, they were actively using this system. It wasn't like this was a system being tested. It was actually being put to use in real cases. And that kind of violates it, well, it doesn't. Kind of. It violates a Privacy Act of nineteen seventy four, which states that when you make these revisions, you're supposed to file a PIA before you put it into use. According to the GAO, the FBI failed to do so, and also the longer you wait to file this the more entrenched though, those uses come. So if you put a system in place, you build everything out, you've actually taken the time to do it, and then you publish a PIA any objections that are raised, you could say, well, we've got a system now, and it costs one point two billion dollars to put it in place. It's going to cost more money, taxpayer money for us to alter it, to remove it, to change it. You could argue against any move to amend the situation. And the GAO says, that's not playing cricket or playing fair for my fellow Americans. So that's a problem. But then there's another one. There's a second type of report called a Systems of Records Notice or sor in SORN. The Department of Justice was required to submit a SORN upon the launch of NGIIPS, but didn't do so until May fifth, twenty sixteen. The GAO criticized both the FBI and the Department of Justice for failing to inform the public of the nature of this technology and how it might impact personal privacy. But wait, there's more. The GAO report also accused the FBI of failing to perform any audits to make certain the use of facial recognition software isn't in violation of other policies, or even to make sure it doesn't violate the Fourth Amendment rights of US citizens. Now, for those of you who are not US citizens, you might wonder what does this actually mean. Well, the Fourth Amendment is supposed to protect us against unreasonable search and seizure, and part of that means law enforcement can't just demand to search you for no reason. And some have argued that using facial recognition software without a person's consent, using it invisibly and widespread essentially amounts to crossing that line. Now, in the United States, we've got plenty of examples of troublesome policies that seem to overstep the bounds that are established by the Fourth Amendment. But that's a tirade for an entirely different show, probably not a tech stuff, maybe a stuff they don't want you to know. There are a couple of laws in the United States that are important to take note of here besides that Fourth Amendment. One of them I just mentioned the Privacy Act of nineteen seventy four, and the other one is the e Government Act of two thousand and two. The Privacy Act sets limitations on the collection, disclosure, and use of personal information maintained in systems of records, including the ones that law agencies use. The e Government Act is the one that requires government agencies to conduct pias to make certain that personal information is handled properly in federal systems, and the GAO report alleges that the FBI policy wasn't aligned with either of those. Now, part of this accusation depends upon the fact that the FBI was using face in investigations for years before they updated their SORN. They're sworn. According to the Privacy Act, agencies must publish a new SORN upon the establishment or revision of the system of records. This is what I was talking about earlier, except I think I said PIA earlier when actually I met sor In. That's entirely my fault because I didn't write in my notes and I was talking next to boraneously. But SORN is what I should have said. The FBI argued that it was continuously updating the database to refine the system, but the GAO's argument was that you could be continuously updating the system and argue, well, we don't want to publish an sor in after every tiny revision because it's wasteful and time consuming. The GAOS counter to that is, yeah, but you were using this tool in actual cases. If you were developing this, let's say, in a department where you're not using real cases, you're just gradually tweaking the system so that it's more and more accurate in a controlled environment. That's one thing. But if you're actively making use of the system in real world investigations, you absolutely must adhere to these laws, because to do otherwise is in violation to laws that are passing the United States. So you can't have it both ways. You can't continuously tweak a system and put it to official use and not also file these reports. You could argue the FBI was trying to have its cake and eat it too, So the expression that I think I actually use properly. All Right, we've got more to talk about, but it's time for us to take another quick break to thank our sponsor. All right, So, the Government Accountability Office criticizes the FBI and various other agencies for failing to establish the scope and use of its facial recognition technology. But that's just the tip of the iceberg. Because the GAO report goes on to make an equally troubling point that the FBI had performed only a few studies on how accurate these facial recognition systems were in the first place. So, in other words, not only was this a poorly defined and unregulated tool, but it's a tool of unknown accuracy and precision, which is terrifying when you think about it now. According to the report, the FBI did perform some initial tests before they deployed the ngiibs, and then occasionally did a couple of tests when they made some changes. But there were problems with these tests. For one thing, they were limited in scope and they didn't represent how the system might be used out in the real world. When they were actually running these tests, they ran on about nine hundred thousand photographs in the database, so they took a subset of the photos that they had. They took nine hundred thousand of them, and they ran probe tests using photos that they knew either were or were not represented in that group of nine hundred thousand. However, you've got to remember the full database is more than thirty million images, so something that works on a smaller scale may not work once you scale it up for another The tests did not specify how often incorrect matches would come back, so you didn't know how many false positives were there because the FBI wasn't tracking false positives. They were only concerned with how frequently they were getting a match to an actual image. So the way they test this is, you've got nine hundred thousand images, they've got a probe image, They know for a fact that the probe image is inside that database, and then they run the search to see if the system sends that image back. And their threshold was an eighty five percent detection rate for a positive match. So, in other words, it went like this, Let's say you need to conduct a test of this system. This is one way you would determine whether or not you had that eighty five percent detection rate. Let's say you have one hundred probe photos that you've taken of one person, and you know this person's face is in that database. You know it's going to be in among those nine hundred thousand or so images, So then you submit your query. If you have an eighty five percent detection rate, then eighty five of those probe photos should come back with a match, and that match should be the actual person you're looking for. That's what they meant by an eighty five percent detection rate, that eighty five percent of the time an image that is in their database would be pulled due to a facial recognition software search. Now, during this testing phase, the FBI reported that they met this threshold. They used that subset of actually was nine hundred and twenty six thousand photos as their subset when they were testing it, and they said that they had an eighty six percent detection rate, So they actually were exceeding what they had set as their threshold. But that just meant that eighty six percent of the time, the actual match for a probe photos showed up in a group of fifty candidate images, so you would get forty nine other images that were not your match. The match would be there eighty six percent of the time along with forty nine other images. So we know that the system works if you were asking for the maximum number of candidates. Remember in the FBI system, you can ask for between two and fifty, but fifty is the max. But what happens if you ask for fewer images? What if you said, no, I want twenty returns. What's the accuracy, then the FBI can't tell you because they do not know. According to the FBI, they did not run tests to see what would happen if you decrease the number of candidate photos you asked for. They only ran tests on the maximum number of candidate photos. And keep in mind, the default for any search is twenty photos, so the default is less than what they tested, and they never tried to see if the eighty six percent detection rate held true at these lower numbers. That's a huge issue. On top of that, the FBI didn't go so far to determine how frequently its system would return false positives to probes, so they never paid attention to how many times they got responses that didn't reflect and the actual identity. They didn't keep track of it. So, according to the FBI, the purpose of the system is to generate leads, not to positively identify persons of interest. So it shouldn't come as a big surprise, or you shouldn't even care if it returns a lot of false positives, because hey, this technology isn't meant to be the smoking gun that says, here's the evidence that will put this person away. It's meant to just create a lead, So why do you care how many false positives it returns? As if being looped in on an official inquiry when you had nothing to do with it isn't disruptive or stressful or provoke anxiety. I don't know about you, guys, but if I had a federal agent show up at my door asking me weird questions about a case that I had no connection to because my image had popped up in one of these searches and I have nothing to do with it, it just so happens that I look enough like a photo that's being used in the case to warrant this. I would probably find that pretty disruptive in my life, so I would care about false positives. FBI, at least according to this GAO report, apparently didn't think it was that big a deal. Now, the GAO points out that it is a big deal, and that they're not the only ones to think so. The National Science and Technology Council and the National Institute of Standards and Technology both state then, in order to know how accurate a system is, you need to know two pieces of information, not just the detection rate, which the FBI claims is eighty six percent at least when you're asking for fifty candidates, but also the false positive rate. You have to know both of them in order to understand how accurate a system is, So only knowing one of those pieces of information isn't enough to state this system is accurate or not. You have to know both. So, not only does the FBI not have a grasp on how accurate their system is if you're asking for fewer than the maximum number of candidates, they also don't know how often it returns false positives. So the FBI has no way of knowing how accurate this facial recognition software is that's being used to actually further investigations for official investigations of the FBI and also other state agencies that have access to the system, That is beyond problematic. If you cannot say that the system with any degree of certainty is above a certain threshold of accuracy, why are you using it? Because? I mean, it has the potential to dramatically impact people's lives and potentially lead people down a pathway that could result in false accusations and imprisonment. The person who is actually responsible might totally get away with something because of this. This is a real problem. And the thing is it might be a perfectly accurate system, but we don't know that because we haven't tested it. So until we test it, we cannot just assume that it's accurate enough. That's not when people's lives are at staate. This is where that my bias doesn't so much creep in as it kicks open the door and makes itself at home on your couch. But I digress. The GAO report also goes into great detail about how this accuracy really can have a clear impact on people's privacy, their civil liberties, their civil rights. They also cite the Electronic Frontier Foundation the EFF which says that if a person is brought up as a defendant in a case and it is revealed that they were matched by a facial recognition system, it puts a burden on the defendant to argue that they are not the same person as was seen in a probe photo, that they are not the same one that the system has identified. And if you cannot reliably state how accurate your system is because you don't know how frequently it returns false positives, you have unfairly burned And the defendant, Like if you were to say, if you're the FBI, and you say, we have an eighty six percent detection rate, but you don't admit, oh, by the way, we don't know how many false positives we get on any given search. The implication you have given is that we're pretty sure that this is the right guy. And again they argue that this is meant to be a point of inquiry, but you could easily see how it could also be used by a lawyer to argue that a defendant is in fact the person responsible for a crime, and they may not be. And because you don't know the accuracy of the system, you can't using the system to argue for it is irresponsible. There's no accountability there. Now. Not only has the FBI failed to establish the accuracy of its own NGIIPS system, it has also not assessed the accuracy of all those external databases that are used whenever they use the face approach. There are no accuracy requirements for these agencies, so there's not like a threshold they have to prove that they meet in order to be part of this. That's a huge problem. While each agency might be accurate with no testing procedure, in place, it's impossible to be certain of that. And since these databases include millions of people with no criminal background and they all use different facial recognition software products, this is a huge issue. You could be put in a virtual lineup simply because you look enough like someone else that a computer thinks you are in fact the same person. The GAO report concludes with a host of recommendations for future actions, including addressing the problem of the FBI being so slow to publish those updated pias in a timely manner, and create a means to assess each system's accuracy. The Department of Justice read the report and then responded disagreeing with several points that the GOAO report made, including arguing that the FBI and the Department of Justice published information when it made the most sense, when the system had been tweaked and finalized. More or less. However, by that time, again, they had been using that system for real world cases throughout the entire process, So it seems to me to be kind of a weak argument. You can't really say, like, hey, it wasn't finished until then, that's when we published it. If you also are saying, hey, we use that for real zees to go after actual people. You can't have it both ways and not maintain accountability at any rate. So that kind of gets to the end of the Government Accountability Office report, but that's not the end of the story. In March twenty seventeen, Congress held some hearings about this, and boy howdy, were some congress people very upset with the FBI. On both sides of the aisle. You had Democrats and Republicans really chastising the FBI for their use of facial recognition software and arguing that it could amount to an enormous invasion of privacy as well as endangering the civil liberties of US citizens. So people who have dramatically different political philosophies were agreeing on this point. So it wasn't really a partisan issue in this case, and it got pretty ugly, but probably not as ugly as the Georgetown University report that was published in late twenty sixteen. This is an amazing report. Both the Government Accountability Office report and the Georgetown University report are available for free online. I will warn you collectively, they're about two hundred pages, so if you want some light reading you can check it out. They are quite good, both of them. And they're very accessible. Neither of them are written in crazy legallees which will make it impossible to understand. They're written in very plain English, as in the Georgetown University report that was revealed that one in every two American adults has their picture contained in a database connected to law enforcement facial recognition systems. And this report goes far beyond just that FBI to state all the way down to state and local systems that are implementing their own facial recognition databases, and many of them have no understanding of how it might impact the civil liberties or privacy of citizens. The report is the summary of a study that lasted a full year with more than one hundred records requests to various police departments. They looked at fifty two different law enforcement agencies across the United States, and the report assessed the risks to civil liberties and civil rights because up until this report was filed, no such study had been made, which is a huge problem. You don't know the impact of the tool that you've created until after it's been put in use for a while. That's an issue. Ideally, you think all this out before you implement the procedure and their findings were pretty upsetting. For example, the report found that some agencies limit themselves to using facial recognition within the framework of a targeted and public use, such as using it on someone who has been legally arrested or detained for a crime. And in this case, you're talking about totally above board approach. You're assuming that everyone is following the law as regards to apprehending and charging a suspect with a crime, and maybe that person is unwilling or unable to tell you what their identity is, and in that case, you would use this facial recognition software stuff in order to figure out who you are dealing with. That's largely a legitimate case the government. The Georgetown University study didn't say that's bad. They actually said, no, that makes sense. It's targeted, it's public. But you could have a more invisible approach, for example, using facial recognition software in real time on a closed circuit camera pointed at a city street, where you're literally picking up people as they pass by. They're not people of interest, they're just people going about their day. And if you're running facial recognition software on such a feed, you are potentially invading privacy and stepping on civil rights and civil liberties.

Speaker 2

Hey, it's modern day, Jonathan here just cutting end to say we will have more about the National Facial Recognition Database after this break.

Speaker 1

So even if you were to argue that this real time use where you're just looking at people as they pass by, and maybe a little name pops up every now and then as it as the system recognizes a person that matches a file in the database, it's easy to a scenario in which such a technology could be abused. Either it picks up somebody mistakenly, it thinks it identifies someone, but in fact it's a totally different person, and then you end up establishing a person's location by mistake, like it's not really where they were, but because the system has identified a person as being at X place at why time, you then have established supposedly that person's location, when in fact that person might be across town or not even in the same state. But it's because of a misidentification in the system. That's one problem. But think about this. Think of this is a scary scenario. Imagine a situation in which a group of people are discriminated against by a government agency. Let's say they have a legitimate gripe. It's completely legitimate. They're victims of unfair treatment. So a group of them at some of their allies get together in a public place for peaceful protest, to raise awareness of this issue and to confront the government agencies that have discriminated against them. This is all perfectly legal according to the US Constitution. They're not doing anything legal. They're assembling on public grounds in order to practice free speech. But it's not hard to imagine a government agency using a camera with this sort of facial recognition software to identify people who are in the crowd in order to use that as leverage in the future for some purpose or another, even if it's just to say we know you were there, and to put that kind of pressure on a person in order to essentially squelch people's freedom of speech. So this is a First Amendment issue, not just a Fourth Amendment issue. Now that might sound like a dramatic scenario like something like Big brother Ish. It's orwellian, but it's also entirely within the realm of possibility. From a technological standpoint, there's nothing technologically oriented that would prevent us from doing this or prevent an agency from doing this, and even without the evil empire scenario in place, you still have the problematic issue of treading on civil liberties just by having such technology available and unregulated. You don't have rules to guide this sort of stuff. The Georgetown report found that only one agency out of the fifty two that they looked at have a specific rule against using facial recognition software to identify people participating in public demonstrations or free speech in general. So only one agency actually has rules against that. Now, that doesn't mean the other fifty one agencies are regularly using this technology to monitor acts of free speech, but it also doesn't mean that they can't. They don't have rules against it. Only one a agency out of the fifty two, people are being watched and identified without any connection to a crime. In these cases, it's pretty terrifying. The Georgetown report also found that no state had yet passed a law to regulate police use of facial recognition software. No state in the US. They're fifty of them, and none of them have passed any regulations, any laws to regulate the use of facial recognition software. So without rules, how do you argue whether someone's misused or abused a system, you have to have rules so that you know what is allowed and what is not allowed. With no rules, the implication is that everything's allowed until it isn't. That's a huge dangerous problem. The report also pointed out that most of these agencies lacked any sort of methodology to ensure that the accuracy of their respective systems was decent. The report stated that of all the agencies they investigated, only two, the San Francisco Police Department and the South Sound nine to one one from Seattle, had made decisions about what facial recognition software they were going to incorporate in their office based off of accuracy rates. That was not a consideration for all of the other agencies, at least not the ones that they asked. Moreover, they report points out that facial recognition companies are also trying to have it both ways. So, for example, they cite a company called fat Face First. Now face First advertises that has a ninety five percent accuracy rate, but it simultaneously disclaims any liability for failing to meet that ninety five percent accuracy rate. So it's kind of like saying we guarantee these tires. Tires are not guaranteed not quite like that, but similar. So again, this is according to the Georgetown University report, that's a problem for a company to sell itself on a performance threshold, but then say, hey, you can't hold us to that performance threshold that we sold you on. That's a little dangerous there too. Then the report goes on to state that the human analysts, you know, the ones I was talking about earlier, that supposed to be a safeguard. Human analysts are supposed to take the images that are returned by these automated systems and manually review them to make sure that they do or do not match that probe photo. That was the whole thing to begin with. But it turns out, according to this report, those human analysts are not that accurate. In fact, they're no better than a coin flip. Literally. The report sites of study that showed that if analysts did not have highly specialized training, they would make the wrong decision for a potential match fifty percent of the time. Literally a coin flip. That's ridiculous. Now, the report found only eight agencies out of the fifty two used specialized personnel to review images. In other words, people who presumably have actually received that highly specialized training necessary to make more accurate decisions regarding these photos, and the report states that there's no formal training regime in place for examiners, which is a major problem for a system that's already in widespread use. So not only do you need highly specialized training, there's no formalized approach to give or receive that highly specialized training. So we know you need it, but we haven't developed the best practices to actually deliver upon that. So meanwhile, you've got human analysts who are making mistakes half the time while reviewing these photo And if you wonder if facial recognition systems would disproportionately affect some ethnicities over others, the answer to that is resounding and dismaying yes. The report found that African Americans would be affected more than other ethnicities. According to an FBI co authored study that was cited by this Georgetown University report, several facial recognition algorithms are less accurate for Black people than for other ethnicities, and there's no independent testing process to determine if there's a racial bias in any of these facial recognition systems, so no one has developed a test to make certain that it is in fact accurate despite a person's age, gender, or race, without being able to verify that it is accurate across all parameters, you have opened up an enormous can of worms, and you are disproportionately affecting people just because of the race, because your system does not address that properly. The report also points out that the information about the systems in use had not been generally available to the public. In fact, all of the fifty two agencies they contacted, only four had publicly available use policies. So, in other words, only four of the fifty two could tell you what their general policy was as far as facial recognition software goes. That's less than ten percent of all of the agencies they looked at, and only one of those agencies, which was San Diego's Association of Governments, had legislative approval for its policy. All the others were just self appointed policies that had not passed through any kind of official legislative support. Finally, the report asserted that most of these systems did not have an official audit process to determine if or when someone misuses the systems. Nine agencies were or that they did have a process, but only one provided Georgetown with any evidence that they had a working audit system, and that was the Michigan State Police, by the way, who said, we have an audit system, and here's proof that it actually works the way we said it did. So good on you, Michigan State for our having that system in place and being able to back it up now. The Georgetown University report also urged some major changes in the way law enforcement uses facial recognition, including an appeal to Congress to create clear regulations to define the parameters of when such a system could be used. They also called for companies to publish processes that test their products accuracy regardless of race, gender, and age, to remove that possibility of bias. And if we're being really super kind and generous toward law enforcement, we could say this is just another case where technology has clearly outpaced the law. We see that all the time, driverless artificial intelligence, lots of different technologies are advancing far faster than legislation can keep up with. All right, that's fair, we see it happen. However, it's particularly troublesome that this is happening within law enforcement that is already employing this technology before we've developed the policies to guide it. It's one thing to say someone's out here working on a driverless car, and we need to start thinking about how are we going to regulate that in the future. Maybe right now we say you aren't allowed to operate your driverless car until we figured this out. That's fair. It's another thing to say, there's this technology that could potentially impact people's lives and we're allowing law enforcement to use it while we try and figure out the rules. That's at best a problem. And as I said at the top of the show, I'm really just talking about the United States with particulars here, but this is happening all around the world. There are lots of governments around the world that are incorporating facial recognition software along with law enforcement. So while I'm using specific US examples in this podcast, the same is true for lots of other places. Of course, the laws that protect the citizens can be different from country to country, and in some cases there might not be very many outlets for citizens to voice their concern or it might even be dangerous to do so. But this is something I think we need to be aware of. I'm not generally the kind of person who tells you that you're being watched or you know, you should be paranoid. But I'm also not the person to just sit back and let something go on when I feel like it's potentially more of a problem than a solution.

Speaker 2

Well that was it for the episode I did on the National Facial Recognition Database back in twenty seventeen. It's a topic I should definitely revisit. Obviously, there's so much going on here. There are so many concerning things about it, from surveillance states, to privacy and security concerns, to the fact that we've been seeing lots of companies try and use facial recognition to match people against databases to varying degrees of success, and that for people of color in particular, those degrees of success are not good. And I think there's a lot that we need to talk about as far as this goes, when it comes to things like, you know, individual rights and authoritarian abuse of these kind of technologies, and I think we do need to have another update on this, so I will put that on my list. I hope that you are all well, and I'll talk to you again really soon. Tech stuff is an iHeartRadio production for more podcasts from iheartradiosit the iHeartRadio app, Apple Podcasts, or wherever you listen to your favorite shows.

In 1 playlist(s)

TechStuff

Social links

Follow podcast

Recent clips