SYSK Selects: How the Deep Web Works

Published Apr 4, 2020, 9:00 AM

Perhaps you didn’t realize that when you search the web you’re only skimming the surface. In fact, the types of web pages that turn up in your search engine results represent only a mere fraction of the total web. Immerse yourself in the Deep web and its dark corners in this classic episode.

Learn more about your ad-choices at https://www.iheartpodcastnetwork.com

Hello, everyone, Happy Saturday. Chuck here with another Saturday Selects pick this week, How the Deep Web Works January. This is a good one. Everyone's deep Web is deep and dark and scary, or at least it can be. And we dove into that it's changed a lot over the past six years. But this is a pretty good early peak at the deep web. And I was proud of this one. So give listen. I hope you enjoy it. Have a great weekend. Welcome to Stuff You Should Know, a production of My Heart Radios How Stuff Works. Hey, and welcome to the podcast. I'm Josh Clark, and there's Charles Depy, Chuck Bryant, yea Audie uh and it makes the Stuff you should Know. That's right, mineus Jerry. But with no that's right, we lose a Jerry gain a Knoll. Yeah. One step forward and another step forward? Oh for you are you've just been wailing on it. I'm not gonna say two steps back with Nol, sitt and five ft away, but it could be one and one. One step foward with Nol, one step back for not having Jerry. You're saying it's a step forward not having Jerry, and a step forward having nol. I'm just trying to make everyone like me doing a poor job of it. You do a great job of Everybody loves the chuck, not everybody who doesn't. I have some mortal enemies. Mortal enemies. Yeah, they want to kill them trying to kill you. H we'll chuck. Yes, I will tell you what if they did want to kill you, they wanted to hire a hit many. Yeah. The deep web is a good place to start, looking quite a segway. It's been a while. I t that one up. You did unintentionally. Yeah, I spotted it and went after it. Yeah. This is about both the deep and dark web, which are two different things. The dark web is part of the deep web, thank you. But the deep web is necessarily dark, all dark. Yeah, that's very well put. The dark web is the nefarious things that go on in the deep web, not necessarily nefarious, but the purposefully hidden. Yeah, that's true, because there are some good things in the dark web. I totally misspoke. Yeah, well you know what. I think that it's great that you confess to it. You feel better, I do. Man. This is a really upfront kind of episode, isn't it. It's a very honest we're bearing it all. Uh so, do you have a fancy intro story? No? You think I would? Right, My intro gets buried later on. It's a great intro, but it just I'll use it as the intro. Okay, go ahead, okay, okay, Chuck. Yes, have you heard of our favorite band, Iron Maiden? Ah? Yeah, sure so. Iron Maiden is arguably the most awesome band of all time. Oh dude, all right, not a huge fan, but you you wouldn't be like I hate Iron Maiden. They suck, of course. Not no, because it makes you crazy. It's right. Iron Maid's been around for a while. They're pretty smart. They know what they're doing. Um and recently they figured out a way to maximize their touring dollars by flying their own plane. They well, Bruce Dickinson always did. Yeah he was. He's a certified pilot. It's gotta be efficient. I would imagine, plus fun unless Bruce was partying too hard and then they got to fly to the next city that night. He wouldn't do that, I hope not, because that's that's dangerous. I mean, driving drunk is bad enough, but flying drunk I can only imagine. Sure. Uh, and it's probably not just drunk, you know what I'm saying. No, no, no, he's he's straight straight has he always been? I don't know. I can't verify that well anyway, Um, Bruce and the boys uh figured out that a good way to figure out where to tour, where to decide to tour, um be to figure out where their music was getting pirated the most. That sounds reasonable. It does sound reasonable. It's it. It provides you with evidence of an established fan base and a fan base that is unwilling to pay for your record but would probably pay to see you live. How does that reason? Well, they like your music, but they don't want to pay for your CD, so why would they go to see live and pay? Because it's different, Like, seeing a live show is way different than buying a CD. You can't you can't get a live show. You could get a video of a live show, it's still not the same experience. A live show is a live show. Plus, everybody always knows that anybody involved in the or entrenched in the old guard music industry does any band doesn't make any money on their records and make it on touring, So going to see a band live also is kind of a true act of fandom because you're really you're you're contributing directly to your band that you like. You know. Um, So what they did was they hired a company to look at bit torrent sites and find the regions where their music it was most pirate ID and they created a tour map from it and went and played those regions. Did you do you have the number one Iron Maiden pirated region? But we're gonna say Rio, all right, they're huge in South America. That's that's my guess. We'll look it up afterward, I guess Rio. And so they were like, we're gonna start our tor and Rio. Yeah. And it wasn't just that that one place, but it was basically a tour that was built on the areas where the music was most pirated. It was a Stroker genius, but they couldn't have done it without harvesting the deep web because bit torrent sites. When you search bit torrent, it doesn't the average search engine doesn't respond with a list of bit torrent activity. It'll just send you to a bit torrent site, which means that those pages is of bit touring activity, which are web pages, and they do exist. They're part of what's called the deep web. That's right, the surface web as we know it, and search engines that we all use like Google and being supposedly only have access to about point zero three percent of what is truly on the world wide Web. It's like scary and weird and thrilling all at the same time. Point three And anything else that's buried is the deep web. And it's not necessarily the deep web is not is not when you're purposely trying to hide things. It just may not be cataloged and index may a password. Sure, maybe one of those timed sites that uh don't let you access date after a certain amount of time, could be anything with a captution involved, anything that's not hyperlinked. Uh, there's lots of reasons that something could find it is all buried in the deep web. And and you make a good point to separate the deep web and the dark webs. Let me give you an example of deep web aside from those bit torrent sites. Um, there's this company called bright Planet and they had this price they provide deep web harvesting, and they had this primer on you know, what is the deep web. One of the examples they used was if you look up government grants on a traditional search engine, it will probably provide you with www. Dot grants dot gov as one of the first returns right straight up. Um, when you go onto grants dot CoV, you can then search and find pages of all these different government grants. You can search by keyword, you can browse, but those pages aren't going to come up on your normal Google search. You have to go to the site, which means that those pages of the actual grants are part of the deep web. Yeah, your bank account, you're checking account online, if you have mobile banking or online banking, it has a web page all to its own right now. And if I searched Chuck Bryant's checking account, it would not come back. I would not get that because it's behind a password. It's a it's a website page. It's a web page, but it's password encrypted. Therefore it's part of the deep web. Twitter until it index tweets used to be you couldn't search tweets individual tweets. Now you can, so that made them formerly a part of the deep web. Actual tweets or every company on the planet has some sort of internal employee pages like internal dot discovery that only we can access and you can't Google search any of that stuff, right, or somebody could conceivably access it. Maybe it depends on the page, but you have to know the exact u r L. So the idea is, if it's blind, if if search engines are blind to it, it's part of the deep web. If search engines can index it and bring it back as a return results search results, Um, it's part of the surface web. Yeah, because that's all the search engine is doing. They are We might should do a full podcast on search engines at some point, but the general thing is that there is an index of data, and they use spiders or crawlers because it is a web, to crawl around and locate domain names and hyperlinks and basically index all that in what they think will be most helpful to what you're looking for. Right, So chuck Brian's bank account. Yeah, there are some web pages out there that contain information related to that keyword search. Yes, so a search engine will keep an index with that keyword search with the u r L s, the locations, the page content, some of the page content, the meta tags are the metadata, and other very brief sketch information about those pages associated with the keyword for an index, which means that when you type in Chuck Brian, it's bank account. You got a quit saying that. Sorry, I thought about it as I was saying it that last time. But when you type in um Birds of Paradise bank account, bank account, um, it will the search engine goes and accesses the index. It doesn't have to go all the way across every page on the web that it can find. It just goes to its indices, and that's how search results are returned so quickly. It's not going across the internet. It's already got the spider crawlers, the bots doing that constantly. The search engine is just going to the index is that the bots have created from their searches. Yeah, and it is super shallow. I mean, we said point zero three percent. We do. We do our whole job as researching online mainly, and we run into this all the time where you feel like you're getting a very slim portion of what you're trying to find out because so many of the best uh, medical journals and things like this don't just pop up as you know, it's more likely to be some headline from CNN dot com and not like a Harvard Medical Journal paper that could really help you out. Yeah, And I mean, like you can get deeper and deeper with your keyword skills and your search skills, but for the most part that Yeah, the first returns of first results, depending on what you search for, are going to be, like you said, superficial. Yeah. But even if you're super a super sleuth, a Google master like we all think we are, I mean, how can how much can that be bumping it up point one? Yeah, Well, a lot of the problem, to the chuck is the so much of science is behind a paywall. Yeah, yeah, really really really expensive paywalls. To um, which is like, here's the first eight lines of this awesome medical research paper exactly if you want it, give us. Yeah, which is a problem in and of itself, not necessarily related to this, but with current search engine technology, you have, like you said, a superficial result from a core um on the other end of the spectrum, And this is kind of what search instances are dealing with now, Um, the deeper you go into the deep web. Again, the surface web is point zero three percent of all of the web pages on the entire Internet, So the further you go into it, the more data you have, and you eventually can run into the problem of what's called big data, which not capitalized B or D, which refers to like companies like Google and and that that can dig and harvest and maintain a large amount of data. It's just it's basically data that's so much and so unwieldy you can't even uh process and search it. It's like not even helpful. It's yeah, it's like a really bad Internet search. Yeah. Um, so the the the current state of search engine design or creation is balancing that figuring out how to get less superficial without running into the big data problem of of incoherent data due to just massive amounts of returns. And you might think that these search engines do a great job because I can always find out what I need, But you don't know what you're missing, you know, right, So it's it's sort of not even correct to say that I always find out what I need because you may even know you need it because it's hidden. That's true, And I mean you're you're missing quite a bit. Okay. There's apparently fifty million registered domains on the Internet. Yeah, and that's I looked at like just in two thousand twelve. I think there're only like two hundred and fifty or something. I mean, it seems like it's doubled in the last couple years. Right, So there's five million domains, for example, a lot of more garbage, yes, but how stuff Works dot Com is one domain, And I asked Tracy Wilson, who's the site director and runs stuff you missed in history class. It's one of the co hosts. How many pages there are how stuff Works? She said roughly at least, so one domain out of five fifty million has fifty pages itself. Right, so you kind of get an idea of the scope. Deep web is anywhere from four hundred to five hundred times bigger than the surface web. And like you said, you don't know what you're missing because you don't know what's out there because your search returns aren't bringing you back anything. Yeah, I mean there's a lot of important stuff out there. We talked about medical papers. Um, apparently there's engineering databases, financial financial information, a lot of things that could really help research. Uh, but you just can't find it, right. Um. Unpublished blog posts or just basically anything that a person creates on the Internet. Yeah, Um, is if a page is created, it's part of the deep web. Yeah, unless you take this stuff down, it's living there forever, just gathering dust exactly. So. Um, and it's not just necessarily engineering databases or medical information. Um. There's also a lot of shady stuff too. The dark web. That's the that's the dark web. Yeah, that is the dark web. Is when is um the sites intentionally uh reroute you. Well, we'll get to how they do it, but basically it's a it's an intentional anonymity. Um, it's not. Oh, it just happens to be buried on the deep web because it's not index it's it is purposely hidden from the surface web, so people can't track the person searching for something or the the end uh website. I guess like those are all just private essentially, right, and privacy advocates are way into it. You're not necessarily a child pornographer, although there is a lot of that kind of stuff on the dark web. Um, there's also a lot of good that happens on the dark web. Yeah. The the anonymity and privacy and the desire for it isn't in and of itself proof of wrongdoing, of course, not no, which is frequently it's frequently pointed out as that but incorrectly. Yeah, I like, I don't want the n s A. And my business people like, well, what are you doing right? Exactly nothing. Yeah, I just don't want them in my business precisely. Yeah. Um, that's an answer. That's good enough. That answer is good enough. And for a lot of people, um, they say, well, then I need to go to the dark web to maintain anonymity, um or higher hit man right to kill Chuck Bryant. That you could do. That's crazy, Um, you could do. There was a site for a while. Um, I don't know if you've heard of it or not. It's called the Silk Road, Yeah, which got shut down, and I teach Chuck, I know you've heard of it. It's like the most famous dark website of all time. The the Feds busted um ross Albrick, who may or may not be dread Pirate Roberts, which was the online name that they said he's the guy running this, and he is now saying, actually that's not me. Um, but all those bitcoins are mine, so you can't seize those bitcoins. And there it's in courts now they're trying to determine whether or not it's it counts as something that you can seize as an asset from a criminal. And they're saying that this is literally a case that no court has ever heard before. Yeah, they they it's never been questioned whether you could seize cryptocurrency. Yeah, and you should listen to our podcast on bitcoins by the way, from not too many months ago. But it's essentially just yeah, encrypted digital currency. And they have a really really fascinating, circumstantial case against a brick Um, not just for operating the Silk Road site. Yeah. Um, that's where you could buy drugs and things, by the way, right, which being the operator of that in and of itself shouldn't be a crime. I'm sure that they would have prosecuted him for that if they'd been able to get their hands on him for just that. But apparently they also have him for at least two hired contract killings. One he he um, I guess hired an undercover cop to do it, and the guy went to the person who he was taking the hit out on and said, this guy's trying to kill you. I need you to cooperate, and I'm going to take pictures of you dead and send them to this guy, and Old Brick apparently gave him like forty grand up front, another forty after he saw the photos. So like in bitcoins, No, I think in cash, although no, it would have been in bitcoins. You're right, yeah, yeah, so who knows? It could have been two bitcoins at the time or five thousand. Well, silk Row two point oh launched in November? Is it out? Now? It's out? And um there are other copycatters, like the black Market reloaded and which that one down for a little while after silk Road went down, but then it went back up. I think, yeah, Like, I don't know, man, I hate to say you shouldn't try and fight crime, but you're not going to stop the stuff when one, you know, you cut off the head of one and another grows right out of it in its place. You know, it's true if the structure that's allowing for the anonymity can remain intact, which is the dark web, right, But it's not just the dark web, it's like how you traverse the dark web, like using tour Yeah, I guess we haven't explained. Um, the Onion Router t o R is what it's called and it is software that you use to access the deep web and the dark web if you choose to UM and it searches for these anonymous sites for you, like a search engine, but instead of dot com or dot org or dot net, they end in dot onion the idea and onion has many layers and UM that's that's how you access it through tour. You have to buy it and install it on your computer. It makes it for free. Yeah, Firefox had UM something that it was basically a tour bundle. It was the most popular one and you could download it for free. But it's not a web browser itself. It's like an add on to a web browser that allows anonymity. And it does two things. One, it bounces your trail all over the world from server to server, so it makes you and your activity extraordinarily difficult to track. It's not just like this computer went to this site, right, It's like that's that whole Onion thing. There's so many layers. It's like we can't we don't know who this is or where they're, where they are, what they're doing, or anything like that. We just know right now that this particular person happens to be it. There's a user on silk Road, but we don't know who it is or anything. You can't track them because they're using Tour. The other thing is you can't get into dot onion domain sites, dark websites unless you're using tour, Like they won't let you in unless you're an anonymous user. Um So, tour has this kind of twofold thing, but there was recently a breach in it and it turned out the FBI was using malware to break through the anonymity of tour users. And yeah, and found out a lot of people on some sites that are that were hosted by something called Freedom Hosting, which apparently had a horrible reputation for being the repository on the web, on the dark web for child pornography and knowingly like basically just not doing anything about it. Um So, the FBI had a They hacked the Freedom Hosting servers and inserted this malware. So if you went to a Freedom Hosting site, any of them, not just necessarily a child pornography but any site hosted by Freedom Hosting, which is like say go Daddy for the dark web, um you would get this malware package that exploited a key hole in Firefox's tore bundle. It went into your computer, said hey, give me your mac address, which is basically like your computer hardware, like serial numbers your computers and your computers alans tracking number, and then also tell me where the computer is, and it sent it back to a server, a mystery server in McLean, Virginia. And finally, after like a month, FBI was like, yeah, that was us. We got We have everybody who went on that site's name and address and everything on them. So that's been a huge ripple and Firefox fixed this loophole. But it's a huge ripple through you know, the dark web, deep web community saying like whoa, whoa. We were anonymous before, but you know now it's it's been shown definitively that the FEDS can find out who we are. So the anonymity is reduced, if not taken away, which defeats the whole purpose. Yeah, so if you don't have that, then you can keep lopping the heads off of these things, and they're not going to grow back because people are afraid. People will be afraid because they won't feel like they're anonymous any longer. Well tour has a sort of an ironic background, which we will get to right after this message break. All right, so we're back and we left you with the the nugget that tour has an interesting background, and the background of tours Actually, the US Naval Research Laboratory in two thousand three launched this program for political dissidents and whistleblowers so they can get their message out without fear of reprisal. Right, and this is still a use of tour Like the New York Times, Wiki Leaks, some other news agencies have um tour sites that if you want to go and contact the New York Times or Wiki Leaks anonymously, like you can go to their tour their onion site and UM upload documents or say hey, I have some information I want to share, right, and you can do it anonymously. So the government, though, is basically law enforcements trying to track down criminals using the software that the government created to begin with. So it's an interesting loop. Um. But like we said, it's not all badness. Um. If you live in a country where bad things are going on and you don't feel safe getting on the regular web as a political dissident, you can do so on the dark web. It offers a virtual meeting place for sometimes people are trying to, you know, combat these oppressive regimes in their countries, and they can't just hop on Facebook and organize a meeting because they'll get smacked down. Right, if you're a person who values privacy for whatever reason or no reason at all. UM, the deep web and the dark web offer file sharing services. Email is a big one too, Like, I know, I can't remember the name of the one Edwards Snowden has been using, but I think it got shut down, like just the whole company shutdown. Sorry, you're out of business now because you're helping Edward Snowden. UM. But there are other email UM email services basically everything you have on the web. If you want to do it anonymously, you have to go to a company that operates on the dark web, right, that uses tour to to route its information or your information. Yeah. The University of Luxembourg did a study where they tried to rank the most commonly accessed stuff on the dark Web, and sadly what they did find a lot of things like child pornography. There were also a lot of uh sites and chat rooms for human rights and freedom of information and just people that don't want to type in a search for, uh, how to grow marijuana? And then the next time they go to their Gmail account, they're a bunch of ads for grow lights and you're going, huh, how that happened? Well to happen because you're searching the surface web with an IP that can be traced back to you, and not even even illegal activities like that. You know, you want to research a fitbit bracelet and then you go and they say, hey, Chuck, are you fat? You want to lose weight? WHI else? You want to fitbit? Alrightaddy, why would you want to fit? And yeah, you're definitely creepy. You know, there's the big brother effect. I think everyone feels it. Uh. There's all there, the existence of the deep web, not necessarily the dark web, but just the deep web, all of those pages of information that are out there. Some companies have figured out how to exploit it, or the fact that search engines, normal search engines, aren't doing a good job of looking into the deep web. There that company, bright Planet I mentioned they have a deep Web Harvester, which is basically a proprietary search engine algorithm that goes into websites and gets everything like, it's not that doesn't them an index. It grabs every bit of text off of every site associated with a u r L. That sounds like big data. It is, but they're doing it for companies like big pharma, big government and saying like, oh, you want to know what your competitors up to, Well, here's every letter of every word of every strip of text on your competitor's website, including all internal stuff everything. Please give us ten million dollars for that search. Um. There's also this site called Vocative which uses something like bright Planets deep web harvesting, but it does it for journalism purposes, and it's basically, rather than searching using Google you or I would for a story idea, they're um searching using a deep web harvester to find all this other information that we wouldn't be able to find because we don't know how to search the deep web and writing stories like that. And there's some pretty interesting stuff that that sites put together already. Well, when you think about if you're only getting if you think the Internet is cool and you're only getting point three of it, yeah, yeah, not bad. And you know this is the webs the surface web is getting deeper. The deep web is getting deeper. Search engines are searching deeper. It's it's all like and they're trying to anonymize more effectively. So it's it's like this cyber war is going on. Oh, yes, you know, that was another good one we did. What do we do cyber war one? On cyber war you? Yeah, I knew I've heard that before, so there you go. I would have to say that this is one of those episodes where we did it, but it is not done. No, no, Sometimes we do them and it's like, that's it. There's nothing more to say about this topic. Yeah, I'm interested to see what happens with uh with old Brick for sure. That's that's gonna be a monument landmark case. You know. Uh. If you want to know more about the deep web, you can type deep web into the search engine and how stuff works. It'll bring back superficial results only how stuff works stuff. But it's pretty good, so you'll be happy. And since I said search bar, that means it's time for listener maw All right, Josh, I'm gonna call this uh birthday shout out that we rarely do. Okay, hey, guys, I'm a longtime listener, shamelessly writing to ask for a huge favor. Here's the sitch. I first became aware of your podcast with my last girlfriend, Natalie. David introduced me to it when we started dating, and i've heard it. Thank for getting me hooked, as we spent a lot of time listening to your show and learning together. As huge supporters of your podcast, we were compelled last year to make the trip up from Virginia to New York when you were putting on your trivia night. And Natalie is the one who gave us the mics on pants off T shirts and David her boyfriend. They were super cool, super nice. They sat at the table right near us, so I, uh, you know, got to know him a little bit. And um, he says. Anyhow, here's where the favor comes in. She moved to Shanghai, China to teach and she's teaching little kids English, and sadly they you know, separated when she moved over there, which to me are always like the saddest breakups, right like, there's nothing wrong to China. So they just thought it was probably the thing to do, but they, um, because I inquired back to David emailed him about this, and it's like, oh no, you guys broke up and said, yeah, but we still really support each other and care about each other, and hopefully our pass across again one day. So anyway, Natalie David is in China, and because of this distance, I was at a loss when considering what to get her. He made a donation to Cooperative for Education in her name. And I know you guys like to read those names of people who contribute, but in this case, I was hoping you would just do a little something more special by wishing her happy birthday. So on January, which I think should be very soon, Natalie, Happy birthday. Yeah, happy birthday. We remember you. I wear that shirt all the time. My wife thinks it's funny. And uh, I hope you're doing well in China, and don't give up on David just because here in the stupid United States. Her new Chinese boyfriends like what that guy? She's like nothing, but wait, rewind that. So um, anyway, I hope you're doing well over there in China, and thanks again for all the support, and I hope you guys, I hope your paths across again one day. That it was very nice that is from David Austin Bury. If you have a special request for Chucker, Me or US, you can tweet to us at s y s K podcast. You can join us on Facebook dot com slash stuff you Should Know, and if you want to send an email to Chuck, Jerry and Me, you can address it to Stuff podcast at how stuff works dot com. Stuff you Should Know is a production of iHeart Radio's How Stuff Works. For more podcasts for my heart Radio, visit the iHeart Radio app, Apple Podcasts, or wherever you listen to your favorite shows.

Stuff You Should Know

If you've ever wanted to know about champagne, satanism, the Stonewall Uprising, chaos theory, LSD,  
Social links
Follow podcast
Recent clips
Browse 2,565 clip(s)