How the Deep Web Works

Published Jan 23, 2014, 2:00 PM

Perhaps you didn't realize that when you search the web you're only skimming the surface. In fact, the types of web pages that turn up in your search engine results represent only a mere fraction of the total web. Immerse yourself in the Deep web and its dark corners in this episode.

Learn more about your ad-choices at https://www.iheartpodcastnetwork.com

Welcome to you stuff you should know from house stuff Works dot com. Hey, and welcome to the podcast. I'm Josh Clark, and there's Charles W Chuck Bryant. Yeah, howdye uh? And that makes the stuff as you know. That's right minus Jerry. But with no that's right, we lose a Jerry gain anale. One step forward and another step forward. Oh where Jerry? You've just been wailing on it. Well, I'm not gonna say two steps back with Nol sit in five ft away, but it could be one and one. One step forward with Noel, one step back for not having Jerry. You're saying it's a step forward not having Jerry and a step forward having Nol. I'm just trying to make everyone like me doing a poor job of it. You do a great job of it. Everybody loves the chuck, not everybody who doesn't. I have some mortal in it means mortal enemies. Yeah, they want to kill trying to kill you know, we'll chuck. Yes, I will tell you what if they did want to kill you, they wanted to hire a hit many. Yeah, the deep web is a good place to start. Looking quite a segue. It's a while that one up unintentionally. Yeah, I spotted it and went after Yeah. This is about both the deep and dark web, which are two different things. The dark web is part of the deep web, thank you, But the deep web isn't necessarily dark a dark Yeah, that's very well put. The dark web is the nefarious things that go on in the deep web, and not necessarily a nefarious but the purposefully hidden. Yeah, because there are some good things in the dark web. I totally misspoke. Yeah, well, you know what. I think that it's great that you confess to it. You feel better, do Man. This is a really upfront kind of episode, isn't It's a very honest we're bearing it all man. Uh so, do you have a fancy intro story? No? You think I would? Right? My intro gets buried later on. It's a great intro, but it's just I'll use it as the intro. Okay, go ahead, okay, okay, Chuck. Yes, have you heard of our favorite band, Iron Maiden? Uh? Yeah, sure so. Iron Maiden is arguably the most awesome band of all time. Oh dude, all right, I'm not a huge fan, but you you wouldn't be, like I hate Iron Maiden. They suck of course not no, because it makes you crazy. It's right. Iron Maid has been around for a while. They're pretty smart. They know what they're doing. Um, and recently they figured out a way to maximize their touring dollars by flying their own plane. They well, Bruce Dickinson always did. He's he's a certified pilot. It's gotta be efficient. I would imagine plus fun all unless Bruce was partying too hard and then they got to fly to the next city that night. He wouldn't do that, I hope not, because that's that's dangerous. I mean, driving drunk is bad enough, but flying drunk I can only imagine. Sure. Uh, and it's probably not just drunk, you know what I'm saying. No, no, no, he's he's straight. Has he always been? I don't know. I can't verify that. Well anyway, Um, Bruce and the boys uh figured out that a good way to figure out where to tour, where to decide to tour, um would be to figure out where their music was getting pirated. The most that sounds reasonable. It does sound reasonable. It's it. It provides you with evidence of an established fan base, and a fan base that is unwilling to pay for your record, but would probably pay to see you live. How does that reason? Well, they like your music, but they don't want to pay for your CD, so why would they go see live and pay? Because it's different. Like, seeing a live show is way different than buying a CD. You can't you can't get a live show. You could get a video of a live show, it's still not the same experience. A live show is a live show. Plus, everybody always knows that anybody involved in the very entrenched in the old guard music industry does any band doesn't make any money on their records. They make it on touring. So going to see a band live also is kind of a true active fandom because you're really you're you're contributing directly to your band that you like. You know. Um So what they did was they hired a company to look at bit torrent sites and find the regions where their music was most pirate ID and they created a tour map from it and went and played those regions. Did you do you have the number one Iron Maiden pirated region? But we're gonna say Rio, all right, they're huge in South America. That's that's my guess we'll look it up afterwards, I guess. And so they were like, we're gonna start our tor and rio. Yeah, and it wasn't just that that one place, but it was basically a tour that was built on the areas where the music was most pirated. It was a Stroker genius. But they couldn't have done it without harvesting the deep web because bit torrent sites. When you search bit torrent, it doesn't the average search engine doesn't respond with a list of bit torrent activity. It'll just send you to a bit torrent site, which means that those pages of bit torrent activity, which are web pages and they do exist, they're part of what's called the deep web. That's right, the surface web as we know it, and search engines that we all use like Google and Bing supposedly only have access to about point zero three percent of what is truly on the world Wide Web. It's like scary and weird and thrilling all at the same time. Point yeah, and anything else that's buried is the deep web. And it's not necessarily Uh, the deep web is not is not when you're purposely trying to hide things, it just may not be cataloged and index a password. Sure, maybe one of those timed sites that uh don't let you access date after a certain amount of time. Could be anything with a caption involved, anything that's not hyperlinked. Uh, there's lots of reasons that something could find it so buried in the deep web. Right And and you make a good point to separate the deep web and in dark web. So let me give you an example of deep web aside from those bit torrent sites. Um, there's this company called bright Planet and they had this they provide deep web harvesting, and they had this primer on you know what is the deep web. One of the examples they used was if you look up government grants on a traditional search engine, it will probably provide you with www dot grants dot gov as one of the first returns. Right right up. Um, when you go on to grant stock CoV, you can then search and find pages of all these different government grants. You can search by keyword, you can browse, but those pages aren't going to come up on your normal Google search. You have to go to the site, which means that those pages of the actual grants are part of the deep web. Your bank account, your checking account online if you have a mobile banking or online banking, it has a web page all to its own right now, And if I searched Chuck Bryant's checking account, it would not come back. I would not get that because it's behind a password. It's a it's a website page. It's a web page, but it's password encrypted. Therefore it's part of the deep web. Twitter until it index tweets used to be you couldn't search tweets individual tweets. Now you can, so that made them formerly a part of the deep web. Actual tweets or every company on the planet has some sort of internal employee pages like internal dot discovery that only we can access, and you can't Google search any of that stuff, right or somebody could conceivably access it. Maybe it depends on the page, but you have to know the exact u r L. So the idea is, if it's blind, if search engines are blind to it, it's part of the deep web. If search engines can index it and bring it back as a return results search results, um, it's part of the surface web. Yeah, because that's all the search engine is doing. They are We might do a full podcast on search engines at some point. But the general thing is that there's an index of data, and they use spiders or crawlers because it is a web, to crawl around and locate domain names and hyperlinks and basically index all that in what they think will be most helpful to what you're looking for. Right, So, Chuck Bryant's bank account. There are some web pages out there that contain information related to that keyword search. So a search engine will keep an index with that keyword search with the u r l s, the locations, the page content, some of the page content, the meta tags or the metadata, and other very brief sketch information about those pages associated with the keyword for an index, which means that when you type in Chuck Bryant's bank account that sorry, I thought about it as I was saying it that last time. But when you type in UM Birds of Paradise bank account, bank account, um, it will the search engine goes and accesses the index. It doesn't have to go all the way across every page on the web that it can find. It just goes to its indices, and that's how search results are returned so quickly. It's not going across the internet it's already got the spider crawlers, the bots doing that constantly. The search engine is just going to the index is that the bots have created from their searches. Yeah, and it is super shallow. I mean, we said point zero three percent. We do. We do our whole job as researching online mainly, and we run into this all the time where you feel like you're getting a very slim portion of what you're trying to find out because so many of the best uh, medical journals and things like this don't just pop up as you know, It's more likely to be some headline from scene and dot com and not like a Harvard Medical Journal paper that could really help you out. Yeah. And I mean like you can get deeper and deeper with your keyword skills and your search skills, but for the most part that Yeah, the first returns of first results, depending on what you search for, are gonna be, like you said, superficial. Yeah. But even if you're super a super sleuth, a Google master like we all think we are, I mean, how can how much can that be bumping it up point one? Yeah. Well, a lot of the problem too, though, Chuck, is the so much of science is behind a paywall, really really really expensive paywalls to um, which is like, here's the first eight lines of this awesome medical research paper, exactly if you want it, give us. Yeah, which is a problem in and of itself, not necessarily related to this, but with current search engine technology, you have, like you said, a superficial result from a query um on the other end of the spectrum, and this is kind of what search instance are dealing with now. Um, the deeper you go into the deep web. Again, the surface web is point zero three percent of all of the web pages on the entire Internet. So the further you go into it, the more data you have, and you eventually can run into the problem of what's called big data, which not capitalized B or D, which refers to like companies like Google and and that that can dig and harvest and maintain a large amount of data. Yes, just it's basically data that's so much and so unwieldy you can't even process and search it. It's like not even helpful. It's yeah, it's like a really bad Internet search. Yeah yeah. Um. So the the the current state of search engine design or creation is balancing that figuring out how to get less superficial without running into the big data problem of incoherent data due to just massive amounts of returns. And you might think that these search engines do a great job because I can always find out what I need. But you don't know what you're missing, you know, right, So it's it's sort of not even correct to say that I always find out what I need. Does you meanly know you need it because it's hidden? That's true, And I mean you're you're missing quite a bit. Like, uh, okay, there's apparently fifty million registered domains on the Internet. Yeah, and that's I looked at like just in two thousand twelve, I think there're only like two hundred and fifty or something. I mean, it seems like it's doubled in the last couple of years, right, So there's five million domains for example, a lot of more garbage, yes, But how stuff Works dot Com is one domain. And I asked Tracy Wilson, who's the site director and runs stuff you missed in history class. It's one of the co hosts. How many pages there are how stuff Works? She said roughly at least, so one domain out of five fifty million has fifty thou pages itself. Right, so you kind of get an idea of the Scalpe deep web is anywhere from four hundred to five hundred times bigger than the surface web. And like you said, you don't know what you're missing because you don't know what's out there because your search returns aren't bringing you back anything. Yeah. I mean there's a lot of important stuff out there. We talked about medical papers. Um, apparently there's engineering database is financial financial information, a lot of things that could really help research. Uh, but you just can't find it, right, Um, unpublished blog posts just basically anything that a person creates on the Internet. Yeah, um, is if a page is created, it's part of the deep web. Yeah. Unless you take this stuff down, it's living there forever, just gathering dust exactly. So Um, and it's not just necessarily engineering databases or medical information. Um, there's also a lot of shady stuff too. The dark web. That's the that's the dark web. Yeah, that is the dark web. Is when is um these sites intentionally uh reroute you. Well, we'll get to how they do it. But basically it's an it's an intentional anonymity. Um, it's not oh, it just happens to be buried on the deep web because it's not index it's it is purposely hidden from the surface web, so people can't try the person searching for something, or the the end uh website. I guess like those are all just private essentially, right, and privacy advocates are way into it. You're not necessarily a child pornographer, although there is that a lot of that kind of stuff on the dark web. Um, there's also a lot of good that happens on the dark web. Yeah. The the anonymity and privacy and the desire for it isn't in and of itself proof of wrongdoing, of course, not no, which is frequently it's frequently pointed out as that, but incorrectly. Yeah, I like, I don't want the n s A and my business people like, well, what are you doing right? Exactly nothing. Yeah, I just don't want them in my business. Yeah. Um, that's an answer. That's good enough. That answer is good enough. And for a lot of people, um, they say, well, then I need to go to the dark web to maintain anonymity, um or higher hit man right to kill Chuck Bryant. That you could do? That's crazy. Um, you could do. There's a site for a while, Um, I don't know if you've heard of it or not. It's called Silk Road, Yeah, which got shut down, and I teach Chuck, I know you've heard of it. It's like the most famous dark website of all time. Yeah. The the Feds busted Um Ross Olbrick, who may or may not be dread Pirate Roberts, which was the online name that they said he's the guy running this, and he is now saying, actually that's not me, Um, but all those bitcoins are mine, So you can't seize those bitcoins. And there it's in courts now. They're trying to determine whether or not it's it counts as something that you can seize as an asset from a criminal. And they were saying that this is literally a case that no court has ever heard before. Yeah, they they it's never been questioned whether you could seize cryptocurrency. Yeah, and you should listen to our podcast on bitcoins by the way, from not too many months ago. But it's essentially just yeah, encrypted digital currency. And they have a really really fascinating, circumstantial case against a brick Um not just for operating the silk road site. Um, that's where you could buy drugs and things by the way, right, which being the operator of that in and of itself shouldn't be a crime. I'm sure that they would have prosecuted him for that if they'd been able to get their hands on him for just that. But apparently they also have him for at least two hired contract killings. One he he um I guess hired an undercover cop to do it, and the guy went to the person who he was taking the hit out on and said, this guy's trying to kill you. I need you to cooperate, and I'm going to take pictures of you dead and send them to this guy. And Old Brick apparently gave him like forty grand up front another forty after he saw the photos. So like in bitcoins? No, I think in cash, although no, it would have been in bitcoins. You're right, yea, yeah, so who knows? It could have been two bitcoins at the time or five. Well. Silk Row two point oh launched in November. It's out and um there are other uh copycatters like the black market reloaded and um which that went down for a little while after Silk Road went down, but then it went back up. I think, yeah, like, I don't know. Man. I hate to say you shouldn't try and fight crime, but you're not going to stop this stuff when one, you know, you cut off the head of one and another grows right out of it in its place. You know, it's true if the structure that's allowing for the anonymity can remain intact, which is the dark web, right, But it's not just the dark web, it's like how you traverse the dark web, like using tour. Yeah, I guess we haven't explained. UM. The Onion Router t o R is what it's called, and it is software that you use to access the deep web and the dark Web if you choose to UM, and it searches for these anonymous sites for you like a search engine, but instead of dot com or dot org or net, they end in dot onion, right the idea and onion has many layers and UM that's that's how you access it through tour. You have to buy it install it on your computer. Yeah, Firefox had UM something that it was basically a tour bundle. It was the most popular one and you could download it for free. But it's not a web browser itself. It's like an add on to a web browser that allows anonymity, and it does two things. One, it bounces your trail all over the world from server to server, so it makes you and your activity extraordinarily difficult to track. It's not just like this computer went to this site. It's like, that's that whole Onion thing. There's so many layers. It's like we can't We don't know who this is or where they're, where they are, what they're doing, or anything like that. We just know right now that this particular person happens to be there's a user on Silk Road, but we don't know who it is or anything. You can't track them because they're using tour. The other thing is that you can't get into dot onion domain site dark websites unless you're using tour, Like they won't let you in unless you're an anonymous user. Um So tour has this kind of twofold thing. But there was recently a breach in it and it turned out the FBI was using malware to break through the anonymity of tour users. And yeah, and found out a lot of people on some sites that are that were hosted by something called Freedom Hosting, which apparently had a horrible reputation for being the repository on the web, on the dark web for child pornography and knowingly like basically just not doing anything about it. Um, So the FBI had a They hacked the Freedom Hosting servers and inserted this malware. So if you went to a Freedom Hosting site, any of them, not just necessarily a child pornography, but any site hosted by Freedom Hosting, which is like say go Daddy for the dark web, um, you would get this malware package that exploited a key hole in Firefox's tore bundle. It went into your computer, said hey, give me your mac address, which is basically like your computer hardware, like serial numbers your computers and your computers a loans tracking number, and then also tell me where the computer is, and it sent it back to a server, a mystery server in McLean, Virginia. And finally, after like a month, fb I was like, yeah, that was us. We got we have everybody who went on that site's name and address and everything on them. So that's done a huge ripple. And Firefox fixed this loophole, but it's done a huge ripple. Through you know, the dark web, deep web community saying like whoa, whoa. We were anonymous before, but you know now it's it's been shown definitively that the Feds can find out who we are. So the anonymity is reduced, if not taken away, which defeats the whole purpose. Yeah, so if you don't have that, then you can keep lop the heads off of these things, and they're not going to grow back because people are afraid. People will be afraid because they won't feel like they're anonymous any longer. Well, tour has a sort of an ironic background, which we will get to right after this message break. Alright, so we're back and we left you with the the nugget that tour has an interesting background, and the background of tours actually, the US Naval Research Laboratory in two thousand three launched this program for political dissidents and whistleblowers so they can get their message out without fear of proprisal. Right, and this is still a use of tour like the New York Times wiki leaks, some other news agencies have um tour sites that if you want to go and in contact the New York Times or wiki leaks anonymously, Like you can go to their tour, their onion site and um upload documents or say, hey, I have some informa I want to share and you can do it anonymously. So the government, though, is basically law enforcements trying to track down criminals using the software that the government created to begin with. So it's an interesting loop. Um, But like we said, it's not all badness. Um, if you live in a country where bad things are going on and you don't feel safe getting on the regular web as a political dissident, you can do so on the dark web. But it offers a virtual meeting place for sometimes people are trying to, you know, combat these oppressive regimes in their countries and they can't just hop on Facebook and organize a meeting because they'll get smacked down. Right if you're a person who values privacy for whatever reason or no reason at all, Um, the deep web and the dark web offer file sharing services. Email is a big one too, Like, I know, I can't remember the name of the one Edwards Snowden has been using, but I got shut down, like just the whole company shutdown. Sorry, you're out of business now because you're helping Edward Snowden um. But there are other email um email services. Basically, everything you have on the web, if you want to do it anonymously, you have to go to a company that operates on the dark web that uses tour to to route its information or your information. Yeah. The University of Luxembourg did a study where they tried to rank the most commonly accessed stuff on the dark web and sadly what they did find a lot of things like child pornography. There were also a lot of UH sites and chat rooms for human rights and freedom of information and just people that don't want to type in a search for uh how to grow marijuana and then the next time they go to their Gmail account, they're a bunch of ads for grow lights and you're going, huh, how that happen? Well, happen because you're searching the surface web with an IP that can be traced back to you, and not even even illegal activities like that. You know, you want to research a fitbit bracelet and then you go and they say, hey, Chuck, are you fat? You want to lose weight? While else you want to fitbit? Why would you want to fit You're definitely creepy. You know, there's the big brother effect. I think everyone feels it. Uh, there's all there, the existence of the deep web, not necessarily the dark web, but just the deep web, all of those pages of information that are out there. Some companies that figure out how to exploit it, or the fact that search engines, normal search engines, aren't doing a good job of looking into the deep web. There that company bright Planet I mentioned, they have a deep Web harvester, which is basically a proprietary search engine algorithm that goes into websites and gets everything like it's not it doesn't form an index. It grabs every bit of text off of every site associated with a U r L. Sounds like big data. It is, but they're doing it for companies like big pharma, big government and saying like, oh, you want to know what your competitors up to, Well, here's every letter of every word of every strip of text on your competitor's website, including all internal stuff everything. Please give us ten million dollars for that search. Um. There's also this site called Vocative which uses something like bright planets deep web harvesting, but it does it for journalism purposes and it's basically, rather than searching using Google, you or I would for a story idea, they're um searching using a deep web harvester to find all this other information that we wouldn't be able to find because we don't know how to search the deep web and writing stories like that, And there's some pretty interesting stuff that that sites put together already. When you think about if you're only getting if you think the Internet is cool and you're only getting point of it, yeah, not bad. And you know this is the webs. The surface web is getting deeper, the deep web is getting deeper. Search engines are searching deeper. It's it's all like and they're trying to anonymize more effectively. So it's it's like this cyber war is going on. Oh yes, you know that was another good one we did. What do we do cyber War one? On cyber war you? Yeah, I knew I've heard that before, So there you go. I would have to say that this is one of those episodes where we did it, but it is not done. No, No, Sometimes we do them and it's like that's it. There's nothing more to say about this topic. Yeah, I'm interested to see what happens with Uh with old brick for sure. That's that's gonna be a big monument landmark case. You know. Uh, if you want to know more about the deep Web, you can type deep web into the search engine and how stuff works. It'll bring back super fisher results only how stuff works stuff. But it's pretty good, so you'll be happy. And since I said search bar, that means it's time for listener mat All right, Josh, I'm gonna call this uh birthday shout out that we rarely do. Okay, Hey guys, I'm a longtime listener shamelessly writing to ask for a huge favor. Here's the sitch. I first became aware of your podcast when my last girlfriend, Natalie David h introduced me to it when we started dating, and I've heard it. Thank for getting hooked as we spent a lot of time listening to your show and learning together. As huge supporters of your podcast, we were compelled last year to make the trip up from Virginia to New York when you were putting on your Tribuae night and Natalie is the one who gave us the mics on pants off T shirts and David her boyfriend. They were super cool, super nice. They sat at the table right near us, so I uh, you know, got to know him a little bit. And um, he says. Anyhow, here's where the favor comes in. She moved to Shanghai, China to teach, and she's teaching little kids English, and sadly they you know, separated when she moved over there. Uh, which to me are always like the saddest breakups, right, like, there's nothing wrong to China. So they just thought it was probably the thing to do. But they um. Because I inquired back to David, emailed them about this and it's like, oh, no, you guys broke up and he said, yeah, but we still really support each other and care about each other, and hopefully our pass across again one day. So anyway, Natalie David is in China, and because of this distance, I was at a loss when considering what to get her. He made a donation to Cooperative for Education in her name. And I know you guys like to read those names of people who contribute, but in this case, I was hoping you would just do a little something more special by wishing her happy birthday. So in January, which I think should be very soon, Natalie Happy birthday. Yeah, happy birthday. We remember you. I wear that shirt all the time. Oh, I think it's funny. And uh, I hope you're doing well in China. And don't give up on David just because he's here in stupid United States. Her new Chinese boyfriends like what that guy? Wait? Rewind that? So um Anyway, I hope you're doing well over there in China, and thanks again for all the support. And I hope you guys. I hope your pass across again one day. It was very nice that it's from David Austin Bury. If you have a special request for Chucker Me or us, you can tweet to us at s Y s K podcast. You can join us on Facebook dot com, slash Stuff you Should Know. You can send us an email to Stuff Podcast at Discovery dot com, and you can as always join us at our home on the web. Stuff you Should Know dot com for more on this and thousands of other topics. Does it how stuff works dot com. This episode of Stuff you Should Know is brought to you by Linda dot Com. Linda dot com offers thousands of engaging, easy to follow video tutorials taught by industry experts to help you learn software, creative and business skills. Membership starts at twenty five dollars a month and provides unlimited seven access tri Linda dot com free for seven days by visiting Linda dot com slash s y s k

Stuff You Should Know

If you've ever wanted to know about champagne, satanism, the Stonewall Uprising, chaos theory, LSD,  
Social links
Follow podcast
Recent clips
Browse 2,565 clip(s)