DARPA’s Creating Search Engine to Crawl the Deep Web
As it was mentioned in cbsnews (video available here) Memex, a powerful new search tool that goes beyond the realm of Google, Yahoo, and Bing, is launched by DARPA:
This powerful new search engine developed by DARPA, the U.S. military’s Defense Advanced Research Projects Agency and was announced last year. The inventor of Memex, Chris White:
“The internet is much, much bigger than people think,” White said. “By some estimates Google, Microsoft Bing, and Yahoo only give us access to around 5% of the content on the Web.” That leaves a lot of room for bad actors to operate freely in the shadows.
White says that Memex goes far beyond the realm of traditional search engines and gives law enforcement a powerful new tool to search the “dark web,” where criminals buy, sell, and advertise in the illegal weapons trade and sex trafficking.
“The easiest way to think about Memex is: How can I make the unseen seen?” said Dan Kaufman, director of the information innovation office at DARPA.
“Most people on the internet are doing benign and good things,” Kaufman said. “But there are parasites that live on there, and we take away their ability to use the internet against us– and make the world a better place.”
According to published reports, including one from Carnegie Mellon University, the NYDA’s Office is one of several law enforcement agencies that have used early versions of Memex software over the past year to find and prosecute human traffickers, who coerce or abduct people—typically women and children—for the purposes of exploitation, sexual or otherwise. “Memex”—a combination of the words “memory” and “index” first coined in a 1945 article for The Atlantic—currently includes eight open-source, browser-based search, analysis and data-visualization programs as well as back-end server software that perform complex computations and data analysis.
DARPA has asked researchers to develop advanced web-crawler software to reach sites and resources that have sophisticated crawler defenses. Memex operators would then be able to access the indexed domain-relevant content with much greater precision and ease than is currently possible.
Memex, DARPA says, will be first employed against human trafficking, which, “especially for the commercial sex trade, is a line of business with significant Web presence to attract customers and is relevant to many types of military, law enforcement, and intelligence investigations.”
DARPA says that dark places online where trafficking occurs enables “a growing industry of modern slavery” that can be stopped with Memex capabilities.
“An index curated for the counter trafficking domain, including labor and sex trafficking, along with configurable interfaces for search and analysis will enable a new opportunity for military, law enforcement, legal, and intelligence actions to be taken against trafficking enterprises,” DARPA’s solicitation announcement reads.
How DARPA would catch traffickers without “deanonymizing” someone, though, the agency does not explain. Nor does it address just how far it wants to out anyone hiding in the deep web for legitimate reasons, whether they are journalists, whistleblowers, activists, and the like.