Demystifying the Darknet

The darknet is sometimes perceived as a dark, shady corner of the internet only accessible by complex network protocols that lead to drug markets run by cartels, predators trading illicit content, and so on. Indeed the darknet does house some of the most unsavory web-content imaginable, but its shape and form is like a living, breathing organism; It shifts with the world’s major economies and political influences, surges with media coverage, seen by recent law enforcement efforts, and unites a unique cross-section of society with interests in things hidden, those with concerns about anonymity, and the possibly unscrupulous who want to illegally transact online. 

Here is an in-depth overview of the vast and ever-changing world of hidden services, including answers to some of the most common questions we've heard and a look at what type of content our intelligence analysts are currently finding on the darknet. 

Peer to Peer (I2P) versus Tor

The concept of the darknet has exploded in size and scope over the past few years. A common question we hear from customers is “what exactly is the darknet?” A large percentage of what we call the darknet currently is known as hidden services hosted on Tor. Tor is a network used to bounce traffic around a distributed set of relays run all around the world designed to conceal the client’s original IP address. The Tor Browser, is a custom version of Firefox browser with the Tor network communications protocol built in. Tor is not very different from peer to peer services, such as i2p. I2p is a network of routers with many multi-directional inbound and outbound paths based on a customized version of onion routing called garlic routing. DarkOwl’s Vision crawls and archives both I2P and Tor sites reducing the need to browse them independently.

Key events affecting the darknet

At the beginning of the year, the total number of connections to Tor, as reported by the Tor Project (metrics.torproject.org), increased with the change of the US Presidential Administration US Presidential Administration and as more information circulated regarding how possibly identifiable surfing the Internet really was with the spread of adware and targeted advertising.

In May, we watched one of the largest ransomware attacks, WannaCry, take over hospitals and key institutions across the world, nearly shutting down the UK’s NHS. During this time, Tor saw an uptick of connections, spiking by nearly 50%.  Darknet usage was at record high usage in June, when in early July, joint international law enforcement efforts brought down two of the largest darknet markets, AlphaBay and Hansa, and the darknet witnessed a dramatic reduction in direct connections, possibly as paranoia paralyzed many regular people who feared being prosecuted for purchasing goods on the markets. Since July, alternative darknets and non-Tor-based “nets” such as FreeNet, ZeroNet, and the Invisible Internet Project (I2P) have welcomed thousands of new peers and clients. 

Since early September, Tor usage is back up, possibly the highest levels of 2017 as more darknet connections are seemingly undeterred by the threats of law enforcement or potential risk of de-anonymization posed by Tor.

 Figure       SEQ Figure \* ARABIC
   1       - Average Numbers of Direct Connections Fluctuating since 1-Jan-2017 (source: metrics.Torproject.org)

Figure 1 - Average Numbers of Direct Connections Fluctuating since 1-Jan-2017 (source: metrics.Torproject.org)

Metrics of the darknet

The engine that powers our DarkOwl Vision platform is constantly and intelligently scraping the darknet for new content, using machine learning to categorize the services captured in an effort to understand the shape and feel its current state. As of the 26th of September, the engine has successfully cataloged almost 62,000 sites across Tor and I2P, with a 6.5% increase in new sites since 1 July 2017.

Although Vision has archived well over 60,000 services over time, monitoring the “breadth” of these sites, by tagging whether they have been “Recently Updated” becomes an important metric. Over the last three months, the average number of sites tagged as RECENTLY_UPDATED is 12,189, meaning that on any given day, 20% of the darknet web content is changing or is considered ACTIVE and ONLINE and that 20% shifts widely across different across darknet sites. August 2017 was the highest month for new sites found on the darknet and the average number of sites with RECENTLY_UPDATED content.  This also means that Vision is observing significantly more activity across darknets as compared to popular directory services such as DarkWebNews or Fresh Onions (Tor) who report there are on average of 4,500 Tor hidden services online.

Who hosts darknet websites?

Darknet hidden services can be easily hosted on your very own home computer or even a simple raspberry pi, but many darknet users, prefer to rely on experienced darknet security hosting services to provide a webserver to host the content they want to create on the darknet. For years, the primary Tor hosting service was Freedom Hosting, which on two occasions (the most recent on Freedom Hosting II in spring 2017) was breached and site content corrupted for hosting child pornography and human trafficking. Since the demise of Freedom Hosting, now the largest darknet website provider is Daniel’s Hosting Service, which reports they host 1050 public sites and another 414 hidden sites across Tor, for a total of 1,464 sites.

Two other up and coming popular hosting services are H4x0rzHosting (http://h4x0rz.ddns.net) which reportedly hosts 177 sites across Tor and Dark System Hosting (http://hostingr3ohiyhph[.]onion) with almost two dozen hidden sites. All offer similar services to Daniel’s, of providing the client their own .onion Tor site address, and free anonymous web hosting and email support on a Linux Nginx webserver with or without PHP or SQL support, depending on the purpose and complexity of the user’s darknet site.[1]

Languages of the darknet

Most of the darknet is comprised of English-speaking services, but Russian (Cyrillic)-based services make up almost a quarter of the sites captured by the DarkOwl Vision engine. The other 26 languages are merely 4% of the Vision database but growing in numbers.

 Figure       SEQ Figure \* ARABIC    2       - Percentages of English and Russian URLs in DarkOwl Vision

Figure 2 - Percentages of English and Russian URLs in DarkOwl Vision

The distribution of eepsites - websites hosted anonymously within an I2P network - and hidden services is widely scattered across international languages. Over the last three months, all languages saw an increase in the total number of darknet services, but there are a surprisingly small number of websites in Arabic and Chinese, with less than 25 i2p and Tor sites crawled by the Vision engine in those languages. [2] However, the total number of services in French, German, Portuguese, Estonian, and Italian increased by over 25% from July to September, and Thai and Dutch sites saw the greatest percentage increase of new services (nearly 82% for Dutch and 148% increase in Thai). In early July, the seizure of Hansa Market by Dutch officials, triggered a concentrated international effort to crack down on drug dealing on the darknet, listing numerous Dutch vendors’ aliases by name who were being pursued by law enforcement, which could contribute to the lower numbers of Dutch hidden service URLs in July and an increase in URLs over the last 60 days.

   
  
  
 
 
 
 
 
 
 
 
 
 
 
 
  
  
  
  
  
    Figure        SEQ Figure \* ARABIC     3       - Total number of URLs in DarkOwl Vision engine (excluding Russian and English

Figure 3 - Total number of URLs in DarkOwl Vision engine (excluding Russian and English

What type of content is on the darknet?

Hidden services on Tor frequently come and go, as criminals often change their onion addresses to avoid apprehension and many servers are operated out of personal residences with uptime fluctuating with their daily schedules. Quantitative analysis helps to provide us with an indication of whether we are successfully collecting data that is relevant to our customers, improves the greater understanding of the network, and offers an opportunity to fine-tune collection methodologies. Categorizing the darknet at any given time, because of its fluid and transient nature, is ever evolving; however, the majority of darknet sites can be categorized into a couple of dozen subjects, ranging from X-rated content to drugs and social media sites used for communication. The engine that powers our DarkOwl Vision platform is constantly and intelligently scraping the darknet for new content, using machine learning to categorize the services captured in an effort to understand the shape and feel its current state. The topics we selected for analysis are based on natural language processing and keyword matching algorithms.

Reviewing the last quarter of darknet collection, we have cataloged the darknet sites scraped into the following distribution:

 Figure       SEQ Figure \* ARABIC
   4       - Current State of the Darknet - Distribution of All Darknet Sites by Topical Category from July to September 2017

Figure 4 - Current State of the Darknet - Distribution of All Darknet Sites by Topical Category from July to September 2017

While the chart above assigns each of the sites we’ve visited over the last three months respectively, we also monitor the “breath” of the darknet by tagging sites who have been recently updated, i.e. online and active. This metric provides insight into where current events and developments are taking place. The following chart shows a breakdown of the most recently updated or modified segments of the darknet, only looking at sites which have uploaded new content over the last five (5) days.

 Figure        SEQ Figure \* ARABIC     5       - Current State of the Darknet, Distribution of Recently Updated Darknet Sites from July to September 2017

Figure 5 - Current State of the Darknet, Distribution of Recently Updated Darknet Sites from July to September 2017

While the categorical distribution charts above represent averages of total numbers of sites over a period of time, July to September, in a similar fashion, we can for any given day, take a snapshot of the distribution of darknet activity by category.  For example, on the 11th of October, the following percentages of darknet sites by category were identified and tagged by the engine as RECENTLY_UPDATED.

    Figure       SEQ Figure \* ARABIC
   6     - Current State Recently Updated Sites Across the Darknet - Snapshot Taken on 11  October 2017

 

Figure 6 - Current State Recently Updated Sites Across the Darknet - Snapshot Taken on 11  October 2017

If you look more closely at the average crawl data, and calculate the delta, or percentage change across the average number of recently updated sites in each category from say July-to-August and from August-to-September, we can start to possibly infer what topical categories or subjects within the darknet, are seeing more activity from month-to-month.

 Figure       SEQ Figure \* ARABIC
   6       Percentage Change in Average Number of Recently Updated Darknet Sites by Category from July to September 2017

Figure 6 Percentage Change in Average Number of Recently Updated Darknet Sites by Category from July to September 2017

The topical darknet categories of File Sharing (sites for uploading and downloading files), Hosting & Email, and Cryptocurrency related content had the dramatic growth from July to August, but saw little to no activity from August to September. Likewise, sites with Blackhat or hacking-related content were extremely active from July to August, but dropped off significantly across the month of September. We also recognized that some sites are considered “uncategorizable” meaning the hosted web content did not contain any keywords of interest or were in an unrecognizable language. It is noteworthy for DarkOwl that the total number of UNASSIGNED sites such as these decreased by over 20%, implying DarkOwl’s engine is getting smarter at intelligent topical assignments.

Final thoughts

In this quarter’s analytical darknet review, the DarkOwl’s Vision revealed many items of interest about the shape and feel of the darknet.

  1.  September saw the highest usage of Tor relays for 2017
  2. The total number of sites crawled by DarkOwl Vision increased 6.5% since 1 July
  3. On average 12,000 sites were active over the last three months
  4. The primary languages of the darknets are English and Russian, but from July to August, the language with the highest number of new sites was Dutch. 
  5.  Hacking and Counterfeiting content make up the largest percentages of content across this quarter’s most recently updated sites

[1] There are also several other hosting providers not mentioned due for the sake of brevity of this report. A future survey of all darknet providers may be published at a later time.

[2] DarkOwl analysts note that ZeroNet, another anonymous darknet whose integration is in development here at DarkOwl, is popular with Mandarin speaking darknet users in China and make up most of the ZeroNet pages.  Also, most countries banning the use of Tor and peer to peer services include countries where Arabic and Chinese are the primary languages.


Curious about something you've read on our blog? Want to learn more? Please reach out. We're more than happy to have a conversation.