Deep and Dark Web Data and Its Impact on Modeling Cybersecurity Risk

Or, watch on YouTube

Of the numerous quantitative models that attempt to define and quantify the cybersecurity risk to organizations, very few consider risk indicators from the deep and dark web. Using ransomware as a case study, this presentation reviewed the content that exists on these hidden networks, and explored how data from the dark web can serve as an important data point for more comprehensive risk models. Further, Ramesh Elaiyavalli, CTO of DarkOwl, discussed the unique challenges and considerations that must be made when examining dark web data.

For those that would rather read the presentation, we have transcribed it below.

NOTE: Some content has been edited for length and clarity.

Kathy: Thank you, everybody, for joining us today for our webinar: Deep and Dark Web Data and Its Impact on Modeling Cybersecurity Risk. My name is Kathy, and I will be the host for today…And now I’d like to turn it over to our speaker today, Ramesh Elaiyavalli, our Chief Technology Officer here at DarkOwl, to introduce himself and to begin.

Ramesh: Alright! Thank you, Kathy. Appreciate the intro. Hi. Hello. My name is Ramesh. I go by Ramesh Elaiyavalli. I’m the Chief Technology Officer and am responsible for product and technology groups to set the strategic technical vision of DarkOwl, as well as kind of the day to day workings and implementation of our platform, our processes and our people.

So with that, today’s webinar, as Kathy mentioned, is to go over at a high level: what is the darknet and the deep web and how risk modeling is relevant to the current web dates. I will talk a little bit about ransomware as a darknet data multiplier. We’ll also review the security risk frameworks, and some of the stakeholders that need to be engaged as you look at risk modeling and the application of darknet and deep web as it relates to modeling and any future quantification efforts of darknet data.

We believe that the deep web and the darknet data have a significant impact in any type of cybersecurity risk modeling.

If you look at the dark web in general, think of it as an iceberg where the tip of the iceberg is the surface web, that we all know and use every day. It was originated back in the nineties. It was basically browser based and we all know that a ton of content which is publicly available is available via the surface web, and there are many content or many types of content ranging from discussion boards to pay sites and so on.

The deep web is anything that is not indexed like Google, simply put, and that is typically behind some type of the authentication of the websites that you require authentication or any type of human intervention. So this is where things like IRCs, telegrams, criminal forums, marketplaces, they all reside in the deep web. And that kind of emerged in the mid-nineties.

[This takes us] all the way to darknet, which was founded as part of the Tor Project in 2006. So this is the intentional anonymizing of networks accessible only by a proxy or a specific peer to peer protocol. So the best example is Tor or called the Onion. And then we have I2p, ZeroNet, Freenet, Oxen, Yggdrasil, so the list goes on and on with a ton of such networks and protocols that only exist in the darknet. And they have become kind of a very important infrastructure for advanced threat intelligence and long defined risk.

When we talk about darknet data, the data is both diverse as well as dispersed all over the internet, The surface web as well as the dark web. So when you look at the diversity of data, data is available as email addresses or email breaches with passwords, which is really the authentication data. There is domain data, subdomains, the IP addresses that are tokens that are common vulnerabilities, exploits and so on. There are source code available. There is content and text available about a company, which is the chatter across the threat actors. There is critical corporate data, contract and financial information, intellectual property, executive insights, as well as employee activity, phone numbers, PII data, banking data and so on and so forth.

So, as you could see, the data is very diverse. Also, the data is spread and dispersed across various sites that could be transient in nature, there are darknet data places, there are forums that criminals use for discussions, there are image boards or chans, there are blogs on ransomware, there are marketplaces where data is being sold in classifieds, and last but not least, is Telegram and some of the IRC chatrooms.

Given the diversity and the dispersion of data, we also know that the data is really valuable when the data is at scale. And scale matters more so now than ever before. Why is this? Number one, there is a rapid digitization in our society overall. Everything that is paper and tribal knowledge is becoming a digital asset.

And, with COVID-19, the pandemic has changed the fundamental way in which we work. A lot of the hybrid and work from home exposes organizations to networks that are only as good as the weakest link. So, there is quite a lot of attacks surface that has been exposed with the work from home networks and the garden variety wifi protocols that are out there.

The third one is [that] the Ukrainian-Russian conflict has significantly shifted the threat landscape. If you think the Ukraine Russia war is far off from you, think again, because a ton of supply chain risk exists today from vendors that you work with and you partner with. And they are directly impacted because of the war or because of the supply chain issues.

And, number four, there is an unprecedented number of never before seen malware and critical zero-day issues in the wild. There is a significant increase in ransomware, ransomware attacks and all of this kind of has fundamentally changed the landscape in which we look at darknet. So it is taken in from a corner of the Internet to now center stage. So the dark web usage has really jumped over 80% in the last three years. 2 million active users, if not more in the Tor browser and the ransomware cost, just the sheer cost is over 20 billion in 2021.

Now, ransomware-as-a-service is a term [increasingly] in vogue. And the threat actors have become very sophisticated in not only attacking and penetrating your organization, but they have the maturity to go after these ransomware-as-a-service providers to make the transaction more professional. You can transact on the internet, on the darknet, and the deep web, where you leverage these initial acts as brokers and third parties wherever they are possible. And the consultants would help in the victim negotiations as well as target the qualification, meaning they would know how big your company is, how much can you pay, and what’s your propensity [to do so]? How badly do you want to be covering your exposures here? So based on that, they offer a service which is the ransomware-as-a-service, and these are paid insider threat partners that criminals and threat actors work with.

[Lastly], with the Ukraine conflict, like I mentioned, there’s a fluctuation between Ukraine conflict and the various international law enforcement operations. We’ve heard about Conti and Cooming and Stormous data which are available immediately after the invasion. The Happy Blog, for example, returned despite the arrests by the FSB. LockBit, AlphV, Snatch – they all have increased activity. Victim data leaks continue at a very high volume CONTI pretty much disbanded and dispersed into not just one group, but various splinter groups. And such threat actors are directly contacting our stakeholders for pressuring the victims.

The bottom line is this ransomware as a darknet ecosystem is extremely well-structured. It is operationally very efficient. And the biggest fear is they are running this at scale with ransomware as a service. So this kind of changes the entire threat posture of a lot of companies out there.

And, if you were to be a victim of a ransomware attack… from a customer standpoint, you are completely shut off from your access points. There are messages that prevent you from getting in unless you’re willing to talk to and pay the ransomware and the threat actors.

Ransomware Shame Site on Tor

Now, [let’s talk about] ransomware as a threat signal and overall as a dataflow lifecycle. You start with a pre-cyber incident, and then there is an initial access where that campaign has been launched. There are then incident responses and negotiations as part of the public announcement over to the post cyber incident management and then the whole attack cycle restarts. So, that’s kind of a quick [overview of the] lifecycle of the entire ransomware threat signal and data flow.

And, 46% of the ransomware victims, unfortunately, have not been compromised once, but multiple times. Over 90% of the data leaks we observed in the last year were attributed in some way or the other to these ransomware actors.

Darknet Ransomware Threat Signal and Data Flow

Now in talking about ransomware, here’s another great example that we tell our customers about: Volvo.

As we all know, Volvo is a very large auto manufacturer. But interestingly, their ransomware attacks did not come from their own compromises, but it came from their supply chain. It started with November 2021, where snatch one of the Chinese Volvo corporations that had a breach. And then it went on to Denso and then it went on to the Volvo Corp update will work to back defense over to StrongCo and so on.

So, various subsidiaries of Volvo, such as the Mack, the Mack defense, the Mack trucks and so on, were exposed as part of this attack. And these impacts we are observing pretty much up and down the entire supply chain. And there are multiple, not just one threat actor, but there are multiple threat actors that are finding ways, finding vectors, finding threat surfaces to expose and bring down some of the largest companies that are out there, either directly or as part of their supply chain and their vendor relationships.

Now, when you look at the darknet and you look at security risks overall, we talked a little bit about ransomware, but there are other type of threats that you should be worried about. We all know about the phishing attacks and the malspam campaigns, the cyberattacks, all the way from the overt or covert malware, DNS hijacking, data exfiltration, cyber espionage, denial of service attacks, insider threats, and basically any type of information based reputation attacks. So the types of threats have multiple dimensions, and ransomware has kind of bubbled up to the top. However, there are other threats that you need to equally pay attention.

And, what are the consequences of these threats? It is data corruption, it is operational downtime, a huge and a tremendous amount of financial and revenue loss, regulatory issues and fines, damage to your virtual or physical infrastructure issues with your shareholders and society as a whole, and the loss of customer confidence and a significant dent in your brand reputation. The consequences of ignoring these threats are significant and threats continue to evolve and [be a] cost concern for various organizations.

Having said that, how do you do threat modeling is not [the exact same as] how you look at risk modeling. Threat modeling is a subset of what you have to think from an overall risk modeling standpoint. Now, are there standards? [What are] the best practices for risk modeling? The good news is that there are some, but the bad news is there are plenty of them. There is no one single overarching standard for risk modeling. So, depending on your use case, depending on your company, your business, your operations, and your exposure to various security and methodologies, you can adopt one or more of these frameworks for your risk modeling.

The stakeholders for such risk modeling would pretty much be everybody in the organization and beyond. It starts with your SOC, your incident response teams, executives, data protection officers, the governance folks, CISOs, IT leadership.

If you are in Insurtech space, it very much applies if you are a broker, you’re an engineer, you’re an underwriter, you’re a reinsurer. All aspects of insurance underwriting and cyber security assessments need to be worried about risk modeling. It also applies to investors, private equity, and venture capital firms who are looking to fund that startups or to do mergers and acquisitions type activity. So all of those decision makers need to be aware of this, including policy makers, security agencies, military decision makers and so on and so forth.

When it comes to risk modeling stakeholders, it is everybody who has some form of decision making capability and they are doing an assessment, they are underwriting the risk in a way. So the NIST really defines the cyber risk assessments as the ones that are used to identify and estimate and prioritize risk across your organization, your operations, your assets and the people that you have within the organization.

One of the things that we are interested in talking about, [and] is a question we get a lot, is how do you quantify risks? At DarkOwl, we spend a lot of time thinking about it, and we have come up with ways, strategies, and products and score models that would help us objectify and quantify risk at scale. It’s not an absolute risk metric, but we see a very strong correlation and influencers for their risk calculations and your business decisions based on the exposure of data about you and the company that you represent as it relates to the darknet. So we call these “entities” which are basically email credentials, it could be domain names, it could be IP addresses, the set of entities that are easy to take, tokenized, and quantified.

Like I mentioned, this model is not basically the threat modeling aspect, but much more. And, you know, you need to give a lot of considerations for all the external and influential factors, which is the who and the where and the when as it relates to getting your data exposed.

So here’s an example of Microsoft whose overall risk profile, or we call it the darknet score, their score has been trending upwards (pictured below). A lower score is better. So, when your score is going up, that is not a good thing. So it could be either as a result of the amount of leaks that they have or the documents that are being exposed, how much hackishness is in those documents. So risk quantification with scores is a very important way to measure and assess risk.

Microsoft darknet exposure score (DarkOwl Vision)

The next one I want to briefly touch on is an experimental basis. We have Scores 2.0 that we are actively building. We are very excited about these scores to point out where we have used our own data, which is data from our entities, from our e-mail breaches, credentials and so on, and we believe it has predicted 73% of the breaches overall and 100% of all the four ransomware cases that we analyzed in the past. So here’s an example of a company such as Okta, which is the largest security authentication company out there. And interestingly, their exposure on the darknet was partly due to their leaks and some of their breaches. But more importantly, their biggest supply chain vendor is Sitel, which is a call center company which had access to Okta data. And when Sitel got compromised, that bubbled up to Okta. So we we always advise our clients to say, look carefully with your company within your data set, but also make sure that you are monitoring your supply chain vendors. So this is a perfect example.

How do we see the future of quantifying darknet data? It is very important that a very critical time is right now where we need to see a dialog among multiple organizations on what are the best methods and the best practices for quantifying darknet data and how do you do the risk modeling. We would love to see folks getting rid of questionnaires and checklists and, you know, making decisions based on data that is available in the open net or OSINT data.

We advocate for education on darknet and darknet data and how important it is for overall cybersecurity. There is a clear need we see in establishing a common language and a common set of mathematical models, be it the darknet score, or it could be something else. But, we want to see more such quantified risk models that are available in the industry.

There is a need for better understanding on the relationships between not just the threat actors, but between the personal and corporate risks that every companies go through. And [as we showed earlier] – you got to take a closer look at the type of data that is being leaked by some of the ransomware groups and the threat actors. Some of it is because they may want money, but a lot of it is also, they’re trying to build reputation by leaking data.

[We advise that] you take a close look at what data types are being leaked and what the cohorts and the verticals in the industry are talking about. Also, the key question here is this: how do you measure the goodness or the effectiveness of your current cybersecurity risk model? Ask that question often, ask that question early, and ask that question constantly. Which is, is your risk model effective enough and is it good enough?

With that, if you want to know more about DarkOwl, please talk to us. Get in touch with us at [email protected]. Or you can follow us on various social media and you can also check out, check us on our blog or on our website. And if there are any other questions, I’m happy to address them. That’s the end of the presentation.

Kathy: Thank you, Ramesh. We have had a couple of questions come in. So let’s see if we can get to some of them. The first one we have is” Why do I need DarkOwl? Most of the darknet can be accessed by individuals.

Ramesh: It’s a it’s a great question. Darknet data can be accessed by any individual or any company for that matter, but I would not recommend doing this at home. The reason being that you’re dealing with data that is extremely sensitive in nature and you are potentially interfacing with criminals and threat actors and it is a very dangerous place. So there is very likely challenges that you would run into is you may get attacked yourself when you expose yourself and your network, if you tried to do it without much expertise.

At DarkOwl, we take great lengths to make sure that our access to the darknet and our ways of ethically gathering data is serving you as a customer so that you can access data through our platform and the safety and security that comes with our platform, as opposed to interfacing directly with the threat actors and the criminals. So I would always recommend go through a provider and sort of avoiding direct.

Kathy: Great. Thank you. Another question that came in is: I want to access your data. What is the best way for me to do so?

Ramesh: Okay. The best way to access our data. The short answer is it depends. If the use case is you are a cyber security analyst or you’re looking for a very specific thing. You want to search on the dark web on a limited basis. The best bet would be to leverage our Vision platform. The next step is if you’re a developer and let’s say you want to build an API because you have a platform already built out, or you’re thinking of building a platform or you’re in cybersecurity and insurance business and you want to leverage darknet data for those type of use cases. We would recommend to our API. And by the way, our API, we offer a Search API, we offer Entity API for lookups on email credentials or crypto and so on. We also offer source via API and we offer entities and searches also via API.

So, there’s a variety of APIs that you can leverage, assuming that you want to be building code and develop and integrate dark data into your platform. And then all the way, if you’re a data science person, you are looking at large amounts of data and big data, right? And you have a data science team that is available. We would do what we call DataFeeds, which is snapshots in time that you can have either our entire dataset or filter based on criteria that you provide as well as we can do these historic data dumps and we can take snapshots in time and send it over in a in a secure transmission over to you and your data science team. So it really depends on the use case. The bottom line is you can leverage our Vision UI, platform or you can leverage our API platform or you can consume our big data, be our data feeds.

Kathy: Great. Thank you so much…Ramesh, thank you so much for this insightful presentation to our attendees. If you’re interested in learning more about how darknet data applies to your use case, please feel free to request time with us using the link in the chat. We look forward to seeing you at another one of our webinars in the future. Thank you.

Ramesh: Thank you.