[Webinar Transcription] Track Your Relative Risk on the Darknet

May 16, 2023

With cyberattacks increasingly on the rise, organizations need better intelligence to safeguard themselves, employees and customers from incidents such as data breaches and ransomware attacks. This rise in illicit cyber activity only increases the need to protect against and determine the likelihood of these attacks.

Cue DarkSonar – DarkOwl’s latest product that serves as a relative risk rating that considers the nature, extent and severity of credential leakage on the darknet to provide a company with a signal that acts as a measurement for a company’s exposure.

In this webinar, attendees:

Reviewed the latest stats around the growth of cyberattacks
Learned why modeling risk is essential for all organizations of any size
Learned how DarkSonar can inform threat modeling, third party risk management, and cyber insurance
Saw first hand how DarkSonar can potentially predict the likelihood of cyberattacks

For those that would rather read the presentation, we have transcribed it below.

NOTE: Some content has been edited for length and clarity.

Kathy: I’d like to thank everyone for joining today’s webinar, Tracking Relative Cyber Risk on the Darknet. My name is Kathy. I will be your host. If you have any issues with hearing the audio or seeing the slides during the presentation, please feel free to ping me privately in the Zoom chat function or email me directly. Now I’d like to turn it over to today’s speaker, Ramesh, our Chief Technology Officer here at DarkOwl to introduce himself and to begin.

Ramesh: Great, thank you, Kathy. Good morning, good afternoon, good evening, everybody, wherever you are. So today I want to go over some very exciting innovation that we’re doing at DarkOwl as it relates to risk modeling. The topic for today is track your relative risk on the darknet. We’ll go over that in the next 35 to 40 minutes. Just a little bit about myself: I am the CTO at DarkOwl. I have over 25 years of software engineering and technology background, worked in a lot of firms as it relates to data risk mitigation, risk modeling, big data, real-time communications, and so on. So today’s agenda we’re gonna cover what is the darknet or the dark web, and where does dark, we’ll share some metrics and statistics about the cyberattacks and the growth of them over the last several years, what is risk modeling and why is it essential for your company and every organization that you partner with.

We have launched a new product, DarkSonar, which is a very interesting way to notify you about threat vectors and threat modeling, third party risk management, and if you are in the cybersecurity insurance business, this would be a very important topic as to how to quantify risk. And last, we will see some of the case studies and some of the firsthand insights that we have been gathering at DarkOwl on how DarkSonar could improve the likelihood of the prediction aspects as it relates to cyberattacks.

Darkweb 101

Okay, so without further ado, let me get started with what is the darknet. There are a lot of different terms that people throw around – some use darknet or dark web.

Essentially, these are the ones that you see in the bottom. So we go bottom to top. The darknet is a group of anonymized networks. They require proxies, and p2p type networks. They are specifically chosen by threat actors to hide their activity and they’re making a concerted effort to be a part of these networks. So that’s where you see the traditional ones, which are the onion or tor browser, I2P, ZeroNet. And there is a whole host of newer networks that get added every now and then. So that is truly the traditional darknet. But then what we have seen in the recent years is there is also a lot of activity in the deep web and the surface web. So, deep web is defined as anything that is behind an authentication wall, meaning anything that is behind a user ID and a password.

So that is where you would have things such as social media, your banking applications, but more importantly, as it relates to the darknet, a lot of the threat actors use the deep web for criminal forums and marketplaces, which we will talk more in detail, as well as any surface web links that you see that are only available once you have membership or credentials to access them. On the right hand side, you see a lot of chat platforms, for example, Discord and Telegram are very much in the news these days because that is where quite a lot of activity as it relates to darknet happens, whether it is exposing breaches or it is critical conversations, marketplaces, and so on. So chat platforms have grown in priority overall. Then the last, but not the least is what we all know as the surface web, which is everything that is indexed by search engines such as Google and Bing.

For the surface web, our focus at DarkOwl is more the high risk surface web, which is not just any webpage out there, but specifically, websites, domains, platforms that people use to collaborate, such as paste sites where you paste images or file sharing sites, discussion boards, GitHub is a big part of it. So the way it is color coded here is the ones that are in, um, the oranges and reds – they are all part of our current collection capability. The ones that are in deeper gray or black are the ones that we are having plans to go after and collect data. So that’s the picture you see on the right hand side. It truly is like an iceberg. We look at the world as what you see on the surface web – it is a very small sliver of the actual data and more interesting data and criminal activity happens on the deep web and the darknet.

Kathy: We have a question that has come in – how big is the dark net?

Ramesh: Great question. There’s no easy answer, but I would say that, a good one fourth of the web, about 25% is behind some form of user ID/password, and a subset of that would be darknet. So it’s really hard to quantify how much data there is in the darknet, but what I can tell you is as far as DarkOwl, we have over two 50 terabytes of data, which is specifically going after the deep and the dark web. So it’s quite extensive, but it is really hard to quantify as a percentage of the overall web.

DarkOwl by the Numbers

The next slide is little bit more about the metrics and numbers that we collect.

So at DarkOwl we have quite a lot of data that we’re actively collecting and we are essentially a data company, which is why we have several million documents. We have forums and marketplaces that are out there, and then we also have the Tor, I2P, ZeroNet, Telegram channels, and whatnot. So, given that, what I want to specifically draw your attention to is the amount of data that we have in terms of growth is clearly on the chat platforms side which which are pretty active as they contribute quite a lot of what we call entity data, which is the stuff that you see in the bottom: email addresses, IP addresses, credit card numbers, crypto wallets and crypto addresses and so on.

On any given day, there is between 1 to 3 million documents that we collect in a 25 hour period, and it’s a combination of crawling various sites and platforms and so on, as well as processing leaks. And leaks continue to keep increasing exponentially ever since the Ukraine conflict. So the bottom line here is the data is a whole variety of data. It is very disparate, and it all has to be collected from multiple places. We normalize it into a common data structure, and then we make it available. As far as the delivery channels, which is what you see on the right hand side, you could use our UI, which we call Vision UI. There’s a whole host of API endpoints that we make available for our customers.

You can search in the data, you can pull out entities and the recent docs, you can consume DarkSonar via API, same thing with ransomware. And if you have a customer or if you have a use case that would want our data, then we’re happy to license the data via data fees. So there’s a variety of mechanisms for you to consume all of this data that we’re collecting, curating, and making available to our customers. Okay.

Specifically this slide talks about what is in our data. So here are a few examples. So the high level way to look at it is the top left versus the bottom right, wherein the top left is the traditional things that you go after the clearnet, or even the Tor and the Onion browser.

Versus the bottom right is there where there is more and more need for us to have personas to do the authenticated access into the deep web services. So the things that you see up there, they’re just various categories of data that we have. So we go after crypto and pay sites, and darknet classifieds and blogs and ransomware gangs and so on. The authenticated ones include marketplaces, carding, going to chat rooms, ransomware gangs, social media, any discussion forms. I do wanna make sure that there’s a clear understanding – we do not touch any pornography content, or we call it CSA, which is child sexually explicit and adult material. That is not our focus. We do not want to be collecting images that are highly objectionable and criminal in nature.

Everything else, in terms of all the topics that I mentioned, we’re actively collecting. So some of the stats that you see up there, we have over 232 ransomware domains. We actively monitor 400, almost 500 marketplaces. Believe it or not, we not only have English, we have 51 other languages that we see in the darknet with Russian and Mandarin being at the top not surprisingly. So that’s kind of the diversity of data and how dispersed the networks are.

Cybercrime is Booming

Moving on, we all know that all of this data collection is there for a reason – because crime and cyber crime continues to keep booming and growing exponentially.

I’m not gonna read every one of these data points, but I think we all can agree that post Covid, the attack vector because of people working from home, the home networks are not as robust as corporate networks. So that just significantly increases the attack surface. The Russian Ukraine conflict has exploded, not just the Russian and the Ukraine side of the war and the data leaks that are out there between both parties, but it is truly a third party risk issue where every company who is dealing with a vendor who is in that part of the world is impacted one way or the other. And so we at DarkOwl keep seeing that this continues to grow, and customers and companies out there are struggling with the amount of alerts that they’re being subjected to their SOC teams and the XDR platforms and so on.

DarkOwl provides a more asymmetric and a unique insight that you don’t get from your traditional corporate security processes and procedures. So again, data is growing, crime is increasing. And we also see that ransomware gangs are becoming very sophisticated. They offer customer service and that’s why the term ransomware-as-a-service is in the vernacular these days, because it is a truly massive problem that we’re all being subjected to.

Kathy: We have had a couple of questions come in. How do you know when a company is being targeted on the dark web?

Ramesh: It’s a good question. There’s multiple things going on in the dark web. So one of the ways that a company can pay attention is, look into the darknet. For example, if you’re using our product or our Vision UI, you can set up an alert, which is basically a way to monitor your company domain and subdomains. And anytime there is any activity about your company, be it a conversation that is happening in a forum, or be it a marketplace where something is being sold, either your company credentials or your AWS keys or what have you, it’s always a good idea to set up these monitors so that you can pay attention to what’s going on. The other is, obviously we’re going to cover the DarkSonar, which is a numeric objective way to see what your risk tolerance levels are over time. You may be thinking you have really good security policies and practices, but it is super important for you to also look at products such as DarkSonar so that you know that you are either at or below the baseline of security and compliance that you should be at.

Kathy: Don’t threat actors only target larger companies?

Ramesh: You know, conventional wisdom is that threat actors would go after bigger companies that are much bigger in revenue, they have a bigger wallet and whatnot. However, you’d be surprised, there is a lot of targeting that happens with small businesses, with smaller educational institutions, from counties to hospitals to you name it, because the threat actor or the criminal is looking at two angles. One is, how much money can I make? And the other is, how little effort do I need to put? So a lot of the companies, the larger ones have gotten pretty sophisticated. So there needs to be a level of sophistication for the criminal to organize themselves and attack, versus there’s lots of much easier, smaller targets to go after. So I’d say the answer is, it really is all of the above. They go after the large ones. They also go after the small ones.

Risk Modeling

Okay. So let’s move on to what is risk modeling. Now, there’s lots of frameworks as it relates to risk modeling.

We’ve all heard of the NIST, which is the largest one in terms of a governing body that defines risk models, but there’s also other modeling tools available, such as ISO, CIS, ISACA, OWASP and so on. Depending on your company, depending on your needs, it would be good for you guys to pick your risk modeling strategy and a framework and then out of that framework, you also need to really pay attention to who are the stakeholders. Like, how do you want to make sure that between your SOC, your data protection folks, cyber governance, CISOs, if you’re in the cyber underwriting space, insurance brokers, underwriters, if you’re a startup, let’s say you might have VCs, investors, M&A things going on if you’re national security or a public government organization, the policy makers, any military operational decision makers.

It all depends on the type of stakeholders that you need to keep in mind as you build your risk modeling practice, right? So all of these type of assessments are, at end of the day, they’re defined by NIST, and these are to identify, estimate and prioritize the risk associated with your organization. So I’d highly encourage folks to take a look at these standards because they all try to achieve the same thing, which is be holistic, have a 360 view of your risk, rather than just pull in a hodgepodge of tools, to figure out what’s going on at any given point of time. So that’s kind of the risk modeling and the people that need to be be involved. And why does that matter? Because ultimately, when we talk about the darknet as it relates to ransomware just in and of itself is getting more and more sophisticated.

Ransomware

As I mentioned, there is a whole piece of the industry called ransomware as a service.

It starts with the threat signal, and you see the data flow associated with that. There is quite a bit of a lifecycle that that is involved when it comes to ransomware, and we’ve been watching the various ransomware groups and what we have seen is prior to ever executing a ransomware attack, the reconnaissance occurs either by members of the ransomware group or by a broker, the IAB. And this appears in tokenized mentions of the critical network data of your company. It could be credentials, it could even be mentions of your employees that would be targeted for social engineering. So on these forums, the threat actors are also discussing things like the common vulnerabilities like the CSV, and find ways to exploit and come up with techniques to exploit them.

They also come up with techniques to break your antivirus, evasion campaigns and so on. All of these are ways in which they’re trying to poke holes into your network, and either they do it directly or through these brokers, and then we kind of capture that as the dwell time, right? So dwell times, once they are in the network, they are gonna start poking things around. And then there is advanced operations that could take days, or it could even be done in a matter of hours. And these threat actors use the traditional Mitre attack techniques. And then once they’re in the network, they’re laterally moving and they’re elevating their privileges one step at a time, and they get more and more access into your network, and they exfiltrate more and more valuable data. So the key thing is persistence.

The level of persistence and hiding they do is very phenomenal. I mean, it’s like, it’s beyond professional. They cover their tracks, they know what they’re doing, and once the data has been removed or stolen from your network, the devices are encrypted. Then they go into the payment cycle where they’re starting to get the extortion payments that they’re demanding. So as part of that lifecycle, what we see is the announcements then go on Tor or whatever data source. They’re advertising the fact that a company has been breached, and then the data is stolen and all of the subsequent PR and all the other challenges to the business. So even though the data is not shared immediately as a data leak, it’s typically repackaged, shared and curated by the threat actors because they want to find the takers – how important is that data breach for that business?

And then they notify the suppliers, the vendors, possibly customers, any contractors, and they keep capturing more and more of the attention that this company they have targeted, they’ve been successful in targeting and exfiltrating, and now they’re looking for ransom, which means the temperature of the company that was a victim keeps going up, that they better pay the ransom amount, otherwise this keeps getting published to their customers and their partners, and it just keeps getting worse, right? So it is kind of like the threat signal always starts with somebody that has gotten access to your network, and then they’re raising their privileges, they’re grabbing the data, they’re publishing that, and then they’re collecting ransom. So given that lifecycle, and there was quite a lot of words there, but the bottom line is these attacks are on the increase.

They are on the increase globally, not just the US and UK but most of the Western regions where we can track them. A lot of the world is being subject to this, and there is also a need for a critical understanding of what are the motivators of these criminals and why are they doing what they’re doing? So understanding such type of risks is not just a nice to have, it’s a must have for any organization, large or small, and be prepared for these type of potential threats. So the takeaway here is be sure that you have a risk mitigation strategy. Look at some of these networks and protocols for risk modeling and truly understand what you and your company could be subject to as part of ransomware and the sheer fact that cyberattacks are on the increase.

DarkSonar API

Now, having said all that, what we’ve been busy in DarkOwl is building a product called DarkSonar. DarkSonar is to address some of the challenges that we have seen from our perspective. Essentially DarkSonar, we like to think of it as a signal. The signal is to inform threat modeling, third party risk assessment. It applies for cyber insurance, anything to potentially predict the likelihood of attacks. In other words, DarkSonar is a cyber risk rating. It is based on an algorithm that measures an organization’s credential exposure, primarily email password exposure over time. So it’s not just a one shot snapshot, we’re monitoring the health of your business and the credentials over time. And because emails are primarily leaked and sold in the darknet, they constitute a major vector for cyber and ransomware attacks. And we measure such exposure on an ongoing basis with DarkSonar.

This enables the organization’s customers and third party risk management folks to get an awareness and understanding of what your weaknesses are, what are your soft spots are, and you could be proactive in taking these mitigation steps rather than find out that it’s too late and you’re being subject to a ransomware attack. So it would be a mitigation step to prevent data theft, to prevent loss to your revenue, to your profits, loss of reputation, because at the time of ransomware, usually it is too late. So what we did is, as part of building the DarkSonar, we did an analysis of over 250 companies, well known companies to lesser known ones that suffered these cyberattacks. And we saw that in 65% to 75% of the cases, when we saw an elevated rating, it was having a direct correlation to a few months after an elevated rating.

There was an attack, and I repeat it is in 65% to 75% of the cases, we see a direct correlation that elevated risk rating equals elevated chances of an attack happening. So that was pretty powerful. And here is a little bit more breakdown of the data, the type of data that DarkSonar uses – so credentials, as I said, is emails and passwords, aka combos. We do see, not surprisingly, there’s quite a bit of plaintext passwords that we see in our collection efforts in our database. So that’s the big part of the pie that you see along with there are hashed passwords, and there are some where we get the email, but we don’t get a password, right?

So the way the DarkSonar model is built is, it’s primarily based on credentials. But we have a waterfall approach in the way we have designed the model. So first up is there is weightage given based on presence of emails of your company, meaning email of the domain that is in question. So they are unique plaintext passwords or hashed passwords, or just the sheer number of emails that we see. So that is weighted. The other thing that we also weight is the time and the time series. Did we get a breach recently which contributed to these emails appearing, or was it happening six months ago or nine months ago? So the older the data is, the less it is weighed in our algorithm. And we also consider duplication.

Duplication is kind of a vast topic. I technically call it correlation, but essentially is the data leak being reposted with the exact same details, in which case it’s a duplicate, or is this being reposted with additional data? Some of it is similar, meaning they’re correlated to the previous leak, but a lot of it is new information. But one way or the other, the sheer fact that threat actors are reposting, your company or your organization’s leaks over and over again is cause for concern. So our model accommodates the fact that there are these are things that are weighted both based on time as well as the number of times it gets posted, the duplication ratio, and then the baseline metrics we provide is based on the overall volume. So our API through which DarkSonar is available will give you data for the past 24 months, and it gives a relative risk rating for the organization in terms of the distance to the mean.

It’s like the bell curve that’s displayed here, you would start with zero, which is right in the middle. If it is in the negative, that means it is good. Meaning there is not that much exposure. If it is on the positive, anything that is greater than or equal to one, it means there is a cause for concern. So, one more time, back to what I just mentioned. Our results show that elevated exposure, meaning if DarkSonar were to say that the exposure is greater than one, an elevated exposure and the sustained elevated exposure over the last four months is a direct indicator that there could be a possibility of an attack in 74% of the cases. So that for us was very powerful. Any questions on this so far?

Kathy: Does DarkSonar distinguish what the username/password combos are used for?

Ramesh: The short answer is we do not distinguish at a per user username password basis, but we do collect the aggregate of all the usernames that are being exposed, specifically the emails, but not the username per se. We’re mostly focused on email and passwords.

Full statistics and chart can be found here.

One thing is DarkSonar is a good indicator of risk. I do want to highlight some of the threat factors here and what should be applied in which scenario. So if you’re looking for phishing emails, for example, and that there is quite a lot of phishing attacks, then DarkSonar would be a really good tool for us to assess. Same thing with third party risk management, third party supply chain, DarkSonar would fit pretty well, any weak or compromised credentials.

If you have any compromised credentials, then that would be directly visible in DarkSonar. However, there are things like brute force attacks, unpatched vulnerabilities, cross-site scripting, man in the middle attacks, right? These are not exactly things that are involving emails and passwords all the time, but our platform, which is the Vision UI platform, as well as the API endpoints and the entities that I talked about, these would all help in understanding such type of threat factors like the brute force or the unpatched vulnerabilities, the cross-site scripting, the man in the middle, DNS poisoning and so on. So think of it as using the right tool for the right job. It depends on what threat vectors you’re interested in. Some of these threat vectors DarkSonar would apply, and other threat vectors, you might be better off using our Vision UI or our Search API or entity lookups and so on.

Okay, so now comes kind of the interesting part. So all that theoretical risk model, what does that mean for companies? So I have some use cases and companies as examples to kind of walk you guys through.

So here is the famous Colonial Pipeline incident that happened in April of 2021.

So Colonial Pipeline is one of the largest fuel pipeline and its breach literally had created shortages for oil and gas up and down the East coast. And this was a result of compromised passwords.

The, the interesting thing is we saw an elevated level of DarkSonar. As you could see, it was kind of hovering in the negative zone, which is good. And then in September of 2020, we’re starting to see the increase and the elevated risk, and then it became 0.5 and then decline to one. Anything above one, like I mentioned, it definitely has a clear indicator of risk, and that is where in from our data, we saw that a month prior to the attack, which is back in April of 2021, we were seeing that elevated risk. And then in May, the attack happened, right? So DarkSonar was able to detect based on these credentials, which are easy to do, the account takeover and instigate that attack. So that was on the Colonial Pipeline case. The next one is General Motors.

General Motors, same thing. We see a three month window where there was an elevated signal to the time that the attack was announced. So again, part of the challenge is when big companies, and you know, big outlets have this challenge, it becomes a media issue. Many of the companies do not report it. They try to pay up the ransom or negotiate with the criminals, whatever they’re doing on the backend. It may or may not be in the news, but we capture what we gathered from General Motors from the time that they had announced, which is April of 2022. When we go back in time and look at March and February and January, there was a clear elevated risk. So our DarkSonar model detected an abnormal increase in the plaintext and hashed credentials, literally months leading up to the attack. The next one is Fujifilm.

Same type of thing where their servers were infected with ransomware and nobody would ever know when the exact ransomware was launched and what exactly happened. But according to the bot ransomware, it came through a phishing attack. And clearly the takeaway is we detected an increased email exposure prior to the actual attack happening.

The last one that I would say is back to the question, that was asked earlier about smaller companies – you’re still very much vulnerable for these type of attacks. And in the City of Tulsa’s use-case, we saw a five month attack window of elevated risk as it relates to DarkSonar. The signal was elevated for five months prior to the attack. So the reasons were really the group installed ransomware in late April, the program began to operate the city firewall and other security protocols were kicked into the city’s technology department. They took their time, but the bottom line is this was months in planning by the criminals. And we see that elevated risk as far as DarkSonar literally five months prior to the attack.

Kathy: Can you answer, what is the likelihood of a breach if the signal goes above one?

Ramesh: Anything above one, there is an increased exposure. An increased exposure would correlate to increased risk. An increased risk would correlate to, there’s much more chance of a breach. So I would say anything over one, companies need to be really, really careful. Pay attention, take the proactive steps, rotate your passwords, put in the multi-factor authentication on your servers, whatever you are doing. As for security operations and proactive things y you should, you should be careful, right? Does that mean anything below one is fine and dandy – I would say look at DarkSonar as another way and another tool in your tool chest. And if it is over one, that means the temperature is going up, right? And if the temperature goes up and it keeps going up and up and up, bad things happen. So that’s the best response I would give. Is anything over one, you better watch out.

Okay. As I mentioned, here is a little bit of technical detail on how the docs owner API is represented.

We give you the results based on a company domain and we give it to you a JSON format. And like I mentioned, we present that data for the last 24 months, and we give you the rating as well as the baseline and the signal we will indicate if it is low or elevated or high, right? And to Kathy’s earlier point, anything about one would be elevated. So if you are in the low category, that’s good, you have good security best practices. If you’re one or above, it’s time for you to pay close attention to what can you and your company do to mitigate these type of risks.

So again, it’s predominantly available via API, you can hit individual domains or you could hit multiple domains at the same time. It’s up to you. And then the results, like I mentioned, is it’s really, we did the internal analysis for the 237 publicly disclosed attacks between the last couple of years, 2021 and 2022. We see the accuracy is very strong. We were actually surprised it was this strong a correlation between the risk level and the attacks. So all attacks was 74%, ransomware was 75%, breaches was 74%. So it’s tracking pretty closely to a very high percentage accuracy for the elevated risk versus the attack. And then we also went to some of our customers. We went to some of our prospects and we call it the beta clients. And we did a pretty extensive evals on the attacks. And we see that there is a pretty strong correlation there as well.