[Webinar Transcription] Exploring Emerging Trends in Cybersecurity

October 31, 2023

Or, watch on YouTube

As the digital landscape continues to evolve, so do the threats that target it. Staying ahead of cyber adversaries requires a deep understanding of the latest trends and innovations in the cybersecurity space.

In this webinar, DarkOwl CEO, Mark Turnage and Socialgist CRO, Justin Wyman explore a variety of critical topics shaping the cybersecurity landscape:

  • Key VC Raises in Cybersecurity: Capturing Industry Attention
  • Understanding the Major Players: Who’s Raising the Stakes
  • Harnessing Security Solutions: How Organizations Protect Their Assets
  • Addressing the Talent Gap: Scaling with Data Aggregators and Services
  • Pioneering the Use of AI: How do LLMs and AI Come into Play

For those that would rather read the presentation, we have transcribed it below.

NOTE: Some content has been edited for length and clarity.


Kathy: Thank you for joining us for today’s webinar exploring emerging trends in cybersecurity. Before we get our topics, begin our topics today, I’d like to turn it over briefly to Mark and Justin to give a brief introduction of themselves and their companies.

Justin: Hi, guys. Nice to meet you. Wyman, Socialgist is the name of my company. I’m the Chief Revenue Officer. We are a provider of open source intelligence. We’ve been doing so for the last 22 years, and I’m excited to be here.

Mark: Hi, I’m Mark Turnage. I’m the CEO and Co-Founder of DarkOwl. We are a company that specializes in the darknet, and specifically in extracting data from the darknet and providing it to our clients and working with partners like Socialgist to provide a broad view of open source intelligence, including that of the darknet.

Kathy: Great. Thank you both. Prior to diving into our topics today, Justin and Mark wanted to take a moment to comment on the Israeli and Hamas conflict happening presently.

Mark: I’m happy to comment. You know, when the conflict broke out on October the 7th, we immediately started looking at content in DarkOwl’s database that was relevant to the conflict, either pro-Israeli, pro-Palestinian, pro-Hamas, and we pretty rapidly triangulated on about 400 Telegram channels that are actively covering the conflict. And we’ve been monitoring those channels throughout, directly ourselves and generating some content which is available on our website, and also supplying that to our clients. And it gives them a different perspective than what you see on the front page of many of the newspapers. I will comment, we published a blog very early in the conflict that noticed that amongst the most prominent Pro-Hamas Telegram channels, they went quiet for several weeks before the attack. Unusually quiet. We don’t have an explanation other than they were distracted, they were planning, they were getting ready, or they had been told to go offline. But we did detect that in the lead up to the attack, there was considerably less activity on those Telegram channels than was normally the case.

Justin: I would say when you see such a horrible thing, it’s really hard to process, especially because in the space that Mark and I occupy, Israel is a big component of it. Technology companies and cybersecurity are founded in Israel all the time. Some of the leaders in the space. So it gave an extra personal feel, if that’s even possible. When you see these types of things, when you know the people that are directly impacted by it at a different level. And then I thought it was it was comforting to see that we could in some way help with our information, help the helpers, essentially. And Mark, I got to say, I thought the Dark Owl content was fantastic. To help show examples of how OSINT intelligence can help prepare for these types of things and deal with them frankly.

Kathy: Thank you both. Now we will begin with our first topic.

Key Raises in Cybersecurity: Capturing Industry Attention

Justin: So let me talk at a high level. What is happening? If you look at VC and cybersecurity over the last couple of years, it’s declining, which normally I think would be a bad thing if you didn’t realize it was declining from a peak bubble that happened during the pandemic. So you can say things are down 30% from last year, which is down another 30% from last year. It really, honestly, to me just seems to be returned back to normal. You see a lot of companies having some very specific raises, we’ll get into and you’ll see some combinations, you’ll see some coverage. But I think that the cybersecurity industry should feel that there’s been a correction that was due because you’re in a bubble. But now we are in a place where things are normally operating. The space is growing and investment is happening as well.

Mark: Yeah, I’ll just echo Justin. The investment into the cyberspace, go back say three years was just red hot. It was at levels that I didn’t think were sustainable. And oftentimes at evaluations that I didn’t think were justified. What has happened as the economy has gone through a fair amount of turmoil over the last year and a half is that those valuations have reset, and the level of investment is what I would normally expect in a pretty healthy sector that is still growing. Overall funding is down. I think it’s down 30% year on year. Valuations are down. The interesting thing is that companies that are still growing and companies that are profitable are still getting healthy inbound investment. Just yesterday, by the way, Censys announced a $50 million dollars raise, a small company out of Israel raised $4 million. I mean the raises come in regularly. They’re not at the valuations that we saw, say 2 or 3 years ago, but they are still happening. And they are particularly happening with very healthy companies.

The other trend, by the way that I’ll mention is any time you have an economic reset, which is what we’re experiencing right now, it forces consolidation in the market. You know, scale matters, size matters, sophistication matters. Go-to-market strategies and the ability to reach your market matters. So whereas before a small startup could have raised successive rounds of value, of money, of capital at ever increasing valuations against, you know, maybe skinny performance – those days are gone and they’re likely to be an acquisition candidate for for another company. And we’re seeing this – large companies are pretty active in the M&A market right now as a result.

Kathy: Based on that, a question has come in. What changes do you foresee over the next coming year?

Justin: Let me start with one of the public markets because that leads things. So in the public markets, you’ll see a lot of leading cybersecurity companies up double digits this year, more than the S&P 500. CrowdStrike is a good example. They’re up 70% year to date. As an example, Tesla is only up 80%. Apple’s only up 36%. So that’s not market forces. That’s industry forces of the problem with cybersecurity is growing so rapidly. The things I think you’ll see over the next year would be companies that have a growth plan, getting more funding and moving into new markets. I saw that already with OSINT Combine. There’s a company with a very good Australian presence going to the North American market. Full disclosure, they’re friends of my company and DarkOwl – so maybe we’re a bit biased there.

You’ll see some people getting acquired by PE firms, which is an idea of, again, operational excellence that might be a different component than things, say, in a bubble where instead of doing a PE acquisition, you would raise a bunch of money and see if you could sell and market your way out of it. The other thing I’ve noticed that I think will come is more legitimacy and standardization. Frost and Sullivan has created industry coverage for the first time on a lot of these companies. You’ll see certification tracks coming out of industry organizations like Osmosis. So I see it as a big step forward in the maturity of this space. There’s always startups, there’s always guys in the middle, and there’s always the big guys, and you want to have enough of them to create an ecosystem where you can ultimately meet the consumer need.

Mark: I couldn’t agree more. The way I would have described the cyber security industry two years ago was an awkward teenager. And it’s moving to young adulthood. It’s maturing. It’s growing up. It’s actually starting to understand what its own limitations are and what it can and cannot do. And I would just echo Justin and say, over the next year, we’re going to continue to see consolidation – more and more mergers, more acquisitions. It has always amazed me, just as an aside, that the largest cybersecurity companies in the world still only measure their revenues in single digit billion dollars. Those are the largest. And then it falls off pretty quickly from there. And given the size and importance of the problem, this is an industry that is ripe for what you just identified, Justin, which is growing up, consolidating, becoming more professional, working against known certifications and known standards. And by the way, known regulations because the regulators have arrived.

Justin: Mark, that McKinsey report we’re referencing before about just how breaches are supposed to go up 300% from 2015 to 2025 also noted that to your point about revenue, that the vendors in the space right now make up a 10th of what they think the overall revenue is going to be in the next ten years. So yeah, teenager growing up is a great analogy, meaning there’s just so much. There’s some stability being built in, but there’s still so much more to grow up.

Understanding the Major Players: Who’s Raising the Stakes

Mark: Well, I think in the world of threat intelligence broadly, there are a couple of very large players – Recorded Future comes to mind, Flashpoint comes to mind, Intel 471. There are a bunch of these players. Interestingly enough, every single one of those has been acquired over the last 3, 4 or 5 years by large private equity firms that have, as a strategy, explicitly what Justin was talking about, grow these companies up, make them larger, make them professionalize their operations, give them global scale and global reach. And then below that you’ve got a whole range of companies and these are small- to mid-size. Some of them are just start-ups who are looking at problems from a different angle. And there has been a lot of activity, both in terms of fundraising into those companies as well as acquisition. I mean, one that comes to mind is Maltego. Maltego was acquired by a private equity firm at the beginning of this year, and that’s a well known, well established platform that is used across the industry by a number of different companies and users. And in my view, that was a really smart purchase by the private equity firm. What else is going on Justin that you’re seeing?

Justin: A company I recently became familiar with at a conference was Fivecast. They raised 20 million. They were an Australian based company looking to really expand their sales and marketing into North America. They feel their perception, not mine or based on conversations, that they feel they have their product completeness to the point where it’s time to go see if they can compete against the bigger guys in the space. Now Cobwebs, another huge player in the space, just joined Chainlink. Those are other things I’m seeing.

Another one we were talking about, Mark, is Palo Alto Networks buying Dig this morning as a sign of just a major player adding in a feature capability. So, you know, this is following the the classic playbook – where you watch Oracle and Salesforce go after each other and then add on competing bolts. Again, another idea that you have a very well established market that you can operate. If you have operational excellence, you can really succeed.

Mark: Another example of that, by the way, is Proofpoint yesterday announced the purchase of Tessian and we’ll come on to it. Tessian is an AI provider that will significantly enhance Proofpoint’s products. And so you’re starting to see that happen at a pace that I have long predicted. But really I think this economic climate has accelerated.

Harnessing Security Solutions: How Organizations Protect Their Assets

Justin: I’ll start as I always do, with a little bit of data. Fraud is still massive. The biggest issue that every organization is dealing with – it’s coming from social media, it’s coming from internally. I talked a little about this McKinsey report, but again, I’ll say it again because it’s such a massive number. They think that breaches damage is going to increase 300% by 2025. The other one that I looked at was a survey of mid-sized companies suggests that threat volumes will almost double from 2021 to 2022. So that’s 100% growth in one year.

What they’re doing to protect their assets – my concern is with their employees. So I’d love to hear your thought on this, Mark.

Mark: Just a small data point from DarkOwl – we track where visitors to our site go and what pages they dwell on. The most common feature across our website is our fraud webpage and content on fraud. That speaks to the nature of the problem.

I’ll just say two things. One is we are all excited as an industry about AI. We’re excited about new tools, about new capabilities that exist. So are the threat actors. They’re using all of the same tools, all of the same capabilities to actually scale and professionalize their own operations. But, you know, going back to your point, Justin, the biggest threat to many companies is their own employees continues to be their own employees, whether that’s actual outright fraud or just mistakes that employees make that open up the company to potential potential attack and fraudulent attacks.

Justin: I believe that was the logic behind the Tessian acquisition is just the amount of people that have exposed their companies by literally emailing the wrong person. That seems to be a problem that should be quickly solved through some proper technology application.

Mark: I mean, I’m amazed. I’m actually amazed. Look, I mean, CEOs are are susceptible to this as well. And in fact, I mean, go to any OSINT training seminar and they’ll tell you the most vulnerable people or the easiest to attack are the C-suite, because they’re the ones who are the sloppiest or the least attentive to to security. That continues to be the case, but it permeates the entire organization.

Justin: The other thing I’ve heard is that key figures, usually execs, because there’s so much information, that they’re much more easy to manipulate. Voice manipulation takes a lot of samples of data. So the bigger the sample, the easier it is to manipulate the voice is the other thing I would talk about. And then the last one I noticed was people just kind of really trying to do the best they can to understand their supply chains. If employees are people accidentally sending information out. Supply chains are people sending information in, and these are business partners that you rely on your suppliers. So it’s very easy. Those are very weak points in a system to kind of create havoc if you’re not prepared.

Mark: There’s absolutely no question. The pandemic taught us that supply chains matter and supply chain vulnerability is mission critical. And to to Kathy’s question of how organizations protect their assets, it’s not only protecting your own assets, but protecting those critical assets of your vendors who are critical to the provision of your product or your services as an organization, which is why you’re starting to see these third party and vendor risk management companies come into their own in terms of their level of maturity, because especially very large, complex organizations need to pay attention to their supply chains.

Addressing the Talent Gap: Scaling with Data Aggregators and Services

Mark: The interesting thing about the talent gap is that the cybersecurity industry for years has complained about lack of talent. I think the statistic I continually hear is something like half a million unfilled cybersecurity jobs worldwide. And that number has held pretty steady for the last number of years. We’re in an environment, though, where many of the companies in our sector are actually laying people off. So how do you square those two contradictory statistics? Well, one way to square them is exactly what Justin said earlier, which is many of the companies that are laying people off were hiring at a clip that was unsustainable just as recently as 2 or 3 years ago. So you’re coming back to a sort of a more normal track. My sense is that there is still plenty of demand in the marketplace for people who have cybersecurity experience, whether it’s developers or product people or otherwise. But yes, there is a gap and I think AI is going to help fill that gap. What do you think about that, Justin?

Justin: I absolutely do. Let’s talk about the two things like data aggregators and services. Start with services because Mark and I have a data aggregation stake in this fight. But on the services component, when I work in the space, what is interesting to me is the people come from all different backgrounds military, private, etcetera. There’s no “you don’t go to school to become a cybersecurity expert.” So that’s a very big problem. But it’s a problem that is being solved, I think. When we were all at OsmosisCon, which is a association of these professionals, they’re creating certifications. They’ve created a conference so people can come and share tips and tricks. And that’s just one of many. So I think it’ll get easier and easier to bring people into the space and give them the certifications that show them that they’re qualified, because right now it really is due to the nature of the sensitivity of the issues and how people come. It’s like, who do you know that you can trust? Which makes sense in the beginning. But over time, you have to figure out how to scale your business. So I see a lot of services being created to help with that.

Then on the data aggregation side. As a data provider technology provider in this space, it’s amazing to me how big the problem is, right? These people are searching for needles in haystacks and the haystacks are growing, and so the only way you can solve them is through aggregation. And that’s basically at any point in the value chain. So if you’re creating a piece of software that allows analysts to hunt for threat actors, well, you’re probably going to use data from many different sources because the haystack is too big for you to do it yourself. Then if you’re actually looking and searching and doing the analysis on top of the data, these tools will allow you to search more efficiently. If you go back to Mark’s Telegram example about things going silent before an attack, as these technologies get better, you know you won’t have to go, “Huh? Why are these silent?” These things will go, hey, there’s an interesting activity here. The volume of these things has really dropped off. Why? And that’s a way that people will be able to not only look in the haystack more broadly, but faster, have things suggested to them. So I think ultimately the space will be fine. Again, I can’t stress this enough, we are coming off a bubble, and that generally means people aren’t behaving how they should behave. And so to correct that, you have to lay some people off. But now that we’ve had this baseline, people go back to building their businesses most based off of the value they provide in the market. And as we’ve shown, the value is only growing, meaning the threats are only increasing dramatically.

Kathy: Based on that, we’ve had a question come in: We have seen a lot of layoffs in the space recently. And can you address how this does affect the talent gap?

Justin: Positive half glass full spin would be – when you have layoffs in an industry where it’s growing, it’s because those people are in a place where they weren’t effective. They weren’t doing the things that needed to be done to keep the business on its goals. So when you take an experienced person and you separate them from a business that no longer needs them in a growing space, they should be deployed in a better space where they are more impactful. Right. This is the efficiency of markets happening. So I think these gaps will take the people that were places where they weren’t as useful and put them in places where they will be much more useful and create a world where they’ll be, again, more coverage.

Mark: Not to disregard the dislocation that necessarily occurs when that when that happens, if you’re the individual who’s affected, it can be quite difficult. But I agree with Justin that on aggregate we’re not seeing employment in the cyberspace decline. It still continues to increase.

Pioneering the Use of AI: How do LLMs and AI Come into Play

Mark: The big issue that both Justin and I have discussed in the past is anytime you bring an end to a problem, it needs a data set to sit on, to learn, to learn that problem in order to be effective. And so what becomes the most critical in that is the data we aggregate – darknet data. Socialgist aggregates open source data across a variety of different platforms. Those data sets become extremely valuable and extremely important in the application of an LLM to address or learn about a specific problem. And you know, in the case of DarkOwl, I can speak to that, our data set has been aggregated over 5 or 6 years. That’s not something that you can just recreate overnight. If you’re a new company coming into this space or somebody looking to utilize AI, the same I’m sure is true for Socialgist. So it’s a very interesting insight into the power of the underlying data that that any organization can has in terms of addressing the problem via an LLM.

Justin: And I totally agree with everything Mark just said. I think the other thing to think about is, how much easier it is to get things out of the data value, out of the data with LLMs, and how in general, the biggest thing you’re going to see in the software world, the biggest constraint is going to be software engineering capacity. Every company in the world wishes they had more software engineers because it’s hard to do things like connect a data set into an analytics platform. It’s a very technical work. These engineers now are doing work 40% faster, so it’ll be easier to make progress and solve problems when you put these types of applications together. What that should mean is that you should have in the long run, and again, marginal like dislocation is hard and things need to change and we have to cross the chasm and all these sayings, but what we’re really talking about is in the long run, things should get cheaper with technology and things should also get better. So the data sets that we ship to our clients that are working very hard to get incredible data out, get incredible insights out of it, should be able to get insights out of it faster and better and cheaper because they need less engineers. And then the tools to analyze these data sets should only get more powerful as well. I really see there will be an area where, you know, there’s different segments in our space, right? There’s the people that are at these big companies, and they have all the budgets in the world, and they have the fanciest tools, and there’s people below that, and there’s people literally using their cell phones to track people doing medical research. Those people should get increasingly better tools that will make them much more effective. So we’re talking about the capability of people with less budget getting much more effective, which I think really creates a much better world.

With the caveat that the other guys have it too. So there’s always a push and pull, but I see a lot of positive headwinds in the in the long run with AI.

Mark: I mean look, you know it’s going to increase, as Justin said, productivity per worker significantly. And the comment that I heard recently in a conference was, you know, AI will be tremendously dislocating of many types of employees and many types of groups, but the world’s going to divide itself into to two camps. Those people who know how to use AI to make themselves more productive and those who don’t. And that’s the digital divide that we’re actually hurtling towards. I’m deeply optimistic, personally, about what I can do across multiple different fields, but starting with our own field in cybersecurity – I’m very optimistic about it.

Kathy: We’ve had another question come in and an attendee is interested to know “Will DarkOwl and its peers sell their data sets to companies?”

Mark: Good question. We’ve been approached by a couple of companies, and we’ve done our own early work on putting an LLM onto our own dataset. I suppose I should put on my businessman’s hat and say it depends on the price. Yes, it depends on the price. But it’s not something that we’re going to do loosely or without a lot of thought. Because once that data is out there under somebody else’s LLM, obviously the data is available to whoever has access to that platform.

Justin: It depends, I think is a good answer. I think the thing to understand about perhaps my company, Mark’s company, is like, you know, our mission is to extract information from the world’s online conversations and if you can help us with that mission, because we’re very serious about it for the reasons we’ve discussed throughout this whole thing, we’re seriously going to talk about it. Now, there’s sometimes choices that make decisions. There’s sometimes choices that make that not the case. And there’s always a lot of nuance. But at a high level, if you help us with our mission and the business makes sense, then that would seem something that should happen. But also, Mark, you touched on a really interesting point of, you know, I do think data companies like ourselves are also going to explore training with our own LLMs too. to have the full picture. So I think the key is as long as LLMs capability is used on these data sets to make the world a better place, we’re for it. The machinations, I don’t know. There could be a world where two data providers do one together, etcetera, but the technology should make the data more useful, and that is our goal.

Mark: I will point out we’re in discussion right now with our first client who wants to put in on a subset of our data. It’s exciting.


Interested in learning more? Contact DarkOwl and Socialgist!

[Webinar Transcription] Why Social Media and Darknet Data go Hand-in-Hand for Robust Cyber Investigations

July 18, 2023

Or, watch on YouTube

In today’s world, the internet is an integral part of everyone’s personal life and even more so of every organization. Over the past several years, social media platforms have come to play a big part of an organization’s strategy and digital footprint as people connect, share information, and express themselves. In addition, the darknet and darknet adjacent platforms have grown in popularity – characterized by anonymity and illicit activities.

In this webinar, DarkOwl CEO, Mark Turnage and Socialgist CRO, Justin Wyman explore how the two interconnect and dive into the topics of:

  • Data collection and enhanced insights
  • Online identities and connections
  • Social engineering and phishing attacks
  • Reputational risk
  • Ethical concerns and legal challenges

For those that would rather read the presentation, we have transcribed it below.

NOTE: Some content has been edited for length and clarity.


Kathy: I would like now to introduce Justin Wyman, the CRO for Socialgist, and Mark Turnage, the CEO for DarkOwl. I’m going to turn it over to them to do some introductions and introduce their companies and then we’ll get started. Justin.

Justin: Thank you, Kathy. I really appreciate you putting this together. And thank you, Mark, for joining me in this webinar.

I feel like our two companies are kind of different sides of the same coin in the sense that we both scour the internet looking for online conversation. Socialgist really specializes in what I would call public conversation, people talking on blogs, message boards, forums and social networks about everything under the sun, including brands, political issues, etcetera. We’ve been doing this since 2001. Our goal is to take all the information on the left, package it in that blue box in the middle, and then distribute it to analytics platforms on the right.

We call this DOS or data as a service. Our core values provide high quality global datasets of the world’s online conversations. The key strengths are important for this webinar. It’s very broad, right? 30 plus languages. We provide a lot of context. That means history. And then we really focus on high quality, low spam data collection. A lot of this is looking for a needle in the haystack, and if you don’t have accurate data, then you’ll get a lot of false needles.

This is just a sample of our data sources. The things to understand are there are many different parts of the internet to potentially mine for insights, blogs. Journaling news is where you watch things spread from social media to online media. Videos like YouTube are obviously important forums or threaded conversations or where you see really hobbyist conversations. And then there’s review sites and social networks reviews being people trying to fish in this example, looking for selling competitive products in social networks, being a parlor, true social, those types of things.

Kathy: Mark, would you like to introduce DarkOwl?

Thank you. And it’s a delight to be here. Thanks for hosting, Kathy, and thanks to Justin. We’ve been looking forward to this webinar. DarkOwl, as Justin said, we’re two sides of the same coin. In fact, the presentation that Justin gave, if you just substituted darknet data for all the data sources that he and Socialgist collect, you would get to DarkOwl. We have been collecting data now for well over a decade. We supply that data to our customers. And we, and just for the for the sake of clarity, we only specialize in darknet and related deep web and surface web sites that repost data from the darknet. And we supply that data through our Vision UI or through a range of APIs and data feeds.

This gives you a sense of what we’re talking about. The bottom of that slide is the traditional definition of a darknet, by which I mean our traditional definition is it usually requires a specialized browser to get to. And once you are in those darknets, your user identity is obfuscated and oftentimes the traffic is encrypted. So the beginnings of the darknet traditionally trace back to the Tor network. As you can see, a range of other darknets have arisen for a variety of different reasons. For example, the third one in called ZeroNet is very popular in China. It’s a blockchain based darknet so that the conversations that occur on ZeroNet are actually distributed around a blockchain. And in order to collect data from ZeroNet, you have to actually continually crawl the entirety of the blockchain to recreate a single conversation. And perhaps unsurprisingly, darknets are popular among the criminal groups because of user obfuscation. And with the rise of cryptocurrency, a relatively anonymous currency, it’s the perfect place to do crimes, the deep web and the surface web we also collect from, but we don’t collect from generally from social media and from the sites that Socialgist collects from, which makes us ideal partners. We collect from authenticated websites and the deep web and then some high-risk surface sites, direct messaging platforms, Discord, Telegram, IRC are new platforms for us where we collect data and increasingly a lot of criminal activity is moving to these direct messaging platforms. And I think the topic we want to discuss here today is how does how does the data that DarkOwl collects, how does it fit in in a cyber investigation and in an analytical context, how does it fit with what Socialgist is doing?

Just very briefly, this gives you a sense of the volume of data that is coming out of the darknet that we collect on a daily, weekly, monthly basis. And you can see some of the types of data that we that we collect as well.

Kathy: Thank you for the introductions to both of your our companies. Today we would like to start off with the first talking point of data collection.

Data Collection

Socialgist specializes in collecting and aggregating social media data while DarkOwl focuses on collecting dark web data. Can you both talk to how these two are connected?

Justin: I will start. I think what’s important to understand is that in an increasingly interconnected world, you have what’s called a butterfly effect, which is where small things can snowball very quickly. And if you see the sources that Mark presented on his slide versus mine, you can see a very interconnected world. Now, the thing that Mark and I have spoken a lot about in our partnership together is how often things that are damaging to brands or cyber investigations start in the dark. That’s where they get organized. But you cannot tell if it’s going to have an impact, especially when the intention of the criminals is to battle perception or brand awareness until it bubbles up into the public net. So it’s smart to look at. Have a wide net across dark and public to see what issues are emerging in the darknet and then going into public and going traditional news. Would you agree with that, Mark?

Mark: Absolutely would agree with that. In fact, we find that threat actors regularly use social media to bubble up threats, to bubble up data that they’ve stolen, to bubble up information to the surface web. We also find, by the way, that in terms of identifying threat actors, most threat actors are very, very active on social media. Whether they do that personally or in their professional, quote unquote, capacity as criminals. And we find that it’s very easy to pivot back and forth between the two in trying to identify who they are. And oftentimes we are able to identify them by virtue of their use of social media and the commonality of what they’re doing in social media with what they’re doing in the darknet. And I have some examples we can talk about later on.

Data Insights

What kind of data can come from social media that helps investigators or threat intelligence teams? And what about from the darknet?

Justin: Any person in crisis communications or PR will kind of have two principles. It’s how fast do you get the insight? And how accurate is the insight? What social media does is take that accuracy component and really helps you understand what’s happening. Or accuracy might be another word for validity. So when you see issues that are very important to you bubbling up in social media, then you know that it has momentum. That snowball is building that butterfly effect. So when I think about how the darknet and social web thing work together, it’s about when it pops up into social media or the public web, that is a very big sign of validity or accuracy. So that’s how you can use that to justify what threats are real or not. Because as somebody that’s in cyber investigations, you’ll have a list of 10, 20, 100 issues and you’re constantly trying to see which ones are real or not. And social media data gives you that validity or accuracy. Okay. This is something we need to pay attention to, especially in information warfare.

Mark: The range of data that is available in the darknet that is of interest to analysts and investigators is very broad. The darknet is a primary repository of threat data. It can be data that’s been hacked or stolen from organizations. It can be vulnerabilities that are being bought and sold or discussed in the darknet as a way to get entry into organizations. It can be a wide range of PII that’s available on the darknet for executives and companies. To Justin’s point, there are disinformation specialists who offer services in the darknet. And so I equate the darknet to the sort of 2 or 3 city blocks in every town where all the crimes occur, and we see ourselves as a primary policeman for those types of activities that occur in the darknet. And obviously, the darknet is growing. It’s a growing phenomenon. That chart I showed earlier shows that what started out as the Tor network is now a number of distributed networks. We, as an example, extract data from 25 to 30,000 darknet sites a day into our platform. But to Justin’s point, when you start to see data bubble up into social media or into the surface web from the darknet or from actors who are very active in the darknet, you know that something has happened. You know they’re bubbling it up for a reason. Usually it’s to draw attention to the fact that they have committed an act or extracted data from an organization or are in the middle of a ransomware attack. And you can easily see that when they when it when it bubbles up to the social media level.

Justin: I thought an important point you brought up on your slides was the ZeroNet Chinese aspect of this. We’ve watched us together, as you know, these 2 or 3 blocks as a great analogy, but those blocks are growing. They’re getting more organized and they’re getting more effective. And ZeroNet in China is a great example of how we watch them organize in the dark web, go up into Chinese forums, then go to more of the US public web. And so the question is, at what point in time are you going to be aware of that? Do you want to be aware of it by the time it hits the public web in the US, that’s probably not the speed you want. If you’re a crisis communications person, you probably want to understand that threat in the ZeroNet so you can prepare for it long in advance. You want to understand that threat as early in that kill chain as possible. And that’s the reason why our two platforms work so very well together.

Mark: And by the way, Justin, the example you cited with respect to ZeroNet also applies to the use of Telegram in the Ukraine Russia conflict and the various spiders that have arisen from the primary use of telegram by threat actors on both the Russian side and the Ukrainian side in spilling and leaking data and attacking each other. It then spreads through a broader social media environment and it has changed, frankly, the landscape of how we think about threats and how we pivot, how we see pivoting by threat actors between social media and the darknet. It is amazing to think about. The lag between traditional news and what we know, what’s happening online when it comes to the Ukraine war, the surprise when certain things happen are not nearly surprising to you and I, because we’ve been watching it for a while. We don’t you know, we obviously can’t predict the future, but we can anticipate it better by using that kill chain, as you described.

There’s no question about that. And it is interesting to me. I think if we were executives in traditional social and traditional media companies how to incorporate the speed with which news travels, particularly in social media, would be a real challenge. I know for, for example, that I go to social media when I hear something is late breaking or newly breaking. I go to social media as a first instance. It beats all the mainstream media sources in terms of speed, there’s no question. And, you know, I’m not unusual in relying on that. I think, you know, certainly the younger generation relies on that almost to the exclusion of any other sources.

Justin: Traditional media no longer breaks news. It’s supposed to analyze news and it always struggles when it tries to do the other thing.

Kathy: We’ve had a question come in and someone would like to know, how do you know when a company is being targeted on the dark web?

Mark: That’s a great question. My first part of that answer is oftentimes they’re named in the darknet. We are attacking XYZ company or we have a back door into XYZ company, and here’s some data we’ve always already exfiltrated. So shockingly, the first thing is look in our platform and see what companies are being named as targets. Secondly, threat actors oftentimes will post IP data of targets that they’re targeting. And if you know your IP range, you can see that you’re being actively targeted. But the most common way to know it is threat. Actors will oftentimes extract data out of a company, post it in the surface web, on surface web sites or in social media and say, we have attacked XYZ company and here’s proof of that, and they will put out some embarrassing documents and they will simultaneously, ransomware operators will simultaneously be discussing talking to the company directly and saying, we have a lot more of this data and we’re going to leak it unless you pay us a ransom. And so, you know, this is a case where seeing what’s happening in the darknet and seeing what’s happening in the surface net go hand in hand. And as Justin said earlier, you don’t want to be on the receiving end of that. You don’t want to see your company’s most confidential data already posted or in whole or in part on social on surface web sites. At that point, you’re way behind. Your response is way behind where it should be. So, you know, it’s pretty easy to see what companies are being targeted in the darknet.

Justin: To build on Mark’s point, what’s interesting about information warfare is for it to be useful, you have to at some point make it public. You have to usually increase the value of the data by saying you have the data in some sort of public way. Now, maybe that starts in the darknet or maybe it starts in the public web. But that’s one advantage I guess the good guys have is when somebody has information on you, it’s only valuable when it’s being used publicly. So eventually they will reveal themselves.

Online Identities and Connections

Using social media and darknet data can help can help paint a picture of a cybercriminal or group. How can these data sets and tools. How can you use these data sets and tools in tandem?

Mark: I’ll give you an example. And I’ve referenced this earlier. A few years ago, one of our clients was being subjected to online disinformation campaigns in Latin America that they thought might originate in Russia. And it was actually causing physical attacks on their facilities in Latin America. They asked us to look at that in the darknet and see what we could find out. And this was a threat actor who was who was actually very active on social media in making threats against our client, but also was very active in the darknet. So we started in the darknet and we were able to trace certain activity and certain identities in the darknet, and we pivoted back to social media. We noticed that in the darknet he was using a specific username that was quite unusual, and we pivoted back to social media and started to see if anyone else was using that username in social media. And we did, we found that there was a user using that username on some fairly obscure social media sites.

We then pivoted to those social media sites and as is the case with many social media sites, we were able to identify both an IP address, located in Siberia of all places, and secondly we were able to locate contact details. We then pivoted back to the darknet and said, is this email address that has been identified on this social media site in use anywhere else in the darknet? And we found that that social media site tied directly to one of the darknet accounts that he was using to launch these disinformation attacks on our client. And we pivoted back and forth and back and forth, and we actually finally came up with, believe it or not, a social media post where the actor had not only posted his picture, but we believed in the end that he was actually acting at the behest of the Russian government. Now, that’s a perfect example where identities are in both the darknet and in social media. And to be honest, he was a bit sloppy in doing so. But that’s a hallmark of many criminals is that they can be sloppy and pivoting between. We would not have been able to do that analysis simply using darknet data. We had to pivot to social media and back several times in order to get to the conclusion that this was a Russian threat actor. It was probably acting at the behest of the Russian government in targeting our client.

Justin: I think what’s interesting about this is their job is not that different from most jobs, meaning if you’re going to have an ongoing concern where you’re trying to achieve objectives, then you need to establish an identity that is known in many worlds, right? Just like I’m on LinkedIn, I’m the same person on Facebook, I’m the same person on Instagram. So while they’re a little more opaque than we would be, obviously you still have to be identifiable across these various mediums and that gives a real opportunity for forensic analysis to follow things along that kill chain.

Social Engineering and Phishing Attacks

How does social engineering differ in social media and on the darknet?

Mark: It depends on what the social engineering is being used for. Phishing attacks are usually emails targeting specific individuals or groups of individuals with a view towards attempting to get them to open a data and corrupt their computer and then get access to their network or to the data that’s on their computer. Social engineering refers to broadly identifying those individuals or those targets ahead of time so that those attacks, those phishing attacks can be much more sophisticated. And I’ll give an example. I’ve been subject to social engineering and phishing attacks and a sophisticated attack, an unsophisticated attack. Is somebody sending me an email and saying, hey, you know, click on this article, it’s of interest. It would be of interest to you. A sophisticated attack appears to come from my CFO and says Mark, attached is a file which I need you to urgently look at and call me now.

Now, to get to that latter email, they have done some research on Mark Turnage. They have to know who my CFO is. They have to then build a template that looks as if it’s coming from my CFO. All of that occurs. All of that data is available in the darknet. My email address is available on my darknet. Biographical information about Mark Turnage is available in the darknet. And for most executives, by the way, it’s also available in the surface net. You can go to the DarkOwl website and see who our management team is. It’s very common for companies to post that data. And so pivoting back and forth between the darknet and social media allows the targeting that we are talking about, targeting of executives, targeting of individuals in organizations and in companies to enable criminals to do what they do.

Justin: The thing that scares me, well, Mark, I’m sure you’ve seen this too, is like how little information you need to do social engineering these days. It’s literally like five seconds of audio and you can clone my voice, basically. And Mark and I were talking before this phone call how we have the first, I think, political campaign ever creating somebody else’s voice today for ads, having somebody literally say what they don’t want to say and publishing that on television. So, I think we’re going to live in a world where social engineering and social media is going to be very personalized. To Mark’s point, because we’re all online, we all have identities, and it’s only going to get easier to trick people with more and more realistic content.

Mark: And to use the example that Justin gave, and I think Justin posted it in social media this morning. When you have deep fakes and you can imitate somebody’s voice or somebody’s a video of somebody really well, in an almost undetectable way. The opportunities for phishing attacks grow exponentially because imagine that that example where I get an email from my CFO saying, Mark, I need you to open this file. Imagine that instead of that being an email, it’s a voicemail. It’s or it’s a voicemail attached to an email that sounds exactly like my CFO. The range of the range of potential abuse of that technology is remarkable. I was just amazed, Justin, that the first use of it was a political presidential campaign. That’s the part that was a surprise to me. Not really phishing. It was just politics.

Justin: When we thought we couldn’t go lower, we go a little bit lower.

Reputation Risk

According to a recent report by Deloitte, 87% of executives rate reputational risks as more important than other strategic initiatives. What are your thoughts on that?

Mark: I think if I had to read behind that statistic, I would say I would guess that the reason most executives are worried about reputational risk versus other strategic initiatives is that they don’t control reputational risk to a large degree. Once an attack, say, a misinformation or disinformation attack is mounted on a company and recovering from an attack, a disinformation attack is inherently more difficult than almost anything else. So to Justin’s earlier point, you want to stay ahead of any disinformation attacks. You want to have a plan in place on how to react to them if they do arise. But if you can get early warning signals from social media, from chat rooms, from forums that people are targeting your company or your organization, and it gives you the chance to stay on the front foot as opposed to be on the on the back foot. I mean, am I right about that, Justin?

Justin: I believe so. What’s interesting about that statistic when I read that was in this business, I still remain very optimistic. People are understanding the risks and how they impact their business. I mean, that’s a very impressive number it generates from the C-suite. I believe most of that responsibility was put on the CEO or CFO of the C-suite, meaning they understand that this is a thing they can’t control. The other thing that was embedded in that study that I thought was really important was consumer perception, which reputational risk is kind of like the bigger version of consumer perception. But when it comes to the world of phishing and social engineering, people are really understanding that this is a problem, probably because they’ve seen many of their peers be burned at this point in time and they’re trying to figure out what to do. The big step now is now that we understand, the problem is how do you execute on it? You know, when those people raise their hands, how do Mark and I help them get systems in place that allows them to be protected?

Ethical Concerns and Legal Challenges

What challenges do you both face?

Mark: We at DarkOwl face a set of ethical challenges every day in terms of how do we collect data from the darknet in an ethical manner and make it available to legitimate clients and while respecting the privacy of people whose data has been posted to the darknet. So as an example, we don’t participate in darknet sites where purchase of data is necessary in order to participate because we don’t want to fund the criminal ecosystem. So there are clearly darknet sites that we will not collect data from. What we’re trying to do is return stolen data to its rightful owners or alert them to the threats that are arising from the darknet. So there’s a natural inherent balancing act that we have between privacy concerns, legitimate privacy concerns on one hand and the need to be continuously monitor this environment from which from which many threats arise.

Justin: In our world, we think a lot about the town square and public conversation and how important that is. And I think that when things are in the public square, our biggest ethical concern is not actually on our side. It’s on the people that are providing the public square. So we have major social networks that are creating these environments to have misinformation spread. A lot of other information is spreading as well. But misinformation is also spreading. And I think the thing we’re seeing, the thing that’s concerning to me and my company is that. These large social networks seem to be, as an attempt to save money, get profitable, abdicating their responsibility to moderate this town square. You can’t sell gasoline to a bunch of people and then be upset when everything is on fire. So that is the big concern I’m seeing, is that moderation is going down, which is causing for a rise of disinformation because they’re filling the vacuum that previously wasn’t there.

Kathy: What are the key technological or macro developments in the space to be aware of?

Mark: I mean, you know, everybody is talking about AI, and rightly so, to be honest. If there’s anybody on this webinar who hasn’t been on ChatGPT or any of the look a likes to ChatGPT, I would I would highly encourage you to do so. AI is moving at a very rapid pace and I think critically it will allow, Justin spoke in his introduction earlier, about the noise to signal ratio and the noisiness of data. Both our companies collect so much data that parsing through that noise to get to your particular signal is oftentimes quite challenging, even with the tools that both our companies provide. AI I think will enable investigators and companies to get to that signal much faster and to monitor in a much more comprehensive way. But with all technologies, it’s also used by the criminals. So we were talking about we were talking about deepfakes, but AI can be used in a criminal context as well. So, you know, it’s going to be an interesting challenge going forward to see both how AI is used to protect companies and how AI is used to attack organizations as well.

Justin: I was reading earlier this morning. So they did a study – they think there’s 220 websites, news websites that are just all AI generated at this point in time. So it’s up from like 73 months ago.

It’s like tools go both ways, right? You can create bad content and you can identify bad content. But if we learn the lessons from the previous versions of AI, which was recommendation engines, where the social networks keep generating more and more or surfacing more and more clickable content, which is usually conspiracy based or negative. Well, soon they’re not recommending the content. In that example, they needed a library of content or people creating content, but they’re going to be able to do that on their own in real time and test and then go, oh, this vein is working. Keep going deeper and deeper. That is a massive macro trend that I think is really going to change how we think about information and maybe in a weird way create a rise of journalism again, because we’re gonna need some validation because we can’t trust what’s in our feeds. And then the last one would be the one I just mentioned previously is as this rise of content is happening, social networks seem to be taking a step back from moderation, which again, I think is going to embolden people with ill intent.

Mark: No, I think that’s you know, I think that’s very clear, by the way. Another potential use of AI on the criminal side is if I were going to mount a disinformation campaign on a company or an organization, it can do so using generative AI could very easily generate an extremely professional sounding set of facts that are misinformation or disinformation and can be used in an offensive capability, and you can generate that almost instantly. So to your point earlier, where companies have at the C-suite level have to be cognizant of the risks they’re facing, that’s a massive risk because instead of responding to a disinformation specialist who’s putting out a rumor that your company did X, Y, Z or was involved in X, Y, Z, criminal act or bad act, you could be facing, you know, what looks like a legitimate article with legitimate sounding facts that’s been generated by AI. And then you’re up against a much steeper cliff in terms of responding. So what is interesting is most of these people are opportunistic. They’re taking a misstep and they’re amplifying it. But soon they’re going to be able to or probably today they’re going to be able to create a perceived misstep and amplify that. So you will be under attack from things that you had no connection to. But that won’t change how the consumer perceives you unless you’re very on top of that.


Interested in learning more? Contact DarkOwl and Socialgist!

[Speaking Session Transcription] What is the Darknet and how is it used in Cybercriminal Investigations?

July 12, 2023

Or, watch on YouTube

Have you ever heard of The Onion Router (TOR) ? Have you ever ventured onto the dark web, maybe a forum or a marketplace? Or have you heard of Open-Source Intelligence (OSINT)? Or have you ever been curious to learn more about what it is like to work in cybersecurity?

The American University of Cairo welcomed Richard Hancock from DarkOwl, an experienced cybercrime investigator, on the history and evolution of the darknet, how it is typically accessed, and how the darknet can be used in threat intelligence and cybercrime investigations.

NOTE: Some content has been edited for length and clarity.


Dr. Sherif Aly: Gives me a pleasure to introduce Richard Hancock today, who works for DarkOwl. Richard has a quite a bit of extensive experience in digital forensics and mining the dark web, if I can say. And it’s a good opportunity to hand it over to you to better introduce yourself and what you do.

Richard Hancock: Absolutely. Thanks a lot, Sherif. Thanks for having me and appreciate you guys taking time out of your day to listen to me speak about the darknet and all the cool things I see on there. So, I work for a company called DarkOwl. What we do is we have a user interface, a searchable user interface, that we give to clients that want to search on the darknet in a safe way.

Going into a little bit about my background before we go into what the darknet is and how to use it for cybercriminal investigations. So a little bit about my background. I have over 7 years experience as an open source intelligence investigator. I spent 4 years living in Amman, Jordan and Abu Dhabi as well. So some of the topics that I’ve focused on would be Arabic linguistics, counterterrorism darknet intelligence, social engineering, and cybercrime.

One of the things I focus on right now, my current job title is Darknet Intelligence Analyst and Sales Engineering Team Lead. It’s a really long title, but kind of my everyday. What my everyday looks like is, I start out my day getting onto various darknet forums in marketplaces and I direct our collections team to collect from the most high value content – usually digital fraud goods, counterfeit items that are things that would be of interest to our clients. So spending a lot of time in the darknet, and then also getting on calls and speaking to people to try to get them to pay for our platform. And I also wanted to share some of my other hobbies outside of this work because it is pretty serious work; you have to make sure that you have fun outside of work. After I lived in the Middle East for several years I returned back to Colorado, where I went to college, and I’m really big into backcountry skiing as well as DJing underground parties and house music.

What is the Darknet?

The surface net is what you guys would be most most familiar with; this would be any websites that are indexed by search engines like Google, Bing, etc. The deepnet – that’s just a layer further, it’s still the same sites that you’re accessing through those same search engines. However, you need some sort of credentials, username and password to get on to these sites. It could be Netflix. It could be social media. It could also be some criminal hacking forums that are accessible through the surface net.

The part of the Internet that we really focus on is here in the darknet. So in order to get on the darknet, it’s still technically the same Internet. But you need special software in order to access this hidden layer of the Internet, which is used for anonymous communication, selling drugs, or selling counterfeit items. This is the part of the Internet that we really focus on at DarkOwl.

You primarily access the darknet, using all of these software right here. The one that is most popular would be Tor, also known as the onion router. i2p is also popular and same as Zeronet.

The deep net, as I’ve mentioned, is still the same Internet but it’s accessible through search engines like Google and Bing and it can represent social media websites, Netflix, as well as some underground criminal forums that are not darknet specific. That would be like noel.to which is a hacking forum.

However, something that’s really increasing the last several years would be the rise of direct messaging platforms. So criminals are obviously going to be living on the darknet. They’ll be living on the deep web.

But how are they communicating with each other? Are they just using these marketplaces and forums to talk to each other? Not necessarily. An app that we’re seeing is really on the rise right now is Telegram. And that’s primarily because it’s really really easy to use. The cybercriminal ecosystem on Telegram is absolutely massive these days. Whether it’s right wing extremism, Islamic extremism activities, Russia Ukraine and the invasion, misinformation, and people selling Netflix accounts, etc.

History of the Darknet

Let’s talk a little bit about the history of the darknet. When was it created? The Tor browser was created by the Naval Intelligence Unit in the CIA in the United States, back in 2002. It was originally used as a way for agents to communicate with each other in the field, so primarily in places like Iran or Russia. It was just for military intelligence and communication. Since then it has evolved a little bit. It then evolved for agents to use the Tor browser to communicate with their family members, and then the next step was the Tor board of directors allowing public use of the Tor browser for free speech, for activism, for journalism, and then obviously cybercriminal ecosystems quickly grew on here.

So, going into this a little bit further, Bitcoin was created in 2009, and that’s what really facilitated the emergence of the marketplaces and forums, because it allowed people to buy things and make transactions. And in an anonymous way.

The first really big marketplace was the Silk Road. If you guys are familiar with this, you might know the guy, the founder, Ross Ulbricht. There’s a lot of good movies on Netflix, or documentaries that you could probably find on YouTube about this instance. If you’ve not heard of Ross Ulbricht and Silk Road, highly encourage you to check out that story. It’s quite fascinating. He ended up getting arrested by law enforcement in 2016, which marked the shutdown of the Silk Road. And I actually know some of the people who were involved in that investigation, in the arrest of that individual. As the darknet has continued, we’ve seen an increase of law enforcement presence in the rise of something called honeypots. So that’s when Russ Ulbricht, the Silk Road founder was arrested. At that point, that is when we really saw an increasing presence of law enforcement on darknet marketplaces, forums, etc. It really started with a lot of American-centric law enforcement presence but quickly expanded to other countries. And I will tell you from personal experience, one of the most savvy NATO countries in terms of darknet investigations would definitely be the German Government. The German Government is very skilled with darknet cybercrime.

2020 marked the twentieth year that the darknet has been around. The future of the darknet is really going to be interesting, because we will always see things like Tor. People will probably stick around. But as I mentioned, we’re really seeing an increasing use of chat applications which are not part of the darknet. But let’s say you’re a ransomware actor – you’re definitely going to be using Telegram just like you would those forums and marketplaces, or in the [.] onion sites where you start, where you actually are hosting corporate leaks, databases, and things like that.

Content in the Darknet

There’s a lot of different things on the darknet. Some things that are really popular in the media about the darknet would be drugs or assassins for hire, and while those things definitely exist on there, it’s not very actionable, especially for the kind of clients that we help in the kind of investigations that I am doing. The primary content that we’re seeing is hacking related. So whether that’s somebody that’s developed an exploit for a specific tool, somebody’s leaked source code for a particular company, or maybe somebody’s sharing leaked databases that contain usernames and passwords associated with like admin credentials for a company.

You know, there’s a lot of different things you can see on there: counterfeit items, passports, pilot certificates, cryptocurrency, fraud, credit card fraud is super widespread. And then, as well, as you know, drugs, weapons, there is quite a bit of child exploitation, child pornography material on the darknet as well. Unfortunately.

So pointing out some more additional examples and some of the things that we we are able to collect when we’re crawling from the darknet:

So when we’re crawling information from the darknet, we’re not scraping pictures like this [see image above]. We’re just scraping the raw text. In this specific example, we’re seeing somebody who’s hosted this information on a [.] onion site. I’m not sure how serious this threat was, but they were claiming to be targeting Donald Trump and Mike Pence for an assassination, and they actually included a QR code with a Bitcoin wallet address, and we were able to track that wallet. This is the kind of information that investigators use within our platform and our data to pull on strings and investigate individuals further, because if you’re able to identify a Bitcoin wallet with an individual on the darknet you can search upon that Bitcoin wallet and see where else they might be using it, maybe on Telegram, a marketplace or a forum. As I mentioned, there’s a lot of counterfeit documents being sold on the darknet. During Covid we saw a lot of Covid scams, tons of counterfeit, fake covid documents, vaccinations cards, as well as we see passports, drivers licenses, certificates and other things as well.

I did mention that there is extremism presence in the darknet. When ISIS was starting in 2014, they actually did have quite a big presence on an onion site. However, today we’re not seeing a very big presence of Islamic terrorists, Islamic extremist groups on the darknet itself. However, we do see quite a bit on Telegram. So this specific shot is from a group known as Jerusalem Electronic Army, which is loosely affiliated with the some Hamas cybergroups. And this is issuing out a target for a water sanitation facility in Israel. And these kinds of attacks, cyberactors targeting industrial control systems for critical infrastructure, is definitely something that’s on the rise. We’ve seen that in Russia, Ukraine. We’ve seen it within the United States, and I can tell you from a Federal government level within the United States, we’re putting a lot a lot of money and effort into building coalitions between agencies to monitor these types of things. Sometimes here at DarkOwl, we actually get agents who ask us specific questions about threats to critical infrastructure. So it’s something that’s on the minds of a lot of people these days. As I also mentioned, drugs are really big on the darknet, going back all the way to the beginning of the Silk Road. That’s what it was primarily used for. I would say, again, it’s probably not the most popular part of the darknet these days. Like I said, it’s going to be that hacking information – basically selling data on individuals and corporations.

This specific screenshot is showing AlphaBay Market, which is a really popular market that had temporarily gone offline after a law enforcement seizure, and then did come back online in 2021. This is something that we’ve seen quite a bit in the last 2 years. I know recently 2 marketplaces that have been shut down: Genesis as well as Monopoly market.

Something that a lot of people in my industry are very skeptical of is when a marketplace is offline by law enforcement seizure, whether it’s Interpol or the United Nations, Drug Enforcement, or whatever it is, if that marketplace or forum returns, at a certain point we pretty much consider that to be co-opted by law enforcement. So probably the admin of that site has been arrested, and maybe they’re using that admin for their skills and things like that. But they’re continuing the existence of that market or forum for the primary purpose of collecting information on individuals and surveillance.

I also mentioned credit card fraud, which is really widespread on the darknet. There’s just huge databases out there that people can easily pay for, that include, credit card numbers, bin numbers, as well as the personal identity, the PII, associated with the individuals bank account information. So that’s really widespread in the darknet as well as people who are selling methodologies to target specific banks. Maybe it’s check fraud, wire fraud, all different types of fraud. It’s really widespread not just to sell access to somebody’s credit card information, but actually to sell access to information, how to commit fraud against a bank or a credit card company.

Right here is an example of telecommunications fraud.

This specific example looks like spoof calling in India. This is absolutely widespread. Any company that has a large mobile application user base, eo whether that’s Coinbase, Netflix and those kind of companies are going to be targeted for fraud the most on the darknet. It’s actually, it’s pretty funny. And a lot of the investigations that we’re going through, from a government level, people are always asking about sophisticated nation state actors. But I’ll tell you, the people that I interact with the most on the darknet are really eager, like 15 to 17 year olds that are trying to become hackers. And for a long time people weren’t taking these individuals serious because they’re like, how serious can you take a teenager? Well, I can tell you that most of the fraud of those companies I just mentioned, UberEats and Netflix, etc – that type of fraud is usually perpetrated by teenagers, and it’s quite often these days when their parents aren’t home, a 15 year old, hanging out with their buddies Friday night, rather than you know, maybe 10 years ago, trying to take money from their parents purse, they’ll actually try to steal somebody’s Pizza Hut account on Telegram and get free pizza for the night. So kind of funny, the world that we’re living in today.

So different types of cryptocurrency used on the darknet.

If you want to purchase something in the darknet, be it a legal or illegal item, cryptocurrency is how you purchase that item anonymously. These 6 cryptocurrencies that are most used are: Bitcoin, Monero, ethereum, Zcash, Dash, and litecoin. There are others, for sure, and you will actually see on one of the emerging dark parts of the darknet called Loki – they’ve actually created their own cryptocurrency within their network, which is pretty sweet. Cryptocurrency is the primary vehicle for illegal transactions on the darknet, and as I mentioned, monitoring cryptocurrency and wallet address activity is a really good way to monitor cybercriminal activity. And when we’re dealing with law enforcement, this is one of the primary vectors in the primary information that they’re searching within our platform.

How do we get to the Darknet?

I had mentioned Tor, the onion router. This is this the primary way people get to the darknet and as I mentioned, it was created all the way back in 2002 – I’m sure it looked a little bit different. When you do get on the darknet, you can enter addresses above in the search bar, or you can search for DuckGo. But the thing that you guys need to understand about the darknet is this is a community and you can only find information if you become an active member of the community. So what I’m trying to say is, if you want to search in that search bar, show me the top 10 criminal marketplaces – you’re just not going to get anywhere. If you’re a new threat actor, you’ll start on one site and that’s called Dread. Dread is the reddit equivalent of the darknet. It’s a great place for young hackers to start their journey and to find links to different marketplaces and forums and basically to interact with users who might be vendors selling illegal items on those forums.

Dread is kind of the the starting point if you will. But you need to know what the URL is for, that there might be a way you can use some open source, Google dorking technique on Google to find some links for that. But it’s really a need to know. And yeah, as as you find more and more links, you get deeper and deeper into these communities.

There are other ways to get to the darknet. It is really popular, this actually re-surged in popularity, since the Russian invasion of Ukraine. So there’s a huge, heavy Russian language, cyberactor presence on this site. It’s a lot more difficult to set up than Tor. If you want to set up the Tor browser, you really just need to have pretty basic understanding of setting up virtual machines, manually configuring proxies and downloading the Tor browser, and using burner numbers and things like that. But I2P is a bit more technical in terms of setting up the server. And it’s something that I’m actually trying to learn more this year, because it’s a it’s a part of the darknet that’s been growing recently, especially with the Russian cyber threat actor community.

There are other ways to get there. There’s a lot of different ways to access the darknet. The one I primarily use is gonna always be Tor but also ZeroNet and FreeNet. What you need to know is the darknets evolving and changing constantly. I keep mentioning Loki, and that’s because it’s quite interesting, because they have their own cryptocurrency known as on oxen.

How is Darknet Data used in Cyber Investigations?

So darknet intelligence is just a one component of open source intelligence. Open source intelligence – there’s social media intelligence, there’s private intelligence. There’s a lot of different kinds of intelligence data feeds that investigators use to conduct investigations. Darknet data is really useful to add into the full spectrum of sources that you’re using. So you can make informed decisions to strategic decisions, right? So if somebody’s looking at somebody’s username on a clearnet, maybe on a social media website, maybe they’re using a similar username on a darknet forum or some other tools.

If you guys are interested in open source search techniques, Michael Basil’s book right here. This guy is awesome. He is really, really, really, knowledgeable and he’s got the most extensive book for all the different types of open source intelligence, searching techniques. If you guys are interested in this stuff, highly encourage you to check him out.

And here’s a quick example of basically what I was explaining; searching a username on Google and then eventually leading out to darknet forums. So in this specific example, we found somebody had asked us about an individual who goes by the name of Ninja Shopper. So we first search that on Google and find a YouTube page. And then we were able to find actually a Discord server where this individual has a presence as well as another alias. We found this guy and his sunglasses over here, and his long beard – looks like a pretty typical sitting behind the computer threat actor. We were then able to find a Github account, which gave us more information, which eventually led us to this male avatar for unique sunglasses, and then searching for this username, these 3 usernames, I should say we were able to find this individual’s presence on darknet forums like RaidForums. Some personal information was leaked on RaidForums. So this is just showing you that these are the kinds of investigations that we’re doing all day. Some of the other illegal activity you’ll see out there is cyber espionage, threats to public officials, child abusive materials, wildlife trafficking, domestic extremism, drug trafficking, threat against critical infrastructure, credit card fraud, telecommunications fraud, counterfeit documents, malware, and a lot more.


Interested in learning how DarkOwl can help with your darknet investigations? Contact us.

[Webinar Transcription] Track Your Relative Risk on the Darknet

May 16, 2023

Or, watch on YouTube

With cyberattacks increasingly on the rise, organizations need better intelligence to safeguard themselves, employees and customers from incidents such as data breaches and ransomware attacks. This rise in illicit cyber activity only increases the need to protect against and determine the likelihood of these attacks.

Cue DarkSonar – DarkOwl’s latest product that serves as a relative risk rating that considers the nature, extent and severity of credential leakage on the darknet to provide a company with a signal that acts as a measurement for a company’s exposure.

In this webinar, attendees:

  • Reviewed the latest stats around the growth of cyberattacks
  • Learned why modeling risk is essential for all organizations of any size
  • Learned how DarkSonar can inform threat modeling, third party risk management, and cyber insurance
  • Saw first hand how DarkSonar can potentially predict the likelihood of cyberattacks

For those that would rather read the presentation, we have transcribed it below.

NOTE: Some content has been edited for length and clarity.


Kathy: I’d like to thank everyone for joining today’s webinar, Tracking Relative Cyber Risk on the Darknet. My name is Kathy. I will be your host. If you have any issues with hearing the audio or seeing the slides during the presentation, please feel free to ping me privately in the Zoom chat function or email me directly. Now I’d like to turn it over to today’s speaker, Ramesh, our Chief Technology Officer here at DarkOwl to introduce himself and to begin.

Ramesh: Great, thank you, Kathy. Good morning, good afternoon, good evening, everybody, wherever you are. So today I want to go over some very exciting innovation that we’re doing at DarkOwl as it relates to risk modeling. The topic for today is track your relative risk on the darknet. We’ll go over that in the next 35 to 40 minutes. Just a little bit about myself: I am the CTO at DarkOwl. I have over 25 years of software engineering and technology background, worked in a lot of firms as it relates to data risk mitigation, risk modeling, big data, real-time communications, and so on. So today’s agenda we’re gonna cover what is the darknet or the dark web, and where does dark, we’ll share some metrics and statistics about the cyberattacks and the growth of them over the last several years, what is risk modeling and why is it essential for your company and every organization that you partner with.

We have launched a new product, DarkSonar, which is a very interesting way to notify you about threat vectors and threat modeling, third party risk management, and if you are in the cybersecurity insurance business, this would be a very important topic as to how to quantify risk. And last, we will see some of the case studies and some of the firsthand insights that we have been gathering at DarkOwl on how DarkSonar could improve the likelihood of the prediction aspects as it relates to cyberattacks.

Darkweb 101

Okay, so without further ado, let me get started with what is the darknet. There are a lot of different terms that people throw around – some use darknet or dark web.

Essentially, these are the ones that you see in the bottom. So we go bottom to top. The darknet is a group of anonymized networks. They require proxies, and p2p type networks. They are specifically chosen by threat actors to hide their activity and they’re making a concerted effort to be a part of these networks. So that’s where you see the traditional ones, which are the onion or tor browser, I2P, ZeroNet. And there is a whole host of newer networks that get added every now and then. So that is truly the traditional darknet. But then what we have seen in the recent years is there is also a lot of activity in the deep web and the surface web. So, deep web is defined as anything that is behind an authentication wall, meaning anything that is behind a user ID and a password.

So that is where you would have things such as social media, your banking applications, but more importantly, as it relates to the darknet, a lot of the threat actors use the deep web for criminal forums and marketplaces, which we will talk more in detail, as well as any surface web links that you see that are only available once you have membership or credentials to access them. On the right hand side, you see a lot of chat platforms, for example, Discord and Telegram are very much in the news these days because that is where quite a lot of activity as it relates to darknet happens, whether it is exposing breaches or it is critical conversations, marketplaces, and so on. So chat platforms have grown in priority overall. Then the last, but not the least is what we all know as the surface web, which is everything that is indexed by search engines such as Google and Bing.

For the surface web, our focus at DarkOwl is more the high risk surface web, which is not just any webpage out there, but specifically, websites, domains, platforms that people use to collaborate, such as paste sites where you paste images or file sharing sites, discussion boards, GitHub is a big part of it. So the way it is color coded here is the ones that are in, um, the oranges and reds – they are all part of our current collection capability. The ones that are in deeper gray or black are the ones that we are having plans to go after and collect data. So that’s the picture you see on the right hand side. It truly is like an iceberg. We look at the world as what you see on the surface web – it is a very small sliver of the actual data and more interesting data and criminal activity happens on the deep web and the darknet.

Kathy: We have a question that has come in – how big is the dark net?

Ramesh: Great question. There’s no easy answer, but I would say that, a good one fourth of the web, about 25% is behind some form of user ID/password, and a subset of that would be darknet. So it’s really hard to quantify how much data there is in the darknet, but what I can tell you is as far as DarkOwl, we have over two 50 terabytes of data, which is specifically going after the deep and the dark web. So it’s quite extensive, but it is really hard to quantify as a percentage of the overall web.

DarkOwl by the Numbers

The next slide is little bit more about the metrics and numbers that we collect.

So at DarkOwl we have quite a lot of data that we’re actively collecting and we are essentially a data company, which is why we have several million documents. We have forums and marketplaces that are out there, and then we also have the Tor, I2P, ZeroNet, Telegram channels, and whatnot. So, given that, what I want to specifically draw your attention to is the amount of data that we have in terms of growth is clearly on the chat platforms side which which are pretty active as they contribute quite a lot of what we call entity data, which is the stuff that you see in the bottom: email addresses, IP addresses, credit card numbers, crypto wallets and crypto addresses and so on.

On any given day, there is between 1 to 3 million documents that we collect in a 25 hour period, and it’s a combination of crawling various sites and platforms and so on, as well as processing leaks. And leaks continue to keep increasing exponentially ever since the Ukraine conflict. So the bottom line here is the data is a whole variety of data. It is very disparate, and it all has to be collected from multiple places. We normalize it into a common data structure, and then we make it available. As far as the delivery channels, which is what you see on the right hand side, you could use our UI, which we call Vision UI. There’s a whole host of API endpoints that we make available for our customers.

You can search in the data, you can pull out entities and the recent docs, you can consume DarkSonar via API, same thing with ransomware. And if you have a customer or if you have a use case that would want our data, then we’re happy to license the data via data fees. So there’s a variety of mechanisms for you to consume all of this data that we’re collecting, curating, and making available to our customers. Okay.

Specifically this slide talks about what is in our data. So here are a few examples. So the high level way to look at it is the top left versus the bottom right, wherein the top left is the traditional things that you go after the clearnet, or even the Tor and the Onion browser.

Versus the bottom right is there where there is more and more need for us to have personas to do the authenticated access into the deep web services. So the things that you see up there, they’re just various categories of data that we have. So we go after crypto and pay sites, and darknet classifieds and blogs and ransomware gangs and so on. The authenticated ones include marketplaces, carding, going to chat rooms, ransomware gangs, social media, any discussion forms. I do wanna make sure that there’s a clear understanding – we do not touch any pornography content, or we call it CSA, which is child sexually explicit and adult material. That is not our focus. We do not want to be collecting images that are highly objectionable and criminal in nature.

Everything else, in terms of all the topics that I mentioned, we’re actively collecting. So some of the stats that you see up there, we have over 232 ransomware domains. We actively monitor 400, almost 500 marketplaces. Believe it or not, we not only have English, we have 51 other languages that we see in the darknet with Russian and Mandarin being at the top not surprisingly. So that’s kind of the diversity of data and how dispersed the networks are.

Cybercrime is Booming

Moving on, we all know that all of this data collection is there for a reason – because crime and cyber crime continues to keep booming and growing exponentially.

I’m not gonna read every one of these data points, but I think we all can agree that post Covid, the attack vector because of people working from home, the home networks are not as robust as corporate networks. So that just significantly increases the attack surface. The Russian Ukraine conflict has exploded, not just the Russian and the Ukraine side of the war and the data leaks that are out there between both parties, but it is truly a third party risk issue where every company who is dealing with a vendor who is in that part of the world is impacted one way or the other. And so we at DarkOwl keep seeing that this continues to grow, and customers and companies out there are struggling with the amount of alerts that they’re being subjected to their SOC teams and the XDR platforms and so on.

DarkOwl provides a more asymmetric and a unique insight that you don’t get from your traditional corporate security processes and procedures. So again, data is growing, crime is increasing. And we also see that ransomware gangs are becoming very sophisticated. They offer customer service and that’s why the term ransomware-as-a-service is in the vernacular these days, because it is a truly massive problem that we’re all being subjected to.

Kathy: We have had a couple of questions come in. How do you know when a company is being targeted on the dark web?

Ramesh: It’s a good question. There’s multiple things going on in the dark web. So one of the ways that a company can pay attention is, look into the darknet. For example, if you’re using our product or our Vision UI, you can set up an alert, which is basically a way to monitor your company domain and subdomains. And anytime there is any activity about your company, be it a conversation that is happening in a forum, or be it a marketplace where something is being sold, either your company credentials or your AWS keys or what have you, it’s always a good idea to set up these monitors so that you can pay attention to what’s going on. The other is, obviously we’re going to cover the DarkSonar, which is a numeric objective way to see what your risk tolerance levels are over time. You may be thinking you have really good security policies and practices, but it is super important for you to also look at products such as DarkSonar so that you know that you are either at or below the baseline of security and compliance that you should be at.

Kathy: Don’t threat actors only target larger companies?

Ramesh: You know, conventional wisdom is that threat actors would go after bigger companies that are much bigger in revenue, they have a bigger wallet and whatnot. However, you’d be surprised, there is a lot of targeting that happens with small businesses, with smaller educational institutions, from counties to hospitals to you name it, because the threat actor or the criminal is looking at two angles. One is, how much money can I make? And the other is, how little effort do I need to put? So a lot of the companies, the larger ones have gotten pretty sophisticated. So there needs to be a level of sophistication for the criminal to organize themselves and attack, versus there’s lots of much easier, smaller targets to go after. So I’d say the answer is, it really is all of the above. They go after the large ones. They also go after the small ones.

Risk Modeling

Okay. So let’s move on to what is risk modeling. Now, there’s lots of frameworks as it relates to risk modeling.

We’ve all heard of the NIST, which is the largest one in terms of a governing body that defines risk models, but there’s also other modeling tools available, such as ISO, CIS, ISACA, OWASP and so on. Depending on your company, depending on your needs, it would be good for you guys to pick your risk modeling strategy and a framework and then out of that framework, you also need to really pay attention to who are the stakeholders. Like, how do you want to make sure that between your SOC, your data protection folks, cyber governance, CISOs, if you’re in the cyber underwriting space, insurance brokers, underwriters, if you’re a startup, let’s say you might have VCs, investors, M&A things going on if you’re national security or a public government organization, the policy makers, any military operational decision makers.

It all depends on the type of stakeholders that you need to keep in mind as you build your risk modeling practice, right? So all of these type of assessments are, at end of the day, they’re defined by NIST, and these are to identify, estimate and prioritize the risk associated with your organization. So I’d highly encourage folks to take a look at these standards because they all try to achieve the same thing, which is be holistic, have a 360 view of your risk, rather than just pull in a hodgepodge of tools, to figure out what’s going on at any given point of time. So that’s kind of the risk modeling and the people that need to be be involved. And why does that matter? Because ultimately, when we talk about the darknet as it relates to ransomware just in and of itself is getting more and more sophisticated.

Ransomware

As I mentioned, there is a whole piece of the industry called ransomware as a service.

It starts with the threat signal, and you see the data flow associated with that. There is quite a bit of a lifecycle that that is involved when it comes to ransomware, and we’ve been watching the various ransomware groups and what we have seen is prior to ever executing a ransomware attack, the reconnaissance occurs either by members of the ransomware group or by a broker, the IAB. And this appears in tokenized mentions of the critical network data of your company. It could be credentials, it could even be mentions of your employees that would be targeted for social engineering. So on these forums, the threat actors are also discussing things like the common vulnerabilities like the CSV, and find ways to exploit and come up with techniques to exploit them.

They also come up with techniques to break your antivirus, evasion campaigns and so on. All of these are ways in which they’re trying to poke holes into your network, and either they do it directly or through these brokers, and then we kind of capture that as the dwell time, right? So dwell times, once they are in the network, they are gonna start poking things around. And then there is advanced operations that could take days, or it could even be done in a matter of hours. And these threat actors use the traditional Mitre attack techniques. And then once they’re in the network, they’re laterally moving and they’re elevating their privileges one step at a time, and they get more and more access into your network, and they exfiltrate more and more valuable data. So the key thing is persistence.

The level of persistence and hiding they do is very phenomenal. I mean, it’s like, it’s beyond professional. They cover their tracks, they know what they’re doing, and once the data has been removed or stolen from your network, the devices are encrypted. Then they go into the payment cycle where they’re starting to get the extortion payments that they’re demanding. So as part of that lifecycle, what we see is the announcements then go on Tor or whatever data source. They’re advertising the fact that a company has been breached, and then the data is stolen and all of the subsequent PR and all the other challenges to the business. So even though the data is not shared immediately as a data leak, it’s typically repackaged, shared and curated by the threat actors because they want to find the takers – how important is that data breach for that business?

And then they notify the suppliers, the vendors, possibly customers, any contractors, and they keep capturing more and more of the attention that this company they have targeted, they’ve been successful in targeting and exfiltrating, and now they’re looking for ransom, which means the temperature of the company that was a victim keeps going up, that they better pay the ransom amount, otherwise this keeps getting published to their customers and their partners, and it just keeps getting worse, right? So it is kind of like the threat signal always starts with somebody that has gotten access to your network, and then they’re raising their privileges, they’re grabbing the data, they’re publishing that, and then they’re collecting ransom. So given that lifecycle, and there was quite a lot of words there, but the bottom line is these attacks are on the increase.

They are on the increase globally, not just the US and UK but most of the Western regions where we can track them. A lot of the world is being subject to this, and there is also a need for a critical understanding of what are the motivators of these criminals and why are they doing what they’re doing? So understanding such type of risks is not just a nice to have, it’s a must have for any organization, large or small, and be prepared for these type of potential threats. So the takeaway here is be sure that you have a risk mitigation strategy. Look at some of these networks and protocols for risk modeling and truly understand what you and your company could be subject to as part of ransomware and the sheer fact that cyberattacks are on the increase.

DarkSonar API

Now, having said all that, what we’ve been busy in DarkOwl is building a product called DarkSonar. DarkSonar is to address some of the challenges that we have seen from our perspective. Essentially DarkSonar, we like to think of it as a signal. The signal is to inform threat modeling, third party risk assessment. It applies for cyber insurance, anything to potentially predict the likelihood of attacks. In other words, DarkSonar is a cyber risk rating. It is based on an algorithm that measures an organization’s credential exposure, primarily email password exposure over time. So it’s not just a one shot snapshot, we’re monitoring the health of your business and the credentials over time. And because emails are primarily leaked and sold in the darknet, they constitute a major vector for cyber and ransomware attacks. And we measure such exposure on an ongoing basis with DarkSonar.

This enables the organization’s customers and third party risk management folks to get an awareness and understanding of what your weaknesses are, what are your soft spots are, and you could be proactive in taking these mitigation steps rather than find out that it’s too late and you’re being subject to a ransomware attack. So it would be a mitigation step to prevent data theft, to prevent loss to your revenue, to your profits, loss of reputation, because at the time of ransomware, usually it is too late. So what we did is, as part of building the DarkSonar, we did an analysis of over 250 companies, well known companies to lesser known ones that suffered these cyberattacks. And we saw that in 65% to 75% of the cases, when we saw an elevated rating, it was having a direct correlation to a few months after an elevated rating.

There was an attack, and I repeat it is in 65% to 75% of the cases, we see a direct correlation that elevated risk rating equals elevated chances of an attack happening. So that was pretty powerful. And here is a little bit more breakdown of the data, the type of data that DarkSonar uses – so credentials, as I said, is emails and passwords, aka combos. We do see, not surprisingly, there’s quite a bit of plaintext passwords that we see in our collection efforts in our database. So that’s the big part of the pie that you see along with there are hashed passwords, and there are some where we get the email, but we don’t get a password, right?

So the way the DarkSonar model is built is, it’s primarily based on credentials. But we have a waterfall approach in the way we have designed the model. So first up is there is weightage given based on presence of emails of your company, meaning email of the domain that is in question. So they are unique plaintext passwords or hashed passwords, or just the sheer number of emails that we see. So that is weighted. The other thing that we also weight is the time and the time series. Did we get a breach recently which contributed to these emails appearing, or was it happening six months ago or nine months ago? So the older the data is, the less it is weighed in our algorithm. And we also consider duplication.

Duplication is kind of a vast topic. I technically call it correlation, but essentially is the data leak being reposted with the exact same details, in which case it’s a duplicate, or is this being reposted with additional data? Some of it is similar, meaning they’re correlated to the previous leak, but a lot of it is new information. But one way or the other, the sheer fact that threat actors are reposting, your company or your organization’s leaks over and over again is cause for concern. So our model accommodates the fact that there are these are things that are weighted both based on time as well as the number of times it gets posted, the duplication ratio, and then the baseline metrics we provide is based on the overall volume. So our API through which DarkSonar is available will give you data for the past 24 months, and it gives a relative risk rating for the organization in terms of the distance to the mean.

It’s like the bell curve that’s displayed here, you would start with zero, which is right in the middle. If it is in the negative, that means it is good. Meaning there is not that much exposure. If it is on the positive, anything that is greater than or equal to one, it means there is a cause for concern. So, one more time, back to what I just mentioned. Our results show that elevated exposure, meaning if DarkSonar were to say that the exposure is greater than one, an elevated exposure and the sustained elevated exposure over the last four months is a direct indicator that there could be a possibility of an attack in 74% of the cases. So that for us was very powerful. Any questions on this so far?

Kathy: Does DarkSonar distinguish what the username/password combos are used for?

Ramesh: The short answer is we do not distinguish at a per user username password basis, but we do collect the aggregate of all the usernames that are being exposed, specifically the emails, but not the username per se. We’re mostly focused on email and passwords.

Full statistics and chart can be found here.

One thing is DarkSonar is a good indicator of risk. I do want to highlight some of the threat factors here and what should be applied in which scenario. So if you’re looking for phishing emails, for example, and that there is quite a lot of phishing attacks, then DarkSonar would be a really good tool for us to assess. Same thing with third party risk management, third party supply chain, DarkSonar would fit pretty well, any weak or compromised credentials.

If you have any compromised credentials, then that would be directly visible in DarkSonar. However, there are things like brute force attacks, unpatched vulnerabilities, cross-site scripting, man in the middle attacks, right? These are not exactly things that are involving emails and passwords all the time, but our platform, which is the Vision UI platform, as well as the API endpoints and the entities that I talked about, these would all help in understanding such type of threat factors like the brute force or the unpatched vulnerabilities, the cross-site scripting, the man in the middle, DNS poisoning and so on. So think of it as using the right tool for the right job. It depends on what threat vectors you’re interested in. Some of these threat vectors DarkSonar would apply, and other threat vectors, you might be better off using our Vision UI or our Search API or entity lookups and so on.

Okay, so now comes kind of the interesting part. So all that theoretical risk model, what does that mean for companies? So I have some use cases and companies as examples to kind of walk you guys through.

So here is the famous Colonial Pipeline incident that happened in April of 2021.

So Colonial Pipeline is one of the largest fuel pipeline and its breach literally had created shortages for oil and gas up and down the East coast. And this was a result of compromised passwords.

The, the interesting thing is we saw an elevated level of DarkSonar. As you could see, it was kind of hovering in the negative zone, which is good. And then in September of 2020, we’re starting to see the increase and the elevated risk, and then it became 0.5 and then decline to one. Anything above one, like I mentioned, it definitely has a clear indicator of risk, and that is where in from our data, we saw that a month prior to the attack, which is back in April of 2021, we were seeing that elevated risk. And then in May, the attack happened, right? So DarkSonar was able to detect based on these credentials, which are easy to do, the account takeover and instigate that attack. So that was on the Colonial Pipeline case. The next one is General Motors.

General Motors, same thing. We see a three month window where there was an elevated signal to the time that the attack was announced. So again, part of the challenge is when big companies, and you know, big outlets have this challenge, it becomes a media issue. Many of the companies do not report it. They try to pay up the ransom or negotiate with the criminals, whatever they’re doing on the backend. It may or may not be in the news, but we capture what we gathered from General Motors from the time that they had announced, which is April of 2022. When we go back in time and look at March and February and January, there was a clear elevated risk. So our DarkSonar model detected an abnormal increase in the plaintext and hashed credentials, literally months leading up to the attack. The next one is Fujifilm.

Same type of thing where their servers were infected with ransomware and nobody would ever know when the exact ransomware was launched and what exactly happened. But according to the bot ransomware, it came through a phishing attack. And clearly the takeaway is we detected an increased email exposure prior to the actual attack happening.

The last one that I would say is back to the question, that was asked earlier about smaller companies – you’re still very much vulnerable for these type of attacks. And in the City of Tulsa’s use-case, we saw a five month attack window of elevated risk as it relates to DarkSonar. The signal was elevated for five months prior to the attack. So the reasons were really the group installed ransomware in late April, the program began to operate the city firewall and other security protocols were kicked into the city’s technology department. They took their time, but the bottom line is this was months in planning by the criminals. And we see that elevated risk as far as DarkSonar literally five months prior to the attack.

Kathy: Can you answer, what is the likelihood of a breach if the signal goes above one?

Ramesh: Anything above one, there is an increased exposure. An increased exposure would correlate to increased risk. An increased risk would correlate to, there’s much more chance of a breach. So I would say anything over one, companies need to be really, really careful. Pay attention, take the proactive steps, rotate your passwords, put in the multi-factor authentication on your servers, whatever you are doing. As for security operations and proactive things y you should, you should be careful, right? Does that mean anything below one is fine and dandy – I would say look at DarkSonar as another way and another tool in your tool chest. And if it is over one, that means the temperature is going up, right? And if the temperature goes up and it keeps going up and up and up, bad things happen. So that’s the best response I would give. Is anything over one, you better watch out.

Okay. As I mentioned, here is a little bit of technical detail on how the docs owner API is represented.

We give you the results based on a company domain and we give it to you a JSON format. And like I mentioned, we present that data for the last 24 months, and we give you the rating as well as the baseline and the signal we will indicate if it is low or elevated or high, right? And to Kathy’s earlier point, anything about one would be elevated. So if you are in the low category, that’s good, you have good security best practices. If you’re one or above, it’s time for you to pay close attention to what can you and your company do to mitigate these type of risks.

So again, it’s predominantly available via API, you can hit individual domains or you could hit multiple domains at the same time. It’s up to you. And then the results, like I mentioned, is it’s really, we did the internal analysis for the 237 publicly disclosed attacks between the last couple of years, 2021 and 2022. We see the accuracy is very strong. We were actually surprised it was this strong a correlation between the risk level and the attacks. So all attacks was 74%, ransomware was 75%, breaches was 74%. So it’s tracking pretty closely to a very high percentage accuracy for the elevated risk versus the attack. And then we also went to some of our customers. We went to some of our prospects and we call it the beta clients. And we did a pretty extensive evals on the attacks. And we see that there is a pretty strong correlation there as well.


Interested in learning how DarkSonar can help alert for potential threats to your organization? Contact us.

[Webinar Transcription] The State of Secrets Sprawl 2023 Revealed

April 05, 2023

Or, watch on YouTube

In 2022, GitGuardian scanned a staggering 1.027 billion GitHub commits! How many secrets do you think they found?

This webinar details the findings of The State of Secrets Sprawl from GitGuardian, the most extensive analysis of secrets exposed in GitHub and beyond! Speakers Mackenzie Jackson, Security Advocate at GitGuardian, Eric Fourrier, Co-founder and CEO of GitGuardian, Mark Turnage, Co-founder and CEO of DarkOwl, and Philippe Caturegli, Chief Hacking Officer of Netragard, dive into the leaks in public GitHub repos, trends such as Infrastructure-as-Code, AI/ChatGPT mentions, and even investigate how leaked secrets move from GitHub to be sold on the deep and dark web.

Check out the recording or transcription to see the most significant trends observed in 2022, what to make of them for the future of developer security and get some practical tips on effectively managing and protecting your secrets.

For those that would rather read the presentation, we have transcribed it below.

NOTE: Some content has been edited for length and clarity.


Mackenzie: Hello everyone. I’m very excited to be with you all today. Today is all about our State of Secret Sprawl report. I’m going to present some high level findings that we have in the report. Then I’m excited to say that our CEO is here with us today. He’s going to be joining us and he’s going to answer some questions rounding with the facts and how we found what we found in the report. Then we’ve got the CEO of DarkOwl, another fantastic company. We’re going to be talking about secrets on the dark web and other areas of the dark web. You may notice that DarkOwl participated in our report if you’ve read it this year – so we’ve got some more facts than just from GitGuardian. And then finally, we have a hacker with us to give us the hacking perspective. We have Philippe, who is from Netragard, and Philippe is the Chief Hacking Officer. Netragard is a company that does lots of services, but one of their services is pen testing. So Philippe gets paid to hack into systems and he’s gonna tell us how he finds and uses secrets to hack into everything.

Report Findings

Let’s get straight into it. What are secrets? What are we talking about?

So, secrets are digital authentication credentials. It’s a fancy word for saying things like API keys, other credential peers, like your database credentials or your unit username and password, security certificates. There’s a bunch more, but these are kind of the crux of what we’re talking about. These are what we use in software to be able to authenticate ourselves, to be able to ingest data, to decrypt data, to be able to access different systems. So these are our crown jewels. What we’re talking about today is how these leak out from our control into the public and into our other infrastructure.

So what did we find in our report? So the State of Secrets Sprawl is a report that we have been doing since 2021 and it outlines essentially what GitGuardian has found throughout the previous year of scanning for secrets.

One of the main areas GitGuardian looks for secrets is on public GitHub repositories. GitHub is a pretty massive platform. There’s millions and millions of developers on GitHub and billions of code, billions of lines of code and billions of commits that get added every single year into this huge data of source code. We scan all of it every single year to actually uncover how much sensitive information is being leaked on GitHub. And we also have some statistics about other areas, but we’ll start off with what we find in GitHub.

So last year we scanned over 1 billion commits throughout the entire year of 2022. So a commit is a contribution of code to a public repository in GitHub. That’s what we’re classing as a commit. If you’re not familiar with the terminology, you can think of it like uploading code. This happened a billion times last year in 2022. So that’s a huge amount of developers. There’s 94 million developers on GitHub and 85 million new repositories. So the numbers on GitHub are pretty astounding. HCL Hashicorp Configuration Language is the fastest growing language on GitHub. This is interesting because this is an infrastructure as code language. So this is actually kind of bringing about infrastructure as code brings about new types of secrets.

We have released our report, so some of you may have already seen this, but we found 10 million secrets in public GitHub last year. This is an absolutely huge number. So what we’re talking about those API keys, so it’s credentials, we found 10 million of them in public. So it’s a pretty astounding number. And we’re going to break down exactly what we found in this report. But essentially what you need to know is that this increased by about 67%, and this is pretty alarming because the increase in volume rose by about 20% last year. So the volume went up by 20%, but we still found much more than 20% extra secrets this year. Last year we found 6 million. Now the only area that may explain some of it is that we expanded our detection, but not nearly enough by this amount. So it shows that the problem is really growing, which is quite alarming.

So this is the kind of evolution that we found.

And there’s a couple of things on that really stand out for me. The number one thing for me is the “1 in 10” that you see at the right. What does that mean? So there was 13 million unique authors that committed code last year. 13 million developers pushed code publicly last year. So if you’re wondering why this is so far off the 94 million at GitHub claim, that’s because not all users push code actively and then push code publicly. They may be pushing code privately, but we are just talking about public contributions. Public commits 1 in 10 lead to secret, 1 out of 10 developers that push code publicly lead to secrets. To me, this is the most alarming statistic this can finally put to bed – that it’s not just junior developers doing this. And there’s other evidence that we have around that. So this shows that it’s a really big problem and something that’s going to happen to a lot of us.

We also can see that about five and a half commits out of a thousand exposed at least one secret. And so this is the biggest oranges to oranges comparison that we had to last year. And it showed that number increased by 50%. So the total number increased by 60%, but an oranges to oranges basis, it’s increased by about 50%. So pretty alarming statistics.

Now this is a slide here. What countries leaked the most secrets?

This slide to me, doesn’t show what it appears to show – this slide doesn’t really show that India is the worst country for leaking secrets. This slide shows that probably India, China and the US have the biggest populations and they’ve got strong developer bases. So I think we can take this with a grain of salt. We can see that this is actually in line with what we’ll see with large engineering populations. If you’re wondering why China isn’t number one based on just that, that’ll be the largest population; there’s GitHub alternatives in China that are commonly used. So that’s probably explains that. So this more shows the frequency of use.

What type of secret leaks the most? The largest leaker is data storage keys. Next on the list is cloud provider keys, then messaging systems, and then private keys.

So we have specific detectors and generic detectors. So a specific detector is like for a cloud provider would be like AWS GCP. And then we also have detectors that catch what’s left over, which we know that this is a secret, but we don’t know what exactly it is for. This will be like a username and password. So we don’t know what system this username and password actually gives access to, we’re confident that it’s real, but we don’t have the additional information from that. And so we have different types of generic detectors. So generic password is a number one generic interview string, but we also have different types like usernames and passwords coming in at 2.8%. So pretty big jumps in there.

What name of file would commonly leak secrets? The biggest leaker that we have is env files. This is the most sensitive file and this is one that can be prevented easily with a .getignore file. We shouldn’t be letting env files in our repositories. And if we are looking at unique detectors, the number one detector that we find is the Google API keys. Next to that we have RSA private keys, generic private keys, cloud keys, Postgres, SQL and then we also have GitHub access tokens.

Secrets with Eric Fourier, CEO and Co-Founder, GitGuardian

Mackenzie: Eric, I’m gonna dive straight into some questions here. One of the things that I know a lot of people are interested in is how did get GitGuardian start and why did you start scanning public GitHub for secrets and other areas? How did this all kind of come about?

Eric: Yes, it’s a great question. I’m a former engineer and data scientist. So my background is more data science and data engineering. You can see it in the report – we share a lot of data analyzed, tons of data. As a data scientist that was used to work a lot with teams of data centers, and we use a lot of the cloud and the cloud’s keys to connect to the service, to be able to manipulate data and provide statistics on it. And actually in my time there, I was like, uh, really? I really saw the problem of credential leaks for example, AWS Secrets in Jupyter Notebook just to connect to your pipelines. And I was like seeing a lot of credential leaks and we said that essentially this is definitely an issue and could we train some algorithm and try some models to resolve this issue at scale? GitHub was actually a fantastic database of source code where I could train this model to detect secrets. And it started just like, I would say as a simple side project to see what we could find on GitHub. And it started by just analyzing the full realtime flow of commits on GitHub, starting with a few detectors with AWS and Twilio at the time and the first model built to find 300, 400 secrets a day.

Now you can see it’s way more, it’s more like 4,000 – 5,000 a day. After that everything went really quickly, we released pro-bono alerting. So this idea of, at each time we were able to find the key on GitHub, we send an alert to the developer saying, you basically click the secret here on public git. And after like, we received really good feedback from the community, created a free application for the developers and after monetize with product for enterprise. We continue with this product-led growth approach and trying to help the developer to provide secure code and start it this way and continuing on this path.

Mackenzie: Let’s talk about the report a little bit more. What has led to this increase in secrets leaking? We’re seeing it every year, and it’s not by like a marginal amount where it could be some small factors. Do you have any ideas or insights into why we are seeing this problem persist and keep going?

Eric: Yeah, it’s a combination of multiple elements. First, a few that’s highlighted in the report is there are more and more developers on GitHub. So we are scanning more and more commits. We have analyzed and scanned 20% more commits this year than last year. But as you said, it’s not just increase of the commits we are scanning – it cannot explain the number of sequences we’re finding. So on the other side, we’re also improving our sequence detection engine, meaning we are adding new detectors or detecting new types of sequences, but also improving our existing detectors. So our ability to detect sequences and trying to keep the precision with high meaning, not detecting too many false positives. So we always try to, especially on public data, keep a precision rate of 70%. I would say it’s really important in all security products to not flood the security team with too many false positives, because after, they just don’t look at the alerts anymore.

I will say, the third point is, even with that, the problem is not going away. You can find multiple ways to explain it; more and more developers on the market that don’t know Git, so need to learn. You can see in the report that there is no obvious correlation between seniority and the amount of sequence leak, but still, it can be the growth of the developers and the growth of junior developers can also explain why secrets are still leaking. I would say the issue is definitely not solved on the public side and on the internal side.

Mackenzie: Being completely honest, I expected the number to remain the same. I was even slightly expecting it to go down this year because there are some initiatives and the problem we’ve kind of become a bit more aware of it. So I was quite surprised to see that actually, we took another big jump up. So for you, was there anything that stood out in the report when it all got compiled that was surprising to you? You’ve been scanning public GitHub for longer than any of us, so does anything surprise you at this point? Or was there something in the report this year that was, that still continues to surprise you seven years on?

Eric: I really like the fun facts. I’m always amazed by the correlation of the number of secrets we find, the number of secrets leaked and the popularity of API vendors and providers that we had. We have this really different statistic with open API key to connect program fit to chatGPT that went from, we were like finding maybe 100 a week in early 2022, and now we are finding more than 3,000 secrets a week. So it jumped 30 times more than one year ago. I think it’s just past like Google API key, and you can see it, it’s really correlated with a trend of open AI right now with developers and in the tech in general.

The other thing, the number of leaks actually that are correlated to the user of a secret. So I think it’s amazing to see that in a lot of the past leaks; Okta, I should look at Okta, Slack and all the ones that are in the report, it’s at some point, it’s secret is not the starting point of the attack, but at some point a hacker is able to find a secret to leverage the attack and do lateral movement.

I’m also amazed by some vendors that when they have those code leaks, just try to minimize the incident. And for us as scanning the open source code, we definitely know that if a company is leaking its source code, they will also leak secrets, and so they will leak PII. So just declaring that leaking source codes is not bad because it’s not confidential information is a little bit, I will say maybe naive. It just shows that we have still a lot of education to do.

Mackenzie: That segues me into my final question for you. You talked a little bit about education this year. This is a two part question. What can we actually do about leaked secrets from an organization, so what can an organization do? And two, what can we as a community at whole, what can we as developers and community, what can we do to try and keep this problem, well prevent it from getting worse and potentially maybe one year, the number of secrets going down even?

Eric: So I think now we, especially if it’s publicly leaked on Git, when you reach the certain size of developers, what was a probability of leaking secrets becomes, I will say more of a certitude. So it means you will leak secrets and you are just waiting for it to happen. So you definitely need to put some mitigation in place, and especially after, I will say internal, when you look at more secrets in internal repos, we find way more secrets in internal repos than public repo. It’s a big challenge for our companies- the remediation, so how we are able to detect all the secrets, but after how you remove it from source code. I think in the industry and technology point of view, it’s really interesting. There is some stuff happening right now with, especially for passwords, trying to replace passwords pass keys. These initiatives are great, but they would take years even dozens of years probably. And it’s more like design for password identification than API keys and machine to machine identification. So, I will say API keys and Sequel 12, you have the rise of sequence managers, like Vault are trying to push protectionary measures such as dynamic key rotation, which are great.

To generate these dynamic tokens, you need long lift tokens that, we all know that developers hate regenerating token and generating this short lift token, and they prefer to use long lift token and install them once in their environment. So there are solutions, really promising on this side with them on the technology side. So on the detection side, I think a lot of work has been done. Looking at some API providers, they have been like doing some rework to prefix the key so it’s easier for a detection company like us or the vendors to detect them. It can be actually a little bit also controversial because it means that it’s also easier for attacker to detect them.

So maybe it could be interesting to work on some other way to do it. Maybe like signatures that are only known by different teams and stuff like that. I think there is definitely some innovation, but still can improve. I think we can do better. I think the big focus now, you have a lot detection is becoming more and more performant on the type. And it’s really, I will say a big, big target or goal for companies is prevention and remediation. So how I make sure that there is no more secrets entering in my code base and that the historic secrets get removed and I achieved this zero secrets in code. And yet it is a big challenge here is how to do mitigation at scale.

So you can use shift left and pre-commit for developers, educating developers, and really try, especially for large companies to remediate at scale. And it’s a big challenge, and we have seen it with our customers and it’s definitely something we are pushing for. It’s really this maturity for us as a vendor has shifted from being able to detect, and it’s always really important to be able to detect secrets. But now it’s really how we can remove all these secrets from the code base and make sure that we have no more secrets in code.

Mackenzie: It’s interesting you’re talking about the problem shifting from it being difficult to detect them and now it’s difficult to remediate them.

Eric, I am going to thank you so much for joining us now.

So we are gonna move on to our next speaker now. We have Mark Turnage, who is the CEO of DarkOwl.

Eric: Thank you. Very nice to be here. Thanks for having me, Mackenzie.

Role of the Darknet in Secrets with Mark Turnage, CEO and Co-Founder, DarkOwl

Mackenzie: Mark, can you tell me a little bit about DarkOwl as an organization and how you fit into this discussion today?

Mark: DarkOwl’s about five years old. We are a company that extracts data at scale from the darknets, and I use darknets as a plural. I’ll come on to that. The reason we do that is, we’ve accumulated what’s probably the world’s largest archive of darknet data that’s commercially available now. Why is it important for someone to do that? It’s important because many of the secrets that we’re discussing in this report and that we are discussing here today are available for sale or for trade, or oftentimes just for free in the darknet. And any organization trying to assess risk and trying to assess where their risk lies, has to have eyes on the darknet, across the darknet to be able to see where their exposure might be. And we provide that for our clients. Our clients include many of the world’s largest cybersecurity companies, as well as governments who are monitoring the darknet for criminal activity.

As you can see on this slide, we provide that data through a number of different means to our clients.

Mackenzie: You said darkwebs, plural. What is the dark web and how has it evolved to perhaps what I might have thought about it ten, five years ago?

Mark: That’s a very good question, because different people refer to dark web or darknet, as very different things. Traditionally, the darknet, originally referred to the Tor network and was originally, ironically, set up by the US government as a secure communication platform. But the key defining feature of any darknet, including Tor, which survives to this day, is the obfuscation of user identities, but the ability to continue to communicate in spite of the fact that a message or an email or a communication can be intercepted by somebody sitting in the middle, but still cannot tell who the users are. So, obfuscation of identities makes it an ideal environment for criminals to operate in.

This slide right here is actually a very good representation of that. When we talk about the darknet, we’re talking about the bottom of the slide. I mentioned Tor, I2p, ZeroNet. There are a range of other darknets that have grown up and these are places they generally require a proprietary browser, which is easily available to get access to. And these are places where people can go and congregate and discuss among themselves and trade data and sell product and sell goods, where the user identity is obfuscated.

The reason why your question is a good one is that people oftentimes confuse the darknet with the deep web, or even some high risk surface websites or messaging platforms. So right directly above it on this slide, you see the deep web, there are a range of criminal forums, marketplaces that exist in the deep web. We watch those as well. Everything in red on this slide we collect data from. And then there are high risk surface sites, particularly pay sites where data is posted from the darknet. Increasingly, and this is a real significant trend in our business, increasingly hackers, activists, malicious actors are turning to direct messaging platforms, peer-to-peer networks. The most active of those right now is Telegram. And so we collect data from those sites as well. Going back to the original comment, having eyes on the data that is in these environments is critical for any organization to understand their exposure.

Mackenzie: Putting this in context with the report that we released, how do these secrets and other credentials and areas end up on the dark web? And if a credential was in public GitHub, for example, is it possible that that will end up in the dark web for sale, for free?

Mark: Yes, absolutely. And so the first question is how do secrets make their way to the dark web? We estimate that well over 90% of the dark web today is now used by malicious actors. So activists, ransomware operators, oftentimes nation states or actors acting on behalf of nation states in the darknet sharing secrets, trading secrets, selling secrets, and it is the core marketplace for this type of activity that goes on. In the GitGuardian report, you see this very clearly. When you look at the range of statistics, many of those API keys, many of those credentials, certs, IP addresses, known vulnerabilities, code is put into the darknet, either for sale or oftentimes you will see actors simply share their secrets or share a portion of their secrets for free in order to effectively gain a reputation or gain points on a site to then be able to sell data at subsequent point. So there’s an enormous amount of data that is available in the darknet. If somebody just goes in there and sees it, the challenge without a platform like ours is to search the darknet more comprehensively and say, I’m looking for a specific API key, or a type of API key. And I want to see where these are appearing. Without a platform like ours, you don’t have the ability to do that.

At the bottom you see some of the statistics that exist in our database. We’ve taken in the last 24 hours, 8.4 million documents out of the darknet, at latest count as of yesterday, we have 9.4 billion credentials. 5 billion of those have passwords associated with them in our database. And obviously, you know it is stunning actually how much data is both shared and then re-shared in the darknet, so we have access to that. It is staggering the scale of what’s going on here, I’ll just pause and say one more thing, which is the darknet and the use of the darknet by these actors is growing more darknets are being set up, more data is being shared. This goes to your point, Mackenzie with Eric, how do you actually, how do you mitigate this? We’re seeing more and more data, not less and less data, available.

Mackenzie: I’ll ask you one more before I bring on our next guest. Are you seeing any other trends in the dark web that we should all be kind of aware of or should know about?

Mark: Well, I mentioned one which is the increasing shift to peer-to-peer, messaging platforms like Telegram, discord and so on. Another trend that’s been very interesting over the last 24 months is the impact of the Ukraine War on the darknet. Criminal groups on the darknet split apart as a result of the Ukraine war and spill each other’s secrets into the darknet. So ransomeware gangs in particular have split apart, some backing Russia, some backing Ukraine, and shared each other’s secrets. And what is really shocking is to be able to see their inner workings of how these criminal gangs operate. And so all of that is available as well on the darknet. But it affects what we’re talking about here today because many of the ways that code repositories are publicly available secrets are then exploited, they’re exploited by these very gangs. And you will see discussions about this vulnerability. Here’s a set of a AWS keys that we can use to get access to certain types of networks. You can see those discussions in realtime unfold on the darknet.

Mackenzie: Well, it’s very alarming. Mark, thanks so much for being here.

Secrets in the Hand of a Hacker with Philippe Caturegli, Chief Hacking Officer, Netragard

Mackenzie: So we have another guest here; we’ve brought on a hacker, Philippe.

Alright, first question; could you explain a little bit about what you do and what’s Netragard do and what you do as the Chief Hacking Officer at Netragard?

Philippe: So, we hack our customers. We get paid to hack our customers. Basically penetration testing is attack simulation. So, we’ll use the same tools and techniques and procedures as attackers or black attackers would use. The only difference is that at the end we write a report that we publish to our customers instead of selling money or publishing the information that we see on the internet.

Mackenzie: But from all that, you basically operate the same way a hacker would.

Philippe: Exactly the same, same method, technique, exactly the same way. And it goes both ways. So we simulate some of attacks that we see in the wild from attackers, but we’ll also try to come with novel techniques, or attacks that are then being mirrored by the bad guys.

Mackenzie: So with that in mind, how is it that hackers actually see secrets? We talked about them being on the dark web, we’re talking about them being in public. How do you discover them? How do you use them?

Philippe: I like to say that the internet never forgets. So everything that gets published on the internet, whether it’s voluntarily or not, a hacker will find it and try to exploit it. So it’s just, it’s not a matter of if, it’s a matter of when, it’s just a matter of time when it’s going to be discovered by an attacker as long as it published. So a few years back we used to have a few script and monitoring some of the dark web and trying to find some secrets and using googledocs to find those secrets. But nowadays there’s companies like GitGuardian or DarkOwl, that does a way better job than us at finding those secrets. So typically we actually use those platforms to find the secrets. The goal of the pen test is not necessarily to just find the secret, but it’s to show what we can do with the secrets or go beyond identifying the secrets, but it’s to exploit it and go beyond that.

Mackenzie: This is something that a lot of people have questions about too – Is that okay if I leak a Slack credential, for example, is it really a threat?

Philippe: Absolutely. Slack is one of my favorite keys to be leaked because it’s plenty of information that are not necessarily public, but that we get access to, by just having one API key. I’ll give you one example. In one of the tests, we actually found an API key for a Slack user. Used this key to actually start monitoring everything that was happening on the Slack for these customers, all the channels. There’s a nice API for Slack that’s called “realtime messaging”, so you can actually get all the messages in real time. Then we just sat there for like a week waiting for developers to share secrets or secrets to be shared. We didn’t stop here. We didn’t wait for a week. We were doing some other tests and attacks. I remember at some point we managed to compromise one employee his workstation. The IT security team discovered that we compromised this workstation through some alerts, and they started to do the investigation. And the way they did the investigation was ping the guy on Slack and say, “Hey, can you join this WebEx and share your screen so we can look at your computer?,” because the user was remote. Of course we had access to Slack, so we joined the WebEx meeting, and we sat for six hours looking at our customers, doing the investigation, like the incident investigation and trying to find what we’ve, compromised. So yeah, I start stopping at getting the keys to go all the way there and try to identify all the possible improvement that our customers could do to prevent it. Secrets are going to happen, but what can you do to lower the impact if it’s going to to happen? Can you detect it quickly and even if it’s leaked and it’s being exploited, how far can the attacker go, can they compromise everything from there, can they get access to more secrets or can they be stopped?

Mackenzie: Could you give us some examples of some attacks that you’ve done where you’ve actually, used secrets and how you’ve used these in real life attacking exploits?

Philippe: A few examples. This one, it’s pretty common: store on a web server, hoping that nobody would find it, or for some reason they share it. This was just reports, so it’s just a matter of finding it by browsing to this slash report, we could find all these documents. What was interesting in this document is that it was actually a configuration file. So that was a telecom company, and that was a configuration follow of their customer’s routers, including passwords and keys and all that. You can ask the question, is this a problem of misconfiguring the server or the developer or whoever? I came up with the ID to share the reports in a public website with secrets. I would argue that’s not even the configuration of this web server. The name of this file could have been found. It’s just that it was easier that the data listing was enabled, and we could find the files. But otherwise it’s just a matter of time to just try to enumerate and, and get those files. So anything that is on the internet, on a server that is exposed to the internet, should be considered public, whether it’s hidden somewhere or not, an attacker will find it at some point.

Then we can find things like this, these are my favorite, plenty of tools used by developers. This one again, was trying to hide it into a secret folder somewhere on the website. These are tools that are used by developers to try to debug their software or programs. These are my favorite cuz there’s not even a security. It’s just like we’re able to send queries straight into the database. We don’t even have to exploit any vulnerability. They give us access to all their internal tools.

The typical SSH key that we find on servers directly. One of the main differences between the tools like GitGuardian and publishing keys on GitHub and the work that you guys do is, it gets detected pretty quickly so it gets burned. I say burn, it’s like somebody’s going to exploit it within minutes of being published on GitHub. I don’t know if you have some statistics on that and how quickly the key goes from being published to being exploited. From our perspective as pen testers, it’s not as useful as it used to be because there’s now, there’s so many attackers or criminals monitoring this and exploiting it within minutes. The difference between us doing a pen test and the bad guys is that the pen test is in the point in time. So we have to be really lucky that a developer is going to publish a key or secret during the time of the pen test. But once the pen test is done, the attackers won’t stop. I mean, they are scanning the internet all day long and looking for things to exploit. The other difference between the pen test and the bad guys is the pen test is targeted to our customer, whereas the bad guys or the attackers, most of these attacks are opportunistic. So whatever secret they’re gonna find, they’re gonna go after the company that leaks the secrets, whether they purchase a pen test or not. That doesn’t matter to them. So that’s the main difference.

So when we can find secrets like this that are not published in public repository, they have a much longer lifespan and they can stay on the server for years without anybody noticing it. A few years back, there was AWS keys that would be leaked even on GitHub we could use, now within seconds they get disabled by AWS, which is good thing, but the reason they did this is because attacks were exploding. So anything that we can find that it is not publicly available, that’s why the dark web and the things that DarkOwl has, is also useful. Things that are still on the internet, but nobody really knows about it – it’s a lot more valuable because the lifespan of the value of this information is much greater.

So for this one, that was pretty easy. SSH key, just give us access to the kingdom. It’s just a misconfiguration and it turns out that they actually configure the web server through the directory and then we could get access to the SSH key. So from there, that’s my favorite kind of of misconfiguration, cuz there’s almost nothing to exploit. I mean, the exploit is just trying to find this misconfiguration or we have the key, we don’t even have to find like a crazy zero-day exploit of inability, use the key and we get in the server and then from there try to move on to other targets.

Mackenzie: That’s super interesting. We’re getting close to running out of time, so I am going to invite everyone back onto the stage and run through some questions.

Questions and Answers

Mackenzie: I’m assuming this one here is for GitGuardian and Eric, “Can different platforms be covered by your tooling, like GitLab, JIRA, Notion, slack, et cetera?”

Eric: That’s a great question. So we have a, actually, we have a CLI that is able to scan other platforms like dock images, S3 buckets. But like native integration, we still have all the VCS or GitLab, Azure, DevOps, and GI buckets. So because what we find in our analysis is most of the secrets are leaked on VCS right now, I think that the tough part is we need the remediation and if you succeed to remove all the secrets that would be great. But definitely there are sequences leaking in other platforms and it’s definitely a problem to tackle.

Mackenzie: The next question is for Mark. This one’s actually referring to dark web currencies. This question was asked when you were talking about the fact that they leak it for credit, for social credits – is there kind of level to this? So the question exactly is “would it mean that you would have to leak information to gain reputation, to gain access to, higher levels of secrets?”

Mark: That’s absolutely correct. I compressed my comment into a very short period of time, when I say credit, I mean reputation on a platform and oftentimes users have to share information in order to get to another level to get access to even more rarer types of data. So that’s absolutely right. When I talk about social credit or credit, I’m talking really about reputational credit.

Mackenzie: I guess this one here is for everyone. “What is your opinion on encrypted secrets? Does it produce a lot of positives by secret scanning tools? Does this make it more difficult to find from the dark web or exploit? Or can we uncover this, encrypt these encryptions when we encrypt credentials?”

Mark: I want to make sure we’re talking about the same type of encryption. Oftentimes credentials that are put in the darknet will have hashed passwords associated with them. And we can see that and we can detect that as a hash password, and we can actually classify that as a hash password. Clearly it’s gonna generate a lot of false positives by scanning tools. I think that’s just part and parcel of it. We’re getting better at understanding what those are and how, and categorizing that set of data. But I’ll defer to Eric as to whether or not they have an effect in terms of your scanning tools and do they result in false positives?

Eric: Yeah, so it’s, yeah, it’s a great question. So I will say for us, hashed credential is not considered as a secret. Now if it’s a non-encrypted credential, like for example, a certificate or SSH key, if the encryption is weak and it’s breakable, I definitely think it’s a leak. I don’t think it’s so common in term of frequency. So you find way more like actually private key and encrypted key exposed on GitHub. I will say it’s definitely a subject. Definitely we need to filter by, if we take all the unencrypted credentials, there will be way too many false positives. If we segment to those that that are weak encryption and that are currently, and that we can break with algorithm, it’s definitely doable.

Mackenzie: I have a question here for Philippe; “How much do secret volts actually keep hackings from accessing secrets? You know, how would you try and hack a company that uses as a secrets manager?”

Philippe: It’s actually pretty efficient, but with the caveat that because it contains so many secrets, it becomes the first or the primary target. So we had some examples where we had some customers using Vault. Sadly they were not using it the right way. So the root key was in the environment viable. So as long as we managed to compromise one of the server and get route access, then we had the key. And then from there, the impact is even worse because now we are not like stuck to just this one server, but we have access to all the keys from all the vault, and that was the root key. So not only did we have access to one vault, but we have access to all the vaults of all the customers. So there are pros and cons as long as it’s properly implemented and used, it’s very efficient. The problem is sometimes it’s not well understood and just having this key environment viable, gives you the key to the kingdom and it’s like the primary target for an attacker to go after this vault. So it’s good if it’s well used.

And to go back, just to the previous question about the encryption, from an attackers perspective, it depends how the encryption is used. Quite often we see that the secrets are encrypted, but the key, the encryption key is sold with the secret. So it’s like pretty much useless. Let’s just make it out to detect, but it doesn’t bring any value because an attacker will have access to the system and be able to decrypt those keys.

Mackenzie: What’s the difference between a data loss prevention tool, to sequence detection; and can it be compared?

Eric: I will say that GitGaurdian solves a part of the data loss prevention world. So we are really focused on sequence on public data. Our main focus is really more code security, so trying to improve the overall generation of code starting with SQL detection. But DLP is a way bigger world. I think Philip has spoken about it is really about finding open server in the wild, servers that will contain sensitive documents. You have also the dark web and the deep web as Mark mentioned. I will say it’s trying to solve the problem of, what’s my digital footprint on the public and deep internet and how can an attacker use it? And I will say a sequence on GitHub is a portion of that, that’s actually really effective and that you should consider, but it’s a fraction of the space.

Mackenzie: Do we see any correlation between cloud provide usage and leaked secrets?

Eric: There is definitely a correlation between the number of secrets leaked and the popularity of cloud providers. It’s just sometimes you can have outliers. So if somebody decides to try to publish 1 million keys, you have some people that have funny behavior on public details that can actually create some abnormality in the statistic. But yeah, usually it’s really correlated. So you see AWS first after Azure, as after GCP.

Mackenzie: Mark, do you guys see insights like this for what tools are kind of becoming most popular and I guess we can extend us beyond cloud providers into and what you’re seeing in leaks?

Mark: There is a direct correlation between volumetric usage and the amount of data that gets leaked because they’re bigger targets. So if you have a bigger target and they’re being hit by more people and more data is being extracted, or more leaks are being extracted and that data makes its way, obviously the size of the particular target makes a difference in terms of the correlation in what we see. The exception to that is the occasional random, popular, small site that gets attacked.

Mackenzie: Thank you very much. Thank you.


About GitGuardian:

GitGuardian is helping organizations secure the modern way of building software and foster collaboration between developers, cloud operations and security teams.

We are the developer wingman at every step of the development life cycle and we enable security teams with automated vulnerability detection and remediation. We strive to develop a true collaborative code security platform.

Learn more here: https://www.gitguardian.com/

About DarkOwl:
DarkOwl uses machine learning to automatically, continuously, and anonymously collect, index and rank darknet, deep web, and high-risk surface net data that allows for simplicity in searching. Our platform collects and stores data in near realtime, allowing darknet sites that frequently change location and availability, be queried in a safe and secure manner without having to access the darknet itself. DarkOwl offers a variety of options to access their data. 

To get in touch with DarkOwl, contact us here.

[On Demand Webinar] Top Trends and Predictions in Open Source Intelligence

March 16, 2023

In 2023, OSINT will continue to quickly evolve as investigators across a myriad of industries seek to disrupt crime, fraud, and threats. To help OSINT practitioners understand what to expect for 2023 and beyond, two respected leaders in the industry will share their predictions about what’s on the horizon for open-source intelligence.

In this webinar, originally held March 14, Rob Douglas, Co-Founder & CEO of Skopenow, and Mark Turnage, Co-Founder & CEO of DarkOwl, will share their insights on emerging threats and the latest OSINT tools and techniques to detect and prevent them.

Get Transcription

[Webinar Transcription] What Role Does Darknet Data Play in API Security?

November 10, 2022

Or, watch on YouTube

Mark Turnage, CEO and Co-Founder of DarkOwl, and Anusha Iyer, CTO and Co-Founder of Corsha, discuss how API Security professionals can benefit from darknet data in forming a more comprehensive understanding of malicious threat actor (TA) tactics, techniques, and procedures (TTPs) and providing effective detailed security recommendations, remediations, and product solutions. API Security related topics, like “API hacking”, “stolen API tokens”, and “API MITM attacks” are regularly discussed in detail in darknet forums, tokens sold and traded in underground digital marketplaces, and API exploitation code shared amongst threat actors.

For those that would rather read the presentation, we have transcribed it below.

NOTE: Some content has been edited for length and clarity.


Kathy: Hi, everybody. Thank you for joining today’s webinar. 

Before we begin, I want to take a moment and introduce our speakers. Anusha Iyer, President, CTO, and Co-Founder of Corsha, and Mark Turnage, CEO and Co-Founder of DarkOwl, both of whom have many years of experience working in the cybersecurity industry. Anusha is a Carnegie Mellon alum. She started in the Washington, DC area at the Naval Research Lab. At NRL her focus was on reverse engineering and tactical edge networking. She started Corsha with a friend a few years ago and is passionate about helping organizations get API security right, and making security accessible, easy to adopt, and even self-assuring. Mark is a graduate of Yale Law School, Oxford University, and the University of Colorado, Boulder. He serves on numerous corporate and nonprofit boards, and is a private investor in technology, software, and manufacturing startup companies. He is also a senior advisor to the Colorado Impact Fund and a technology advisor to the Blackstone Entrepreneurs Network. And now I’d like to turn it over to Anusha to begin our webinar.  

Anusha: Thank you Kathy and thanks to everyone for joining. We have an exciting agenda today. We’re going to look at API security and specifically API credentials and what an API security related incident looks like. We’ll tell you a little bit about Corsha as well as DarkOwl. We’ll go into why API security is so critical, some mechanisms to combat some of the threats and the attacks that we’re seeing, how the darknet can provide insights on this problem. Then [I’ll] turn it over to Mark to talk about DarkOwl and what is the darknet, how DarkOwl can deliver darknet data and give you more insights and analytics into where information is showing up on the darknet. And particularly with respect to APIs, what are threat actors saying about APIs on the darknet? And then we look forward to your questions and final thoughts.

DarkOwl and Corsha actually met a few months ago at Black Hat and had an interesting conversation around the proliferation of API credentials and how they are increasingly being used to gain unauthorized access to systems and services.

Increasingly we are seeing these types of data showing up on the dark web and being leveraged to execute breaches against organizations, like Toyota. Recently Toyota was notified of a breach where they had an API access key for T connect system. That’s part of their connectivity app to give things like wireless access and so forth to vehicles, and apparently, they had inadvertently checked in a hard-coded API secret into a repo about five years ago. It’s been available for five years in a public repo. And then they just released that over 2,900 records were exposed since then, giving access to customer names, customer information, and so forth. This is one example of what the threat landscape looks like and what the implication can be of API credentials getting into the wrong hands.  

Similarly, recently FTX and 3Commas revealed that an API exploit was used to actually make illegitimate transactions, to FTX transactions. And this was done using API keys that were obtained from essentially users and phishing attacks actually accessing other systems. Right, so 3Commas, the platform came out and said that the API keys were obtained from outside of the platform, but certainly still pose the risk of being able to then be used off-environment, unauthorized, to then make financial transactions. These trades were basically from keys that were gained from phishing and browser information stealers. 

Kathy: We’ve had a questions come in on these first couple of slides. Someone would like to know, is the fact that APIs are being targeted – is that a relatively new phenomenon?  

Anusha: That’s a great question. It is an increasingly leveraged phenomenon. I wouldn’t say that it is new necessarily, but it is increasingly leveraged. Because APIs tend to be an underserved element with respect to cybersecurity postures of most enterprises. Increasingly organizations are relying on APIs. As they look towards digitally transforming application ecosystems into microservices, APIs end up forming the backbone of communication and application ecosystems. And further, more and more organizations are moving towards cloud, moving towards ephemeral scale, and that just creates a proliferation of environments where these credentials are potentially obtainable. 

Mark: And that’s echoed by what we’re seeing in the darknet where discussions around API exploits, API keys, stealing API keys, and selling them is a relatively new phenomenon in the darknet over the last couple of years. We’re seeing the same thing from the criminals’ perspective that Anusha is observing in real life.  

Anusha: Absolutely. I would come at it from the perspective that we see the movement of organizations using more APIs, but you’re absolutely right from an exploit perspective. It is fairly new. And it makes a lot of sense, they tend to be large types of information. With the automation it’s easy to lose track of what’s legitimate and what’s not. Great question.  

Another one, this one was actually a 2018 leak where it was the USPS API endpoint. And in this instance, it was more of an authorization vulnerability where if someone has a USPS account they could actually change search parameters and do a much more expansive search and essentially get records for an entire data set without being limited to exactly what they should be seeing. It’s both on the authentication side but also on the authorization side in terms of how these credentials are provisioned, leveraged, and so forth.  

With that, let me hop into Corsha and tell you about our story and why we’re going after this problem space. Both myself and my co-founder come out of the DoD intelligence world. We’re focused on: how do we stop these breaches? How do we prevent unauthorized access to sensitive systems and services? And [we] decided to start course at the intersection of machine identity and API security. A lot of our early customers are out of the Department of Defense and we are working closely with Gartner to define this category and to define the space, if you will. What we’re finding increasingly is that API authentication, authorization, and security in general substantially lags behind all of the resources, effort, and human capital, put into human identity and access management. Now we need to think of these machines as entities, and as the same first-class type citizens as humans because they are accessing systems and services at a far greater rate and at a far greater impact even than just humans logging into accounts. So we started CORSHA and we’re very focused on how we can help with this API credential and API identity problem. I’m probably telling a lot of folks that are online something that they already know, which is that today API secrets are just glorified system passwords. They are largely static, often shared, rarely rotated, and don’t have a lot of good hygiene around them. They get leaked, sprayed, and sprawled across tons of environments. Mark, I’m sure you’re probably seeing this on the other end in terms of where they’re coming from, whether it’s CIC/D pipelines, whether it’s things like logs, deployment or cloud platforms, or even team collaboration sites. We already saw an instance with Toyota of GitHub. But I would venture to say that most organizations, just for the ease of sharing, probably inadvertently have leaked API keys, even internally, across systems. Because today the model of authentication is largely static, they’re ripe targets for adversaries.  

Kathy: A question based off this slide: can’t secret managers like Vault or KeePass prevent these attacks from happening?  

Anusha: It’s a great question. To some degree. They provide a secure mechanism to store the keys internally. But, oftentimes these APIs live in hybrid environments even in the control of hybrid parties. You may have an API that you expose to a partner, a vendor, or a customer. You would then have to rely on them properly leveraging a vault or a password manager or maintaining good hygiene around secrets to access your systems and services. So that’s part of the challenge here, is that vaults and password managers tend to be very environment or entity-controlled specific. 

Because we’re using these static, essentially bearer model credentials, for authentication and even authorization, they are almost acting as proxies for machine identity. And the challenge is that they’re not very strong proxies because they are static and they’re difficult to maintain hygiene around. Whether it’s a key, or a token –like an O-auth token, a JSON web token, or even a PKI certificate –because they essentially prescribed that bearer model of authentication where “as long as I hold it, I can leverage it, it doesn’t matter where I’m coming from,” they turn into ripe targets for adversaries. I’ll stop here and say that when we talk about a machine, what are we really talking about? In our terminology we like to think of it from the zero-trust approach to it where it’s a non-person entity. Anything where you’re trying to access a system or service and there isn’t a human identity to back that access is where the API authentication approach breaks down a little bit. Whether that’s a Kubernetes pod, a docker container, VMs, even physical IoT devices –those tend to all be areas where static credentials end up getting leveraged in some way, shape, or form. Increasingly we’re seeing that these are the new attack sector vector that is increasingly in vogue. 

To give you a very quick overview of what we are doing at Corsha, what we’ve done is we’ve come up with an API security platform where we can pull some lessons learned from the human identity and access management space. And we’ve come up with a way to not only do dynamic machine identity for API clients, but then leverage that to do fully automated MFA for machines. Think of a second dynamic factor where you can make sure that API calls are going with one time use MFA credentials. This gives you a lot of those nice benefits that we’ve seen on the human side with MFA where now you can pin access to only trusted machines. Even if a key inadvertently gets checked into a public GitHub repo, if MFA is enforced as a secondary factor, you’re okay there. That’s the idea: to elevate these API clients as first-class citizens, regardless of what their form factor in a way that is easy to adopt, easy to integrate, no code change, so that it doesn’t place burden on DevSecOps teams and make their day to day easier. So that they’re not having to worry about things like credential rotation as part of their workflows. 

Just very high level, the essence of what we are trying to provide is security, visibility, control, even the ability on a fine-grained level to do things like start and stop access for a client. That’s a little bit of a difference with, say, this approach and say a vault. Because if you give an API key to a third party, you don’t necessarily have control over their vault. But with machine-driven or an identity-first approach to it, you can say, okay, from a control plane I’m going to dynamically start and stop API access for this collection of machines. And in that way have the expectation of access matching your threat surface. That’s a quick overview of CORSHA and the product and the problem space. I would love to turn it over to Mark and hear more about DarkOwl and what you’re seeing on the Darknet.  

Mark: Thank you. The darknet is an interesting place and DarkOwl was set up specifically to allow organizations to monitor the darknet for threats to their core missions. As you can see in the lower right hand corner, our clients include many of the world’s largest cybersecurity companies who effectively use our platform and use our data to monitor on behalf of their clients. We also work, as does Corsha, with various agencies in the US Government. What we do is we go into the darknet at scale and we extract data at scale from tens of thousands of darknet sites on a daily basis. We index that data, we store that data, and we make that data available to our clients and make it searchable to our clients.  

The question I get is what really is the darknet or the dark web? The two terms are conflated.  

We all spend most of our time in the surface web. What you can search for off of your Google browser is effectively the search web. It represents a relatively small percentage of the data that is available by the internet, in spite of the fact that if I search for any term I’m going to find thousands, if not tens of thousands of results on my Google browser. It’s actually a relatively small percentage of the data that’s out there. Most of the data is fire-walled and it’s in what we call the deep web. My bank account information is available to me because it’s authenticated, I have the credentials, but it’s not available to Anusha and vice versa. By volume, most of the data that’s available via the internet is actually in the deep web. We specialize in the darknet, which is below the deep web. The darknet is dark for two reasons. It’s dark because you can’t get there from your Google browser. It usually requires a specialized browser or specialized access. What it does is it obfuscates user identity. Oftentimes the traffic is itself encrypted. And because of that, it is the perfect environment for criminals to operate in. Anusha and I can conduct a transaction, we can have a conversation, we can conduct a criminal transaction, buy or sell exploits with each other, drugs – there are any number of other things that we can do. A law enforcement agency could be sitting in the middle of that and see the transaction go through and see the discussion and never understand who I am and who Anusha is. And if you add in cryptocurrency on top of that, we could pay each other in an anonymous fashion. As a result, the darknet has become a haven for criminal elements.  

At the bottom of that page, you’ll see Tor, I2P, Zeronet. Everything in red is data that we at DarkOwl collect from. We also collect from certain deep websites and some surface websites which enrich our darknet data. Increasingly, especially with the Ukraine-Russia war, these direct messaging platforms, such as Telegram and IRC are becoming destination points for criminals to operate in and we collect data from those as well.  

Kathy: Mark, before you move on, an attendee would like to know, how big is the darknet?  

Mark: I wish I had an answer to that question. We don’t know how big it is. We do know that Tor was the original darknet. It is now one of many darknets. The Tor project actually publishes data on users, number of users, numbers of connections to the Tor network, and number of sites.  

Year on year, it continues to grow significantly. There are a number of sites like I2P, Zeronet, Freenet, and these other new sites that have grown. We don’t know how large it is. We have been told that DarkOwl has the largest commercially available archive of darknet data that’s available. I couldn’t prove that to you because I don’t know what the denominator is. But we know that the darknet is growing in terms of both customer usage and transactions that take place.  

Very briefly, this is the kind of data that we collect. The data that most people are familiar with is at the bottom of this slide. We hold somewhere around 9 billion email addresses that we’ve collected over the years, 1.8 billion IP addresses. Those are oftentimes IP addresses or networks that are being targeted. A range of credit cards, crypto addresses, and so on. It’s a big database that we have, and it’s updated continuously and has been since we stood the company up five years ago. Then we make our data available by a number of APIs as well as a user interface for the analyst community as well. But to give you a sense, just in the last 24 hours we’ve indexed and put into our database 1.3 million documents. That gives you a sense of the scale of the type of documents that we’re dealing with.  

More relevant to this conversation, though, is the next slide, which is, what are we seeing in the darknet that is relevant to the issue around API security? And the answer is a lot. We’re seeing that threat actors in the darknet are discussing stolen API secrets, keys, they’re trading the session tokens, and they’re openly discussed in these closed communities. This is a hot topic for the criminal elements in these communities. There are man in the middle attacks, there are injection methods being discussed and actually traded. Anusha and I would get into one of these forums, we’re both criminal actors, and we would discuss how I mounted a successful attack using this method. And she’ll say, can I buy that method from you or can I borrow it? Let me try it on a target that I’m thinking about. We see that ongoing. JWT authentication bypass methods are oftentimes discussed in detail. That’s been a real wake up call for me personally, seeing how creative criminals are being in these methods that they’re developing. Tools are shared.  

Interestingly enough, and not particularly relevant, but the DDoS services are sold. API DDoS services are sold for cheap. One of the things we’re seeing broadly in the darknet across all sorts of threat actors is the migration of threat actors to actually selling out their services and renting them out on a monthly basis. This is just an example. We’ve seen Kubernetes targeted especially. It’s a distributed environment, so there are some vulnerabilities that the threat actors are using. Then hacking courses on and on and on.  

These are some screenshots of some of the discussions that we have seen in the darknet. In the upper left you’ll see this discussion around leaking API keys. In the middle of the slide, you’ll see Russian threat actors describing API keys as well as the secret keys and making the secret keys available. I think those were stolen. In the lower right. I love this. You know, we figure out a way to withdraw funds using API keys without access to the account itself and on and on and on. If you get onto our platform and search for any of these terms, you’re going to find quite a lot of discussion among the threat actors and the criminal gangs around this. And you’ll see data brokers actually selling keys. Selling actual access to networks. The conclusion is that Darknet is rife with discussion around the very threats that Corsha is targeting and that was set up to respond to. Anytime you see this kind of activity, any time you kind of see this discussion going on in the darknet, you know you’re on to something. So your customers made some smart choices here in new shows.  

Anusha: We appreciate that, Mark. I will say it is very interesting to see all of the discussion and the activity around Kubernetes. I think that might be even a fun double click into another session to do, because it is turning into a foundational layer of most organizations transforming their application ecosystems. It would be fantastic if we could get ahead of that.  

Mark: I’d actually like to talk to the founders back at Google. It was right around 2014, if I’m correct, about Kubernetes, and ask them whether they ever had a conversation around security right at the outset. Because most people, most developers won’t. And it’s not a criticism. It’s that just most developers won’t do it. They’re thinking about how to build a scalable environment for whatever their mission is. They’re not thinking about how five or six or seven or eight years down the road, somebody’s going to be trying to attack that environment.  

Kathy: We’ve actually had a few questions come in. One of our attendees would like to know: what specifically can be done from a security perspective to prevent an API attack?  

Anusha: Some of it is obviously having good hygiene around primary credentials. Having policies in place for things like rotation. Certainly using a platform like Corsha as a layered defense so that you have a way to uniquely identify and control each API client. Is a very sound approach to a lot of this activity that we are seeing on the darknet. Other things like making sure that API access is least privileged, so having notions of authorization in there. Just like when you have a given user, you give them roles, and not all users have access to the same information and services on the system, APIs and API clients have to be dealt with the same way. And having ways to revoke secrets and revoke access are very important. It’s about drawing a lot of those parallels that we have with human identity and access management but into the world of APIs. 

Kathy: Thank you. We also have a question of: how can security or engineering teams get better visibility into how their API secrets are being used? 

Mark: One way to do that is to use a platform like DarkOwl’s platform to actually monitor the environment on an active basis. Oftentimes, you will see threat actors discussing targets by name or by IP range or by other things. Look in the upper left hand side of this slide, right there  is a discussion around a very specific key from a very specific 

My point is, any time you’re thinking about security more broadly, there are a number of hygiene elements that have to go into place. One of those hygiene elements is monitoring this environment where criminals are actually plotting attacks in a wide variety of different contexts, not only in the API environment. We see active threats, active exploits under way. We see targets being identified and threat actors saying, all right, that’s great, I’m going to hit them. You have to have some eyes on that environment.  

Kathy: Thank you. We did have one more question come in, and that is, what should a team do today if an API secret is compromised?  

Anusha: That is where having a good platform for observability in place is really important because you want to know where that API secret could have been leveraged right and have the ability to quickly revoke and rotate it. It’s both understanding impact of the leakage or the stolen credential and then mitigation strategy of how to revoke it, how to rotate it, with obviously a little downtime as possible. I think for observability using a platform like DarkOwl is really helpful because you can see the extent to which it may have been leaked or compromised as well.  

Mark: Thank you, Anusha. It’s a pleasure doing this. Let’s do another one in the future once we find more threats.

Anusha: Absolutely. That would be fun. Thanks so much for the time. And thanks to everyone for listening in.   


About Corsha:
Corsha is on a mission to simplify API security and allow enterprises, developers, and DevSecOps teams to embrace modernization, complex deployments, and hybrid environments with confidence. Our core technology is dual use, designed for widespread adoption, and easy to configure and deploy to both commercial and government customers. Corsha has a strong engineering team with deep expertise in distributed ledgers, cryptography, security principles, orchestration technologies, and software design. Contact Corsha.

About DarkOwl:
DarkOwl uses machine learning to automatically, continuously, and anonymously collect, index and rank darknet, deep web, and high-risk surface net data that allows for simplicity in searching. Our platform collects and stores data in near realtime, allowing darknet sites that frequently change location and availability, be queried in a safe and secure manner without having to access the darknet itself. DarkOwl offers a variety of options to access their data. 

To get in touch with DarkOwl, contact us here.

[Webinar Transcription] Countering Illegal Trade on Darknet Marketplaces

November 08, 2022

Or watch on YouTube.

David Alley of DarkOwl FZE and Ivan Kravstov of Social Links dive into the topic of harnessing OSINT to expose illegal trade on the darknet. They outline the black-market landscape of the darknet and showcase a range of methods for fighting illegal trade and approach the topic of darknet marketplaces from different angles. In this webinar, they cover:

  • The nature of the dark web and how it is accessed by users
  • The functional make-up of darknet marketplaces
  • User deanonymization methods
  • Advanced darknet data extraction and analysis techniques

Attendees learn how to break through the perceived anonymity of the dark web and crypto transactions to identify criminal actors and track illegal trade and illicit activity.

For those that would rather read the presentation, we have transcribed it below.

NOTE: Some content has been edited for length and clarity


Ivan: Greetings everyone, today we will be hosting a joint webinar with David Alley of DarkOwl FZE and the topic will be countering illegal trade on darknet marketplaces or more broadly dark web research in general. 

David could you tell us a bit about DarkOwl?  

David Alley: Absolutely. It’s really great to be here and thank you to everyone for joining from all around the world. I know that we always fight the various time zones to get everyone here, so a special thanks to the Social Links team for hosting this webinar. They’ve been super helpful in getting this excellent presentation together for us.  

A little bit about DarkOwl – we are American company, and our headquarters is in Denver, Colorado also known as the Mile High City. We originally started off as a cybersecurity company with a focus on penetration testing. And at that time we would do research on the darknet to see if we could find credentials to help with our pentesting work. We were really successful at that, we had a very high rate of penetrations for the pentests. We said, “why don’t we change this and actually go into being just a pure darknet company only?” That was really the birth of DarkOwl. Since then we’ve had a lot of great team members with us at DarkOwl and we’ve built a very good collection capability for us to go onto the darknet and pull out that data that is really difficult to get to.  

We have a great collections team that does all of this hard work and makes it much easier for our partners like Social Links to do the next part. Which is, that once they’ve looked at that data, to make sense of it and decide what does is it mean? And how do we use it? And how do we fight crime that is emanating from the darknet? 

We have a couple of claims to fame. The one we use the most is that we have the largest commercially available darknet data lake in the world. And that’s just because we have been doing it for longer than everyone else. We’ve had some very special team members over the years that have had a very unique access and understanding of the Tor Network. At one point we actually had the co-founder of Tor on our team and so it’s a really unique company. We are highly niche and highly skilled and that’s why great companies like Social Links and ours like to work together because we are complimentary. We work a lot with OSINT analysts as well, but we also provide APIs and Datafeeds for partners and that’s how we work with Social Links. I think you’re going to be pretty amazed at what the team has to show you today. I’m always impressed with what they’re able to come up with; they have a superior team. Leveraging great data from DarkOwl with great analysts from Social Links you’ll always be happy with the results. I’ll turn it back over to you Ivan. 

Ivan:  Thank you very much for the introduction David. A bit about us: the company was founded in 2015 we have 80 + employees at the moment with HQ in the US and EU offices in the Netherlands and the R&D office in Riga, Latvia. What we do is provide software for data-driven investigations. You can see that we have a good rating on Gartner Peer Insights and that we have received a number of 
industrial awards in the past years. 

Here we have a very brief slide about the average pricing of various goods on the dark web. Ranging from stolen credit cards to out of the box ransomware Trojans.  

A concept that I’m sure everybody is familiar with is that there is a division into what is known as the clear web or the surface web, something which is indexed by conventional search engines, then there is the deep web which can include many different things that are not [indexed by conventional search engines] and that it takes a bit more effort to find and then there is a space commonly known as the dark web which include the Tor Network but also additional ones such as I2P, Freenet, and Zeronet.  

The general principal of Tor browser network is that the traffic goes from the user through several nodes and then reaches a specific server at the end. The current total Tor network bandwidth is 400 gigabytes per second.

One of the technologies that is also utilized quite often within the platforms of communication is PGP encryption. The basic concept being that the user sends an encrypted message that can only be accessed and read with the use of a private key held by the recipient.  

Now here we can see the boost of darknet marketplaces revenue from 2011 with the first precedent being the Silk Road up to 2020 [revenue] which is quite substantial.  

The products and services available on those marketplaces range from drugs to tutorials, forgery, various kinds of illicit services, malware hosting, and fraud. The majority of those being drugs. 

The general principle of how a marketplace works is that a buyer exchanges currency for any kind of specific cryptocurrency accepted by the marketplace. Which is predominantly Bitcoin at this moment but there is a shift towards alternative ones such as Monero or Z cash. The buyer then transfers the 
Bitcoin into markets account and makes a purchase. The crypto is  held in the market’s ESCROW account until the order is finalized with the market taking a commission. After the finalization of the deal the vendor is paid. Then the vendor may move the Bitcoin from the market account and potentially exchange it. 

Here we see an infographic of types of entities receiving Bitcoin from dark web sources which can be KYC and for exchanges enforcing KYC or exchanges more liberal with their KYC processes. Those can also be mixing services and other entity types.  

David, if you could tell us about DarkOwl’s differentiation?  

David: Absolutely. As we’ve seen here we’re talking a lot about the crypto piece. And I want to talk about how DarkOwl differentiates itself and helps you with this. It is because we are able to go into these markets that we’re talking about today and were able to pull that data out for you. A lot of the Blockchain tools that you’ll be familiar with will allow you to see various wallets as they’re being tumbled and where they’ve been mixed or how they’re being exploited. But what they have difficulty doing is tying wallets to a very specific illegal activity. And that’s one of the main things that makes us different for these types of investigations. We are continuously out there crawling these darknet sites and these markets that we are in. Someone asked a question: how do we differ from our competitors? It’s just a real question of scale and scope. Many of them are in about 400 sites and we’re collecting from over 95,000 sites and about another 20,000 to 30,000 mirrors every day. It’s this massive amount of unmatched darknet content discovery that we’ve got and inside of that content is where all of these cryptocurrency wallets are which can be tied to illegal activity. You want to buy your MDMA in London? Here you go – use this bitcoin wallet or this Monero wallet.

I second the comments that we’re seeing a shift from Bitcoin into some of the other coins out there. We’ll even pick up coins in our collection that are not even on the chain yet. They’re brand-new wallets that are being used. We’re seeing that shift away from the traditional way of using the same wallet over and over to now criminals will create a new wallet put it up on the site for their drugs or their CSAM material or whatever it is they’re trying sell and have the payments into the air before the Blockchain tools can even detect them. You’ll see coins get recycled and because of our unique archival capability it goes back to almost 9 year’s worth of data. You can also do those deep investigations into darknet transactions that happened years ago. All of that together gives you the content that makes investigations very strong and that combined with the ability to do leak analysis as you can see from our Social Links partners is a very powerful tool. To give you an idea of what we actually have in the collection, it’s about the numbers. 

It is a lot of Tor. Tor is the largest of the darknets. We also have a very large collection of from I2P and from ZeroNet. Those are the three major darknets that we collect on. And there’s some very technical reasons behind that. We also are having a lot of success picking up cryptocurrency transactions off of Telegram channels. As we know Telegram is very popular with a lot of different hacking groups and black hat hacking groups. It’s easier to use than a darknet channel. We see that a lot of hackers are also gamers, and they use Discord for communications. We see some in paste as well. What should really be focused on [in this slide] is the lower right-hand corner. That’s 347 million cryptocurrency wallets pulled out of our darknet collection. It’s a pretty big number, and every time I see a cryptocurrency wallet on a darknet site it’s always doing something bad. I’d say it’s a 99.9% probability that if you’re using Social Links and you pull out a cryptocurrency wallet from the darknet data, you’ve already done one of the hardest steps which is identifying some form of suspicious activity. I’ll turn it back over to our Social Links partners to take you through the rest of the demo.  

Ivan: It may make sense to note that with Telegram and Discord channels there is indeed substantial overlap. Much more substantial obviously then with the traditional mainstream social media platforms. Telegram and Discord aren’t really called social media, but they have a significant social networking element. Telegram especially in the past few years. It is about cybercrime groups but also apart from that it could just be local, regional, or even macro-regional drug vendors. It could be people engaged with child grooming, especially on Discord, or extremist groups as we previously covered in one of our webinars with a German expert on extremism research. Now we will go into the actual examples that we have. 

First we should dedicate a few minutes to talk about the method of dark web research. In this case that would mean focused on researching an individual. It makes sense to use all of this in conjunction.  

From the username we can get the specific platform within this interface where the vendor or forum member is present. That can also give us insights into their stated or observed affiliations. Those are the payment methods, the posts and threads and the products. From the posts and threads you can examine the topics discussed in the details which can also tell you more about what exactly they are doing, what kind of merchandise they are dealing in, what kind of categories, and if they have a specific focus. As well as the speech patterns of the idioms and idiosyncrasies used by the individual and the shipping locations. And of course, the products also tell us more about the proper categories and sometimes product cards can contain contact details within them. Objects within this schema such as the speech patterns, the stated shipping locations of the products, the affiliations, and the specific platform can point us to assumptions about a certain region or macro-region.  

For example, there is a higher probability of a vendor or a forum member on an Eastern European marketplace to be from somewhere in Eastern Europe. Payment methods can be different as well as various types of e-money, but here we’ll focus more on cryptocurrency addresses. A transaction derived from an address can tell us about the interactions it has with other addresses for groups of those. And it can tell us about the services that they are using such as mixers or exchanges. A mixing service may also have theoretically some kind of interactions in some kind of partnership program for a specific marketplace. They can also be mentioned in various reports or forums. All of those can possibly lead us to digital breadcrumbs, and that in conjunction with the assessment of the presence of the user in other forums and marketplaces and the way their personality may be reflected in their online behavior and the kinds of merchandise that they are dealing in and the kind of payment methods that they’re using is all part of an attempt to create a digital profile of an individual.  

Now here we will start with the first example where we will go from an alias. We will run our first transform search for users under this alias. Here we can see some details in the properties, one of those being the side name Tochka Market. “Tochka” is a Russian word standing for point or place. We search for the products related to this vendor and we also extract their PGP open key which is quite often used by vendors. Next, we will use the products and extract the locations they are to be shipped to and from. 

We can see here that those are mostly recreational drugs shipped to the United States. From a PGP open key it is sometimes possible for us to go to the email address. Not in a hundred percent of cases, which can also be said about some of the other methods that we will be applying here. Here we see a Gmail and from that we can further try to see if there are any social media profiles and any accounts connected to that email address. There is also the possibility to get reviews if it’s a Gmail account. We can see that there are accounts within Facebook, Firefox, Gravatar, Pinterest, Samsung, and Twitter connected to the email and we see several profiles within Gravatar, LinkedIn, and Skype from which we can extract additional details. In reviews we also see a cannabis dispensary seemingly located in the United States and a bar in Cameroon which matches with the location that we see here within the LinkedIn account [redacted account name] connected to the Gmail address. There is also a post promoting the sale of marijuana on a surface web source stated by the account holder to be safe and secure. Now here we can use some of the Maltego functionality to go into more data about that specific domain. The WHOIS data gives us the name of [redacted name] as a registrant and the company name [redacted company name]. [Redacted names] are both something that we have seen within the social media footprint derived from the email address. Now of course an analyst won’t be as lucky as in this instance in 100% of cases, but this is real data related to a real individual. It is possible because people do tend to make mistakes.  

Now we will go through another alias. This [alias] gives us 4 accounts with the same username and it’s something that vendors to do to maintain a commercial reputation with the customer base. Now we can ask for specific platforms. We can see the Dread forum, the Hub forum, the Apollon market, and the Wall Street Market. Now we also see a single PGP key used by three out of four of those accounts and we will further ask for the posts and products. We can see that there is a certain focus on Europe. In this instance the goods are more likely shipped from Europe to locations worldwide. The principles of working with the posts are  similar to the way a user of Social Links Pro or a SOC tool in general can work with social graphs. The graphs of social interactions within the digital space. From each of those we go into the thread. From the thread we can go to the other posts within it, and the other users that have been participating in those conversations.  

This is just at stage of gathering data and an analyst working on a real case will of course face the necessity to analyze this communication in depth. That’s why there’s a capability here to download the content within those posts and save the text content as a text archive. Now here we see a Proton mail account- [redacted email address] so they seem to be more conscious about their digital footprint and security, but potentially we can try to search for this alias in the social media platforms available. Here we’ll try with an Eastern European platform because [redacted name] [the alias] is obviously a reference to the famous assault rifle. Here we got an account with just the cat as a picture under the name [redacted name] and while it’s not something that we will state and something that we will accuse this person of, it could be a coincidence or it could not be a coincidence. The account is not very informative, is closed, and has a profile picture of a cat. So here we are less lucky than in the first example. In some instances it’s even more obscure. Here we see an individual with the alias [redacted name] focusing on the European Union. They have two email addresses and a statement in the product description that there is a possibility to contact the vendor on Discord. We see that there is a Discord account connected to their Proton mail address, and also a Skype account which states the location as Germany. This is all on the level of analyzing people and individuals or small groups of people, because several individuals can be behind one username.  

This can also be done on a macro level. We can take several capital cities or countries within a certain macro region such as Asia-pacific or Latin America and run a search into the full spectrum of dark web sources available to us to see which products are shipped to and from those locations. Here we see that some countries have more activity within the spectrum of available sources, some countries have less, and we can potentially look for vendors that are focused on two or three specific countries at once. We can also see which marketplaces are more active within a given region. Here and in Latin America Tochka market is quite active. Additionally the Apollon and Nightmare markets and then several other ones have much less activity.  

Now of course it makes sense to talk a bit about the cryptocurrency aspect within dark web research. Several of those graphs are something that we’ve shared previously in some of our previous webinars. The methods can be split into two sets: passive intelligence and direct engagement. Passive intelligence may include open-source and social media intelligence, the traditional following the money approach, and the enrichment of the initial entered data that the analyst or potentially a victim of a crime may have. Direct engagement is something that implies using custom digital avatars for social engineering and also in the case of enterprises, or state organizations, offensive security procedures or threat intelligence. Some of those methods are more customary to certain kinds of professionals, analysts, and organizations than others but in the end as is the case with any kind of investigation it is all about connecting the dots, the seemingly not connected entities in a broad sense that word. 

Here is a small reflection of the situation within the Bitcoin ecosystem. There are a number of addresses here, some of those belonging to militant extremist groups such as the Palestinian Al-Qassam Brigades or Hay’at Tahrir al-Sham the fellowship operating in Syria. Some of those belong to dark web vendors such as Ross Ulbricht of the founder of Silk Road. Alexandre Cazes founder of Alphabay, or the administration of the Wall Street Market that exit scammed in 2019. Some of those were because of law enforcement, some of those were ransomware groups, and some of those were to legitimate exchanges. 

A way to perform this attribution to be 100% certain that a specific address belongs to a specific individual or a group is to run searches into the social media and dark web space and also into data that is provided by vendors such as DarkOwl And I must say that DarkOwl provides fascinating amounts of information of fascinating depth, and a number of these were done with the help of DarkOwl as well. Social Links is focused specifically on the Tor Network while DarkOwl, as David has mentioned, also pulls data from other sources such as I2P and Zeronet. Once you get this kind of entity you can further run the transform to get to the details and then examine the contents of those entities. The source of the networks and the date and time are also stated within the properties. 

Here we have another simple example of building a timeline with the timestamps from within the transactions related to a specific address and the timestamps of the mentions of that address on a dark web forum. 

All of this above is related to the situation around the exit scam performed by the Wall Street Market administration. You can see that all of the transactions and all of posts take place in the second half of April 2019.  

If we talk about profiling, there been there are a number of quite famous cases that have been solved by law enforcement and by analysts within those types of organizations related to de-anonymizing an owner or a senior administrator of a dark web marketplace. There is the famous Ross Ulbricht who was using the alias Dread Pirate Roberts and a clear web alias Altoid which was the key thing that led the American law enforcement towards then. We can gather the different data from the full spectrum of sources or potentially we could very carefully try to profile the individual based on the way they interact with the customers, the way they interact with vendors, the way they behave online within the platform. Or we can try to profile those people in retrospect to see what is common between the individuals who have been involved in such activities that have been uncovered historically. We can see that the portrait of the criminal has changed over time to this day in 2022. All of those –Mr. Ross Ulbricht, Mr. Gal Vallerius and Mr. Alexandre Cazes are educated individuals in different fields. For instance, Mr. Cazes has a degree in computer science. They tend to share certain views such as being Libertarian. Libertarianism was something very much associated with the motives of the founder of Silk Road, but similar motives can be speculated about other members of that community. In the case of Mr. Alexandre Cazes, the key input was an email address that was a source of messages to newcomers within the Alphabay Marketplace which was 10 times the size of Silk Road at its peak. The support emails were to new vendors and new members.  

Here we can try an example of enriching that identifier to build this graph from scratch. This can be done with the help of something called a machine within Maltego which can automate those queries under a specific logic.  

Here at this moment it gives us an IP address from a leaked database, it gives us an account on Gravatar –[redacted account name] an account on Skype, and a number of email addresses with similar passwords. And also a number of additional database records that contain the email in the string. The IP address is further resolved into a Canadian netblock and that is resolved to an autonomous system number. Now we can try to do the same with the second email that we have here. This is giving us two Skype accounts and two additional IP addresses. Of course, we can run a search into the data lake of DarkOwl. From which we will try to extract additional details. Here this gives us the family name, it gives us the name of another individual, and a number of IP addresses and phone numbers. The IP address issue may be just a minor technical problem on the side of Social Links with integrating this, but you get the point. This gathering and structuring process is something that is done in retrospect, so this person has already been uncovered, already been arrested, and already committed suicide while in jail. But I think it’s  obvious how beneficial industrial automated tools such as DarkOwl and Social Links can be in researching such individuals and investigating and doing criminal intelligence within those types of sources.  

With Oxymonster, the alias that belonged to Mr. Gal Vallerius an Israeli-French individual, the initial input point that investigators had was this vanity Bitcoin address for which they traced output, a number of outgoing transactions to a number of addresses all leading to an account on a peer-to-peer platform [redacted address][.]com under the username Vallerius. That is exactly what we were talking about when we said speech patterns and idioms and idiosyncrasies. The investigators further compared the speech of Mr. Gal Vallerius on Instagram and Twitter accounts that are no longer in existence but we do have a Foursquare profile here with that of the user Oxymonster and there was a certain match in the patterns. Now here we can extract additional things from the DarkOwl entities that we have as well.  

In another example with an email of Mr. Ross Ulbricht which was found from one of the posts on the Bitcointalk forum which was initially found a by matching the username Altoid with the first-ever mention of the Silk Road marketplace on [redacted address].org. We can also try to use those transforms to see what is connected to those identifiers.  

Here we go to what is more commonly associated with Social Links. Social Media intelligence is our strongest side so far even though we’ve diversified the sources that we have and the methods available for them in the standard procedure of mapping out the digital footprint of an individual. If we return to the initial logical schema of those processes it is a necessity not just to focus on the user account or on the group or on the marketplace within the Tor Network or any of the other darknets. The process of investigation and analysis will take the analyst, if they’re lucky of course, into other kinds of domains which may include conventional social media. 

There is another instance for a potential use of OSINT tools in a similar scenario, but it would make sense to use in the case of the Berlusconi Market and their administrator John Kohler Racino . The way that they were uncovered was something far more in line with the traditional work of law enforcement. They were eventually closed down as a result of the operation by the Italian Guardia de Finanza, but it was the result of operatives having ordered number of goods from the marketplace as part of an experiment and having noted that they all came from the same post station from within a small town in Italy. Here we see an example of what can potentially be found from the usernames and the accounts under the usernames that were operated by Mr. Lucino. There are two of them: one that had presence in the Dread forum and was involved in discussions around the Berlusconi Marketplace and another one on several marketplaces including Berlusconi, two of those sharing a single PGP open key with the pattern of the goods being shipped from Italy worldwide. There is some output from the Social Links identity search engine that also gives us a number of email addresses and IP addresses. Operations such as this can be advanced with the use of DarkOwl. 

That is all of my part so far with the functional demonstration of the capabilities.

Another topic which we haven’t really focused on today but which is quite relevant here is the usage of those kinds of tools and the exploration and the research by professionals in the field of corporate security. The cases that we’ve shown now –they’re somewhat more in the domain of law enforcement work and criminal intelligence analysts, but the monitoring of sources, aggregating leaked databases, data breaches, are also a topic relevant to the practice within the corporate sector.  

How we use those tools to detect human trafficking is a very good question and there is an organization that we have done a webinar with previously called the Anti-Human Trafficking Intelligence Initiative with very brilliant people working in that area. They have a solution of their own that works by a slightly different principal than Social Links and DarkOwl, but yes such solutions do exist and such practices do exist and they have been successful uncovering numerous instances of human trafficking and the distribution of CSAM.  

David: Absolutely. Ivan, I just want to jump in and congratulate you on a really excellent presentation. As far as the human trafficking pieces, we are seeing a growth in the kind of communications and coordination that happens on the darknet for human trafficking and even more broadly for the CSAM types of materials. I would like to talk about one of the other questions that has been brought up, and it talks about the companies that have been involved in ransomware incident response. The amount of chatter that we see happening on the darknet for the different ransomware gangs has increased exponentially over the last two years, and we’ve tried to focus on it for quite some time. We’ve really seen how well they have taken their software to market. You can see that ransomware as a service programs have been proliferating widely through  markets on the darknet. As far as identifying specific ransomware families, I think we have about 30 or 40 of them that we’ve already curated. Including what cipher they are using, when we first saw them appear on the darknet, and you can use it to gather some of the pricing data that you need.  

Ivan: Thank you for that David. One thing that is easy to see even from this simple graph which is just a reflection of the current state of affairs in the cryptocurrency industry and specifically in the Bitcoin ecosystem is that it is very Wild West-esque at the moment. [There is] the obvious pattern of large a number of interactions with people involved in terrorism and ransomware and the trades in illicit goods in the dark web space and human trafficking and CSAM as well, although those two categories are not reflected here. The people at the Anti-human Trafficking Intelligence Initiative know much more about that topic. Interacting with legitimate exchanges such as Binance, Gemini, and Coinbase.  

David: There’s a question from Andrew and it says: do DarkOwl and Social Links have the tech to crawl the deep and dark web? Almost all of our collection is technical-automated. There is a combination of techniques that you use to gain access, but then you cannot collect at scale just using human beings so it’s a combination of both. We use both for this kind of collection. Then there was one question about risk management targeted profiling and Customs control. Absolutely, specifically for the for the drugs portion…most of drug shipments that we see happening on the darknet are international transactions. The largest shipper of drugs worldwide is the United States Postal Service because it takes a federal warrant to get into a box being shipped. We see some law enforcement agencies do controlled buys. They use these tools to identify who the vendors are, how do you enter and interact with them, and it’s about the speed – how do you get ahead of this and then do controlled buys. When it comes into your country you will figure out which one of your Customs agents is taking bribes from people to let those packages in. It’s both useful for looking at criminal activity and also from an internal counter-intelligence perspective. 

Ivan: Thank you David and thank you for visiting we are always glad to see you here.

David:  Andrew we don’t leave you hanging out there I see your question, you’ve asked how they might go seize the ransomware payments. I don’t have any direct knowledge of how that happened, but most of these payments have to go through some form of exchange to move the money around and they likely had access to one of those exchanges that could tell them. Because remember there are some exchanges that are working with and cooperating with law enforcement and international law enforcement agencies and if they get a valid warrant from a law enforcement agency to block the transaction, they can do that. Just like it would work in the international Swift system for blocking bank transactions through the Federal Reserve Bank of New York. I would imagine that probably something like that is how it was done.  

Ivan: Yes, I actually think there was an Eastern European mixing service there.  

This is it on our part for today thank you everybody very much for participating and we hope that you will contact us to talk with us further about how our solutions can be implemented into your business processes. We will be very glad to see you and will be expecting you on our further webinars that are to come. David thank you for co-hosting.  


About Social Links

Corsha is on a mission to simplify API security and allow enterprises, developers, and DevSecOps teams to embrace modernization, complex deployments, and hybrid environments with confidence. Our core technology is dual use, designed for widespread adoption, and easy to configure and deploy to both commercial and government customers. Corsha has a strong engineering team with deep expertise in distributed ledgers, cryptography, security principles, orchestration technologies, and software design.

Contact Social Links.

About DarkOwl

DarkOwl uses machine learning to collect automatically, continuously, and anonymously, index and rank darknet, deep web, and high-risk surface net data that allows for simplicity in searching. Our platform collects and stores data in near real-time, allowing darknet sites that frequently change location and availability, be queried in a safe and secure manner without having to access the darknet itself. DarkOwl is unique not only in the depth and breadth of its darknet data, but also in the relevance and searchability of its data, its investigation tools, and its passionate customer service. Our passion, our focus, and our expertise is the darknet.


Interested in how darknet data applies to your use case? Contact us.

[Webinar Transcription] Cowbell x DarkOwl: Into the Dark with a Flashlight

October 14, 2022

Or, Watch on YouTube

DarkOwl’s Chief Business Officer, Alison Halland, the Director of Strategic Alliances at Cowbell, Jessica Newman, and Cowbell’s Director of Risk Engineering, Manu Singh, sit down and discuss the building blocks of the darknet and organizational risk, what darknet data exposure means for small to medium sized businesses, how Cowbell uses DarkOwl’s darknet data to generate a dark intelligence scores for each of their policyholders. They also dive into the value-add of a Cowbell policy to their policyholders provided by Cowbell’s free reports from their risk engineering team utilizing DarkOwl’s darknet data to assess and mitigate cyber risk to businesses.

For those that would rather read the presentation, we have transcribed it below.

NOTE: Some content has been edited for length and clarity


Jessica: I want to welcome you all. We are really excited to have you on behalf of Cowbell and DarkOwl. We are here today to talk about the dark web, which is hopefully an interesting and fun topic of conversation. From what we hear, everyone typically perks up when the dark web is mentioned. It’s a topic that gets a lot of questions and a lot of interest. Our intention today is to arm you with enough information about the dark web and about how Cowbell uses the dark web to feel comfortable talking about the dark web with your customers.

Quickly I will introduce myself, my name is Jessica. It’s great meeting you and being with you today. I run point on our cybersecurity partnerships here at Cowbell… I’m going to let our panelists introduce themselves. We’re really excited to offer their expertise to you all today. Alison do you want to start by introducing yourself?

Alison: Absolutely. I’m Alison Halland with DarkOwl, and we’re based in Denver Colorado. I’ve been with the company for over 6 years and we are Cowbell’s darknet partner. We provide our darknet data to Cowbell and it’s been a great partnership. I’m excited to be here talking to all of you and hopefully you can walk away with a little more information about what the darknet is and how it can be helpful as you talk to your clients looking at getting policies.

Manu: Thanks Alison, thanks Jessica. Glad to be here today. I’m Manu Singh, and I’m the Director of Risk Engineering here at Cowbell. My team assists our policy holders through our continuous risk assessment process. That includes understanding our Cowbell cyber platform, our Cowbell factors, understanding how our AI and machine learning scans are used to develop insights and recommendations, as well as some of the data that we add off the dark web thanks to the assistance of DarkOwl. My team— ultimately our goal is to reduce the frequency and severity of data breaches and cyber incidents for our policy holders. We certainly do that by generating our dark web data reports, and again that is with the assistance of the data that DarkOwl is providing for us.

Jessica: Awesome. I want to kick things off. We’re going to have this be as interactive as possible so please feel free to ask questions, utilize that chat, if questions come up we want to make sure that we’re answering them. I want to kick things off with a question for those of you who are joining us today: I want to understand if the dark web is a topic is of interest to your customers today. Is this something that comes up a lot or do you see [the dark web] as an angle that you can use when you’re selling cyber insurance? There’s a question up there now: How confident are you at discussing the dark web with customers? Do you feel very confident, neutral, or that this is totally new to me? I’ll give you a second to place your vote…. there are some results showing that people are pretty neutral. This is actually really good news. My hope was that you don’t feel very confident in discussing dark web and that you’ll leave today feeling much more so. That gives us a really good place to start from.

Alison I’m going to kick it over to you if you can give our audience a quick overview of what is the dark web. I think having that basic understanding of what it is and what happens there will help us understand how we can then talk to customers about it.

Alison: Excellent. So as Jessica said let’s step back and define the darknet so that you all are all operating under the same kind of information. At this point in time it’s a buzz word that we’ve all heard, especially if you’re paying attention to media or newspapers. It usually comes along with an image of someone in a black hooded sweatshirt with all sorts of code in the background. I want to unravel that a little bit and talk through exactly what the darknet is.  

We at DarkOwl consider the surface web to be anything that’s indexed by a search engine. Think about when you open up Google, you put in a search term, you hit enter. All of those results are by definition the surface web. They are indexed and you can click on them. And interestingly despite hundreds of thousands of results coming up on Google that only represents about 5% of the internet. Hence the iceberg analogy this is the tiny piece that is above the surface of the water.

The deep web makes up essentially the other 95%. The deep web is nothing scary or dangerous. In fact, I guarantee that everyone on this call was on the deep web today or yesterday. The deep web is content that sits behind a username and a password –or content that is not indexed by a search engine. What I mean by that is if I go into Google, I might not be able to pull up the deeds on all the houses in Denver within the search engine. But if I go to denver.gov I can find that information. Or, for instance, I logged into my bank website this morning and I paid my water bill. That is deep web content – I can get there with a username and password but all of you can’t access my bank account. That is where the majority of the internet resides. There’s so much that sits behind usernames and passwords.

The darknet, where DarkOwl specializes, sits below both of those. It is an undefined and hard to quantify space, but, in comparison to the deep web and the surface web it is much, much smaller in volume. The reason it’s important and significant is that by definition the darknet allows you to remain anonymous. That is the darknets defining feature. If you want to nerd out over it, the darknet was actually developed by the US Naval Research Laboratory in the 90s to allow folks serving to remain anonymous. As we all know, what does anonymity bring? It brings an opportunity to do things without being found. Hence the illegal activity that happens on the darknet. But I want to be very clear that the darknet in itself is not a bad thing. It is not illegal to go onto the darknet. You do need to download special software. Some of you may have been on Tor.

The key takeaways here are: for the darknet you have to download special software, its kind of a pain get onto it, however, it is not illegal, and anyone is able to access it, and the defining feature is that you are able to remain anonymous when interacting on the darknet. And the best way to visualize this –and you all will know what generation I was born in by my analogy- when you used to watch the Price is Right you’d have that show with the little ship that would go down through all the slots. That’s essentially what’s happening on the darknet with your IP address. The ability to track someone back to an IP address is almost impossible on the darknet whereas if I go to Cowbell.com, Cowbell has an awesome marketing team, they most likely know where I came from, what my IP address was, what pages I looked at, and how long I stayed there. Those same tracking metrics do not exist on the darknet.

Why do we care about the darknet? Why is it something that we are all on the phone to talk about? The takeaways here are usage on the dark web. I go interchangeably between dark web and darknet – we use them interchangeably at DarkOwl. But there was an 80% increase in usage over the last 3 years. Millions of users are connecting through the Tor browser, which is the best-known darknet out there. This is a very lively and active community of folks. It may not be as big in quantity compared to the deep web or the surface web, but there’s a lot of activity going on there. That’s why we are all focused on it, and that’s why we DarkOwl are in business.

Obviously all of you are in the insurance space –so why is it important to understand what DarkOwl does and what darknets exist out there? The kind of stuff you are going to see on the darknet is exposed credentials, you are going to see IP addresses, you are going to see people buying and selling social security numbers, people trading gift cards, people posting ransomware, and people selling services to conduct malicious activity against organizations. You name it and it is being transacted on the darknet.

As a company, whether you are tiny or humongous, you need to understand what that looks like for your own organization. Jessica and Manu and I think a lot about: how can this data be helpful for our respective clients? And the best way to think of it is as an exposure vector. Most people on the darknet are there because they are doing something illegal and taking advantage of the ability to remain anonymous. If you as an organization have content on the darknet, whether it is emails or trade secrets or anything – that is a concern. That is why we’re all on this call today. We’re going to get into how Cowbell uses that information and what you can all leverage to help inform your clients why it is important for them to understand their darknet presence.

Jessica: You can see here the quantity of data that DarkOwl has, DarkOwl being Cowbells partner for dark web intelligence. What we want to impress upon our customers is that with Cowbell, it’s not just a cyber insurance policy that you are getting. You’re getting all the intelligence that Cowbell has from our partners as well. And DarkOwl is the crème de la crème of dark web intelligence. This is a value-add that’s above and beyond what other cyber insurance carriers might offer. And that’s a huge piece of information to keep in mind when talking to customers. The dark web is not just one place, it’s several different places. Many of you may have heard of platforms like Telegram or Discord. These are encrypted chat spaces that DarkOwl collects information from as well.

Alison, quick question for you: most people think to themselves, I am a small business, my information is probably not on there, and if it is on there what can someone really do with it? Can you speak to the amount of exposure you see for small businesses? Let’s say a bad actor has access to an email address, what could they even do with it?

Alison: Right. We get that question quite a lot. The answer is contrary to what most folks think. The vast majority of attacks, whether it’s ransomware attacks or cyber incidents, are targeted at small and medium businesses. Part of that is because that’s an easier feeding ground. A lot of those small to medium businesses don’t have the tech staff or the budget to have cybersecurity tools in place. Yes, you read in the front of the Wall Street Journal that a Fortune 500 company experienced a huge breach. But the ones you don’t necessarily hear about are all the small and medium businesses that are getting targeted day-in and day-out. The risks can be higher for those small and medium businesses… An IBM is going to be able to weather that storm whereas a small or medium business – they could not in a position to deal with a huge ransomware attack. The first question you ask Jessica – it absolutely is important for anyone whether you are a business of two people or two billion.

And the second question is what can they do with an email address? Quite frankly they can do a lot. They can find their way into that organization. A lot of content we see on the darknet will have passwords associated with it. Think about a hacker that has stolen information. And a small to medium business’ employee uses that same password for their Spotify account that they do internally for work. Because of password re-usage, that hacker can access the internal systems of the small to medium business and take down information. The business could be vulnerable to social engineering. We see a lot of executives targeted at small to medium-sized businesses. There are many vectors present on the darknet that threat actors could use to get into the organization from a technological standpoint or to social engineer their way in.

Jessica: That’s the perfect segway over to Manu which is the “so what?” What does Cowbell do with this data? How do we understand at what level of risk a company faces when it comes to dark web exposure? Manu, if you wouldn’t mind, give us an understanding of how Cowbell uses this data and what is available to customers above and beyond what they see in their dashboard? In their Cowbell portal on the platform.

Manu: Absolutely. The way we look at it, DarkOwl’s data is directly aggravated from forums off the dark web. This is valuable data to Cowbell since we’ve created a dark intelligence score for each one of our policyholders in the form of a Cowbell factor. This score helps us determine what the level of risk is associated with the organization’s exposure on the dark web. If we determine that there is organizational data exposed on the dark web, we’re able to quickly identify the number of documents exposed, and then we notify those Cowbell policyholders to potentially take action through our own platform. Now how does that really affect Cowbell factors, and what we can do for our insurers?

Our dark intelligence telefactor is impacted directly by the number of exposed data points that have surfaced on the dark web. The more exposed documents we identify and the more credentials or passwords that are leaked are associated as a high-risk. The severity of those exposed data points is categorized by low, medium, high, or very high. For example, if we identify that there’s 50 documents exposed on the darkweb for a particular insurer and 25 of those documents were considered hack-worthy data, then we may categorize that as a medium-sized risk. If we go down to 20 exposed documents with only 5 that were considered hack-worthy data than that may be considered a low risk. Versus something where we might find 5,000 documents on a particular insurer and they have 250 documents that are considered hack-worthy data. That would be in either the high or very high category as well. At that point  we identify that risk for our insurer on our Cowbell cyber platform. From there they can go ahead and request additional details, such as what’s behind those documents and what’s actually been exposed. That’s what tends to happen with policy holders. They reach out to the risk engineering team and then from there we create a report for them.

Jessica: So if I want to understand what’s behind the score, you’re saying that I can reach out to the risk engineering team and receive a report. What does that report have in it? Can you show us an example?

Manu: Absolutely, I have one right here… this is a sample report. For full disclosure this is not any actual data on a policy holder or any actual dark web information on a policy holder. This is all make-believe. With DarkOwl’s data, we organize that data into a report that is consumable by IT professionals, by security professionals, and a report that makes sense for management teams and the C-suite as well. We want everyone to be able to look at this report and say: I get it, I see what the risks are, I see what the exposure is.

Our risk manager has done a great job of aggregating that data into a report that’s consumable by all. On the top you’ll see that summary of findings found through the help of DarkOwl’s platform. It will quickly summarize where this data was exposed. This report says the data was exposed in the MGM 2022 breach as well as the leading data source where all types of information was exposed. In this one it highlights the PII that may have been exposed such as date of birth, email addresses, names, actual physical addresses. This was happening for over 142 million records from the MGM breach in 2022. From there the report goes into some of the categories that we have found. The total number of exposed documents that have surfaced on the dark web for this particular insurer will be listed.

We have 555 exposed documents. From there it goes into how many of those are actually exposed credentials with passwords listed whether is plain text passwords, so that would be the actual password, versus something that’s hashed which would be more of a coded password and more difficult for a bad actor to take advantage of. It has listed 5 there. And then 5 is the total number of exposed passwords. This is passwords without a credential associated with it. This will also list out the most recent data that was listed, so you may find data that was listed in 2021, 2020, 2019. This data is as of this year so that makes it even more crucial for an organization to understand that this is a direct exposure, this is a recent breach, and this could be a recent password that an employee is using.

Down here we have a couple of charts. It will tell you the amount of passwords and some of the other data that is exposed such as email, names, phones, and physical address which is conveyed here for a policy holder.

Scrolling down we get into the recommendations that we want some of our policy holders to follow if they do have data exposure, such as what you can do next and how you can mitigate some of the exposure. We have listed some of the best practices and security controls policyholders can apply.

Everything from applying multi-factor authentication to those email accounts that may be exposed, to changing those passwords, creating robust password policies, requiring employees to have alphanumeric passwords, and passwords of at least 10 to 12 characters. That’s the standard right now. With special characters included. Training your employees to identify phishing attempts, having good email hygiene, and not clicking on links if you don’t know who the sender is are what we recommend to our policy holders to apply if they do have any exposure.

Jessica: I want to note that somebody in the chat asked: Is it hard to remove your information if it is on the dark web? Alison thanks for answering it [in the chat function]. In fact, it is impossible to erase information once it is on the dark web. There are two things to keep in mind here. Number one is this set of recommendations. If a customer is highly exposed, however, they are acting on some of these recommendations the exposure will go down with time. The more time that passes the lesser the importance of the exposure, such as if they are old passwords or passwords that are no longer in use or if there’s multi-factor authentication enabled. That’s going to disable a bad actor from using this information to do anything bad.

Manu: And the answer is yes. It is hard to remove that data. We can’t simply call the bad guys and say “hey look can you please delete my data off the dark web?,” they just won’t do it. Once it’s on the dark web it’s most likely on there for good. It’s going to be bought and sold. It’s going to be reposted on other forums for bad actors to buy, for actors to attempt to deploy phishing attempts against, to employ brute force attempts against, so it will always be on there. What the organization should do at that point is mitigate. Be proactive in your approach. Apply best practices. These are some of the recommendations that we initially want the insurer to take advantage of and quickly apply within their environment.

Going onto the next page this will be the actual raw data that we notice from DarkOwl’s aggregation. We’ll list out the email that was leaked and posted on the dark web. It will be the company email most likely. From there we’ll also post whether any password was leaked. In this case the password was leaked. The answer could be yes, no, or could be a hashed password. From there we want to give the policyholders the date that it was published on the dark web. We think that’s important because the more relevant data for the bad actor to take advantage of and to use to compromise your organization will be the most recent data that was posted. We tend to see that with credentials that they get from the current year – those passwords may still be current. Employees may still be using those passwords to login to those accounts. Threat actors take quick advantage of that. Then we’ll list out the data source as well just so the policy holder can understand: was I compromised, or was this from a third party where my data might have been sitting somewhere and the bad actors had access to it that way? If there are any of other types of information included –so email addresses and passwords for this one –you can see some of these emails have a lot more PII associated with them, such as email addresses, names, titles, their LinkedIn IDs, and where they’re living.

Alison: That context that Manu just went over is extremely valuable and I don’t want that to be lost in the details. There are other darknet providers who might be able to say yes, that company has exposure on the darknet. But then it’s end of sentence. And you don’t get the context. Being able to share with that client that these exact email addresses with these exact passwords were a part of this breach is so much more powerful. Think about the mitigation if you are a small-medium sized business and this report comes back and there’s three email addresses on it and three of those employees are no longer with the company and left 4 years ago. You’re not concerned. Or, you come back and this report has 5 email addresses listed on it and every one of those employees was attending a conference last week together – that’s going to be a very different mitigation strategy for that business than the former. The context and the fact that Cowbell can pass that on to you to pass onto the policy holder… is extremely important because it allows them to act on it versus “well there’s information out there, good luck.”

Jessica: Alison when you say valuable, I want to press upon that. A lot of DarkOwl’s customers are cybersecurity companies. They might charge thousands of dollars to a customer per year to provide this in-depth information. Manu, do our customers have to pay for this report?

Manu: No, this is a value-add for being a Cowbell policy holder. It’s one of the many value-adds that we bring to our policy holders, and it’s one of the most frequently requested value-adds that we provide. If you notice that you had an exposure on the dark web, within the same day or within 24 hours we can turn around a report and get it over to your risk managers and your security folks. There’s no added cost associated with utilizing this service.

Alison: I would leverage that highly. When we were prepping for the webinar, Jessica was asking me how I would position it if I was in all of your shoes. I think about comparing different car insurances. If you make that analogy over to cyber insurance, this is the equivalent of getting free oil changes and engine checks. This is a huge value-add especially for small-medium businesses that may not have an IT staff who can be looking on the darknet. I think it’s a freebee that they can take advantage of.

Jessica: Absolutely. Manu, can you answer for us: what are some of the most common questions or trends that you get from customers about the dark web? Is there a common misconception, myth, or concern that your team fields most often?

Manu: The number one question we get after an organization realizes that their data was on the dark web is: “have we suffered a data breach or a cyberattack?” In most cases that we’ve seen the answer is no, it’s just the circumstances – it tends to be that the compromise happened at a third party, and they were storing your data in some capacity and threat actors gained access to it and they posted it on the dark web. Sometimes a bad actor will even mention the data source that they aggregated the data from. There are some cases where it could be direct exposure to your organization, and this indicates a breach. But what we tend to see is that it’s most likely a third-party breach and your data has been posted on the darknet. I would say the next question that we receive often is: “how can I reduce my risk?” and “this data is out there, what do I do?” and “how do I make sure that I don’t get hacked, how do I make sure that I don’t become a target?” It goes back to being proactive to applying those recommendations that we spoke about. Between MFA, email security, training your employees, and having strong passwords –all of that is very important. Those are probably the top two questions we get from policy holders once they notice that they have had some exposure.

Jessica: Manu this question (from the chat) is going to come to you. Do we also use this data as we underwrite and determine premium rates for prospective customers, and if so, is there a way to get a sample of some exposure for clients in advance as we help them consider the value of a Cowbell policy?

Manu: It is factored into the underwriting process if there is exposure on the dark web, however, we do give policyholders a chance to let us know what they are doing to be proactive to reduce their risk. Once underwriting understands [what they are doing to reduce risk] we get comfortable enough with the risk to move forward in the underwriting process.

Alison: Can they share it with their prospective clients?

Manu: Yes. We can certainly provide that data to prospective clients as well.

Jessica: So a broker could reach out in advance and understand what the exposure is so that they can guide that client potentially into a Cowbell policy or elsewhere?

Manu: Yes. As far as sending the actual data over we wouldn’t do that. We would just let them know if there is exposure and the amount of documents we’ve noticed as hack worthy data on the dark web. Then the actual data that is exposed would be shared with the policyholder or the potential client.

Jessica: So the dark web report that you shared is a post-buying experience for the policy holder.  Any final comments? Alison and Manu thank you so much for being here. Do you have any closing thoughts for the audience?

Alison: We’re here and as you can tell we are doing a ton of work in the background and by “we” I mean Cowbell and DarkOwl to try and make this a much more robust policy than some other folks out there so don’t be afraid to come to us, ask questions, and if you have any personal interest in learning more about the darknet, there’s a lot on our website at DarkOwl. We’re just here to help.

Manu: Thanks Alison and I would say that if there’s exposure on the dark web and if you don’t know what to do –come to us, ask, go on the platform and see if there is any indication. And if there is exposure as us to generate a report for you. Again, it’s a value-add for our policy holders so certainly take advantage of it. This helps in several ways. It will help reduce the organization’s chance of suffering a cyber incident related to that exposed data. It also helps underwriters better understand your security posture, and then they can more accurately rate your organization as a safer risk, and that includes during the renewal process as well.

The organization can show Cowbell that they have been proactive, that they have reached out, that they have mitigated against some of these exposures, and they can show us that they are in a strong place for a renewal. Take advantage of the value-add from DarkOwl and Cowbell; it only helps reduce your risk and make your organization a stronger cybersecurity organization.

Jessica: Thank you. Hopefully we’ve given you some things to think about today that you can turn around today, tomorrow, the next, and directly relay to your customers as to why Cowbell… is different in the market than other carriers. We’re using data sources that are absolutely the best in class to help define risk and rate risk. Beyond that we have Manu and his team who are here to help you, guide you, and provide extra information and context throughout the entire lifecycle of a policy.


Cowbell is the leading provider of cyber insurance for small and medium-sized enterprises (SMEs) and the pioneer of Adaptive Cyber Insurance. Cowbell delivers standalone cyber coverage tailored to the unique needs of each business. Our innovative approach relies on AI for continuous risk assessment and continuous underwriting while delivering policyholders a closed-loop approach to risk management with risk prevention, risk mitigation, incident preparedness and response services. To learn more, visit: https://cowbell.insure/

DarkOwl uses machine learning to collect automatically, continuously, and anonymously, index and rank darknet, deep web, and high-risk surface net data that allows for simplicity in searching. Our platform collects and stores data in near real-time, allowing darknet sites that frequently change location and availability, be queried in a safe and secure manner without having to access the darknet itself. DarkOwl is unique not only in the depth and breadth of its darknet data, but also in the relevance and searchability of its data, its investigation tools, and its passionate customer service. Our passion, our focus, and our expertise is the darknet.

Interested in how darknet data applies to your use case? Contact us.

[Presentation Slides] Industrial Control Systems & Operational Technology Threats on the Darknet

October 7, 2022

Industrial control systems (ICS) and their adjacent operational technologies (OT) govern most everything societies rely on in the modern age. Manufacturing facilities, water treatment plants, mass transportation, electrical grids, gas, and oil refineries… all include some degree of ICS/OT incorporated in their industrial processes. Cyberattacks against these are on the rise and the challenge to protect industrial control systems persists. Recent research from DarkOwl analysts specifically identifies an alarming number of threats on the darknet and deep web that could effectively target and compromise critical infrastructure.

DarkOwl is not the only one taking note of these trends and associated challenges. Hybrid CoE, the European Centre of Excellence for Countering Hybrid Threats, has published a Working Paper entitled Defending critical infrastructure: The challenge of securing industrial control systems, diving into the topic of cyber threats affecting industrial control systems, the downstream affects and what can be done from a policy perspective.

Last week, DarkOwl CEO Mark Turnage participated in a webinar, “Defending Critical Infrastructure: The Challenge of Securing Industrial Control Systems” hosted by Hybrid CoE, with speakers from The United States Army College, the Internal Society of Automation (ISA), and the National Institute for Strategic Studies, Ukraine.

The panelists discussed DarkOwl’s recent research in detail, covering topics such as cyber incidents affecting the vulnerabilities of industrial operations, recent examples from Russia’s war against Ukraine, specific OCS/OT threats on the darknet, and potential ways to develop more effective policies.

The slides that Mark Turnage shared during the webinar can be found here:


Curious to learn more about how darknet data can tailor your threat intelligence or provide insight into the threats your face? Contact us.

Copyright © 2024 DarkOwl, LLC All rights reserved.
Privacy Policy
DarkOwl is a Denver-based company that provides the world’s largest index of darknet content and the tools to efficiently find leaked or otherwise compromised sensitive data. We shorten the timeframe to detection of compromised data on the darknet, empowering organizations to swiftly detect security gaps and mitigate damage prior to misuse of their data.