How to Apply Neuro Network-Like Techniques to Asset Profiling in an Automated Penetration Testing?

by | May 7, 2020 | AI in Automated Pen Test

The asset profile is a starting point for planning an attack chain and it is an important foundation. Many decisions made in the attack chain are actually based on the results of the asset profile. Therefore, the asset profiling is to understand the target thoroughly which is very important as the old saying said” We need know our enemy thoroughly in order to win the battle.”

Conventional techniques for asset recognition generally use rule matching or pattern matching, but there are some attribute sets that you cannot use rules to identify. For example, on a HP printer website, its language may be Arabic or Greek which we do not understand. But when we see this website, we still know it is a printer’s website. Why?

Because its layout, its style, and its structure can be used as attributes, and these attribute sets tell us that it is a printer’s website. But such attributes are difficult to be described with the traditional rule-based expression or regular expressions. But with AI asset profiling, it is very suitable. This is point 1.

The second point is that in an actual attack, such asset identifications are often performed many times as those features of the target may be deliberately removed or even forged. If you are only based on simple rule matching mechanism, it is very easy to be scrapped. So instead of traditional matching, in RidgeBot we run asset profiling using the approach of neural network.

The last point is that the features of assets are not only expressed by its string characteristics or some visual attributes. In fact, it also has a relationship with many network entities. We have raised asset recognition to another level called “asset profiling”. What does that mean? We build some knowledge graphs, and then use knowledge to mine the relationships between assets and other entities, organizations, even people. Many of our assets in the network are actually defined/determined by such relationships.

For example, as a hacker, once I checked into a network, and there are many servers in it, but there is one server which I found that has high input traffic and the other servers are all related to it, and it also has large quantity of downstream data. I may initially assume that specific server is doing some data storage services, and no doubt I will try to attack it first.

Another example, if I find a certain entity, I can’t recognize it by other methods, just know that, on weekdays, from 9 am to 6 pm it has a relationship with one entity, and then on the weekend, it has a relationship with other entities. I may infer that this entity might be a mobile phone or notebook. In weekday working time, it links with the office network, and weekend with home network. To construct such relationships will help to identify assets. For example, if I attacked into office network, and if I can recognize which laptop is the CEO’s or HR’s, I will definitely prioritize it as the value of the information obtained from their computers tends to be higher.