Skip to content

Data Driven Network Security with Machine Learning

August 20, 2019 | Written by: Efficient IP | ,

Data Driven Network Security and Machine Learning

Todayโ€™s threats are sophisticated, malware deploys a long time prior to activation using complex command and control mechanisms, and theft of data and personal information is a strong motivation for hackers. In this context, DNS traffic plays a vital role in enterprise network security since it offers the possibility of seeing the intent of most traffic, whether it is legitimate or not. Countermeasures based on DNS provide an opportunity to filter a lot of malicious traffic, but DNS security requires a more evolved way to detect this traffic, ideally fast enough to annihilate the malware. This is where specific algorithms and data analysis based on machine learning can help enhance network security.

Recursive DNS servers are well placed to see traffic intent

DNS servers, mainly recursive ones located near the users, see a lot of traffic intent not only from all the network devices and computers but also the IoT and industrial devices. In fact, every device connected to the IP network, as well as the Internet, has to exchange traffic with central sites and resources for upgrading, pushing data, getting content, and to perform exchanges with other devices. All these traffic types rely initially on DNS IP address resolution from a domain name.

We all know that malicious content can be hidden behind domain names – in URLs – but it can also be concealed within content. Malware, viruses and most computer threats search to resolve a specific domain name once they need to exchange data or receive commands from a central point. Using a static IP address is not optimal as it can be detected through signature by anti-virus. Furthermore, a static IP address does not provide a way to ensure service continuity in case of server corruption or IP address filtering. Using FQDN (fully qualified domain name) is the standard solution for most malware communication, flexible enough for it to function optimally. DNS is used for malicious traffic such as data exfiltration or contact with a C&C (command and control) service, but also on domain names that could be similar to the one used by any other legitimate service. Looking to more advanced options, like machine learning network security is in order to keep up with more advanced threats.

Machine learning brings predictive security – data is key

In order to rapidly detect that a domain name is associated with bad behavior, new techniques are required. This is where machine learning and network security come into play. Reputation filtering that can be enabled in DNS firewall and is a good option for long term protection, since discovered domains will be known by reputation filtering providers within days or months. In the meantime we also need to protect our assets, our data and our privacy. Working only with a statistical approach is not enough in cases such as detecting command and control based on DGA (Domain Generation Algorithm), so itโ€™s important to introduce new ways of analyzing data – in particular, machine learning algorithms for network security (ML), clustering and neural networks.

In order to dive into these advanced algorithms, data is key. Data can be gathered at each enterprise site but can also be shared between multiple sites as some patterns are not specific. Working with ISPs and hosting providers is very interesting since they see a large amount of traffic and usage trends. But this data needs to be stored with the appropriate level of anonymization since they contain much personal information where regulations can apply. Once you have the data, you need to learn from it, filter the good from the bad traffic, train the machine learning models, apply heuristics, aggregate domains through clustering and use all possibilities offered by artificial intelligence (AI) studies. These are all key steps towards bringing predictive enterprise network security.

In addition to historical data, you require the real DNS query flow from clients, so itโ€™s important to be located as close as possible to the devices on recursive DNS servers. Being near the client allows keeping an eye on each client and its chain of domain resolution requests, and to trigger on specific suspicious patterns. Performance of the DNS recursive server may potentially be impacted if too much processing is performed at this level, but this approach is preferred to pushing the DNS traffic back to a cloud service for CPU-intensive analysis.

Add zero-day malicious domains to threat intelligence

Research we are performing in the R&D lab of EfficientIP is really promising. We are able to detect new domains used in some typical attacks well in advance and thus protect the DNS user devices. We can also detect and prevent typical horizontal malware – that is waiting for a DGA domain to be enabled – from spreading. Thanks to the powerful DNS security engine DNS Guardian, analysis is performed on the fly, enabling true detection of brand new โ€œzero day malicious domainsโ€ which can then be added to threat intelligence lists. DNS machine learning is a powerful option for enterprise network security strategies.

Simplify & Secure Your Network

When our goal is to help companies face the challenges of modern infrastructures and digital transformation, actions speak louder than words.