By Markus Happe, November 2019
Many cyber attacks follow the same logic: After being successfully placed in an organization’s IT network, most malware types need further guidance from the outside (i.e. their controller). They rely on commands to know where to go next or what data to exfiltrate.
Such a communication channel is quickly established by deciding on a domain where attacker and malware meet after the attack and encode this domain on the malware. On the downside, once such communication is detected, attacks are easily contained by blocking the particular domain and preventing any further communication between the malware and the attacker. In our drone-analogy, this would be when you are speeding with the broken remote in your hand to the place you last saw your drone. With the difference, that this is not an easy thing to do for a cyber attacker.
To avoid complete communication break-down, attackers nowadays often revert to more sophisticated methods, leveraging, for example, domain generation algorithms (DGAs). When applying this method, attackers do not encode a specific domain name on the malware but an algorithm to generate pseudo-random domain names. Such an attack rolls-out in the following three steps (corresponding to the figure below):
The attacker places the malware in the victim’s system (infection)
The attacker registers one or various algorithmically generated domains (AGDs). The malware starts generating domain names with the same algorithm and tries to resolve the corresponding IP address. If the malware’s AGD exists, i.e. has been registered by the attacker, it can be used to connect to the corresponding C&C server. If the malware’s AGD does not exist, it tries another one.
Once a communication channel is established, it can be used to control the malware and exfiltrate or encrypt data.
In case the victim detects the attack and blocks the corresponding domain/IP, the malware starts generating new AGDs until it finds one that has been registered by the attacker, but not yet blocked by the victim.
This dynamic method complicates the contagion of an attack significantly and renders common blacklists useless. By the time one IP/domain is listed on a blacklist, the attacker has long switched to another one.
Detection mechanisms must be equally dynamic as the attack itself. Together with the Zurich University of Applied Sciences (ZHAW), we researched the characteristics and detection methods of DGA-attacks. One research aspect I find fascinating is that the main advantage of such attacks, i.e. the generation of random domains, is also their Achilles heel when it comes to their detection. AGDs often exhibit certain features which differentiate them from “normal” domains. We call those features detection features:
Lexical features: AGDs are often assembled in a random alphanumeric layout (e.g. 742jlr7-58wf4.org) whereas “normal” domains are based on meaningful text (e.g. exeon.ch). However, hackers already address this weakness by increasingly using a random but more dictionary-based domain name (e.g. go-29-bank.com)
Temporal features: In the process of finding the domain registered by the attacker, the malware typically contacts several non-existent domains. A large number of such failed requests (NXDomain response) can point towards the use of DGAs.
External features: Features of the domain itself could suggest maliciousness. AGDs, for example, often exhibit missing or incomplete data on its owner (missing WHOIS features) or have more IP-addresses mapped against it than usual domains.
Dynamic detection tools, such as ExeonTrace, use machine learning and big data analytics to detect those features and dismantle AGDs. Contrary to common blacklists, such tools can closely follow the attacker, even if they use new domains. The picture below shows several domains, which all exhibit an unusual name pattern and failed connections and were thus flagged by ExeonTrace.
Would you like to learn more about DGAs? Then read the additional articles listed below or contact us.
C&C Server: The command and control (C&C) server is a computer controlled by a cybercriminal which is used to send commands to systems compromised by malware and receive stolen data from a target network.
DGA: Domain generation algorithms are commonly used by malware to generate a large number of pseudo-random domain names which are used as meeting points between the attacker and its controller.
DNS: Domain Name System is the system by which Internet domain names and addresses are tracked and regulated.
Find additional articles on the topic here:
The DGA of DirCrypt
Interested to see how a DGA code looks like? This article vividly describes the reverse-engineering of a DGA algorithm used by the inactive Ransomware DirCrypt.
Conflicker and its legacy: An Overview of the Conficker worm.
One of the early adaptors of the DGA technique was the infamous conficker worm. Travel back in time with this article.
Battle of the machines
Battle of the machines: Listen to a guest lecture I gave at the University of Zurich on the arms race between automated attack and defense mechanisms. [Lecture is in German]