Knock, Knock, Log: Threat Analysis, Detection & Mitigation of Covert Channels in Syslog Using Port Scans as Cover

In this paper, Kevin Lamshöft describes how researchers performed a threat analysis for a covert Command and Control (C2) channel using port scans as cover and syslog as carrier for data infiltration.

Session Chair: So, Kevin is presenting Knock, Knock Log: Threat Analysis, Detection & Mitigation of Covert Channels in Syslog Using Port Scans as Cover. So, this paper was written by Kevin Lamshöft, Tom Neubert, Jonas Hielscher, Claus Vielhauer, and Jana Dittmann. So I’ll hand over to you now, Kevin.

Kevin: All right, so thank you for the nice introduction. I’m Kevin Lamshöft, I’m from the Otto-von-Guericke University. This is basically a joint work together with Brandenburg University. So I’m more like the offensive side, and although guys from Brandenburg University are more from the defensive side.

And Jonas Hielscher, actually, he was working with me at Otto-von-Guericke University, but now he’s doing his PhD in Bochum, so we unfortunately lost him, but I hope he’s listening. It was very nice work and I’m a PhD student too, and I’m very happy to talk today about a little bit uncovered channels.

So information hiding, especially in research, has gained a lot of traction over the last years, especially in network-covered channels. And we especially see this as well in newer malware generations.


Get The Latest DFIR News

Join the Forensic Focus newsletter for the best DFIR articles in your inbox every month.


Unsubscribe any time. We respect your privacy - read our privacy policy.

So for example, here we have Daxin; Bvp47, which is by the Equation Group, which is associated with the NSA; Lazarus is like advanced position threats; and Tiny Turla, for example, is a small backdoor by the Turla Group, which was apparently used in the withdrawal of the Western states in Afghanistan, so it’s really geopolitical.  

And in part, it’s very targeted attacks and the aim basically is to be stealthy, be undetected as long as possible in the networks and do exfiltration or infiltration.

So this is the one hand. And on the other hand, Syslog is one of the go-to standards basically for logging system events and it’s also a network protocol. And I will talk today about why it is so interesting for covered channels. And then there’s another trend we see which is basically trying to hide in seemingly harmless attacks.

And there was for example, an incident in an Indian nuclear power plant where they thought, “All right, this is malware, which is just a crypto miner.” while actually it was really a targeted attack on gaining insights and exfiltrating sensible information.

Basically, we took both Syslog and then we thought, all right, we stumbled upon basically that we can use port scans to trigger Syslogs and hide within these port scans.

So why is Syslog so interesting from our perspective as information hiders? So as I said, alongside SNMP, actually, it’s one of the go-to standards for logging system events. And not only that, it’s also a network protocol, so if you have multiple hosts, for example, the logs are often aggregated and sent to central service.

So they’re going over the network and this is normal ethernet, IP, TCP, UDP stuff. And usually it’s unencrypted, there are some ways to encrypt it though, but it’s mainly unencrypted and therefore you can do some basic network cover channel stuff like hiding in header fields, in payload fields, modifying stuff.

And the other thing is logs are actually basically plain text, right? So you can do interesting things with textonography, which is like capitalization of letters, uppercase, lowercase stuff like that. And the other interesting thing is that logs are basically there to log events, right? So from a standpoint view of an information hider, basically you can trigger events in such systems basically to produce logs that you want.

And for example, if you have a certain event that is logged to a specific point in time, for example, this code/end code messages. And the other thing that makes this very interesting is Syslog can be retrieved at several locations, I will talk about that soon, and also at different points in time. So basically we have a location and temporal aspect. And the temporal aspect is kind of the persistence.

So the interesting thing is that on the host, for example, we helped this today actually, there was a question that we have so much data, so if you log everything right, you have many, many data and basically therefore you have log rotations and it depends, it can range from hours to days to weeks. It of course depends on how much verbosity you have in the logs.

And then, if you aggregate those logs on central servers, then they might last for months, or even years, or in factories, for example, or nuclear power plants, it’s even the lifetime basically of the factory, for example.

And the other thing is, in network traffic, so if you don’t record the network traffic, of course then if you embed something into Syslogs, then they’re basically gone when the network packet is not captured.

And the other aspect that’s coming with this is the locational aspect. So if you embed something into the Syslogs, you can actually retrieve this hidden information at different locations, like on the host systems, on the servers, in the network traffic. And as we heard on the keynote today, of course you have nowadays systems like security information and event management systems like SIEMS where logs are also sent to.

So this makes it very interesting. And this leads me to our basic threat scenario we used in this paper at least. And this is actually based kind of on the Tiny Turla back door, which is really just a little back door to keep the door open and maybe load some further malware components.

And here on top we have Alice, which is the adversary running a server on the internet basically. And we have just in the basically standard cyber killer chain, we have some malware infection using spearfishing, waterholing, whatever you like basically.

And have an exploitation, then in many cases we have lateral movement, especially in those advanced persistent threats. They are in fact generally multiple systems and even use them as a proxy. And basically, the idea here was so Tiny Turla, for example, would here now start doing a reverse channel to Alice, like, just an HPPS call every five seconds and asking, “Hey, is there anything I should do?”.

But this is actually more likely to be found of course, and we wanted to infiltrate information into this network and basically do this in a very stealthy way. And we came across port scans, for example, using Nmap and the idea is that Alice is performing port scans, which is like knocking on the firewall, seeing which ports are open.

And the idea is basically to hide within the sequence of the ports that are locked, but I will show you this in the next slide. And the basic idea is that the firewalls are port logs. So we have the logs on the firewall, and then they’re basically aggregated on Syslog servers and even the SIEM systems from where Bob, which is in the actual infected system, compromised system and the network would retrieve the logs from the server.

So this is the basic scenario. We extended this a little bit and also included a kind of reverse channel. So now we have a full command and control channel. So here it’s a DNS covered channel, but it can be basically anything. So there are so many right now published, for example, we also have a NTP covered channel and other time synchronization techniques or covered channel techniques to exfiltrate information.

What makes this very interesting here is that it’s quite common in covered channels to use basically symmetric encryption using shared secrets. But in this work we wanted to do it a bit different.

So we used fully asymmetric encryption and key derivation functions basically to derive, it’s called a stable key, which is then used to drive the embedding algorithm, so basically how you hide and where you hide specifically in the port scans, but I will show you that in the next slide.

And we also stumbled upon another covered channel, which is actually pretty nice. We figured out that you can store your public keys on GitHub. So if you have a user account, you can store your keys at github.com/yourusername.keys.

And yeah, this is pretty nice because if you want to do key exchanges, for example, you can do this quite in a stealthy way because you only need an NGTP call to get up the com that’s very inconspicuous on many networks. And this is actually a pretty nice one, nothing too fancy though, but it’s quite nice. And this is how the embedding then works.

So if we have Alice and she’s having a hidden message she wants to send Bob, so this can be malware components, for example, so other parts of the malware, then it’s encrypted using the asymmetric keys, of course. And then we get the Cipher, and the encryption is actually quite important to us because it increases of course the entropy, so we have a high entropy bitstream, which we use to encode the message.

So we use a pretty basic encoding here. So we really just are using an odd/even scheme here in the port number. So basically an even port number with ncode is 0 and an odd one 1. So that’s pretty easy, but the funny thing is actually that with the key derivation function, we derive this Stego key.

And from that we derive a seed and the seed then is used on both sides to seed a random number generator. And basically with that, we select, so for example, in Nmap scans, the first 1024 ports. But the first 24 are always the same and the rest basically is selected fully random.

So this is why it’s so interesting to embed there because it’s a random sequence. And therefore the detection of course is quite difficult. And what we do here is we do not use the whole sequence, we use a subset. So basically we can achieve a dynamic bandwidth, a dynamic steganographic bandwidth.

So this is basically the main part and this makes stego analysis of course very difficult because when you try to attack this covered channel you don’t know actually where the covered message is in the port scan. So even if you know there’s hidden information in the port scan, it’s very hard to find it.

And then, of course the port scan is performed against the firewall and then this results in the logs and the logs will then distribute it on the network and the malware would now try to retrieve this.

So there are several options, basically. You can do man in the middle kind of stuff, you can compromise the Syslogs servers, so there are many, many scenarios. And as I said, so basically if you do it with Nmap, you can come up with basically one kilobit, so 1000 bits and you can of course increase it by using sub carriers.

So those are TCP port scans. And so you can use common TCP covered channels, as well. So we took a look at several published covered channels and had a look if they would work in open-source firewalls and some would work, some would not, but this would help at least increase bandwidth, but this also of course is then easier to detect. So it’s always like a cat and mouse game, as well.

So of course we have some indicators of compromise. So the first stage, so the common compromise of the system of course alongside the killchain would result in several IOCs. Yeah, so this is basically like the normal malware stuff.

So the interesting thing is then, what IOCs do we have in the covered channels? And yes, it’s not so much, of course the port scans but yeah, port scans are quite common, so it’s not that conspicuous, but the retrieval of course, would result in some artifacts, which might be detectable.

So my colleague then, Tom’s task was basically to detect this covered channel directly in the port scans. And we are doing this usually with handcrafted features and classic machine learning basically, but this was not working properly. And so he basically came up with the idea, ”Okay, maybe we can use deep convolutional neural networks?”

And he came up with an approach using inception and transfer learning. And the challenge was basically DCNNs are mostly used for images, right? So how can we represent the port scan as an image? And yeah, he came up with a good idea. So basically it’s really just the port number is then used as a pixel.

And he came up with a 10-bit basically pixel because it’s 1000 port numbers and this results in a, in a gray value image. And this a 16-bit png file. You have to rescale it a bit to use the DCNN. And then he tried to find basically patterns in this noise, patterns to see if the embedding algorithm, which is inserting the hidden information, would result in certain patterns.

So we did several training sets with different bandwidth. So we have basically no steganography in it, then 256-bit, 512 and 768. And the training data was looking very good, but on the test data, I will get to this actually, so here this is the test data and this is the training data.

So the test data, we see that on low bandwidth basically it was not very good, but we see at least an increase in when we increase the bandwidth, as well. So this is kind of a general concept, which we see, which is called steganography cost. So basically if you alter the carrier, you make detection more easier. So here are of course still ways to improve.

So what else can we do? So of course one way would be to modify the logging behavior. So basically not every single port, but aggregated. So basically, just lock that there’s a port scan basically done from port numbers, 1-1000 or something like that.

And the thing is with covered channels in general or information having is basically to reduce redundancy in the networks, cut down entropy where it’s possible, basically use, for example, only lowercase where you can do traffic normalization and basically have better ways to detect modifications to those things. And if you do these event-based things like triggering these events, basically you can also do event-based detection on such networks.

So to sum it up, Syslog I think is very interesting for covered channels and has many opportunities for hiding information in it. Such eventuality at covered channels is actually quite interesting. And I think there will be more in the future and hiding harmless-looking attacks is really a thing, so keep that in mind, especially you incident responders.

And I really recommend active measures like traffic normalizers to make this detection actually more easy. And of course, we tried novel approaches and we will see if this will be continued in the future. So thank you very much for listening.

Session Chair: Thank you very much, Kevin. Have we any questions? Yes.

Audience Member 1: I wondered, so it’s very difficult to detect this kind of access on the lock files themselves, that did you check how good it is to see the access to the log file that’s now more frequent if it is easy to detect?

Kevin: Yeah. This is definitely a way to detect such channels, because the malware has to retrieve it somehow, but the problem is that often those Syslog servers are just common Linux servers and are often compromised with the same malware that is used before, basically. So before the letter movement, but yes, definitely unusual access to logs is definitely an indicator of compromise, I think.

Session Chair: Okay. Have we any other questions? Kevin, in terms of the firewall to log the events, how was that configured in terms of port accessibility?

Kevin: Yeah. This really makes a difference of course which ports are blocked or not. So, in our lab we just had them in a default state and default logging behavior. But basically, what you would do is to do prior reconnaissance basically to see which ports are open so that you can filter them out. Yeah, that’s absolutely true. There’s a bit of a limitation in that.

Session Chair: Okay. Thank you very much, Kevin. Yes.

Audience Member 2: I was thinking about this really interesting problem about detection based on looking at the ports that are being used. And I was wondering if you considered using any sort of frequency analysis approaches to determining whether what you’re looking at is a random distribution versus one that’s actually encoding data?

Also, are there any heuristics you could potentially pick up on such as, you know, Nmap not repeating scans at the same port, but perhaps an encoding of information might do that?

Kevin: Yeah, yeah. Very interesting. So the thing is we actually used Nmap and modified the source code, so we have basically built an API, so we can basically give him a list of ports it should use, because in default it’s really using random values from the operating system. So basically, this would be a way maybe, yeah.

Audience Member 2: Interesting. Thank you.

Session Chair: Okay, another other question? Okay, thank you very much, Kevin.

Kevin: Thank you.

Leave a Comment

Latest Videos

This error message is only visible to WordPress admins

Important: No API Key Entered.

Many features are not available without adding an API Key. Please go to the YouTube Feeds settings page to add an API key after following these instructions.

Latest Articles