How Cloudflare Protects your Server

Cloudflare is an internet security and performance company. The company provides reverse proxy and CDN services to a large number of enterprises around the world. Cloudflare has an extensive network consisting of millions of clients. The company is impartial towards content, meaning that it serves businesses in different niches. Cloudflare has, thus, acquired a lot of knowledge on server performance and security.

The internet is full of malicious people who try to jeopardize others’ websites. They use malware to perpetrate these attacks. Forchan, for example, has withstood as many as 350 daily denial of service (DoS) attacks, with an average daily rate of between 40 and 100 attacks.

Protecting your site from a DoS attack is not easy. During such attacks, the attacker tries to overwhelm a web server with junk, which normally consists of raw internet packets. The problem is that the target’s resources are limited. This means that if the attack packets are large enough, the target quickly becomes overwhelmed and unable to handle the junk. This leaves them with inadequate resources to handle traffic from real visitors. The other problem is that perpetrating the attack a lot easier and cheaper than scaling the victim’s infrastructure.

A successful attack can affect several layers of the application stack. This makes it easy to overrun the end application. By sending a few thousand requests per second, it is easy to overwhelm a vulnerable website with a Python DDoS script. The Linux network stack is not very scalable for particular types of packets, with the processing being dependent on only one core. This means that certain packet types will overwhelm Linux at around 300,000 packets per second.

Attackers take advantage of the lack of scalability of the victim’s resources. For example, the firewall layer and IPtables can handle about two million packets per second before being overwhelmed. A 10-gigabit network card can only handle up to ten million packets. In other words, networking infrastructure have limited resources and can handle tens of millions of packets per second, above which problems arise. 

How Cloudflare knows that an Attack is Underway

So, how does Cloudflare know if your server is under attack? The answer is by using detection tools. Cloudflare uses Sflow for real-time analytics. Sflow is a tool that is supported by a wide variety of hardware and switches. Sflow enables network switches to sample packets that flow through them. For example, it may sample one in every thousand or two thousand packets. It then sends the contents to a central aggregation location. This helps Cloudflare to collect a lot of information from such aggregations. Cloudflare uses this technique to gather valuable information about all its data centers; information such as the top IP addresses and TCP flags among others. If you have switches that don’t support Sflow, there is a software implementation that you can use.

How Cloudflare Protects Your Server against Network Congestion

When an attack is large enough to cause network congestion, your traffic is adversely affected. Most networks respond by dropping random packets. A common way to deal with network congestion is to use BGP null routing. This is a special contract between your router and your ISP’s router, where your router asks the ISP’s router not to send packets to particular IPs. This restores your service as you’ve now removed specific problematic IPs from the internet.

The best way, which Cloudflare uses, is to migrate your services across several IP addresses. If Cloudflare identifies an attack on an individual IP address, it moves the services to another IP address and then routes the previous IP address. Cloudflare’s DNS infrastructure is robust and can quickly propagate DNS changes. Moreover, Cloudflare usually reduces its DNS TTL values ahead of time. There is also a chance that the attack will follow the new IP address, but the chances are very minimal.

How Cloudflare Protects Your Server against High Volume Packet Floods

Let us assume that there is no network congestion and that your network still has some remaining capacity. In this case, you should let the traffic to flow. There is no reason to block or route traffic as long as there is no congestion.

By allowing traffic to flow, you can use server monitoring tools to create a time-series chart showing the number of packets per second reaching your server. If there is an attack, you will observe a sharp increase in packet requests per second, which will render your DNS server dysfunctional after a while. You need to ask yourself what the origin of those packets is. Remember that it is not possible to get real traffic to the tune of 800,000 packets per second on the DNS layer. If you get such numbers, then the traffic is from an illegitimate source.

If you experience a high volume packet flood, the ideal action is to drop the traffic because there is no reason to handle invalid packets. In fact, trying to handle such traffic is a waste of CPU resources because the system will try to parse them. You will end up littering the internet with responses to random IP addresses.

According to Cloudflare, only one in ten thousand packets is valid. So, how would you identify if a packet is valid or not? There are several ways, the simplest being to monitor the packet length. For example, let us assume that you find that the DoS packets are 50 bytes long. You can then drop all the packets that are exactly 50 bytes long. This is not an efficient technique because there is the likelihood that some of the packets are legitimate and 50 bytes long. As such, Cloudflare monitors the payload and limits the false positives.

You can achieve better results if you process the packets on the firewall layer. In the Linux firewall, there are several ways of examining the payload in order to classify packets based on their content and not just their length. This ensures that Cloudflare only drops the illegitimate packets.

BDF bytecode originally came from Tcpdump, meaning that you can use Tcpdump expressions to request for UDP packets or packets going to a specific port. The benefit of using Tcpdump expressions is that they are very instrumental in crafting BPF. However, there are several problems associated with the use of Tcpdump expressions, such as the lack of variables.

Cloudflare has open-sourced tools that help in generating BPF bytecodes that are a bit more complicated than Tcpdump expressions. This helps you to withstand DoS attacks. You can use such tools to create BPF expressions that either match DNS patterns or DNS requests going to particular subdomains. Your BPF expressions can also be case-insensitive and even match invalid DNS packets. These tools are used by most DNS providers around the world and will keep your DNS server running by enabling you to drop all the invalid packets.

How Cloudflare Protects Your Server against Interrupt Storms

An interrupt storm occurs when your server receives at least two million packets per second. This will overwhelm your Linux machine because it will commit all its resources to handling all the incoming traffic. It will only focus on network processing and not have adequate resources to run any other applications. There is a way around interrupt storms, but it is important first to understand how network cards work.

Modern network interface cards (NICs) can split traffic across several CPUs. They work through an abstraction known as the “receive queue,” and each CPU has only one receive queue. Each CPU does the typical processing which involves receiving the packets, going through the firewall and the network stack before finally reaching the application. This is a lot of work and is not efficient with common general purpose operating systems in use today.

The first and commonly used method is known as kernel bypass, where the kernel is skipped entirely, and you receive all the packets directly from the userspace. This helps in enhancing performance. However, this technique has a disadvantage in that by skipping the kernel altogether; you are forced to do all the processing yourself. Moreover, the kernel bypass technique only allows one application to run on the network card. You cannot afford to dedicate the network card to just the kernel bypass.

The alternative is to use a technique known as partial kernel bypass, which is also known as a bifurcated driver. The fundamental idea is to keep most of the receive queues going to the kernel. Then, one receive queue is dedicated to handling the fast user space. This gives you the benefit of a healthy kernel and efficiently functioning applications because you offload fast processing to the user space. With this idea in mind, Cloudflare has created several patches for netmap, which is an open source framework supported on FreeBSD and Linux.

Cloudflare uses several techniques to avoid interrupt storms. It has a userspace offload application that sits idle, listening to the Linux kernel IPtables statistics and whenever a denial of service attack is detected and found to be likely to cause interrupt storms, it initiates the userspace offload program and takes ownership of the IP address that is under attack and then runs the BPF filters in a very efficient manner close to the hardware without any kernel interruptions.

This is what allows Cloudflare to scale BPFs and prevent interrupt storms altogether. This works very well, and Cloudflare has been able to handle about three million packets per second using a single CPU. On a typical day, Cloudflare can drop around 75 million packets per day. Even though these attacks take place on different machines each with several CPUs, a lot of work is involved.

However, BPF is not very useful against some of the attacks. For example, assume that you are experiencing an ACK flood. What is an ACK flood? It is a flood of packets of unknown source IPs that have random acknowledgment numbers and one bit set on the TCP flags layer (known as the ACK bed). The problem is that it's hard distinguishing between valid and invalid packets. There is a likelihood that one of those packets is valid. To make things worse, all the ACKs are serialized against one bigger structure in the kernel meaning that the performance is usually poor.

The good news is that it is relatively easy to work around this problem in Linux using the stateful firewall known as Conntrack. Conntrack only works if you disable the tcp_loose setting, which is usually enabled by default. This allows you to drop around two million packets per second. It is useful against TCP attacks such as ACK, FIN, RST, and X-mas. It, however, doesn’t work well against SYN floods. SYN floods are pretty difficult to deal with because if you enable Conntrack and you experience a SYN flood, then Conntrack will worsen your performance because it will try to create a state for every new SYN packet.

This is just how SYN handling is configured to work in Linux. It consists of two data structures; one known as the listen backlog and the other one SYN backlog. SYN backlog is responsible for handling incoming SYN packets. It then sends out the SYN acknowledgment packet. If you are under a SYN flood attack, this data structure fills up, and all the SYN packets are dropped.

Cloudflare handles SYN floods using SYN cookies. SYN cookies work by sending SYN ACK without its string and state. The problem is that by not storing any state information, your ability to handle the SYN flood is limited. In order to get useful information from SYN packets, you can enable TCP timestamps. This will allow you to collect information such as the scaling factor.

However, even with all these enabled, Linux will still not be able to handle more than 300,000 packets per second because all the SYN cookies are still serialized against one data structure. Thankfully, there is still a way to deal with this. The idea is to remove the LISTEN lock, which involves massive refactoring of the SYN queue. This is thanks to a patch submitted by Eric Dumazet from Google in October 2015.

How Cloudflare Protects Your Server against Botnet Connections

What happens if you are the victim of a botnet attack? A proper botnet runs from a real machine, with a real TCP/IP implementation, a real IP address, and a real network stack. This means that you will not see a very high volume of incoming packets and is the difference between having a botnet attack and a layer-free high-volume packet flood. Despite the fact that bots don’t send a lot of packets, there are symptoms helps to identify them. One symptom is that you will see concurrent connection count that continually increases. Another symptom is that most sockets will be in an “orphaned” state. “Time waits” will indicate churn, meaning that the connections come and go quickly.

Botnet attacks can quickly overwhelm the end application. The good news is that since these are real bots with real IP addresses and real TCP implementations, each bot will try to inflict the maximum amount of damage. This means that Cloudflare can recognize botnets through the identification of the top IP addresses and limit traffic based on IP reputation.

When Cloudflare has identified an incoming attack, the first thing it does is to enable Conntrack Connlimit. This ensures that a single IP address will not be allowed to make too many concurrent connections to Cloudflare’s servers. Second, Cloudflare enables hashlimits, which limit the rate of SYN packets per IP. This ensures that the damage caused by a single IP is minimal. Third, Cloudflare uses Ipset, which allows it to blacklist and whitelist IP addresses manually. It also supports timeouts and subnets and can automatically use the hashlimit as a blacklisting parameter.

Cloudflare also disables HTTP keep-alives. The idea that is that a single bot will try to do as much damage as possible by trying to execute the same query over and over again on the TCP layer. By disabling HTTP keep-alives, Cloudflare limits the activities of the bots.

How Cloudflare Protects Your Server against Very Large Botnets

The techniques for dealing with small botnets are not very useful when it comes to handling massive botnet attacks. For starters, large botnets have an enormous range of IP addresses. Even if each of the IP addresses is allowed to connect once every second, these bots can do a lot of damage. Cloudflare blacklists the IPs of huge botnets by inspecting their payload and cutting the ones that belong to botnets before they reach the application layer. This means that TCP connections will be established, but they will be forced to wait until they timeout. This kills the TCP packets and shields the application from the traffic. Just recently, Cloudflare published a blog article about a massive attack, and this article has explained all the techniques that were used.