[Crypto][CTF]hashbrown - Length Extension Attack

Crypto 2024. 11. 13. 01:19

Description

I made ~~fresh hashbrowns~~ fresh hash function.

- hashbrown.py

import os

# pip install pycryptodome
from Crypto.Cipher import AES

flag = "bctf{????????????????}"
secret = os.urandom(16)

my_message = "\n".join(
    [
        "Grate the raw potatoes with a cheese grater, place them into a bowl and cover completely with water. Let sit for 10 minutes.",
        "Drain the grated potatoes well; if this is not done thoroughly the potatoes will steam instead of fry.",
        "Mix in chopped onions by hand.",
        "Mix the egg OR flour into the hash brown mixture evenly. This will allow the hash browns to stay together when frying.",
        "Place a large frying pan on medium-high heat and add enough oil to provide a thin coating over the entire bottom of the pan.",
        "When the oil has come up to temperature apply a large handful of potatoes to the pan and reshape into a patty that is about 1/4-1/2 inch (6-12 mm) thick. The thinner the patty, the crispier the hash browns will be throughout.",
        "Flip when they are crisp and brown on the cooking side. They should also stick together nicely before they are flipped. This should take about 5-8 minutes.",
        "The hash browns are done when the new side is brown and crispy. This should take another 3-5 minutes.",
    ]
).encode()


def aes(block: bytes, key: bytes) -> bytes:
    assert len(block) == len(key) == 16
    return AES.new(key, AES.MODE_ECB).encrypt(block)


def pad(data):
    padding_length = 16 - len(data) % 16
    return data + b"_" * padding_length


def hash(data: bytes):
    data = pad(data)
    state = bytes.fromhex("f7c51cbd3ca7fe29277ff750e762eb19")

    for i in range(0, len(data), 16):
        block = data[i : i + 16]
        state = aes(block, state)

    return state


def sign(message, secret):
    return hash(secret + message)


def main():
    print("Recipe for hashbrowns:")
    print(my_message)
    print("Hashbrowns recipe as hex:")
    print(my_message.hex())
    print("Signature:")
    print(sign(my_message, secret).hex())
    print()

    print("Give me recipe for french fry? (as hex)")
    your_message = bytes.fromhex(input("> "))
    print("Give me your signiature?")
    your_signiature = bytes.fromhex(input("> "))
    print()

    print("Your recipe:")
    print(your_message)
    print("Your signiature:")
    print(your_signiature.hex())
    print()

    if b"french fry" not in your_message:
        print("That is not a recipe for french fry!!")
    elif your_signiature != sign(your_message, secret):
        print("That is not a valid signiature!!")
    else:
        print("Thank you very much. Here is your flag:")
        print(flag)


if __name__ == "__main__":
    main()

올해 Buckeye CTF에서 출제된 문제이다.
주어진 문자열(hashbrown recipe)에 대한 hash값이 주어지고, 사용자는 'french fry'가 들어가는 문자열과 그에 대한 hash값을 입력해야 flag가 출력되는 문제이다.
처음에는 AES ECB 모드에 대한 문제인 줄 알았는데, 그것이 아니라 주어진 hash function 안의 round 함수가 이 문제에서는 AES ECB 모드임을 알 수 있다. 또한, 임의의 16바이트 문자열과 message를 함께 hash function의 digest로 만들어낸다.
이와 비슷한 문제가 드림핵에도 나와 있다. 드림핵 문제를 먼저 풀어보면서 취약한 hash function에서 사용할 수 있는 공격인 Length Extenstion Attack(길이 확장 공격)에 대해 알아보자.
https://dreamhack.io/wargame/challenges/1125

Textbook-HMAC

MD5 해시함수를 이용한 HMAC의 취약점을 찾아 인증에 성공해보세요! Exploit Tech: Length extension attack에서 함께 실습하는 문제입니다.

dreamhack.io

Description

- MD5 해시함수를 이용한 HMAC의 취약점을 찾아 인증에 성공해보세요!

- chall.py

import hashlib
import os

K = os.urandom(500)
flag = open("flag.txt", "r").read()

def HMAC(M):
    return hashlib.md5(K + M).digest()

m1 = b"Dreamhack"
h1 = HMAC(m1)

print(f"HMAC(\"Dreamhack\") = {bytes.hex(h1)}")

m2 = bytes.fromhex(input("Your message: "))
h2 = bytes.fromhex(input("Your hash: "))

if m2 != m1 and h2 == HMAC(m2):
    print(flag)

이 문제도 마찬가지로 임의의 랜덤 문자열과 'Dreamhack'이 합쳐진 문자열에 대한 hash값을 이용하여 사용자가 임의의 문자열과 그것에 해당되는 hash값을 알아내면 flag가 출력되는 문제이다. 여기서는 사용자에게 특정 문자열을 요구하진 않지만, 해결방법은 동일하다.
여기서는 hash function으로 MD5 가 쓰였다. MD5가 어떻게 이루어졌는지 알아보면서 머클-담고르 구조(Merkle–Damgård 구조)에 대해 학습해보면서 어떻게 길이 확장 공격이 가능한지 알아보자.
hash function이 무엇인지, 어떤 종류가 있고 어디에 쓰이는지에 대한 설명은 추후 시간이 되면 정리해보겠다. 지금은 hash function이 무언가 input을 넣으면 어떤 함수(round 함수)에 의해 output이 나오는 함수로 단순히 이해해보자.(이 output을 digest, checksum이라고 부른다)

머클 담고르 구조

머클-담고르(Merkle–Damgård) 구조

머클 담고르 구조는 압축 함수(compression function)을 기본으로 한다. 압축 함수는 어떤 크기의 두 입력을 받아서 한쪽 입력의 크기에 해당하는 하나의 출력을 생성한다. 쉽게 말해, 몇 개의 데이터를 받고 더 적은 데이터를 반환한다.

머클 담고르 구조는 이러한 압축 함수를 반복적으로 호출하여 메시지를 해시하는 알고리즘이다.(Ralph Merkle과 Ivan Damgård가 독립적으로 발명)
위의 [그림 1.]을 보면, 머클 담고르 구조에서 필요한 2가지 특징이 있다.
1. 초기화 벡터(Initialization Vector)이 필요하다. 첫 번째 압축 함수에 두 입력 중 메시지가 아닌 다른 입력이 반드시 필요하기 때문이다.
2. 패딩(Padding)이 필요하다. 이것은 머클 담고르 구조는 메시지를 블록 단위로 만들어 반복적인 압축 함수의 입력으로 들어가기 때문에 발생한다. 메시지의 크기가 블록 단위의 배수가 아닐 경우, 마지막 블록의 남은 부분만큼은 패딩으로 채워진다.

hash function에는 여러 가지 종류가 있는데, 그 중 MD5, SHA-2 등은 이 머클 담고르 구조를 이용한 해시 함수이다.
머클 담고르 구조는 압축 함수 자체가 있는 한 충돌 저항성(collision resistance)을 가진다는 증명이 이미 이루어졌다. 따라서 임의 길이 입력을 가지는 해시 함수의 보안은 고정 크기 압축 함수의 보안과 같고, 설계 및 분석이 더 용이해진다. 여기에 머클 담고르 구조의 독창성이 있다.
그러나, MD5와 SHA-2는 비밀을 해시하기에 적합하지 않다. 그 이유가 바로 머클 담고르 구조의 단점인 길이 확장 공격에 대한 취약성 때문이다. 그렇다면 길이 확장 공격은 무엇일까?

길이 확장 공격

길이 확장 공격(Length Extension Attack)

해시 함수를 이용하여 사용자의 비밀번호를 안전하게 지켜보려고 시도해보자. 해시 함수 알고리즘은 이미 공개되어있는 상태라 가정한다. 즉, 어떤 문자열이 있으면 공격자는 그 문자열의 해시값을 쉽게 구할 수 있다.
만약 비밀번호를 A, 해시값을 H라 하자. 그러면 H=hash(A)라고 표기할 수 있을 것이다. (패딩을 고려하면 H=hash(A || P) ; P는 Padding)
그러나 이렇게만 해두고 안전하다 생각하면 오산이다. 머클 담고르 해시 함수의 출력 크기는 고정되어있기 때문에, 제한된 크기 안에서 공격자가 hash(A)와 일치하는 문자열을 찾아낸다면 그것이 곧 A일 것이기 때문이다.
- 이것이 가능한 이유는 머클 담고르 구조의 충돌 저항성 때문이다. 충돌 저항성이란 아무도 동일한 출력을 만드는 두 개의 다른 입력을 생성할 수 없도록 보장하는 보안 속성을 말한다.
그래서 사용자는 임의의 랜덤값(여기서는 cookie라 하자)과 A를 합친 H'=hash(cookie || A)를 만든다. (여기서 || 는 두 값의 연결을 나타낸다) 이렇게 하면 공격자가 임의의 랜덤값을 모르기 때문에 일치하는 문자열을 찾아도 그것이 A가 아니기 때문에 사용자는 안심을 할 수도 있다.
그러나 공격자는 A를 찾아내는 것보다 더 치명적인 공격을 할 수 있으니, 그것이 바로 길이 확장 공격이다.

예를 들어, 어떤 은행 사이트를 이용하는 사용자가 H'=hash(cookie || A)를 이용하여 사용자를 인증한다고 치자. 이때 패딩을 고려하면, 정확히 H'는 hash(cookie || A || P)인 셈이다.
여기서 공격자가 패딩 부분이 입력의 일부인 것처럼 가장하여 머클 담고르 계산을 계속한다면 어떻게 될까?
예를 들어, 은행에서 100만원을 인출할 수 있는 문자열 B가 있다고 하고, hash(cookie || A || P)값을 패킷을 가로채서 알아냈다고 하자. 그러면 공격자는 cookie || A || P 문자열을 알아낼 수 있다.
이때 공격자가 cookie || A || P || B || P' (P'는 B에 대한 Padding)을 해시 함수의 입력값으로 넣으면, 머클 담고르 구조 해시 함수는 작업이 완료되지 않은 것처럼 주어진 digest ([그림 2.]에서의 Hash(M1 || M2), 이 예시에서는 hash(cookie || A || P)이다)에서 해시를 계속할 수 있고, 이것(hash(cookie || A || P || B || P'))은 프로토콜을 깨뜨리면서, 공격자가 cookie와 비밀번호를 알지 않고도 100만원을 인출해낼 수 있게 된다!

이제 공격에 대한 개념을 명확히 알았다. 드림핵 문제의 솔브 코드는 이미 사이트에 공개되어있기 때문에, 간단한 code로 길이 확장 공격의 과정을 살펴보자

from pwn import *

## 드림핵 문제에서의 K = cookie, m1 = password 로 이해하면 편하다. ##

m1 = b"Dreamhack"
h1 = bytes.fromhex(io.recvline()[:-1].decode()) 

msg1 = bytes(500) + m1  #이 문제에서 K의 내용은 신경쓰지 않아도 되므로 그냥 500바이트를 m1 앞에 넣는다.
padding1 = pad(msg1)[len(msg1):]

msg2 = pad(msg1)  
padding2 = pad(msg2)[len(msg2):]

blocks = split_block(padding2)  #공격자가 추가로 넣는 padding. 여기에 공격자가 Bad code를 넣을 수 있다.

state = digest2state(h1)  #앞서 사용자가 hash한 값. 
for block in blocks:
    state = process(state, block)  #추가로 머클 담고르 해시 프로세스를 진행한다.
    
h2 = state2digest(state)  #공격자가 exploit할 해시 값

io.sendlineafter(b": ", (m1 + padding1).hex().encode())  
io.sendlineafter(b": ", h2.hex().encode())   #이때 사실상 공격자가 돈을 인출해내는 것

다시 hashbrown 문제로 오자. 주어진 문제에서 secret + message를 AES ECB 모드를 이용하여 해시 함수의 digest로 주어진다. 그리고 AES ECB 모드는 머클 담고르 구조를 가지고 있으므로 길이 확장 공격이 가능하다!
문제에서 hash(secret || message || padding)이 주어져있고, 우리는 'french fry'라는 문자열을 필수로 넣어야 하므로, 'french fry' || padding2 을 머클 담고르의 다음 압축 함수의 입력으로 넣어주면 hash(secret || message || padding || 'french fry' || padding2) 을 얻어낼 수 있다! 솔브 코드와 그 주석을 자세히 참고하라.

from pwn import *
from Crypto.Cipher import AES

hashbrown_message = "\n".join(
    [
        "Grate the raw potatoes with a cheese grater, place them into a bowl and cover completely with water. Let sit for 10 minutes.",
        "Drain the grated potatoes well; if this is not done thoroughly the potatoes will steam instead of fry.",
        "Mix in chopped onions by hand.",
        "Mix the egg OR flour into the hash brown mixture evenly. This will allow the hash browns to stay together when frying.",
        "Place a large frying pan on medium-high heat and add enough oil to provide a thin coating over the entire bottom of the pan.",
        "When the oil has come up to temperature apply a large handful of potatoes to the pan and reshape into a patty that is about 1/4-1/2 inch (6-12 mm) thick. The thinner the patty, the crispier the hash browns will be throughout.",
        "Flip when they are crisp and brown on the cooking side. They should also stick together nicely before they are flipped. This should take about 5-8 minutes.",
        "The hash browns are done when the new side is brown and crispy. This should take another 3-5 minutes.",
    ]
).encode()

def aes(block: bytes, key: bytes) -> bytes:
    assert len(block) == len(key) == 16
    return AES.new(key, AES.MODE_ECB).encrypt(block)


def pad(data):
    padding_length = 16 - len(data) % 16
    return data + b"_" * padding_length


def hash(data: bytes):
    data = pad(data)
    state = bytes.fromhex("f7c51cbd3ca7fe29277ff750e762eb19")

    for i in range(0, len(data), 16):
        block = data[i : i + 16]
        state = aes(block, state)

    return state


# nc challs.pwnoh.io 13419
io = process(["python3", "hashbrown.py"])

io.recvline() # Recipe for hashbrowns:
io.recvline() # (recipe)
io.recvline() # Hashbrowns recipe as hex:
io.recvline() # (recipe as hex)
io.recvline() # Signature:

hashbrown_signature = bytes.fromhex(io.recvline().decode('utf-8')) 
#hashbrown_signature : 기존의 이용자의 해시값 (hash(secret || A || padding))

block = pad(b'french fry')
#block : 공격자가 넣을 입력 ('french fry' || padding2)

fry_signature = aes(block, hashbrown_signature).hex()
#fry_signature : 공격자가 넣을 해시값 (hash(secret || A || padding || 'french fry' || padding2)
#여기서 해시 함수는 그냥 AES이므로 기존 hashbrown.py에 있던 aes 함수를 이용한다.

fry_message = (pad(hashbrown_message) + b'french fry').hex()
#fry_message : 공격자가 넣을 입력 (A || padding || 'french fry')

io.recvuntil(b'> ')
io.sendline(bytes(fry_message, 'utf-8'))

io.recvuntil(b'> ')
io.sendline(bytes(fry_signature, 'utf-8'))

io.recvuntil(b'flag:\n')
print(io.recvall().decode('utf-8'))

io.close()
#bctf{e7ym0l0gy_f4c7_7h3_w0rd_hash_c0m35_fr0m_7h3_fr3nch_hacher_wh1ch_m34n5_t0_h4ck_0r_ch0p}

reference

- <Real-World Cryptography>, David Wong

'Crypto' 카테고리의 다른 글

[Crypto][theory]Private Set Intersection (0)	2025.02.20
[Crypto][lecture]1. 동형 암호(homomorphic encryption)란? (0)	2025.01.08
[Crypto][CTF]Bigger is Better - RSA Wiener Attack (1)	2024.10.29
[Crypto][CTF]YYYES - AES CBC Attack (0)	2024.10.13

ABOUT ME

jhk_security jhk_security

Description

Description

머클-담고르(Merkle–Damgård) 구조

길이 확장 공격(Length Extension Attack)

reference

'Crypto' 카테고리의 다른 글

티스토리툴바

ABOUT ME

Description

Description

머클-담고르(Merkle–Damgård) 구조

길이 확장 공격(Length Extension Attack)

reference

'Crypto' 카테고리의 다른 글

관련글 관련글 더보기

티스토리툴바