Browse CTFs New CTF Sign in

Analyzing gRPC PCAP Captures via HTTP/2 Stream Identification and Protobuf Parameter Extraction

binary_exploitation Difficulty 1–5 30 min certifiable

Theory

Why This Matters

gRPC is the dominant inter-service communication protocol for cloud-native microservices, adopted by Google, Netflix, and thousands of organisations running Kubernetes-based workloads. Security teams investigating API abuse, data exfiltration through internal services, or lateral movement in containerised environments increasingly encounter gRPC traffic in captures. Unlike REST/JSON, gRPC is binary and multiplexed — raw packet bytes are opaque without knowing both the HTTP/2 framing and the protobuf schema. Analysts who can dissect gRPC captures, identify method names, and decode protobuf payloads gain visibility into a class of microservice communication that is otherwise forensically blind.

Core Concept

gRPC is a remote procedure call framework that runs over HTTP/2 and serialises messages with Protocol Buffers (protobuf). Every gRPC request is an HTTP/2 stream carrying:

  • A HEADERS frame with :method: POST, :path: /ServiceName/MethodName, and content-type: application/grpc (or application/grpc+proto).
  • One or more DATA frames carrying the gRPC message envelope: a 5-byte header (1 compression flag byte + 4-byte big-endian message length) followed by the protobuf-encoded message body.
  • A HEADERS frame (trailers) with grpc-status on completion.

Unary RPC: one DATA frame per direction (one request, one response). Client streaming: multiple client DATA frames, one server response. Server streaming: one client DATA frame, multiple server DATA frames. Bidirectional streaming: multiple in both directions; streams are distinguished by HTTP/2 stream IDs (odd = client-initiated).

RST_STREAM frame on a stream indicates early termination — an error or cancellation. GOAWAY terminates the connection.

Protobuf decoding without a .proto schema is possible using protoc --decode_raw or the blackboxprotobuf Python library, which infers field numbers and wire types from the binary encoding.

Technical Deep-Dive

# Display all HTTP/2 traffic (gRPC runs over HTTP/2)
tshark -r capture.pcap -Y "http2" 
  -T fields -e frame.number -e frame.time_relative 
  -e ip.src -e ip.dst -e http2.streamid 
  -e http2.type -e http2.headers.path 
  -e http2.headers.content_type 
  -E header=y -E separator=","

# Filter specifically for gRPC streams (content-type: application/grpc)
tshark -r capture.pcap 
  -Y "http2.headers.content_type contains "application/grpc"" 
  -T fields -e frame.number -e http2.streamid 
  -e http2.headers.path -e http2.headers.authority

# Extract gRPC method names from :path header
tshark -r capture.pcap 
  -Y "http2.headers.path" 
  -T fields -e http2.headers.path 
  | grep -v "^$" | sort -u

# Extract raw DATA frame payload bytes for a specific stream
tshark -r capture.pcap 
  -Y "http2.type == 0 and http2.streamid == 1" 
  -T fields -e http2.data.data 
  | head -5
#!/usr/bin/env python3
"""
Decode gRPC DATA frame payloads (protobuf) from tshark hex output.
Strip the 5-byte gRPC framing header before protobuf decode.
"""
import struct, sys

def strip_grpc_header(hex_data: str) -> bytes:
    """Remove 5-byte gRPC message envelope: [compressed_flag(1)] [length(4)]."""
    raw = bytes.fromhex(hex_data.replace(":", ""))
    if len(raw) < 5:
        return b""
    compressed, length = struct.unpack(">BI", raw[:5])
    if compressed:
        import zlib
        return zlib.decompress(raw[5:5+length])
    return raw[5:5+length]

def decode_protobuf_raw(data: bytes) -> dict:
    """
    Minimal raw protobuf decoder (wire types only — no schema needed).
    Returns {field_number: value} for varint and length-delimited fields.
    """
    result = {}
    i = 0
    while i < len(data):
        # Read tag (varint)
        tag, n = 0, 0
        while True:
            b = data[i]; i += 1
            tag |= (b & 0x7F) << n; n += 7
            if not (b & 0x80): break
        field_num = tag >> 3
        wire_type = tag & 0x07

        if wire_type == 0:    # varint
            val, n = 0, 0
            while True:
                b = data[i]; i += 1
                val |= (b & 0x7F) << n; n += 7
                if not (b & 0x80): break
            result[field_num] = val
        elif wire_type == 2:  # length-delimited
            length, n = 0, 0
            while True:
                b = data[i]; i += 1
                length |= (b & 0x7F) << n; n += 7
                if not (b & 0x80): break
            payload = data[i:i+length]; i += length
            try:
                result[field_num] = payload.decode("utf-8")
            except Exception:
                result[field_num] = payload.hex()
        else:
            break  # unsupported wire type — stop
    return result

# Example: process hex DATA from tshark output
for line in sys.stdin:
    line = line.strip()
    if not line: continue
    body = strip_grpc_header(line)
    if body:
        fields = decode_protobuf_raw(body)
        print(fields)
# Use protoc to decode raw protobuf without schema
# First extract the body bytes (strip 5-byte gRPC header):
python3 -c "
import struct, sys
raw = bytes.fromhex(sys.stdin.read().strip().replace(':','))
compressed, length = struct.unpack('>BI', raw[:5])
sys.stdout.buffer.write(raw[5:5+length])
" <<< "HEX_FROM_TSHARK" > message.bin

protoc --decode_raw < message.bin

# Install blackboxprotobuf for schema inference
# pip install blackboxprotobuf
python3 -c "
import blackboxprotobuf
data = open('message.bin','rb').read()
msg, typedef = blackboxprotobuf.decode_message(data)
print(msg)
"

Analytical Methodology

  1. Open the PCAP and apply Wireshark filter http2 to confirm HTTP/2 traffic is present. Check the :authority header to identify the target service hostname or IP.
  2. Filter on http2.headers.content_type contains "grpc" to isolate gRPC streams from other HTTP/2 traffic (e.g., HTTP/2-based REST APIs). Note the stream IDs involved.
  3. For each gRPC stream, read the :path header to identify the RPC method being called. Format is /package.ServiceName/MethodName. The method name reveals what operation the client is requesting.
  4. Examine HTTP/2 stream types to classify RPC pattern: a single DATA frame in each direction indicates unary RPC; multiple client DATA frames indicate client streaming; multiple server DATA frames indicate server streaming.
  5. Identify any RST_STREAM frames on gRPC streams. These indicate premature termination — possible error, rate-limit rejection, or attacker probing failed RPCs.
  6. Extract DATA frame payloads. Strip the 5-byte gRPC framing header. Pass the remaining bytes to protoc --decode_raw or blackboxprotobuf for schema-agnostic field extraction.
  7. If a .proto file is available (from the challenge resources or recovered from the binary), compile it with protoc and use protoc --decode=package.MessageType message.bin for fully labelled output.
  8. Correlate decoded protobuf field values (usernames, resource IDs, command strings) with other evidence in the capture. Document: stream ID, method name, request field values, response field values, and grpc-status code.

Common Analytical Errors

  • Applying HTTP/1.1 analysis methods to gRPC: gRPC does not use HTTP/1.1 chunked encoding or request-response pairing by TCP connection. Each HTTP/2 stream ID is an independent RPC; multiple RPCs share a single TCP connection. Filter by stream ID, not by connection.
  • Forgetting the 5-byte gRPC envelope: Passing the raw DATA payload directly to a protobuf decoder without stripping the 5-byte gRPC message header produces parse errors or garbage output. Always strip bytes 0–4 before protobuf decode.
  • Missing TLS decryption requirement: Production gRPC is almost always TLS-encrypted (port 443 with h2 ALPN). If the capture contains only encrypted traffic, you must supply an SSLKEYLOGFILE (see card .pcap-tls-session-keys.v1) before any HTTP/2 or gRPC dissection is possible.
  • Confusing client and server stream IDs: HTTP/2 uses odd stream IDs for client-initiated streams and even for server-initiated (rare in gRPC). When mapping requests to responses, always track by stream ID, not by IP direction alone.

NICE Framework Alignment

Code Knowledge/Skill/Task Statement How This Card Develops It
K0046 Knowledge of intrusion detection systems and methodologies gRPC traffic anomaly detection requires protocol-aware dissection beyond port-based IDS rules
K0093 Knowledge of network protocols HTTP/2 framing, stream multiplexing, and protobuf encoding are modern protocol knowledge requirements
K0221 Knowledge of OSI model and network layers gRPC spans layers 4 (TCP), 6 (TLS), and 7 (HTTP/2, protobuf) — multi-layer analysis is essential
S0046 Skill in performing packet-level analysis Dissecting binary HTTP/2 frames and decoding protobuf messages from raw PCAP data
T0023 Collect intrusion artifacts for use in forensic analysis Decoded gRPC request/response fields provide structured forensic artifacts from microservice communications

Further Reading

  • gRPC over HTTP/2 specification: github.com/grpc/grpc/blob/master/doc/PROTOCOL-HTTP2.md
  • Wireshark Wiki: HTTP/2 dissector and protobuf decode configuration
  • blackboxprotobuf: schema-free protobuf analysis library (github.com/nccgroup/blackboxprotobuf)
  • Google Protocol Buffers encoding guide: protobuf.dev/programming-guides/encoding/
  • SANS: "Analysing gRPC Traffic in Wireshark" (blog post)

Challenge Lab

Reinforce your learning with a hands-on generated challenge based on this card's competency.