Analyzing gRPC PCAP Captures via HTTP/2 Stream Identification and Protobuf Parameter Extraction
Theory
Why This Matters
gRPC is the dominant inter-service communication protocol for cloud-native microservices, adopted by Google, Netflix, and thousands of organisations running Kubernetes-based workloads. Security teams investigating API abuse, data exfiltration through internal services, or lateral movement in containerised environments increasingly encounter gRPC traffic in captures. Unlike REST/JSON, gRPC is binary and multiplexed — raw packet bytes are opaque without knowing both the HTTP/2 framing and the protobuf schema. Analysts who can dissect gRPC captures, identify method names, and decode protobuf payloads gain visibility into a class of microservice communication that is otherwise forensically blind.
Core Concept
gRPC is a remote procedure call framework that runs over HTTP/2 and serialises messages with Protocol Buffers (protobuf). Every gRPC request is an HTTP/2 stream carrying:
- A
HEADERSframe with:method: POST,:path: /ServiceName/MethodName, andcontent-type: application/grpc(orapplication/grpc+proto). - One or more
DATAframes carrying the gRPC message envelope: a 5-byte header (1 compression flag byte + 4-byte big-endian message length) followed by the protobuf-encoded message body. - A
HEADERSframe (trailers) withgrpc-statuson completion.
Unary RPC: one DATA frame per direction (one request, one response). Client streaming: multiple client DATA frames, one server response. Server streaming: one client DATA frame, multiple server DATA frames. Bidirectional streaming: multiple in both directions; streams are distinguished by HTTP/2 stream IDs (odd = client-initiated).
RST_STREAM frame on a stream indicates early termination — an error or cancellation. GOAWAY terminates the connection.
Protobuf decoding without a .proto schema is possible using protoc --decode_raw or the blackboxprotobuf Python library, which infers field numbers and wire types from the binary encoding.
Technical Deep-Dive
# Display all HTTP/2 traffic (gRPC runs over HTTP/2)
tshark -r capture.pcap -Y "http2"
-T fields -e frame.number -e frame.time_relative
-e ip.src -e ip.dst -e http2.streamid
-e http2.type -e http2.headers.path
-e http2.headers.content_type
-E header=y -E separator=","
# Filter specifically for gRPC streams (content-type: application/grpc)
tshark -r capture.pcap
-Y "http2.headers.content_type contains "application/grpc""
-T fields -e frame.number -e http2.streamid
-e http2.headers.path -e http2.headers.authority
# Extract gRPC method names from :path header
tshark -r capture.pcap
-Y "http2.headers.path"
-T fields -e http2.headers.path
| grep -v "^$" | sort -u
# Extract raw DATA frame payload bytes for a specific stream
tshark -r capture.pcap
-Y "http2.type == 0 and http2.streamid == 1"
-T fields -e http2.data.data
| head -5
#!/usr/bin/env python3
"""
Decode gRPC DATA frame payloads (protobuf) from tshark hex output.
Strip the 5-byte gRPC framing header before protobuf decode.
"""
import struct, sys
def strip_grpc_header(hex_data: str) -> bytes:
"""Remove 5-byte gRPC message envelope: [compressed_flag(1)] [length(4)]."""
raw = bytes.fromhex(hex_data.replace(":", ""))
if len(raw) < 5:
return b""
compressed, length = struct.unpack(">BI", raw[:5])
if compressed:
import zlib
return zlib.decompress(raw[5:5+length])
return raw[5:5+length]
def decode_protobuf_raw(data: bytes) -> dict:
"""
Minimal raw protobuf decoder (wire types only — no schema needed).
Returns {field_number: value} for varint and length-delimited fields.
"""
result = {}
i = 0
while i < len(data):
# Read tag (varint)
tag, n = 0, 0
while True:
b = data[i]; i += 1
tag |= (b & 0x7F) << n; n += 7
if not (b & 0x80): break
field_num = tag >> 3
wire_type = tag & 0x07
if wire_type == 0: # varint
val, n = 0, 0
while True:
b = data[i]; i += 1
val |= (b & 0x7F) << n; n += 7
if not (b & 0x80): break
result[field_num] = val
elif wire_type == 2: # length-delimited
length, n = 0, 0
while True:
b = data[i]; i += 1
length |= (b & 0x7F) << n; n += 7
if not (b & 0x80): break
payload = data[i:i+length]; i += length
try:
result[field_num] = payload.decode("utf-8")
except Exception:
result[field_num] = payload.hex()
else:
break # unsupported wire type — stop
return result
# Example: process hex DATA from tshark output
for line in sys.stdin:
line = line.strip()
if not line: continue
body = strip_grpc_header(line)
if body:
fields = decode_protobuf_raw(body)
print(fields)
# Use protoc to decode raw protobuf without schema
# First extract the body bytes (strip 5-byte gRPC header):
python3 -c "
import struct, sys
raw = bytes.fromhex(sys.stdin.read().strip().replace(':','))
compressed, length = struct.unpack('>BI', raw[:5])
sys.stdout.buffer.write(raw[5:5+length])
" <<< "HEX_FROM_TSHARK" > message.bin
protoc --decode_raw < message.bin
# Install blackboxprotobuf for schema inference
# pip install blackboxprotobuf
python3 -c "
import blackboxprotobuf
data = open('message.bin','rb').read()
msg, typedef = blackboxprotobuf.decode_message(data)
print(msg)
"
Analytical Methodology
- Open the PCAP and apply Wireshark filter
http2to confirm HTTP/2 traffic is present. Check the :authority header to identify the target service hostname or IP. - Filter on
http2.headers.content_type contains "grpc"to isolate gRPC streams from other HTTP/2 traffic (e.g., HTTP/2-based REST APIs). Note the stream IDs involved. - For each gRPC stream, read the :path header to identify the RPC method being called. Format is
/package.ServiceName/MethodName. The method name reveals what operation the client is requesting. - Examine HTTP/2 stream types to classify RPC pattern: a single DATA frame in each direction indicates unary RPC; multiple client DATA frames indicate client streaming; multiple server DATA frames indicate server streaming.
- Identify any RST_STREAM frames on gRPC streams. These indicate premature termination — possible error, rate-limit rejection, or attacker probing failed RPCs.
- Extract DATA frame payloads. Strip the 5-byte gRPC framing header. Pass the remaining bytes to
protoc --decode_raworblackboxprotobuffor schema-agnostic field extraction. - If a
.protofile is available (from the challenge resources or recovered from the binary), compile it withprotocand useprotoc --decode=package.MessageType message.binfor fully labelled output. - Correlate decoded protobuf field values (usernames, resource IDs, command strings) with other evidence in the capture. Document: stream ID, method name, request field values, response field values, and grpc-status code.
Common Analytical Errors
- Applying HTTP/1.1 analysis methods to gRPC: gRPC does not use HTTP/1.1 chunked encoding or request-response pairing by TCP connection. Each HTTP/2 stream ID is an independent RPC; multiple RPCs share a single TCP connection. Filter by stream ID, not by connection.
- Forgetting the 5-byte gRPC envelope: Passing the raw DATA payload directly to a protobuf decoder without stripping the 5-byte gRPC message header produces parse errors or garbage output. Always strip bytes 0–4 before protobuf decode.
- Missing TLS decryption requirement: Production gRPC is almost always TLS-encrypted (port 443 with h2 ALPN). If the capture contains only encrypted traffic, you must supply an SSLKEYLOGFILE (see card
.pcap-tls-session-keys.v1) before any HTTP/2 or gRPC dissection is possible. - Confusing client and server stream IDs: HTTP/2 uses odd stream IDs for client-initiated streams and even for server-initiated (rare in gRPC). When mapping requests to responses, always track by stream ID, not by IP direction alone.
NICE Framework Alignment
| Code | Knowledge/Skill/Task Statement | How This Card Develops It |
|---|---|---|
| K0046 | Knowledge of intrusion detection systems and methodologies | gRPC traffic anomaly detection requires protocol-aware dissection beyond port-based IDS rules |
| K0093 | Knowledge of network protocols | HTTP/2 framing, stream multiplexing, and protobuf encoding are modern protocol knowledge requirements |
| K0221 | Knowledge of OSI model and network layers | gRPC spans layers 4 (TCP), 6 (TLS), and 7 (HTTP/2, protobuf) — multi-layer analysis is essential |
| S0046 | Skill in performing packet-level analysis | Dissecting binary HTTP/2 frames and decoding protobuf messages from raw PCAP data |
| T0023 | Collect intrusion artifacts for use in forensic analysis | Decoded gRPC request/response fields provide structured forensic artifacts from microservice communications |
Further Reading
- gRPC over HTTP/2 specification: github.com/grpc/grpc/blob/master/doc/PROTOCOL-HTTP2.md
- Wireshark Wiki: HTTP/2 dissector and protobuf decode configuration
- blackboxprotobuf: schema-free protobuf analysis library (github.com/nccgroup/blackboxprotobuf)
- Google Protocol Buffers encoding guide: protobuf.dev/programming-guides/encoding/
- SANS: "Analysing gRPC Traffic in Wireshark" (blog post)
Challenge Lab
Reinforce your learning with a hands-on generated challenge based on this card's competency.