
Burp Suite WebSocket Pentesting: Stop the Chaos & Produce Evidence
The first time I “tested WebSockets,” I spent 47 minutes attacking the wrong connection—telemetry cosplay, not the feature that mattered. That’s when I built a Burp Suite WebSocket pentesting workflow (Repeater + History + Filtering) that stops the chaos and starts producing evidence.
If you’ve ever watched Repeater “do nothing,” chased messages that only work once, or drowned in ping/pong noise while the real action hides in plain sight, you’re not alone. Modern apps open a small galaxy of sockets—and your brain is expected to remember which one actually changes state. Keep guessing and you lose hours, miss the bug, and end up with a report you can’t defend.
The Practical Promise: Capture the upgrade handshake correctly, use Proxy → WebSockets history to build the “movie,” then use Repeater for one-variable tests—without breaking session context (cookies/tokens), order, or direction. I’m not theorizing. I’m documenting the exact mistakes I made—and the fixes that finally made results consistent.
What You Will Learn
- ✓ How to spot the real socket fast
- ✓ How to filter noise without deleting proof
- ✓ How to replay in-order when token rotation and join/subscribe state are in play
- ✓ How to write findings as “expected vs observed,” not vibes
Table of Contents
Permission first: the 30-second scope check you’ll be glad you did
WebSockets are “just another transport” until they aren’t. One wrong assumption (like testing a customer chat socket outside scope) and you’re no longer a professional—you’re a problem.
- My mistake: treating “it’s in the app” as “it’s in-scope.”
- The fix: a 30-second permission pass before you capture anything.
- Confirm environment (dev/stage/prod) and allowed accounts—ideally using a dedicated safe hacking lab setup or an explicitly approved test tenant.
- Confirm allowed actions (read-only vs mutation, rate limits, fuzzing limits).
- Confirm logging expectations (what you must capture/redact).
Apply in 60 seconds: Write one sentence in your notes: “Socket X is in-scope because it powers feature Y under rule Z.”

What counts as “authorized” WebSocket testing in US pentest contracts
Most US pentest SOWs don’t say “WebSockets” explicitly. They say application functionality, API endpoints, authenticated workflows, and rate limits. Treat each socket as an endpoint with a blast radius. If you can’t point to the clause that allows message mutation or volume testing, you default to the safest version: baseline capture + minimal edits—then you escalate with the same mindset you’d use for penetration testing vs vulnerability scanning boundaries (proof-driven, scope-aware, documented).
Safe payload rules for real-time channels (keep it reversible)
WebSockets love repetition. That’s great for apps—and dangerous for tests. Your payload rules should be boring:
- Prefer read-only probes first (e.g., observe, replay, confirm error handling).
- When mutating, change one field at a time and keep it reversible.
- Avoid “spray and pray.” A real-time socket can amplify mistakes faster than HTTP.
Quick reality check—are you on prod?
If you’re not 100% sure, assume yes. I once ran a “gentle test” that was gentle in my head and not gentle in the client’s metrics. It’s a very specific feeling: your stomach drops, your cursor stops, and you suddenly become a person who loves checklists—especially the kind that live inside a repeatable note-taking system for pentesting.
- Do you have explicit permission to test authenticated features that use WebSockets? Yes/No
- Do you have a non-production environment or a safe test account? Yes/No
- Is there a stated cap on message volume or fuzzing intensity? Yes/No
- Do you have a plan to redact tokens and PII in screenshots/logs? Yes/No
Neutral next action: If any answer is “No,” pause and get written clarity before you mutate messages.
Handshake map: the upgrade request that quietly controls everything
The handshake is the moment WebSockets stop being “a concept” and become your target. The WebSocket protocol is standardized (RFC 6455), but every application layers its own identity and trust on top—cookies, tokens, subprotocols, and sometimes fragile assumptions.
Here’s the uncomfortable truth: if your replay fails, it’s often because you didn’t capture the handshake context that made the server say “yes” in the first place.
The minimum handshake artifacts to screenshot/log (cookies, tokens, subprotocol, Origin)
When you capture the upgrade request in Burp, don’t just admire it. Record what makes it authentic:
- URL + path (including query parameters that look “temporary”).
- Cookie header (session binding often lives here).
- Authorization-style tokens (headers or query params).
- Sec-WebSocket-Protocol (subprotocol) if present.
- Origin (cross-site risk lives here).
Which header decides whether the server trusts you? (and why)
If I had to pick one “quiet kingmaker,” it’s the piece that binds identity to context. Sometimes that’s the cookie. Sometimes it’s a bearer token. Sometimes it’s a subprotocol value the backend expects. The practical rule: when you see a handshake succeed once, treat its identity fields like a key—copy them carefully, and assume they can expire. If your environment is a Kali-based lab, this is also where a hardened baseline helps: stable auth flows, stable tooling, fewer self-inflicted failures (see Kali SSH hardening if your test box is a recurring jump host).
When handshake data goes stale (and why your replay suddenly “dies”)
Token rotation is the classic culprit. Another one: the server expects a short-lived nonce or a per-connection value. I learned this the hard way on a “simple” messaging feature—my first replay worked, the second failed, and I spent 22 minutes blaming Burp instead of my assumptions.
Show me the nerdy details
WebSockets begin as HTTP and “upgrade” the connection. That means your auth story can be split: some identity is validated at handshake time, and some is validated per message. If your app binds a session to a specific connection or expects a rotating token, replaying the same payload out of context can fail even if the payload is “correct.”
Stream selection: find the real WebSocket (not telemetry cosplay)
Modern apps open more sockets than your brain wants to count. Analytics, notifications, chat, collaboration cursors, background sync—some of them are loud, some are quiet, and the dangerous ones are not always the obvious ones.
My favorite time-waster used to be “pick the first one.” My new favorite is “prove it’s the one.”
Triage cues: endpoint path, message cadence, payload shape, direction mix
- Endpoint path: descriptive paths often map to features; generic ones can hide multiplexed traffic.
- Cadence: heartbeats look rhythmic; real actions look bursty and human-timed.
- Payload shape: does it resemble your feature (IDs, room/channel names, action verbs)?
- Direction mix: action sockets show meaningful client→server requests and server→client responses.
“One action, one capture”: isolating a single user gesture end-to-end
Do one thing in the UI. One click. One send. One “join room.” Then capture from handshake to response. If you do three actions, you’ll get three storylines braided together—and your replay will feel haunted. The same “one action” discipline is why a fast enumeration routine works: single hypotheses, clean signals, less chaos.
Why the noisiest socket is rarely the most dangerous
The loudest socket is often just “presence” or “keep-alive.” The one that changes money, permissions, or data tends to be quiet until you do the privileged action. It’s the difference between a drummer warming up and the conductor lifting the baton.
- Sockets that trigger state changes (create, update, transfer, approve).
- Sockets with IDs you can swap (userId, accountId, roomId).
- Sockets tied to privileged UI actions.
- Pure heartbeat/presence streams.
- Analytics/telemetry sockets with no user-controlled actions.
- Background sync with no obvious authorization decisions.
Neutral next action: Pick one high-impact socket and capture a single action end-to-end.
History as truth: build a “movie” before you mutate anything
If WebSockets are a conversation, your job is to record it like a court reporter—before you start improvising lines. Burp’s WebSockets history is where you build the “movie”: what was sent, what was received, and in what order.
Where to live: Proxy → WebSockets history (and what to pin)
Park yourself in WebSockets history and get disciplined:
- Pick the right connection and stay with it.
- Note the handshake moment and the first authenticated message (if any).
- Identify “this is the action” messages by timing and payload fields.
Label the timeline so it’s reproducible by another tester
This is how you stop being “the person who saw it once” and become “the person who can prove it.” It’s also where your tooling muscle matters—good operators pair Burp evidence with clean traffic context (see traffic analysis with Wireshark when you need a second lens beyond Burp’s UI).
- T0: handshake succeeds
- T1: auth/subscribe/join message
- T2: privileged action message
- T3: server response or downstream effect
Let’s be honest—your memory lies. History doesn’t.
I’ve done the “I swear it worked five minutes ago” dance. History ends that dance. It also lets you spot patterns: correlation IDs, channel names, and the little “ack” messages that tell you the server accepted your request even when the UI pretends nothing happened. If you’re building your general skill stack around repeatability, you’ll feel the same payoff in labs like Kioptrix Level 2—sequence, evidence, and patience win more often than heroics.
Show me the nerdy details
In many real-time apps, the first meaningful message after handshake is a subscription/join that sets server-side context (room, topic, permissions). If you replay the “action” without replaying the join/subscribe first, you’re effectively speaking into a room you never entered.

Filtering without self-sabotage: remove noise, keep the bug
Filtering is where good testers become fast testers. It’s also where fast testers accidentally delete the one message that proves impact. So we do this like adults: we filter to see, not to forget.
Name the three noise classes: heartbeats, acks, background sync
- Heartbeats: ping/pong, keep-alive, “still here.”
- Acks: confirmations, sequence counters, receipt messages.
- Background sync: periodic refreshes that look “important” but aren’t your action.
The 3 clicks that reveal the real action (pin + direction + keyword filters)
Your goal is a stable view where a human action creates a visible burst of messages. Do this:
- Pin the connection you care about.
- Filter by direction (client→server when you’re hunting “what did I send?”).
- Keyword filter for action fields (e.g.,
action,type,room,id).
The “dual view” rule: raw feed vs signal-only feed (so you don’t hide evidence)
Keep one view raw. Keep one view filtered. When you find something, confirm it exists in raw history so you can export/screenshot it without “filter doubt.” This is the same mental model you use when scanning: raw output for proof, filtered output for speed (if you want a clean parallel, skim easy-to-miss Nmap flags—small filters change what you notice).
- Filter by direction to isolate the request you control.
- Filter by keywords that match the user action you performed.
- Always sanity-check the raw feed before you write the report.
Apply in 60 seconds: Duplicate your view: one “RAW,” one “SIGNAL,” and never mix them.
Repeater lab setup: replay messages without breaking state
Repeater is your lab bench. It’s where you take one captured message and change one variable to see what the server actually enforces. If Proxy is your camera, Repeater is your controlled experiment.
The clean path: WebSockets history → Send to Repeater → Send → verify in history
The boring workflow is the winning workflow:
- Find the message in WebSockets history that represents the user action.
- Send it to Repeater.
- Send it unchanged once (baseline replay).
- Verify the result back in history (and/or in the UI) before you mutate.
Direction control (server vs client) and why it matters
Some messages are meant to be client→server requests. Others are server→client events. If you replay a server event as if you’re the client, you’ll get nonsense results and emotional damage. Label your message direction in your notes every time you copy it.
Repeater is a time machine, not a reset button.
My early mistake was assuming Repeater “starts fresh.” It doesn’t. The server remembers context—rooms, subscriptions, permissions, sequences. Your replay must respect that context, or you’ll test a fantasy.
Show me the nerdy details
Real-time protocols often include correlation IDs, message IDs, or sequence numbers. Even when they look optional, the server may use them for replay protection, ordering, or deduplication. When a replay “works once,” check whether a second replay fails due to dedupe logic rather than authorization.
I was testing a real-time “approve” action. The UI would sometimes approve, sometimes fail, sometimes do nothing—perfect chaos. I captured the message, sent it to Repeater, and… success. I changed one field and… failure. Great. Then I tried the same change again and… success. That’s when I started blaming the app, the network, and—briefly—modern civilization.
The actual problem was painfully simple: the first request included a join/subscription message a few seconds earlier. I was replaying the action without replaying the context that made it valid. When I rebuilt the sequence (handshake → join → action), the results became consistent. And when results become consistent, they become reportable. My ego didn’t love it. My deliverables did.

Connection surgery: clone/reconnect and edit the handshake on purpose
This is the part most guides mention quickly and most testers learn slowly: connections drop. Tokens rotate. Backends expect a per-connection state. When that happens, “Send” turns into “Nothing,” and you start bargaining with your tools.
Don’t bargain. Operate.
When you must manipulate the handshake (drops, stale tokens, hidden paths)
- Your baseline replay suddenly fails after a few minutes.
- The app reconnects silently, and your captured connection is now stale.
- There’s a hidden parameter or subprotocol value that changes per session.
The Repeater wizard move: attach/clone/reconnect, then edit handshake details
The goal is not “edit everything.” The goal is “change the minimum required to re-establish a valid connection.” Start by comparing a fresh handshake capture against your stale one. Look for:
- rotating token values
- new cookie/session identifiers
- subprotocol changes
- query parameters that look like timestamps/nonces
What to change first (least invasive → most invasive)
- Refresh auth context (log out/in, recapture handshake).
- Rebuild the correct message sequence (join/subscribe before action).
- Update rotating fields one at a time.
- Only then consider deeper handshake manipulation tests (Origin/subprotocol paths).
This doesn’t measure “security.” It measures whether your workflow is stable enough to trust your results.
Result: (click calculate)
Neutral next action: If your score is under 5, rebuild baseline capture before writing findings.
7 brutal mistakes: what breaks real-world WebSocket tests (and fixes that stick)
These are not theoretical mistakes. These are the “I did it so you don’t have to” mistakes—the ones that turn a confident morning into a foggy afternoon.
Mistake #1: Replaying before you have a baseline movie → Fix: golden capture first
If you mutate before you understand the normal sequence, you won’t know whether your change caused the result or the app did something else. Capture first. Replay unchanged once. Then mutate.
Mistake #2: Testing the wrong socket → Fix: stream selection decision tree
The best payload in the world won’t work against the wrong connection. Prove the socket maps to the feature you’re testing using one action and timing correlation.
Mistake #3: Replaying out of order → Fix: sequence lock + one-variable edits
Many apps establish context in an early “join/subscribe/auth” message. If you skip it, your action fails for the wrong reason. Lock the sequence. Then change one variable at a time.
Mistake #4: Losing session context (rotating tokens / stale handshake) → Fix: reconnect playbook
When your replay “dies,” don’t thrash. Re-capture a fresh handshake and compare it to your stale one. Identify the rotating parts, update minimally, and re-test baseline. If your baseline environment is messy, fix the lab before you blame the target—your future self will thank you (start with Kali Linux lab infrastructure mastery and keep your workstation predictable).
Mistake #5: Filtering away the exploit evidence → Fix: dual view rule
Filters are for speed. Raw history is for proof. If you can’t find it in raw, you can’t defend it in a report.
Mistake #6: Ignoring directionality → Fix: label client→server vs server→client edits
I once tried to “replay” a server event as a client request. The server responded like a confused bouncer. Direction matters. Label it every time.
Mistake #7: Changing five fields at once → Fix: one mutation, one note, one expected delta
Changing multiple fields gives you multiple possible explanations. That’s not testing. That’s guessing with extra steps.
- Baseline replay first—always.
- Sequence dependencies are common, not rare.
- One-variable edits produce defensible conclusions.
Apply in 60 seconds: Add “baseline replay verified” as a checkbox before any mutation.
Exploit patterns that matter: message bugs vs handshake bugs vs cross-site bugs
Competitors love lists of “WebSocket vulnerabilities.” Useful, but incomplete. What you need is a map: which class of bug you’re testing, and which Burp lever finds it fastest.
Message tampering: injection/XSS-style payload paths (server and client impact)
Treat every message field as untrusted input until proven otherwise. Message bugs often look like:
- type confusion (string vs number)
- schema bypass (unexpected fields accepted)
- client-side rendering risk (unsafe HTML in messages)
Practical tip: start with harmless markers and observe how they propagate. You’re not trying to “break everything.” You’re trying to show a clear validation failure with visible impact. If you want a mental bridge from “HTTP injection patterns” to “message-layer injection patterns,” it helps to study a few concrete templates like NoSQL injection patterns—not because WebSockets equal NoSQL, but because the attacker mindset (shape, coercion, validation gaps) transfers.
Handshake design flaws: trust in headers, session context binding, custom headers
Some apps put too much trust in handshake metadata—assuming the browser means “safe,” assuming Origin will be validated, assuming the token always ties to the right user. When those assumptions fail, you get cross-context authorization problems that don’t show up in normal HTTP testing. This is where it pays to have broad web exploitation essentials instincts, then translate them into real-time flows.
Cross-site WebSocket hijacking: when CSRF meets the handshake
If a site can be tricked into establishing a WebSocket connection cross-site, and the server relies on ambient credentials (like cookies) without robust checks, you can end up with “CSRF, but real-time.” Not every app is vulnerable. Many are protected by design. But you should know where the edges are.
Show me the nerdy details
The handshake begins as an HTTP request, so concepts like Origin checking and cookie behavior matter. How an app validates the handshake (and whether it requires a per-request token beyond cookies) often determines whether cross-site initiation is possible.
Neutral next action: Pick the lowest tier that answers the question you’re paid to answer.

Fix guidance: what “secure WebSockets” actually means in tickets
Your report shouldn’t just say “WebSockets are insecure.” That’s like saying “roads are dangerous.” True, but unhelpful. Good tickets explain what to change, how to verify the fix, and what “done” looks like.
PortSwigger’s Burp documentation is a solid reference for how testing workflows work in practice, and OWASP’s testing guidance is a useful baseline mindset: validate trust boundaries, validate authorization decisions, validate input handling. Your job is to turn that into fixable, testable steps—then present them with the same clarity you’d use in a professional OSCP report template (clean repro, expected vs observed, minimal noise).
Non-negotiables: wss://, protect the handshake, treat data as untrusted
- Use TLS (wss://): protect traffic from interception in transit.
- Protect the handshake: don’t rely on ambient credentials alone for sensitive actions.
- Validate messages server-side: schema, types, and permission checks must live on the server.
Origin and SameSite realities (what to verify, what to recommend)
Origin checking is not a magic spell, but it’s a meaningful control when used correctly. Cookie behavior (including SameSite settings) can reduce cross-site risk in many designs, but you still need server-side validation for sensitive actions. In tickets, keep it practical: “Server must reject handshakes that fail X checks,” not “consider security.”
“Don’t do this” #1: client-side filtering as access control (why auditors catch it)
If a client hides an “admin” button but the socket accepts the “admin action” message anyway, that’s not “security by UI.” That’s a vulnerability with a friendly face. Your ticket should name the missing server-side authorization check and provide a minimal repro. If you want a clean mental model for “what the app thinks it is” versus “what it actually is,” it helps to understand vulnerable web app structure—where trust boundaries leak in predictable places.
“Don’t do this” #2: relying on headers as security decisions without validation
Headers are data. Data can be forged. If the server trusts a header to decide identity, role, or tenant without cryptographic binding and verification, your fix recommendation should be direct: remove that trust decision or bind it properly.
- State the trust assumption (“server trusts X”).
- Show the breach (replay/mutate one message).
- Define “fixed” (server rejects with clear error; no state change).
Apply in 60 seconds: Write “Expected vs Observed” in two bullet lines for every finding.
Next step: the 10-minute “golden capture” that upgrades every future test
If you do nothing else from this article, do this. A golden capture is a small artifact that makes your testing faster, your results repeatable, and your reporting painless.
Build your capture pack: handshake + 5 key messages + expected responses
Your capture pack should include:
- Handshake request/response essentials (URL, key headers, cookies/tokens)
- First context message (join/subscribe/auth if present)
- The action message you care about
- The server acknowledgment/response
- A visible downstream effect (UI change, state update, or explicit error)
Run one-variable mutations (authZ → IDOR-style IDs → validation → gentle rate probe)
The order matters. Start with authorization questions (can role A do role B’s action?). Then test ID swaps and types. Validation tests are cheap. Rate probes should be last—and only within explicit permission limits. When your mutation tests start drifting into “tool sprawl,” keep your kit tight with a curated baseline of essential Kali tools and a short list of pentesting tools you actually use.
What to screenshot/log so developers can replay it without you
- Message payload before and after your one-variable change
- Time ordering (what preceded what)
- Connection identifiers (room/channel/tenant IDs)
- Redacted auth values (show presence, not secrets)
FAQ
Q: Can Burp Suite intercept and modify WebSocket messages?
A: Yes. You can view WebSocket traffic in Burp’s WebSockets history and send individual messages to Repeater for controlled replay and mutation. The key is capturing the correct connection and keeping session context stable.
Q: Why does WebSocket replay fail in Burp Repeater?
A: The usual causes are state and sequence: stale handshake/auth data (token rotation), replaying the action before replaying the join/subscribe context, or sending the wrong direction message. Re-capture a fresh handshake, rebuild the baseline sequence, then mutate one field.
Q: What headers matter most in a WebSocket handshake?
A: Practically, the headers that bind identity and expectations: cookies/tokens, Origin (for cross-site risk), and subprotocol values if present. Record what changes across sessions—those are often the fields your replay must preserve or refresh.
Q: How do I find the correct WebSocket connection when there are multiple?
A: Do one UI action at a time and watch for a burst of messages that correlate with your timing. Look for payload fields that match the feature (room IDs, action names) and a meaningful client→server request followed by a server response/ack.
Q: How do I safely test authorization (authZ) for WebSocket actions?
A: Start with two roles/accounts and a single privileged action. Capture the authorized message from the privileged role, then attempt a minimal mutation/replay from the low-priv context using the same action structure. Your proof is “same message shape, different identity, server still allows state change.”
Q: Do I need wss:// for security, or is it optional?
A: For anything sensitive, treat wss:// as non-optional. TLS protects traffic in transit, but it doesn’t replace server-side authorization and validation. Think of it as “necessary, not sufficient.”
Q: How do I filter ping/pong and heartbeat noise without missing the exploit message?
A: Use a dual-view approach: keep one raw feed and one signal-only feed. Filter by direction (client→server for actions), then by keywords tied to your UI action. Always confirm your key evidence exists in the raw feed before exporting screenshots.
Conclusion: your 15-minute pilot run
Remember the curiosity loop from the beginning—the “wrong connection” trap? Here’s the clean way out: you don’t win WebSocket testing by being clever. You win by being repeatable.
Your Burp Suite WebSocket pentesting workflow is now a simple triangle:
- Handshake = the map (what makes the server say yes)
- History = the truth (what actually happened)
- Repeater = the lab (one-variable experiments)
- Record identity fields (cookies/tokens)
- Note subprotocol + Origin if present
- Assume some values can expire
- Build the “movie” (T0→T3)
- Filter for signal (keep raw view)
- Identify sequence dependencies
- Replay unchanged once
- Mutate one variable at a time
- Verify impact back in history
Here’s your 15-minute pilot run:
- Pick one in-scope feature that uses WebSockets.
- Capture handshake + one action as a baseline movie.
- Replay unchanged in Repeater (prove reproducibility).
- Change one field (an ID, a role hint, a type) and observe the server’s behavior.
If you want the cleanest outcome: write one ticket with one missing control and one proof. That’s the kind of work teams fix quickly—and the kind of work clients remember. And if you’re building this into a long-term skill engine, pair the workflow with a sustainable practice cadence like a 2-hour-a-day OSCP routine so “repeatable” becomes muscle memory, not a mood.
Last reviewed: 2025-12.