Lakshya Purohit | Software Developer & System Architect

Designing a Secure, Low-Latency Video Platform for Compliance and Verification Workflows using SFU architecture.

Designing a Secure, Low-Latency Video Platform for Compliance and Verification Workflows

Imagine you're building a platform where users must verify their identity through live video.

A user opens their camera, an agent verifies their identity, documents are shown on camera, and the entire interaction is recorded for compliance.

This is common in systems like:

Digital KYC verification
Remote exam proctoring
Telemedicine consultations
Insurance claim verification
Secure onboarding for financial platforms

In such workflows, the video system must be:

Low latency
Highly secure
Reliable under load
Scalable for thousands of users

This is exactly where WebRTC + Mediasoup becomes a powerful combination.

In this article we'll explore:

How WebRTC enables real-time communication
Why SFU architecture is critical
How Mediasoup routes media efficiently
How to design a secure verification platform
How to deploy and host the system properly
Real code examples

Let’s start with the foundation.

Understanding WebRTC

WebRTC (Web Real-Time Communication) allows browsers and applications to exchange audio, video, and data streams in real time.

Unlike traditional streaming systems (which buffer video), WebRTC focuses on ultra-low latency communication.

Typical WebRTC architecture:

User Browser
     │
     │ Signaling (WebSocket / HTTP)
     ▼
Signaling Server
     │
     ▼
Other Participants

Important detail:

WebRTC does not define signaling, meaning developers must build the signaling server themselves using technologies like:

WebSockets
Socket.IO
REST APIs

However, WebRTC's peer-to-peer model does not scale well for large sessions.

Why Peer-to-Peer Breaks at Scale

In a pure peer-to-peer architecture:

Every participant sends video to every other participant.

Example with 4 users:

User1 → User2
User1 → User3
User1 → User4

Bandwidth usage grows exponentially.

With 10 participants, each browser must upload 9 video streams.

This quickly overwhelms networks and devices.

To solve this, modern video platforms use SFU architecture.

SFU (Selective Forwarding Unit)

An SFU receives streams from participants and forwards them to others without re-encoding.

Users
  │
  ▼
 SFU Server
  │
  ├── forwards video streams
  └── forwards audio streams

Advantages:

Very low latency
Minimal CPU usage
Scales to hundreds of participants
Allows selective routing of streams

Popular SFU systems include:

Mediasoup
Janus
Jitsi
LiveKit

For custom infrastructure and deep control, Mediasoup is one of the most powerful options.

What is Mediasoup?

Mediasoup is a Node.js based WebRTC SFU framework with a high-performance C++ media worker.

It acts as a media router that:

Receives WebRTC streams
Routes RTP packets
Manages producers and consumers
Handles bandwidth adaptation

Key advantages:

High performance
Fully customizable architecture
Designed for production-scale video systems
Perfect for verification workflows

High-Level Architecture

A typical verification platform using WebRTC + Mediasoup looks like this:

Browser Client
     │
     │ WebRTC
     ▼
Signaling Server (Node.js)
     │
     ▼
Mediasoup SFU Cluster
     │
     ▼
Recording + Storage Service
     │
     ▼
Database

Each layer plays a role:

Layer	Responsibility
Client	Capture camera & microphone
Signaling Server	Exchange WebRTC connection info
Mediasoup	Route media streams
Recording Service	Store verification sessions
Database	Metadata & logs

Creating a Mediasoup Worker

First install dependencies:

bash

npm install mediasoup socket.io

Create a worker that handles media processing.

javascript

const mediasoup = require("mediasoup");

async function createWorker() {
  const worker = await mediasoup.createWorker({
    rtcMinPort: 40000,
    rtcMaxPort: 49999
  });

  console.log("Worker created");

  return worker;
}

Workers run the core media engine.

Creating a Router

Routers define supported codecs.

javascript

const mediaCodecs = [
  {
    kind: "video",
    mimeType: "video/VP8",
    clockRate: 90000
  },
  {
    kind: "audio",
    mimeType: "audio/opus",
    clockRate: 48000,
    channels: 2
  }
];

const router = await worker.createRouter({ mediaCodecs });

The router becomes the central media hub.

Creating WebRTC Transport

Transport allows clients to connect to the server.

javascript

const transport = await router.createWebRtcTransport({
  listenIps: [{ ip: "0.0.0.0", announcedIp: "PUBLIC_IP" }],
  enableUdp: true,
  enableTcp: true,
  preferUdp: true
});

This transport will handle the incoming and outgoing RTP streams.

Producing a Video Stream

When a user sends video to the server:

javascript

const producer = await transport.produce({
  kind: "video",
  rtpParameters,
  appData: { peerId }
});

The SFU now knows that a new video stream exists.

Consuming a Stream

Other participants receive the stream.

javascript

const consumer = await transport.consume({
  producerId,
  rtpCapabilities,
  paused: false
});

Mediasoup simply forwards packets.

No heavy encoding required.

Client-Side Camera Capture

Capture camera in the browser:

javascript

const stream = await navigator.mediaDevices.getUserMedia({
  video: true,
  audio: true
});

Attach it to a video element:

javascript

const video = document.getElementById("video");

video.srcObject = stream;
video.play();

The stream is then sent to Mediasoup.

Security Considerations

For compliance workflows, security is critical.

End-to-End Encryption

WebRTC automatically encrypts streams using:

DTLS + SRTP

Authentication

Always authenticate users before joining.

Example:

JWT token verification

Session Authorization

Each verification session should have:

SessionID
UserID
AgentID
Expiry

Recording Integrity

Recordings should include:

timestamps
user identifiers
audit logs

This ensures legal compliance.

Hosting the Video Infrastructure Properly

Running WebRTC infrastructure requires careful server deployment.

A production setup usually includes:

Load Balancer
     │
     ▼
Signaling Servers (Node.js)
     │
     ▼
Mediasoup SFU Nodes
     │
     ▼
Storage + Database

Step 1: Use Dedicated Servers or High-Performance Cloud

WebRTC workloads are network intensive.

Recommended infrastructure:

AWS EC2 (c6a / c5n instances)
DigitalOcean
Hetzner dedicated servers
Google Cloud

Recommended server specs:

8–16 CPU cores
32GB RAM
High network bandwidth

Step 2: Configure UDP Ports

Mediasoup uses UDP ports for RTP traffic.

Example configuration:

javascript

rtcMinPort: 40000
rtcMaxPort: 49999

Ensure firewall allows this range.

Example:

bash

ufw allow 40000:49999/udp

Step 3: Use a Reverse Proxy

Use NGINX or Traefik to handle:

HTTPS termination
WebSocket forwarding
Load balancing

Example NGINX config:

nginx

server {
    listen 443 ssl;
    server_name video.example.com;

    location /socket.io/ {
        proxy_pass http://localhost:3000;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
    }
}

Step 4: Use TURN Servers

Some users are behind strict NATs.

TURN servers relay traffic when direct connections fail.

Popular TURN server:

coturn

Example config:

listening-port=3478
fingerprint
lt-cred-mech
realm=example.com

Step 5: Scale with Multiple SFU Nodes

As usage grows, deploy multiple Mediasoup nodes.

Load Balancer
   │
   ├── SFU Node 1
   ├── SFU Node 2
   └── SFU Node 3

Users connect to the nearest node.

Step 6: Monitor Performance

Important metrics include:

packet loss
RTT
bitrate
jitter
CPU usage

Use monitoring tools:

Prometheus
Grafana
WebRTC Stats API

Optimizing for Low Latency

To maintain smooth real-time video:

Adaptive Bitrate

Automatically reduce quality for slower networks.

Simulcast

Send multiple video qualities.

Low
Medium
High

Mediasoup selects the best stream.

Regional Deployment

Deploy SFU clusters globally:

India
Europe
US

Users connect to the closest server.

Final Thoughts

Real-time video systems are far more complex than traditional web applications.

But when built correctly, they power critical systems such as:

Identity verification
Secure onboarding
Remote education
Telehealth platforms

Using WebRTC + Mediasoup, developers can build highly scalable, low-latency video infrastructure while maintaining full control over security and compliance.

A solid architecture typically includes:

WebRTC for real-time media
Mediasoup as an SFU router
Node.js signaling servers
TURN servers for connectivity
Secure authentication
Scalable cloud infrastructure

When these components work together, you get a production-grade video platform capable of handling thousands of concurrent verification sessions.

And most importantly — a system users can trust when it matters most.

Welcome to the world of real-time communication engineering. 🚀

Real-Time Video with WebRTC & Mediasoup

Designing a Secure, Low-Latency Video Platform for Compliance and Verification Workflows

Understanding WebRTC

Why Peer-to-Peer Breaks at Scale

SFU (Selective Forwarding Unit)

What is Mediasoup?

High-Level Architecture

Creating a Mediasoup Worker

Creating a Router

Creating WebRTC Transport

Producing a Video Stream

Consuming a Stream

Client-Side Camera Capture

Security Considerations

End-to-End Encryption

Authentication

Session Authorization

Recording Integrity

Hosting the Video Infrastructure Properly

Step 1: Use Dedicated Servers or High-Performance Cloud

Step 2: Configure UDP Ports

Step 3: Use a Reverse Proxy

Step 4: Use TURN Servers

Step 5: Scale with Multiple SFU Nodes

Step 6: Monitor Performance

Optimizing for Low Latency

Adaptive Bitrate

Simulcast

Regional Deployment

Final Thoughts