Home / Products / Edge & Realtime Apps / Stream Engine
ENGINE

Stream Engine

Attendance from a camera, no device at the door. The Cams Stream Engine decodes an RTSP/ONVIF or file video stream, runs the same face pipeline as the Face Match Engine on every frame, tracks each person so they are punched once per appearance, and records attendance into the standard Cams pipeline.

Overview

Turns a CCTV, RTSP, or ONVIF video stream into face-recognition attendance โ€” one punch per person per appearance, with video that never leaves your network.

The Stream Engine turns a live video stream (RTSP/ONVIF IP camera, HTTP, RTMP, or file) into biometric attendance. It decodes the stream, runs the shared face pipeline (SCRFD detect, 5-point align, ArcFace embed, 1:N identify over the warm gallery), and uses an IoU tracker so each person produces a single deduped punch per appearance. A multi-camera supervisor reads a configuration table and runs one worker per active camera with hot add, change, and remove, and a developer-portal form self-provisions cameras. Streams stay on the local network โ€” only attendance events leave the edge box.

ENGINERTSPONVIFRTMPHTTPfileC++
Point it at any RTSP/ONVIF camera
# single camera: <port> <stream-url> <device-sn>
stream-engine 9030 rtsp://cam.local/stream1 CAM-GATE-01
# or supervisor mode (DB-driven, multi-camera)
stream-engine 9030
# face detect -> align -> ArcFace -> 1:N -> one punch / appearance

Capabilities

What it does

CCTV Attendance

Recognizes faces in a live RTSP/ONVIF stream and records attendance with no dedicated terminal at the door.

One Punch per Appearance

An IoU tracker assigns a track per face so each person is recorded once per appearance, not once per frame.

Multi-Camera Fleet

A supervisor runs every active camera, one worker each, with seconds-level hot add and remove.

Edge-Local Privacy

Runs on an on-prem or edge box near the cameras; the video stream never leaves the local network, only attendance events do.

Self-Provision

The CCTV portal form registers the camera and the supervisor starts it within one poll.

Features

Everything included

  • Decodes RTSP/ONVIF IP-camera, HTTP, RTMP, or file video streams
  • Shared face pipeline: SCRFD detect, 5-point align, ArcFace embed, 1:N cosine over the warm gallery
  • IoU face tracker punches each person once per appearance, deduped by a configurable window (default 60s)
  • Multi-camera supervisor: one worker thread per active camera, polled for hot add, change, and remove
  • Per-camera threshold, minimum face size, and dedup overrides
  • Each camera is a registered device serial, attributing punches to a tenant and IN/OUT direction
  • Auto-reconnect for live streams with capped exponential backoff
  • Self-provisioning via the developer-portal CCTV onboarding form
  • Keeps video on the local network โ€” only attendance events leave the edge box
  • GPU-ready for dense-crowd throughput; CPU is fine for classroom and low-fps use

Build with Stream Engine

Grab a key, read the docs, and ship. Our team helps with your first integration.

FAQ

Common questions

What cameras does the Stream Engine support?

RTSP and ONVIF IP cameras, HTTP streams, RTMP, and video files. It also documents bridges for bulb-cameras and Android-as-camera on the developer portal.

Does the video leave my network?

No. The engine runs on an edge or on-prem box near the cameras; the stream stays local and only the resulting attendance events are sent to the server.

How does it avoid punching the same person repeatedly?

An IoU tracker assigns a track id per face and emits one attendance event per appearance, with a configurable per-user dedup window (default 60s).

Can it handle multiple cameras?

Yes. A supervisor reads a camera configuration table and runs one worker per active camera, polling so cameras can be added, changed, or removed live.

Related

Explore more of the platform