Streaming RTSP cameras on web: the easy way with FFMPEG and Websocket


Streaming RTSP cameras on web: the easy way with FFMPEG and Websocket

Many industries are nowadays exploiting the benefits of automation by using intelligent cameras for the surveillance of workplaces or factories, to monitor the safety of employees or to improve visual quality inspection processes. Considering the wide variety of industrial applications, there are many different cameras, parameters and requirements to be considered when starting a project. For example, according to the context, low latency, security, and persistence are requirements that must be addressed to ensure the solution is developed successfully.

For one industrial use case (streaming cameras of a visual quality inspection system),  the developers in DX Solutions Development Centre encountered a challenge: how to stream RTSP cameras on the web easily. Their objective is to minimize the latency and elaboration time of data of the web-streaming, because of the importance of ensuring a continuous flow of video images in applications like surveillance or monitoring of a production line.


The solution identified exploits FFMPEG and Websocket:

  1. The first step is converting RTSP in MPG through FFMpeg. Being an open source platform, FFMPEG provides a set of tools and libraries for recording, conversion, and playback of audio and video in different formats; that’s why it’s widely used for its flexibility and power.
  2. The second step is converting the data stream and, by opening a WebSocket on a specific port, outputting the converted data stream through it.

WebSocket offers two key advantages:

  • Persistent Connection: When a WebSocket connection is established, it remains continuously open. There’s no need to repeatedly initiate new connections for data exchange.
  • Low Latency: WebSocket ensures minimal latency due to its persistent connection and full-duplex communication, where both the client and server can transmit data simultaneously without waiting. This results in significantly reduced latency compared to conventional polling-based methods.

In sharp contrast to traditional HTTP requests, where the client initiates a request and waits for a server response, WebSocket has transformed real-time communication. This technology enables seamless data exchange between both parties without the need for constant connection reestablishment enabling us to stream RTSP cameras on web adhering to the requirements expected for our application.


This approach has been designed by the DX-Solutions Development Centre‘s team which is working on Video Solutions Services. In particular, under the direction of the Technology Manager Matteo Tacconi, the team has been working to solve the challenge of using RTSP cameras and they are now sharing with us some insights.

If you have any questions about our approach, read the full detailed description in the next section and feel free to contact our researchers today.


RTSP (Real Time Streaming Protocol) is a streaming protocol that controls the transmission of data, like a TV remote control it can start, stop, and pause. In the transport layer, however, another protocol is used: RTP (Real Time Protocol) to transmit the stream in real time.

Reproducing RTSP (RTP) directly from the browser, without going through a support server that performs operations on the data stream, while not exactly straightforward, is possible. The biggest obstacle, however, is the fact that most browsers do not directly support the protocol, so external libraries and/or various plug-ins are needed to make the data stream viewable. That’s why our method encompasses the following series of steps:

  1. Converting RTSP to MPG via FFMpeg. To install FFMpeg, simply go to the official site and download one of the latest auto-builds, extract the files from the archive and add the path to the executable in the environment variable window. Then, simply open the terminal and type “ffmpeg” followed by the command options. For example, to convert a video file from one format to another, use the command “ffmpeg -i input.avi output.mp4”.
  2. Using a support server to receive the data stream directly from the cameras and wait for commands to start streaming new cameras or end camera streaming. In our case, we wrote a simple REST API server, directly connected to the camera stream, which converts the data stream from the RTSP (RTP) format to the MEPG1Video format via FFMpeg by outputting frame after frame of the converted video via WebSocket. For our support server, we chose NodeJs, a javascript runtime environment for creating server-side applications. In particular, we took advantage of the Fastify framework, which is light and fast, and allows the creation and management of a server for REST APIs in a very intuitive way. Both are well documented, so we will skip the initial setup of the application and concentrate on the streaming part.Starting with an initial configuration file, we obtained: the camera address, the WebSocket port (through which the converted video is transmitted) and an identification name for each camera.
  3. The stream object is a class that has two basic methods: one for starting a new streaming process and one for terminating it. The streaming process is executed by exploiting the node-rtsp-stream library, which starts a new process, secondary to that of the application, which allows the video to be converted. It does this via a system call that launches an instance of FFMpeg and creates a communication channel between it and our application.  Specifically, the secondary process takes care of converting the data stream from RTSP to MMPEG1Video and, by opening a WebSocket on the specified port, outputs the converted data stream through it.
  4. For the playback of converted video streams, a viable option is to use a video library such as JsmpegPlayer. This JavaScript library offers a specialised video player for decoding and displaying MPEG1Video streams via WebSocket connections. In detail, a new JSMpeg.VideoElement object is created with the following parameters:
    – els.videoWrapper: The DOM element in which the video will be inserted. This is the div created in the render() method.
    – this.props.videoUrl: The URL of the MPEG video you wish to play.
    – this.props.options: Any additional options for configuring the video player.
    – this.props.overlayOptions: Any options for overlays on the video.
    – The last two parameters are not set in our example.

If you need to run the application in a Docker container (NodeJs Alpine) is to add the “rtsp_transport” flag followed by “tcp” within the spawning of the secondary process in the “mpeg1muxer.js” file of the node-rtsp-stream library because while windows takes it for granted, the container needs it to run the process.

In fact, it is possible to favour false positives and false negatives to meet clients’ specific needs. Again, what is important is that they are aware of the limits to levels of accuracy: after a while, diminishing returns or no returns at all can make further training unnecessary, but being conscious of it can really make a difference. It encourages an open attitude towards new paths and perspectives and can lead to an effective solution in the perfect spirit of Co-Creation. 


An interesting functionality for a streaming application (especially for those dedicated to surveillance) is to allow the user to draw on-screen areas of interest or indicate critical points of transit.

To do this, playing the stream through the JsmpegPlayer component is not sufficient. To enable user interactions with the video stream, we must exploit the potential of the HTML5 <canvas> element.

We therefore created a parent component called StreamingComponent, which acts as a wrapper and allows us to combine video playback and user interaction functionality. JsmpegPlayer we know by now, allows us video playback. Within the component, we then added a new component ImageWithLineDrawing, which allows us to draw the thresholds.

This is done via a function within the ImageWithLineDrawing component, where when the user clicks on the video stream, the coordinates within the canvas are taken and the threshold and arrow are drawn.

This dedicated area allows users to draw directly on the stream, outlining critical transition points. Through the points drawn within the canvas, users can create straight lines and arrows to indicate the direction of the passage stream.