GStreamer camera fps performance using glx vs egl

Hi,

We have noticed lower FPS performance when using the OV5645 camera between using glx (30fps) and egl (20fps).

This can be reproduced on the linaro alip images by using the following commands:
OpenGL (30fps)
GST_GL_PLATFORM=glx GST_GL_API=opengl gst-launch-1.0 -v v4l2src device=/dev/video3 ! video/x-raw,width=1280,height=720,format=NV12 ! fpsdisplaysink video-sink=glimagesink

GLES2 (20fps):
GST_GL_PLATFORM=egl GST_GL_API=opengl gst-launch-1.0 -v v4l2src device=/dev/video3 ! video/x-raw,width=1280,height=720,format=NV12 ! fpsdisplaysink video-sink=glimagesink

This is an issue for us, as when using wayland images (wayland only supports GLES2), we have performance degradation to 20fps.

Has this been noticed before, and if so, is there any hints as to where the problem lies? (i.e. MESA driver, or internally within gstreamer?)

Many thanks

Few things to clarify
EGL: EGL is an interface between Khronos rendering APIs (such as OpenGL, OpenGL ES or OpenVG) and the underlying native platform windowing system. - Wikipedia

GLX: GLX (initialism for “Open GL Extension to the X Window System”) is an extension to the X Window System core protocol providing an interface between OpenGL and the X Window System - Wikipedia

EGL is NOT limited to OpenGLES, It does work with regular OpenGL
GLX however is limited to OpenGL.

(wayland only supports GLES2)

No, Wayland supports both GL and ES2 and more. It is however limited to EGL.
BUT, you CAN run GLX application on wayland using XWaylad. In debian just install using sudo apt install xwayland

Now if you wan to stick with native wayland, you can try: (OpenGL3 is supported in db410c so try opengl3 and opengl whatever works best)
GST_GL_WINDOW=wayland GST_GL_API=opengl3 gst-launch-1.0 -v v4l2src device=/dev/video3 ! video/x-raw,width=1280,height=720,format=NV12 ! fpsdisplaysink video-sink=glimagesink

Hi,

Many thanks for your reply - definitely helps explains the differences between EGL/GLX and GL/ES2.

I have tried setting GST_GL_API to opengl3, but the performance hit is still there whenever we are using EGL.

Ideally we’d like to use EGL (and can’t understand why it’s slower, as the DMABUF path is only present in EGL for the dragonboard according to Does 'glimagesink' used Qualcomm MDP for layer compose? - #6 by ndufresne ). As soon as we use GLX, the performance goes up to 30fps.

Thanks

Whats even more interesting,

if i force v4l2src to use mmap instead of dmabuf, I get 30FPS:

GST_GL_PLATFORM=egl gst-launch-1.0 -v v4l2src device=/dev/video3 io-mode=mmap ! video/x-raw,width=1280,height=720,format=NV12,framerate=30/1 ! fpsdisplaysink video-sink=glimagesink

On the linaro images, if I use egl with io-mode set to dmabuf, I get 20fps:
GST_GL_PLATFORM=egl gst-launch-1.0 -v v4l2src device=/dev/video3 io-mode=dmabuf ! video/x-raw,width=1280,height=720,format=NV12,framerate=30/1 ! fpsdisplaysink video-sink=glimagesink

If i use glx with io-mode set to dmabuf, I get 30fps:
GST_GL_PLATFORM=glx gst-launch-1.0 -v v4l2src device=/dev/video3 io-mode=dmabuf ! video/x-raw,width=1280,height=720,format=NV12,framerate=30/1 ! fpsdisplaysink video-sink=glimagesink

does this mean the dmabuf is not working correctly on egl?

There is two things to consider with DMABuf/zero-copy. The first one is the cost of mapping in software. To avoid this one you can simply set text-overlay=0 to fpsdisplaysink and use -v option to gst-launch-1.0 to still see the framerate.

The second issue with zero-copy is that buffers are kept alive longer. What generally happens is that due to GL latency, they are kept long enough that v4l2src ends up starving, and frame drops then occures inside the v4l2 driver. First improvement would be to add a thread between v4l2src and the GL pipeline (e.g. v4l2src ! queue ! glimagesink). I don’t know which version you are running, but in recent GStreamer version, we also increase by 1 the minimum number of allocated buffers in v4l2src in order to help mitigate this issue.

1 Like

Hi,

Thank you for your response - much appreciated.

I have removed the test-overlay from fpsdisplaysink just in case their is some software overlay happening (the fps is still 20fps).

Ah yes, adding another thread with a queue does seem to fix it, but we are a little worried about the latency added due to the queue.

We are using GStreamer version 1.16.0 (but our testing linaro images are using 1.14.4)

Is there another solution apart from a queue to stop the v4l2 driver dropping frames?

Many thanks

It’s a miss-conception that queues explicitly add latency. Queue elements in GStreamer absorbs latency, it increases what we call the max latency (basically the capacity to keep buffers around to compensate downstream latency). There is a small kernel scheduling latency, but if your system is not being too loaded, it won’t be noticeable (probably less then 100 microseconds).

For other solutions, you’ll have to go and experiment changing GStreamer code. I suspect the GL pipeline is not requesting enough buffers, I don’t think it changes that number depending on using DMABuf import vs copying the buffers to GL (the last one is the one that yield better framerate for you). Another aspect is that glimagesinkelement (the sink element inside glimagesink bin) is not fully asynchronous, so it increases the push-back time which get impacted on v4l2src.

Something that happens sometimes is bad V4L2 HW timestamps. That would have to checked in your kernel, in GStreamer we don’t support any HW that would provide timestamp with more then 1 frame latency, maybe that’s your issue ? It’s common issue with random UVC firmwares.

1 Like

Thank you for your response - really helpful!

We are using the linaro 4.14.96 kernel for the dragonboard, and are using the OV5645 over MIPI CSI, and using the Qualcomm CAMSS drivers. Is this what you mean by UVC firmware?

Thanks,