Qt Quick Direct3D 12 Adaptation
The Direct3D 12 adaptation for Windows 10, both in Win32 (
windows platform plugin) and in UWP (
winrt platform plugin), is shipped as a dynamically loaded plugin. This adaptation doesn't work on earlier Windows versions. Building this plugin is enabled automatically, whenever the necessary D3D and DXGI develpoment files are present. In practice, this currently means Visual Studio 2015 and newer.
The adaptation is available both in normal, OpenGL-enabled Qt builds, and also when Qt is configured with
-no-opengl. However, it's never the default, meaning that the user or the application has to explicitly request it by setting the
QT_QUICK_BACKEND environment variable to
d3d12 or by calling QQuickWindow::setSceneGraphBackend().
This experimental adaptation is the first Qt Quick backend that focuses on a modern, lower-level graphics API in combination with a windowing system interface that's different from the traditional approaches used in combination with OpenGL.
This adaptation also allows better integration with Windows, as Direct3D is the primary vendor-supported solution. Consequently, there are fewer problems anticipated with drivers, operations like window resizes, and special events like graphics device loss caused by device resets or graphics driver updates.
Performance-wise, the general expectation is a somewhat lower CPU usage compared to OpenGL, due to lower driver overhead, and a higher GPU utilization with less idle time wastage. The backend doesn't heavily utilize threads yet, which means there are opportunities for further improvements in the future, for example to further optimize image loading.
The D3D12 backend also introduces support for pre-compiled shaders. All the backend's own shaders (used by the built-in materials on which the Rectangle, Image, Text, and other QML types are built with) are compiled to D3D shader bytecode when you compile Qt. Applications using ShaderEffect items can choose to ship bytecode either in regular files, via the Qt resource system, or use High Level Shading Language for DirectX (HLSL) source strings. Unlike OpenGL, the compilation for HLSL is properly threaded, meaning shader compilation won't block the application and its user interface.
The plugin does not necessarily require hardware acceleration. You can also use WARP, the Direct3D software rasterizer. By default, the first adapter providing hardware acceleration is chosen. To override this and use another graphics adapter or to force the use of the software rasterizer, set the
QT_D3D_ADAPTER_INDEX environment variable to the index of the adapter. The adapters discovered are printed at startup when
QSG_INFO or the
qt.scenegraph.general logging category is enabled.
If you encounter issues, always set the
QT_D3D_DEBUG environment variables to
1, to get debug and warning messages printed on the debug output.
QT_D3D_DEBUG enables the Direct3D debug layer.
Note: The debug layer shouldn't be enabled in production use, since it can significantly impact performance (CPU load) due to increased API overhead.
By default, the D3D12 adaptation uses a single-threaded render loop similar to OpenGL's
windows render loop. A threaded variant is also available, that you can request by setting the
QSG_RENDER_LOOP environment variable to
threaded. However, due to conceptual limitations in DXGI, the windowing system interface, the threaded loop is prone to deadlocks when multiple QQuickWindow or QQuickView instances are shown. Consequently, for the time being, the default is the single-threaded loop. This means that with the D3D12 backend, applications are expected to move their work from the main (GUI) thread out to worker threads, instead of expecting Qt to keep the GUI thread responsive and suitable for heavy, blocking operations.
The scene graph renderer in the D3D12 adaptation currently doesn't perform any batching. This is less of an issue, unlike OpenGL, because state changes don't present any problems in the first place. The simpler renderer logic can also lead to lower CPU overhead in some cases. The trade-offs between the various approaches are currently under research.
The ShaderEffect QML type is fully functional with the D3D12 adaptation as well. However, the interpretation of the fragmentShader and vertexShader properties is different than with OpenGL.
With D3D12, these strings can either be a URL for a local file, a file in the resource system, or an HLSL source string. Using a URL for a local file or a file in the resource system indicates that the file in question contains pre-compiled D3D shader bytecode generated by the
fxc tool, or, alternatively, HLSL source code. The type of file is detected automatically. This means that the D3D12 backend supports all options from GraphicsInfo.shaderCompilationType and GraphicsInfo.shaderSourceType.
Unlike OpenGL, whenever you open a file, there is a QFileSelector with the extra
hlsl selector used. This provides easy creation of ShaderEffect items that are functional across both backends, for example by placing the GLSL source code into
shaders/effect.frag, the HLSL source code or - preferably - pre-compiled bytecode into
shaders/+hlsl/effect.frag, while simply writing
fragmentShader: "qrc:shaders/effect.frag" in QML. For more details, see ShaderEffect.
Multisample Render Targets
The Direct3D 12 adaptation ignores the QSurfaceFormat set on the QQuickWindow or QQuickView, or set via QSurfaceFormat::setDefaultFormat(), with two exceptions: QSurfaceFormat::samples() and QSurfaceFormat::alphaBufferSize() are still taken into account. When the sample value is greater than 1, multisample offscreen render targets will be created with the specified sample count at the maximum supported quality level. The backend automatically performs resolving into the non-multisample swapchain buffers after each frame.
When the alpha channel is enabled either via QQuickWindow::setDefaultAlphaBuffer() or by setting alphaBufferSize to a non-zero value in the window's QSurfaceFormat or in the global format managed by QSurfaceFormat::setDefaultFormat(), the D3D12 backend will create a swapchain for composition and go through DirectComposition. This is necessary, because the mandatory flip model swapchain wouldn't support transparency otherwise.
Therefore, it's important not to unneccessarily request an alpha channel. When the alphaBufferSize is 0 or the default -1, all these extra steps can be avoided and the traditional window-based swapchain is sufficient.
On WinRT, this isn't relevant because the backend there always uses a composition swapchain which is associated with the ISwapChainPanel that backs QWindow on that platform.
Mipmap generation is supported and handled transparently to the applications via a built-in compute shader. However, at the moment, this feature is experimental and only supports power-of-two images. Textures of other size will work too, but this involves a QImage-based scaling on the CPU first. Therefore, avoid enabling mipmapping for Non-Power-Of-Two (NPOT) images whenever possible.
When creating textures via C++ scene graph APIs like QQuickWindow::createTextureFromImage(), 32-bit formats won't involve any conversion, they'll map directly to the corresponding
B8G8R8A8_UNORM format. Everything else will trigger a QImage-based format conversion on the CPU first.
Particles and some other OpenGL-dependent utilities, like QQuickFramebufferObject, are currently not supported.
Like with Software adaptation, text is always rendered using the native method. Distance field-based text rendering is currently not implemented.
The shader sources in the Qt Graphical Effects module have not been ported to any format other than the OpenGL 2.0 compatible one, meaning that the QML types provided by that module are currently not functional with the D3D12 backend.
Texture atlases are currently not in use.
The renderer may lack support for certain minor features, such as drawing points and lines with a width other than 1.
Custom Qt Quick items using custom scene graph nodes can be problematic because materials are inherently tied to the graphics API. Therefore, only items that use the utility rectangle and image nodes are functional across all adaptations.
QQuickWidget and its underlying OpenGL-based compositing architecture is not supported. If you need to mix with QWidget-based user interfaces, use QWidget::createWindowContainer() to embed the native window of the QQuickWindow or QQuickView.
Finally, rendering via QSGEngine and QSGAbstractRenderer is not feasible with the D3D12 adaptation at the moment.
To integrate custom Direct3D 12 rendering, use QSGRenderNode in combination with QSGRendererInterface. This approach doesn't rely on OpenGL contexts or API specifics like framebuffers, and allows exposing the graphics device and command buffer from the adaptation. It's not necessarily suitable for easy integration of all types of content, in particular true 3D, so it'll likely get complemented by an alternative to QQuickFramebufferObject in future releases.
To perform runtime decisions based on the adaptation, use QSGRendererInterface from C++ and GraphicsInfo from QML. They can also be used to check the level of shader support: shading language, compilation approach, and so on.
When creating custom items, use the new QSGRectangleNode and QSGImageNode classes. These replace the now deprecated QSGSimpleRectNode and QSGSimpleTextureNode. Unlike their predecessors, these new classes are interfaces, and implementations are created via the QQuickWindow::createRectangleNode() and QQuickWindow::createImageNode() factory functions.
The D3D12 adaptation can keep multiple frames in flight, similar to modern game engines. This is somewhat different from the traditional "render - swap - wait for vsync" model and allows for better GPU utilization at the expense of higher resource use. This means that the renderer will be a number of frames ahead of what is displayed on the screen.
For a discussion of flip model swap chains and the typical configuration parameters, refer to Sample Application for Direct3D 12 Flip Model Swap Chains.
Vertical synchronization is always enabled, meaning Present() is invoked with an interval of 1.
The configuration can be changed by setting the following environment variables:
|The number of swap chain buffers in range 2 - 4. The default value is 3.|
|The number of frames prepared without blocking in range 1 - 4. The default value is 2. Present() starts blocking after queuing 3 frames (regardless of |
|The frame latency in range 1 - 16. The default value is 0 (disabled). Changes the limit for Present() and triggers a wait for an available swap chain buffer when beginning each frame. For a detailed discussion, see the article linked above.|
Note: Currently, this behavior is experimental.
|The time the CPU should wait, a non-zero value, for the GPU to finish its work after each call to Present(). The default value is 0 (disabled). This behavior effectively kills all parallelism but makes the behavior resemble the traditional swap-blocks-for-vsync model, which can be useful in some special cases. However, this behavior is not the same as setting the frame count to 1 because that still avoids blocking after Present(), and may only block when starting to prepare the next frame (or may not block at all depending on the time gap between the frames).|