How client-side rendering works
This feature is not available yet, but we are already writing docs.
Track progress on GitHub and discuss in the #web-renderer channel on Discord.
The biggest challenge of client-side rendering is that it is not possible to capture the browser viewport.
Only certain HTML elements and React components such as <canvas>, <svg>, <Video>, <Audio>, and <Img> can be captured.
Unlike in server-side rendering, where a pixel-perfect screenshot is made, in client-side rendering Remotion only takes the capturable elements and composites them onto a single canvas.
Rendering process
Initialization
First, the component is mounted in the DOM in a place where it is not visible to the user.
Frame capture process
For each frame that needs to be rendered, the renderer uses element.querySelectorAll() to find all elements that can be captured, including <canvas>, <svg>, <Audio>, <Video>, <img> and other supported element types.
For each capturable element, the renderer:
- Goes up the DOM tree and resets all
transformCSS properties tonone. - Gets the bounding box using
.getBoundingClientRect(), as well as the bounding boxes of the parent elements. - Adds up the transforms and positions to determine the original placement of the element in the DOM.
- Captures the pixels of the element
- Draws them to the canvas according to the calculated placement.
In addition, audio from mounted <Audio> and <Video> elements is captured and mixed, and added to the audio track of the video.
Encoding
Mediabunny is used to encode the frames and processed audio into a video file.
Context isolation
Renders happen in the same browser environment as your app. This means CSS and Tailwind variables will automatically work, but you run the risk of conflicts with the host page.
See Limitations for more details to ensure your code works with client-side rendering.