2026 · NSS Background RemoverAbout 8 min readNovus Stream Solutions
Web Workers and OffscreenCanvas: keeping the UI smooth during heavy AI work
Everything a web page does — every click, scroll, and animation — runs on one main thread by default, including any heavy work you put there. Run a model or a video render on that thread and the page freezes. Web Workers and OffscreenCanvas are how you keep the interface alive while the heavy work happens.
Contents
Overview
A web browser runs the code for a page on a single main thread, and that thread does almost everything: it responds to clicks, runs animations, lays out and paints the page, and executes your JavaScript. The crucial consequence is that the thread can only do one thing at a time. If you hand it a task that takes two seconds — running a machine-learning model over an image, encoding a frame of video, processing a large file — then for those two seconds it cannot respond to anything. Buttons do not depress, scrolling locks, the cursor stops, the whole page appears frozen. It is not actually broken; it is busy, and being busy on the main thread is indistinguishable from being broken as far as the user is concerned.
For an app whose entire premise is doing heavy work in the browser — removing backgrounds with a real model, rendering a beat-synced video, restoring a photo — this is the central engineering constraint. The capability that makes the app worth using is exactly the thing that would freeze it if you put it in the obvious place. Web Workers and OffscreenCanvas are the standard answer: they let you move heavy work onto separate threads so the main thread stays free to keep the interface alive. This article is about what those two technologies actually do, how data moves between threads without killing the benefit, and when the added complexity is worth it.
Why one blocked thread freezes everything
It helps to understand the mechanism rather than just the rule, because the mechanism tells you exactly what to move and what to leave. The browser maintains an event loop on the main thread: a queue of tasks — a click handler here, an animation frame there, your function call next — that it works through one at a time. Each task runs to completion before the next begins. A click that arrives while a long task is running does not interrupt it; it waits in the queue until the long task finishes. So a single function that takes two seconds does not slow the page down by a little, it stops the page for two seconds, because every other task — including the repaint that would show a spinner — is stuck behind it in the same queue.
This is why the usual half-measures do not help. Showing a loading spinner before you start the heavy work does not work if the work is on the main thread, because the browser never gets a chance to paint the spinner until the work is done — by which point you do not need it. Breaking the task into chunks and yielding between them helps a little but turns simple code into a state machine and still steals time from the thread that should be handling input. The only real fix is to get the heavy work off the main thread entirely, onto a thread whose being busy does not block the interface. That thread is a Web Worker.
What a Web Worker actually is
A Web Worker is a separate JavaScript thread with its own execution context, running in parallel with the main thread. You hand it a script, it runs independently, and crucially it does not share the main thread’s event loop — so when a worker spends two seconds running a model, the main thread is completely unaffected and keeps handling clicks and animations the whole time. The worker cannot touch the page directly: it has no access to the document, the DOM, or the elements on screen. That restriction sounds limiting but is the entire point — by being unable to touch the UI, the worker cannot block it, and the wall between them is what keeps the interface responsive.
Communication between the two happens by message passing. The main thread posts a message to the worker — “here is an image, run the model on it” — and the worker posts a message back when it is done — “here is the result”. Each side registers a handler for incoming messages and they talk asynchronously, never sharing variables directly. This message-passing model is what keeps the parallelism safe: there is no shared mutable state to corrupt, just messages crossing a boundary. The mental model is two people in separate rooms passing notes under the door, rather than two people reaching into the same drawer at once, and that separation is exactly why it works.
The data-transfer trap, and how to avoid it
There is a catch that turns the naive version of this into a performance trap. By default, when you post a message to a worker, the data is copied — serialized on one side and rebuilt on the other. For a small message that is nothing, but for a large image or a multi-megabyte buffer, copying it across the boundary can cost as much as the work you were trying to offload, and it briefly doubles the memory used. Offloading the model run only to spend the saved time copying the image back and forth is a real way to make things slower while feeling clever, and it is the mistake that sours teams on workers.
The fix is transferable objects. Instead of copying certain kinds of data — array buffers, image bitmaps, offscreen canvases — you transfer ownership of them to the worker. The underlying memory is handed over rather than duplicated: it becomes unusable on the sending side and available on the receiving side, with no copy and no doubling. This makes passing a large image to a worker nearly free, which is what makes the whole architecture pay off for image and video work. Knowing which data is transferable, and structuring your messages so the heavy payloads are transferred rather than copied, is most of the practical skill in using workers well.
OffscreenCanvas: letting a worker draw
Web Workers solve computation, but they create a second problem for anything visual. A worker cannot touch the DOM, and a canvas element is part of the DOM, so a worker cannot draw to the screen directly. For an app that renders video frames or composites images, that is a serious limitation — the rendering is precisely the heavy, frame-by-frame work you want off the main thread, but the canvas it draws into lives on the main thread the worker is forbidden from touching. Without a bridge, you are forced to do the rendering on the main thread after all, which puts you right back where you started.
OffscreenCanvas is that bridge. It lets you detach a canvas’s drawing surface from the DOM and transfer it to a worker, which can then draw to it directly, on its own thread, with the results appearing on screen without ever involving the main thread in the per-frame work. For a music visualizer rendering sixty frames a second or an editor compositing layers, this is what makes smooth playback possible while the rest of the interface — the controls, the timeline, the buttons — stays fully responsive on the main thread. The rendering and the interaction genuinely happen in parallel, which is the only way both can stay smooth at once, and it is why the export engine described in /product-blog/when-the-preview-matches-the-export can keep the preview fluid while it works.
The honest cost: complexity
None of this is free, and the cost is complexity. Code split across threads is harder to write, harder to debug, and harder to reason about than code that runs in one place. Everything becomes asynchronous and message-based: you cannot just call a function and get a result, you post a message and wait for one, and errors that would be a simple thrown exception on one thread become messages you have to route and handle across the boundary. Debugging spans two contexts, and a bug caused by the order in which messages arrive is a genuinely harder thing to chase than a bug in straight-line code. This is real overhead, and it is why you should not reflexively put everything in a worker.
The judgement is to move the heavy, isolatable work and leave everything else where it is simple. Model inference, video encoding, large image processing, parsing a big file — these are bounded tasks with a clear input and output, which makes them ideal worker candidates: the complexity of the boundary buys you a responsive UI during work that would otherwise freeze it. Light, frequent, UI-coupled logic should stay on the main thread, because moving it adds boundary complexity for no real gain. The same restraint applies here as with reliability work in /product-blog/reliability-hardening-honest-failures: add the machinery where it earns its keep, and resist it everywhere it does not.
What the user feels
The reward for all this is that the user never experiences the heavy work as a freeze. They start a background removal and can still scroll the page, read the help text, or queue up the next image while the model runs. They watch a visualizer preview play smoothly while they drag a slider. The work still takes as long as it takes — moving it to a worker does not make the model faster — but the interface stays alive throughout, and an app that stays responsive while it works feels dramatically more trustworthy than one that goes dead and makes the user wonder if it has crashed.
That perceived liveness is the whole return on the architecture. A frozen interface during a two-second task is not just unpleasant; it is ambiguous, because the user cannot tell a busy app from a broken one, and ambiguity is what makes people give up and reload. Keeping the main thread free turns a frightening freeze into a visibly-working wait, and a visibly-working wait is something users will happily sit through. The threading is invisible; what they notice is that the app never stops listening to them, which is exactly the impression a heavy in-browser tool needs to make.
Frequently asked questions
Quick answers to common questions about this topic.
When should I use a Web Worker instead of just running the code normally?
Use a worker for heavy, bounded tasks with a clear input and output — model inference, video encoding, large image processing, parsing a big file — that would otherwise block the main thread for more than a frame or two. Keep light, frequent, UI-coupled logic on the main thread, because moving it adds message-passing complexity for no real responsiveness gain.
Why does my app still freeze even though I added a loading spinner?
Because the spinner and the heavy work are on the same thread. The browser cannot paint the spinner until the current task finishes, so if the heavy work runs on the main thread the spinner only appears after the work is already done. The fix is to move the heavy work to a worker so the main thread is free to paint the spinner and respond to input while the work runs.
What are transferable objects and why do they matter?
By default, data sent to a worker is copied, which is expensive for large payloads like images and briefly doubles memory. Transferable objects — array buffers, image bitmaps, offscreen canvases — are instead handed over by reference: the memory moves to the worker with no copy, becoming unusable on the sender side. Using transfer rather than copy for big payloads is what makes offloading image and video work actually faster.
What is OffscreenCanvas for?
A worker cannot touch the DOM, and a normal canvas is part of the DOM, so a worker cannot draw to the screen directly. OffscreenCanvas lets you transfer a canvas’s drawing surface to a worker so it can render directly on its own thread — essential for smooth per-frame rendering, like a video export or a visualizer, while the main thread stays responsive to controls.
What is the main downside of using workers?
Complexity. Code split across threads is asynchronous and message-based, harder to debug, and spans two contexts, and errors must be routed across the boundary rather than simply thrown. That overhead is worth it for genuinely heavy, isolatable work, which is why you should move those tasks to a worker and leave simple, UI-coupled logic on the main thread.