Using the Immersive Translate plugin’s LLM to translate certain websites: noticed mouse lag, some suggestions from the LLM

Also, the stuttering is only reproducible on certain websites. I had the LLM read through the plugin code; later I’ll ask an expert to take a look and see if there are any ideas for optimization.

From the plugin development architecture and implementation perspective, the performance bottlenecks caused by LLM translation can be deeply optimized along the following core dimensions:

  1. Introduce an asynchronous processing architecture (Web Workers)
    Currently, a large amount of regex matching, HTML parsing, and translated-text comparison in the plugin may all be executed in the Content Script (main thread).
  • Optimization plan: migrate logic that does not involve direct DOM operations (such as LLM streaming protocol parsing, text preprocessing, and multilingual comparison algorithms) into a Web Worker
    .
  • Effect: free up the main thread, ensuring the browser can still respond to user scrolling and clicks at 60fps while processing translation data.
  1. Implement “render batching” and “time slicing” (Scheduling)
    With streaming output, the worst thing is updating the DOM once per received character.
  • Optimization plan:
    • Buffering: set a 100ms–200ms buffer window, merge multiple chunks returned by the LLM, and then trigger a single DOM update via
      requestAnimationFrame.
    • Time Slicing: if a large number of translation nodes must be inserted at once, use an approach similar to React Fiber
      —split the task into many small tasks and insert them one by one when the browser is idle (requestIdleCallback).
  • Effect: turn formerly intensive reflows into rhythmic updates, significantly reducing instantaneous CPU load.
  1. Viewport-based “lazy rendering” (Intersection Observer)
    If the page is very long, full-page translation will cause thousands of DOM nodes to be updated in the background at the same time, even if those nodes are not on screen.
  • Optimization plan: leverage the IntersectionObserver API.
    • Only trigger translation requests and mount the translated DOM when the source node enters the viewport (or is about to enter).
    • For translation nodes far away from the viewport, consider temporarily destroying or hiding them to reduce browser layout computation pressure.
  • Effect: reduce computation from “whole-page scale” to “screen scale”.
  1. Optimize the DOM mounting strategy (Containment)
    When inserting translation nodes, if the original layout is disrupted, it can trigger a reflow of the entire page.
  • Optimization plan:
    • CSS Containment: add contain: layout; or contain:
      content; to the translated container. This tells the browser that internal changes to this node will not affect the external layout.
    • Fixed placeholders: before translation starts, estimate a placeholder height based on the source text length to avoid repeated “shaking” of the page when the translated text jumps during streaming.
  • Effect: limit the reflow scope and avoid “one change affecting everything”.
  1. Reduce excessive triggering of MutationObserver
    Since the plugin itself modifies the DOM, it will in turn trigger its own MutationObserver listener again.
  • Optimization plan:
    • When the plugin inserts translation nodes, temporarily disable listening via a global flag (Flag), or filter out nodes generated by the plugin itself in the listener using node.isTrusted
      or a specific attribute.
    • Increase the dynamic threshold of mutationChangeDelay: when high-speed user scrolling is detected, automatically increase the delay or pause scanning.
  1. “Incremental updates” for streaming output rather than “full replacement”
  • Optimization plan: if the translation node is a container, during streaming updates you should only append text nodes, rather than innerHTML = …
    rebuilding the internal structure each time.
  • Effect: reduce memory garbage collection (GC) pressure and DOM tree parsing overhead.

Summary
The most effective optimization path is: Web Worker processes data →
requestAnimationFrame buffered rendering →
IntersectionObserver
limits the scope →
CSS contain isolates layout. This can perfectly decouple the “heavy load” of the LLM from the browser’s “smooth feel”.


When I enable LLM translation, the CPU instantly jumps from 5 to a load of 50.