Skip to main content
technical webassembly engineering

How Browser-Based PDF Processing Actually Works (WebAssembly Deep Dive)

When we say "files never leave your browser," skeptics ask: how is that possible? Compress, merge, OCR — these are heavy operations. The answer is WebAssembly, and it's not magic. Here's the technical deep dive.

PDFShed TeamMay 7, 2026 8 min read

The pre-WebAssembly era

For 15 years, the answer to "how do I process a PDF in JavaScript" was: don't. JavaScript was too slow for the heavy parts (image transcoding, font subsetting, decryption). Server-side processing wasn't an architectural choice — it was the only choice.

What WebAssembly changed

WebAssembly (WASM) is a compact bytecode format that browsers run at near-native speed. C and C++ libraries — including qpdf, the canonical command-line PDF tool — can be compiled to WASM via Emscripten. Once compiled, they run in the browser at maybe 70–90% of native speed.

That changes the question from "can the browser do this?" to "do you want to wait two extra seconds?" For a 50 MB PDF compress operation, the answer is usually yes.

PDFShed's actual stack

  • qpdf-wasm — encryption, decryption, linearization, password handling. The reference implementation for PDF security.
  • pdf-lib — pure JavaScript PDF manipulation. Page extraction, merging, form filling, basic text editing.
  • pdfjs-dist — Mozilla's PDF.js, the same engine Firefox uses. Rendering, text extraction.
  • tesseract.js — Tesseract OCR compiled to WASM. Handles 100+ languages.
  • LibreOffice WASM — for the heavy office document conversions (DOCX↔PDF, XLSX↔PDF). Yes, the entire LibreOffice runtime, in your browser. ~150 MB download cached after first load.

Each tool loads only the libraries it needs. Compress doesn't need OCR. Sign doesn't need LibreOffice. The page weight stays manageable.

The "files never leave" claim — verified

The actual data flow when you compress a PDF on PDFShed:

1. Browser fetches the tool page (HTML, JS, CSS, ad scripts) over the network.

2. Browser fetches qpdf.wasm (~5 MB) over the network. Cached for future visits.

3. You drop a PDF onto the page. It's read into a JavaScript ArrayBuffer using FileReader.

4. The ArrayBuffer is passed to qpdf-wasm running in a Web Worker.

5. qpdf-wasm reads the bytes from WASM memory, processes, writes a new ArrayBuffer back.

6. The browser builds a Blob URL and triggers a download.

At no point in steps 3–6 does any file content cross the network boundary. You can verify this with tcpdump or by sniffing the WiFi packets in the lab. (Yes, we've done this.)

What about the ad scripts?

The ad scripts (Google AdSense) make their own network calls to Google's ad servers. They request ad creatives, report impressions, and read cookies for ad targeting. They cannot read your file — JavaScript variables in your tool code are not accessible to ad scripts under standard same-origin / browser security boundaries.

This is the same boundary that prevents one ad from reading another. For a deeper look, see the Web Platform same-origin policy.

Why your laptop fan spins up

Compressing a 100 MB PDF in WASM uses your CPU at near-native speed. A modern Apple M2 or AMD Ryzen handles it in 5–15 seconds while drawing maybe 30W. The fans spin up because real work is happening.

This is a feature: it's the work that *would have happened on a remote server* now happening on your machine. Your battery pays the cost; your privacy keeps the upside.

Limits of the approach

  • RAM-bound: A 500 MB PDF needs ~1.5 GB of working RAM. A Chromebook with 4 GB total may struggle. Most laptops handle 1 GB+ files fine.
  • No GPU acceleration yet: OCR could be 10× faster on a GPU. WebGPU is rolling out; we'll adopt as it stabilizes.
  • No background processing: Close the tab, the work stops. Long-running batch operations (process 200 PDFs overnight) are awkward in a browser.

Try it

If you're curious, open PDFShed's compress tool, then open Chrome DevTools and watch the Network tab as you process a file. The absence of outbound traffic is the whole pitch.

More posts

PDFShed

Professional PDF Tools - Free & Private

Security

  • Client-side processingFiles never leave your device
  • No file uploads100% private & secure

Compliance

GDPR Compliant
100% Private - Files never leave your device
Select Language

© 2026 PDFShed. All rights reserved.