Web Engineering

Building a PDF Compressor in the Browser: WebAssembly & Next.js

The ultimate guide to building a browser-side PDF compressor using WebAssembly, C++, and Next.js. Learn about Emscripten, Web Workers, memory management, and high-performance frontend engineering.

Sachin Sharma
Sachin SharmaCreator
Jan 7, 2026
7 min read
Building a PDF Compressor in the Browser: WebAssembly & Next.js
Featured Resource
Quick Overview

The ultimate guide to building a browser-side PDF compressor using WebAssembly, C++, and Next.js. Learn about Emscripten, Web Workers, memory management, and high-performance frontend engineering.

Building a PDF Compressor in the Browser: WebAssembly & Next.js

In the modern web era, privacy is no longer a luxuryβ€”it is a requirement. Yet, every time we need to perform a "heavy" task like compressing a PDF, we are forced to upload our sensitive documents to a third-party server.

Think about that for a second. Your bank statements, legal contracts, or identity documents are traveling across the internet to a server you don't control, being processed by code you haven't seen, and potentially being stored in a temp folder indefinitely.

As developers, we can do better.

With the advent of WebAssembly (Wasm), the browser is no longer just a document viewer; it is a high-performance execution environment. We now have the power to run C, C++, and Rust code directly in the browser at near-native speeds.

In this deep-dive, I'm going to show you exactly how I built a production-grade PDF compressor that runs entirely in the browser. No server-side processing. No data leaving the user's machine. Just pure, raw performance.


πŸ—οΈ The Architectural Vision

Our goal is to build a tool that can take a 50MB PDF and crush it down to 5MB without the user noticing any lag. To achieve this, we need three core components:

  1. 2.
    The Engine: A powerful C++ library for PDF manipulation (like Ghostscript or a specialized PDFium build).
  2. 4.
    The Bridge: WebAssembly and Emscripten to compile that C++ code into something the browser understands.
  3. 6.
    The Orchestrator: Next.js for the UI, state management, and Web Workers to ensure the heavy lifting doesn't freeze the main thread.

πŸ› οΈ Step 1: Choosing and Compiling the Engine

The browser's JavaScript engine (V8, JavaScriptCore) is excellent, but it isn't designed for the bit-level manipulation required by PDF compression. For that, we turn to the masters: C++.

For this project, we utilize a custom build of a PDF optimization engine. The compilation process using Emscripten looks like this:

The Compilation Command

We need to tell Emscripten which features we want enabled. Since we're dealing with large files, memory management is critical.

bash
emcc -O3 \ -s WASM=1 \ -s ALLOW_MEMORY_GROWTH=1 \ -s EXPORTED_FUNCTIONS="['_compress_pdf', '_malloc', '_free']" \ -s EXPORTED_RUNTIME_METHODS="['ccall', 'cwrap', 'FS']" \ -s MODULARIZE=1 \ -s EXPORT_NAME="createPdfModule" \ -o pdf_compressor.js

Why these flags?

  • -O3: Maximum optimization. We want the code to be as fast as possible.
  • ALLOW_MEMORY_GROWTH: PDF processing can be memory-intensive. This allows the Wasm heap to expand dynamically.
  • MODULARIZE: Wraps the output in a promise-based module, making it much easier to integrate with modern React/Next.js code.
  • FS: This is crucial. It gives us a virtual file system (MEMFS) within the browser. We "write" the PDF into this virtual memory, process it, and "read" it back out.

πŸŒ‰ Step 2: Bridging C++ and TypeScript

Once we have our .wasm and .js glue files, we need to talk to them. In Next.js, we create a wrapper service.

typescript
// lib/wasm/pdf-service.ts export class PdfWasmService { private module: any; async init() { this.module = await createPdfModule(); } async compress(fileBuffer: Uint8Array): Promise<Uint8Array> { const filename = 'input.pdf'; const outFilename = 'output.pdf'; // 1. Write the file to the virtual FS this.module.FS.writeFile(filename, fileBuffer); // 2. Call the C++ function // We use ccall to invoke the compiled C++ function '_compress_pdf' this.module.ccall( 'compress_pdf', 'number', ['string', 'string'], [filename, outFilename] ); // 3. Read the compressed result back const result = this.module.FS.readFile(outFilename); // 4. Cleanup virtual FS to free memory this.module.FS.unlink(filename); this.module.FS.unlink(outFilename); return result; } }

πŸš€ Step 3: Next.js & Web Worker Integration

Running this directly in your React component is a recipe for disaster. If the compression takes 5 seconds, your UI will be frozen for those 5 seconds. Users will think your app crashed.

We solve this with Web Workers.

The Worker Setup

Web Workers run in a separate background thread. They can't access the DOM, but they are perfect for our Wasm engine.

typescript
// workers/pdf.worker.ts import { PdfWasmService } from '../lib/wasm/pdf-service'; const pdfService = new PdfWasmService(); self.onmessage = async (event) => { const { fileBuffer, type } = event.data; if (type === 'INIT') { await pdfService.init(); self.postMessage({ type: 'READY' }); return; } if (type === 'COMPRESS') { try { const output = await pdfService.compress(fileBuffer); self.postMessage({ type: 'SUCCESS', output }, [output.buffer]); } catch (error) { self.postMessage({ type: 'ERROR', error: error.message }); } } };

Performance Tip: Notice the [output.buffer] in the postMessage call. This is a Transferable Object. Instead of copying the data (which is slow for large PDFs), we move the ownership of the memory from the worker thread to the main thread. This is near-instant.


🎨 Step 4: Building the Premium UI in Next.js

A high-performance engine deserves a high-performance UI. We use Framer Motion for smooth transitions and Tailwind CSS for a sleek, modern look.

The Multi-File Upload Logic

We want to support drag-and-drop and batch processing.

tsx
// components/pdf/Compressor.tsx export default function PdfCompressor() { const [files, setFiles] = useState<File[]>([]); const [status, setStatus] = useState<Record<string, 'pending' | 'processing' | 'done'>>({}); const processFiles = async () => { const worker = new Worker(new URL('../../workers/pdf.worker.ts', import.meta.url)); for (const file of files) { setStatus(prev => ({ ...prev, [file.name]: 'processing' })); const buffer = await file.arrayBuffer(); worker.postMessage({ type: 'COMPRESS', fileBuffer: new Uint8Array(buffer) }); // Wait for response... } }; return ( <div className="max-w-4xl mx-auto p-12"> <h1 className="text-4xl font-black mb-8">Crush Your PDFs. Privately.</h1> {/* Dropzone component with Framer Motion animations */} <Dropzone onFilesAdded={setFiles} /> <div className="mt-8 space-y-4"> {files.map(file => ( <div key={file.name} className="flex justify-between items-center p-4 bg-secondary/20 rounded-xl"> <span>{file.name}</span> <StatusBadge status={status[file.name]} /> </div> ))} </div> </div> ); }

πŸ“ˆ Optimization: Memory & Multithreading

When dealing with 100MB+ files, the default Wasm heap might not be enough. We have to implement several advanced techniques:

1. Handling SharedArrayBuffer

If you want to use multithreading (pthreads) in Wasm, you need SharedArrayBuffer. However, for security reasons (Spectre/Meltdown), browsers require specific headers to enable this:

javascript
// next.config.js module.exports = { async headers() { return [ { source: '/(.*)', headers: [ { key: 'Cross-Origin-Opener-Policy', value: 'same-origin' }, { key: 'Cross-Origin-Embedder-Policy', value: 'require-corp' }, ], }, ]; }, };

2. Garbage Collection in Wasm

Wasm does not have automatic garbage collection for memory allocated via malloc. Every time we pass a string or a buffer from JS to C++, we must manually free it.

typescript
const ptr = this.module._malloc(size); // ... use the pointer ... this.module._free(ptr);

Failure to do this will cause the browser tab to crash after processing only a few files. In my implementation, I use a custom AutoFree wrapper that tracks all allocations during a compression task and cleans them up automatically at the end.


πŸ” The Privacy Dividend

By moving the logic to the browser, we've eliminated the biggest cost of a modern SaaS: Compute. Generally, processing PDFs requires expensive GPU/CPU clusters. By leveraging the user's local hardware, we can offer this tool for free, forever, with zero overhead.

More importantly, we've achieved Perfect Privacy. Even if my website is hacked, your documents are never at risk because they never touched my server. They lived and died in your browser's RAM.


🏁 Conclusion

Building this PDF compressor wasn't just about compressionβ€”it was about testing the limits of the modern web. We've proven that the browser is no longer a "thin client." It is a powerhouse capable of native-level performance.

The future of software is Decentralized Execution. Tools that used to be server-only (video editing, 3D rendering, document processing) are all migrating to Wasm.

If you are a developer, start looking at your C/C++ libraries. There is a whole world of high-performance code waiting to be unlocked in the browser.

Want to see the code?

I have open-sourced the core Wasm wrapper and the Next.js integration on my GitHub. Feel free to fork it, star it, and build your own private tools!


About the Author: Sachin Sharma is a Software Developer who bridge the gap between low-level performance and high-level UX. When he's not optimizing Wasm modules, he's building pixel-perfect interfaces in Next.js.

Sachin Sharma

Sachin Sharma

Software Developer & Mobile Engineer

Building digital experiences at the intersection of design and code. Sharing weekly insights on engineering, productivity, and the future of tech.