Consider this part 2 of the Generating Word Documents series. While your customers might manually convert their downloaded Word reports to PDF, there’s a lot of power in automatically converting them to PDF.

For example, you probably don’t want to send invoices as Word documents, but maybe you still want to use the strong templating functionalities. Also, your customers might want to batch-download PDF reports.

LibreOffice’s headless converter

As rendering a document requires a lot of separate parts, we’re relying on LibreOffice as it is the most popular (and used) free package and is well-supported, maintained and enjoys a high standard-compliance (which we want).

In case you’re looking for a good-to-go command to spin your own version, this one is for you (also courtesy to this very detailed answer on Stack Overflow):

libreoffice --headless "-env:UserInstallation=file:///tmp/LibreOffice_Conversion_${USER}" --convert-to pdf:writer_pdf_Export --outdir ${HOME}/pdfexport input.docx

The two most important parts are --outdir and the input file. Sadly, we cannot choose to output it as a stream to later on use it within a buffer in Node, so we need to clean it up afterwards.

Universal function to use within Node

Heavily inspired by libreoffice-convert, we can build our own function to use within our project.

To circumvent manually cleaning up our temporary files and support using buffers, we’re using tmp-promise which is a great package taking care of it for us.

import tmp from 'tmp-promise';
import path from 'node:path';
import { Buffer } from 'node:buffer';
import { readFile, writeFile } from 'node:fs/promises';
import { execFile } from 'node:child_process';

async function libreOfficePath(): Promise<string> {
    // todo: find libreoffice binary, see https://github.com/elwerene/libreoffice-convert/blob/master/index.js
    switch (process.platform) {
        case 'darwin': return '/Applications/LibreOffice.app/Contents/MacOS/soffice';
        case 'linux': return '/usr/bin/libreoffice';
        default: break;
    }

    return null;
}

function promiseExecFile(file: string, args: string[]): Promise<string> {
    return new Promise((resolve, reject) => {
        execFile(file, args, (error, stdout, stderr) => {
            if (error) {
                reject(error);
            }
            if (stderr) {
                reject(stderr);
            }
            resolve(stdout);
        });
    });
}

/**
 * Converts a Word document using LibreOffice's Writer to PDF.
 * @param source Buffer containing the Word document.
 * @returns Buffer containing the PDF.
 */
async function convertWordToPdf(source: Buffer): Promise<Buffer> {
    const tmpDir = await tmp.dir({ unsafeCleanup: true });
    const userInstallationDir = await tmp.dir({ unsafeCleanup: true });
    await writeFile(path.join(tmpDir.path, 'source'), source);

    const binary = await libreOfficePath();
    const command = `--headless "-env:UserInstallation=file:///${userInstallationDir.path}" --convert-to pdf:writer_pdf_Export --outdir ${tmpDir.path} ${path.join(tmpDir.path, 'source')}`;
    
    await promiseExecFile(binary, command.split(' '));

    const pdf = await readFile(path.join(tmpDir.path, 'source.pdf'));
    return pdf;
}

// just to try out our code
async function main() {
    const source = await readFile('test.docx');
    const pdf = await convertWordToPdf(source);
    await writeFile('out.pdf', pdf);
}

main()
    .then(() => console.log('done'))
    .catch((error) => console.error(error));

On portability

On your local environment, you can use all your fonts the system provides. However, if you’re running it within Docker, it will fall back to default fonts.

To include your own fonts, you can copy them to your Docker container. I’ve put mine inside the repository under the /fonts folder and added RUN cp -r ./fonts /usr/share/fonts to my Dockerfile. As long as they’re OTF or TTF, they just work fine :) Just be sure to install all font weights.

As an example, to successfully run our script we’ve just built, you can use this Dockerfile. Of course, it’s also important to install libreoffice’s packages.

FROM node:18-bullseye

RUN apt update && apt -y -q install \
    libreoffice \
    libreoffice-writer \
    libreoffice-core \
    libreoffice-common

WORKDIR /usr/src/app

COPY package*.json ./

RUN npm install

COPY . .

RUN cp -r ./fonts /usr/share/fonts

RUN npm run build

CMD node dist/index.js

Now, let’s see the result made by our nifty container (left is without fonts, right is with them)

Word Export

As you can see, it also supports a lot of the features you may use within Word, so your project’s exporting features just have become the coolest thing on the block.

Conclusion

Without a lot of effort, we’ve made a portable (so, dockerized) Word to PDF exporter. You could plug it into your web app, for example. So, LibreOffice is really cool, especially because it allows us to do fancy things like this ;)

One thing to note though is that the conversion is fairly slow, so if somebody wants to batch-export their beautiful reports, it might be good to run it separately and as a background task.