How to convert .docx files to .pdf with C#

How to convert .docx files to .pdf with C#
Converting .docx files to .pdf with C#

It's a very common ask in a business settings to have .docx files that need to be converted into .pdf files. There might be .docx templates that need to be sent to customers, but you would prefer to send them a .pdf instead of a .docx file. Fortunately for us, there is an easy way to do that and automate the process using C#!

Prerequisites

The only requirement you will need is a portable installation of LibreOffice. You can find the portable installations here. The portable installation gives you raw .exe's that we will use as a dependency in our C# code. It is the LibreOffice LibreOfficeWriterPortable.exe file that will do the conversion of the .docx file into a .pdf file. Install the portable version of LibreOffice and make note of where the LibreOfficeWriterPortable.exe is saved.

Installing the LibreOffice portable version

The code

In order to programmatically convert .docx files to .pdf files, we need to make use of the LibreOffice CLI. As with any CLI, we need to first pass in the path of the .exe/executable as the first parameter, followed by any required or optional flags in order to use the CLI. In C#, this is how we can do that.

using System.Diagnostics;

namespace ConvertDOCXToPDF
{
    internal class Program
    {
        static void Main(string[] args)
        {
            // Create LibreOfficeWriter CLI process
            var commandArgs = new List<string>
            {
                "--convert-to", //a flag that will be followed by the file type we want to convert to
                "pdf:writer_pdf_Export", // the [output file type]:[OutputFilterName] we are requesting the output to be; more details are here (https://help.libreoffice.org/latest/en-US/text/shared/guide/convertfilters.html)
                "C:\\Users\\zachary\\Downloads\\Letter.docx", // input file
                "--norestore", // disables restart and file recovery after a system crash
                "--headless", // allows using the application without user interface
                "--outdir", // a flag that will be followed by the output directory where we want our new pdf file to be created
                "C:\\Users\\zachary\\Downloads" // output directory
            };

            // The path to LibreOfficeWriterPortable.exe
            ProcessStartInfo processStartInfo = new ProcessStartInfo("C:\\Users\\zachary\\Downloads\\LibreOfficePortablePrevious\\LibreOfficeWriterPortable.exe"); 
            foreach (string arg in commandArgs)
                processStartInfo.ArgumentList.Add(arg);

            Process process = new Process
            {
                StartInfo = processStartInfo
            };

            // Only 1 instance of LibreOfficeWriter can be running at a given time
            Process[] existingProcesses = Process.GetProcessesByName("soffice");
            while (existingProcesses.Length > 0)
            {
                Thread.Sleep(1000);
                existingProcesses = Process.GetProcessesByName("soffice");
            }

            // Start the process
            process.Start();
            process.WaitForExit();

            // Check for failed exit code.
            if (process.ExitCode != 0)
                throw new Exception("Failed to convert file");
            else
            {
                int totalChecks = 10;
                int currentCheck = 1;

                string originalFileName = Path.GetFileNameWithoutExtension(commandArgs[2]);
                string newFilePath = Path.Combine(commandArgs[6], $"{originalFileName}.pdf");

                while (currentCheck <= totalChecks)
                {
                    if (File.Exists(newFilePath))
                    {
                        // File conversion was successful

                        break;
                    }

                    Thread.Sleep(500); // LibreOffice doesn't immediately create PDF output once the command is run
                }
            }
        }
    }
}

Converting a .docx file to .pdf using C# and LibreOffice

You'll notice at the end of the code, we sleep the thread in a loop waiting for the file to drop in our filesystem. This was a behavior I noticed when using the LibreOffice CLI; it wasn't until a few moments after the process exited did LibreOffice create the .pdf file. This code is an improvement/bugfix from the library I was originally using at https://github.com/smartinmedia/Net-Core-DocX-HTML-To-PDF-Converter.

Converting the .docx file into a .pdf file

Github

All of this code can be found on Github.