Friday, November 23, 2012

Saving PowerPoint slides to PDF with PowerShell

Today, I’m having a bit of a catch-up day. One task I’ve been meaning to do for a while was to covert some PowerPoint slides to PDF for distribution and printing.

The basic task of converting a presentation to PDF is simple in PowerPoint. First you start PowerPoint and open the presentation. Then you save it as PDF using the Save-As dialog (and close PowerPoint).

The old saying if you have to do some thing more than once, write a script kicked in and so I did (write a script!). In theory, it should be relatively trivial to write a script that did just that. And once you know the complex PowerPoint object model, so it was! The full script is over on my scripts blog at: http://pshscripts.blogspot.co.uk/2012/11/convert-pptxtopdfps1.html.

It turns out that writing the script was a bit of a walk on the dark side, as in working with com and the Office COM objects. While what I wanted to do was simple, I needed to do it the way that PowerPoint's COM object wanted to – which was  bit different to working with some other API sets.

Working with Office, in particular the Office object model, from PowerShell is a bit painful. The PowerPoint object model is huge. When you instantiate the PowerPoint.Application object, you get a new object that is extremely rich and deep. You get loads of methods and properties. Many of these properties are themselves objects with methods properties (that can also be objects), etc., etc., etc. Since they are COM, the normal goodness of .NET reflection (i.e. Get-Member) is missing. Using a good search engine and applying a little ingenuity is the key!

In terms of getting this to work, the first issue I hit was the need to add some assemblies into PowerShell, like this:

Add-type -AssemblyName office -ErrorAction SilentlyContinue
Add-Type -AssemblyName microsoft.office.interop.powerpoint `
          -ErrorAction SilentlyContinue

Next, you open PowerPoint by instantiating the PowerPoint.Application COM object, and making it visible is relatively easy:

$ppt = new-object -com powerpoint.application
$ppt.visible = [Microsoft.Office.Core.MsoTriState]::msoTrue

An important point here is that you can’t use $True here to make PowerPoint visible. The ‘true’ you have to pass to PowerPoint is based on the enum, not on the normal .NET Boolean true. Finding these enums’s full name for PowerShell requires some Search Engine foo as the full class name is missing in the MSDN documentation.

Once you have PowerPoint open, you need to open the presentation and just save it as PDF. This is also easy, although you have to use the right enum (the string ‘pdf’ is not adequate):

$pres = $ppt.Presentations.Open($ifile)
$opt= [Microsoft.Office.Interop.PowerPoint.PpSaveAsFileType]::ppSaveAsPDF
$pres.SaveAs($ofile,$opt)

In addition to the code here, you also need to clean up and close PowerPoint. You can see how to do that in the script on the PowerShell scripts blog

To test it, I ran this script (which is part of the script posted on the scripts blog!).

$ipath = "E:\SkyDrive\PowerShell V3 Geek Week\"

Foreach ($ifile in $(ls $ipath -Filter "*.pptx")) {
  # Build name of output file
  $pathname = split-path $ifile
  $filename = split-path $ifile -leaf
  $file     = $filename.split(".")[0]
  $ofile    = $pathname + $file + ".pdf"

  # Convert _this_ file to PDF
   Convert-PptxToPDF -ifile $ifile -OFile $ofile
}

So all things considered, it was relatively easy to create a script to do this conversion. And as this sort of thing is something I do all too often, this script will save me time in the future!

Technorati Tags: