PDF Viewer
PDF Viewer
PDF Viewer
This article discusses how to create a .NET PDF Viewer control that is not dependent on Acrobat
software being installed.
Fundamental Concepts
The basic steps that need to take place in order to view a PDF document:
1. Get a page count of the PDF document that needs to be viewed to define your page
number boundaries (iTextSharp or PDFLibNET)
2. Convert the PDF document (specific page on demand) to a raster image format
(GhostScript API or PDFLibNET)
3. --(Deprecated) Extract only the current frame to be viewed from the raster image
(FreeImage.Net)
4. Convert the current frame to be viewed into a System.Image
5. Display the current frame in a PictureBox control
Several utility classes were created or added from others which expose functionality needed from
the various helper libraries.
GhostScriptLib.vb (contains methods to convert PDF to TIFF for Viewing and Printing)
AFPDFLibUtil.vb (contains methods to convert PDF to System.Image for Viewing and
Printing as well as methods to create a Bookmark TreeView)
iTextSharpUtil.vb (contains methods for getting PDF page count, converting images to
searchable PDF and for extracting PDF bookmarks into TreeNodes)
PrinterUtil.vb (contains methods for sending images to printers)
ImageUtil.vb (contains methods for image manipulation such as resize, rotation,
conversion, etc.)
TesseractOCR.vb (contains methods for Optical Character Recognition from images)
PDFViewer.vb (contains the Viewer user control)
I was tempted to move every function over to PDFLibNet (XPDF) which is faster, but after a lot
of testing, I decided to use Ghostscript and PDFLibNET. Ghostscript is used for printing, "PDF
to image" conversion, and as a secondary renderer in case of XPDF incompatibility.
PDFLibNET is used for quick PDF to screen rendering, searching, and bookmarks.
FreeImage.dll
FreeImageNET.dll
gsdll32.dll
itextsharp.dll
PDFLibNET.dll
tessnet2_32.dll
PDFView.dll
Due to file size restrictions, I could not include the Ghostscript 8.64 DLL (gsdll32.dll) in the
source code. Please download the Win32 Ghostscript 8.64 package from sourceforge.net and
place the file "gsdll32.dll" into the \PDFView\lib directory where the other DLLs already exist.
' Get the page count of the PDF document if you want to
' conditionally set properties of the PDFViewer control
' Dim PageCount As Integer = PDFViewer.PageCount(PDFFileName)
' PDFViewer displays the file as soon as the FileName property is set
' File can be a PDF or a TIFF
PDFViewer.FileName = OpenFileDialog1.FileName
Me.Controls.Add(PDFViewer)
The essential part of this solution is extracting the current frame to be viewed from a multi-frame
(or single frame) image. At first I used System.Drawing to implement it. I found this to be
slower than other C++ solutions that use DIBs (Device Independent Bitmaps) to perform graphic
conversions.
I then tried implementing FreeImage with a .NET wrapper which gave it a little speed boost.
FreeImage also has a ton of image conversion functions which may come in handy if you
wanted to extend this into an editor.
I ended up implementing PDFLibNET which gave it a substantial speed boost since the amount of
File I/O operations were reduced. Another streamlined routine for extracting one page from a
PDF was added to the Ghostscript utility class as well.
AFPDFLibUtil.vb
GhostScriptLib.vb
The page is loaded from the PDF file and converted to a System.Image object.
The PictureBox is updated with the image.
Hide Copy Code
Private Function ShowImageFromFile(ByVal sFileName As String,
ByVal iFrameNumber As Integer, ByRef oPictureBox As PictureBox,
Optional ByVal XPDFDPI As Integer = 0) As Image
oPictureBox.Invalidate()
If mUseXPDF Then 'Use AFPDFLib (XPDF)
If ImageUtil.IsPDF(sFileName) Then
If XPDFDPI > 0 Then
AFPDFLibUtil.DrawImageFromPDF(mPDFDoc, iFrameNumber + 1,
oPictureBox, XPDFDPI)
Else
AFPDFLibUtil.DrawImageFromPDF(mPDFDoc, iFrameNumber + 1,
oPictureBox)
End If
End If
Else 'Use Ghostscript if PDF or use System.Drawing if TIFF
If ImageUtil.IsPDF(sFileName) Then 'convert one frame to a tiff
for viewing
oPictureBox.Image =
ConvertPDF.PDFConvert.GetPageFromPDF(sFileName,
iFrameNumber + 1)
ElseIf ImageUtil.IsTiff(sFileName) Then
oPictureBox.Image = ImageUtil.GetFrameFromTiff(sFileName,
iFrameNumber)
End If
End If
oPictureBox.Update()
Return oPictureBox.Image
End Function
Points of Interest
This project was made possible due to various open source libraries that others were kind enough
to distribute freely. I would like to thank all of the Ghostscript, FreeImage.NET, iTextSharp,
TessNet, and AFPDFLib (PDFLibNet) developers for their efforts.
History
19th June, 2009: 1.0 Initial release
22nd June, 2009: Updated source code to correctly scale printed pages to the Printable
Page Area of the printer that is selected
7th July, 2009: Updated source code to use AFPDFLib(XPDF) or Ghostscript for PDF
rendering
15th July, 2009: Updated source code to use PDFLibNet(XPDF ver 3.02pl3) and added
search/export options
22nd July, 2009: Added "Image to PDF" import, password prompt for encrypted PDF
files, fallback rendering to Ghostscript if XPDF fails, latest version of PDFLibNet with
various bug fixes applied, and LZW compression for "PDF to TIFF" export
20th August, 2009: Major changes:
o Added the ability to convert images into a searchable PDF (OCR is English only
for now)
o Added the ability to export a PDF to an HTML Image Viewer
o Pages are only rendered at the DPI needed to fill the Viewer window (good speed
increase)
o Rotated page settings are kept while viewing the document
o Added the ability to convert images into an encrypted PDF
o Changed bookmark tree generation to use recursion
o Multiple bug fixes (see SVN log on the repository)
5th October, 2009
o Fixed problem with incorrect configuration error with PDFLibNet.dll
o Removed dependencies on FreeImage