Merging Files with Visual Basic

Below is an example of how multiple PDF documents, containing a varying number of pages, can be merged together into one file, with all page ones together, followed by all page twos and so on. For this to work the library iTextSharp needs to be used.

Firstly, the file path is set and a check is made to see if it exists. This is followed by another check to verify that there are files to merge. The files are then processed one by one to find the number of pages that each PDF contains and this information is stored in a sorted dictionary along with the corresponding file name. All files without a ‘.pdf’ extension are ignored. Whilst doing this, a record is made of the maximum number of pages in an individual file. The sorted dictionary containing file name and page information, along with the maximum number of pages figure, is then used to access pages in each file and check that the desired page actually exists in a particular file, which allows for PDFs of varying sizes to be merged. Finally, a confirmation message is displayed stating how many files have been merged.

Imports System.IO
Imports iTextSharp.text
Imports iTextSharp.text.pdf

Public Class MergingFiles

    Public Shared Sub Main()

        ' File path.
        Dim filePath As String = "C:\Demo"

        ' File extension.
        Dim fileExtension As String

        ' Check if the file path exists.
        If Directory.Exists(filePath) Then

            ' Return the names of the files at the specified path.
            Dim dirFiles As String() = Directory.GetFiles(filePath)

            ' Check if there are any files at the path.
            If dirFiles.Length = 0 Then

                ' Message stating there are no files to merge.
                Console.WriteLine("There are no files to merge.")

            Else

                ' Source PDF.
                Dim pdfReader As PdfReader

                ' Maximum number of pages.
                Dim maxPages As Integer = 0

                ' Files to process with number of pages.
                Dim filesToProcess As New SortedDictionary(Of String, Integer)

                ' Process the files at the path.
                For Each dirFile As String In dirFiles

                    ' Extract the file extension from the name.
                    fileExtension = Path.GetExtension(dirFile)

                    ' Check if the file is a PDF file.
                    If fileExtension = ".pdf" And Not dirFile.Contains("~") Then

                        Try

                            ' Assign the current PDF.
                            pdfReader = New PdfReader(dirFile)

                            ' Assign the number of pages to the maximum if greater
                            ' than current value.
                            If pdfReader.NumberOfPages > maxPages Then

                                maxPages = pdfReader.NumberOfPages

                            End If

                            ' Add the file information to the sorted dictionary.
                            filesToProcess.Add(dirFile, pdfReader.NumberOfPages)

                            ' Close the PDF.
                            pdfReader.Close()

                        Catch ex As Exception

                            ' Message confirming the file could not be merged.
                            Console.WriteLine("The file ""{0}"" cannot be merged.",
                                    dirFile.ToString())

                        End Try

                    End If

                Next

                ' If there are PDFs to merge, process them.
                If maxPages > 0 And filesToProcess.Count > 1 Then

                    ' Memory output stream.
                    Using output As New MemoryStream()

                        ' Create and open new document.
                        Dim document As New Document()
                        Dim writer = New PdfSmartCopy(document, output)
                        document.Open()

                        ' Extracted page.
                        Dim page As PdfImportedPage

                        Try

                            ' Process PDF files up to the maximum number of pages.
                            For pageIndex As Integer = 1 To maxPages

                                ' Add the desired page from each PDF to the new PDF.
                                Dim file As KeyValuePair(Of String, Integer)
                                For Each file In filesToProcess

                                    ' Check if current file has the desired page to merge.
                                    If pageIndex <= file.Value Then

                                        ' Assign the current PDF to a reader object.
                                        pdfReader = New PdfReader(file.Key)

                                        ' Extract the desired page.
                                        page = writer.GetImportedPage(pdfReader, pageIndex)

                                        ' Add the extracted page to the combined PDF.
                                        writer.AddPage(page)

                                    End If

                                Next

                            Next

                            ' Close the document and save the new combined PDF.
                            document.Close()
                            File.WriteAllBytes(filePath + "\combined.pdf", output.ToArray())

                            ' Feedback that file merge has been successful.
                            Console.WriteLine("{0} files merged successfully.",
                                filesToProcess.Count.ToString())

                        Catch ex As Exception

                            ' Display a message stating the merge was unsuccessful.
                            Console.WriteLine("The file merge was unsuccessful.")

                        End Try

                    End Using

                Else

                    ' Display a message stating there are no files to merge.
                    Console.WriteLine("There are no files to merge.")

                End If

            End If

        Else

            ' Display a message stating file path does not exist.
            Console.WriteLine("File path does not exist.")

        End If

        ' Force console window to stay open until a key is pressed.
        Console.ReadKey()

    End Sub

End Class