PDF Modules

The folder Engines->Pdf in the Standard subset contains Modules that performs specific tasks for the PDF Engine 3.0:

PDF 1:1 Compare

The Module PDF 1:1 Compare allows you to compare two PDF files.

The PDF Engine 3.0 performs a visual comparison between the two files.

  • If the similarity between the two files is above a certain percentage, the TestCase passes.

  • If the similarity between the two files is below the expected value, the TestCase fails.

The Module has the following ModuleAttributes:

ModuleAttribute

Description

Reference File

Full path to the first PDF file, including the file name and extension.

Reference File Password

Password to open the first PDF file.

Target File(s)

Specify which file(s) you want to compare to the Reference File.

To compare the contents to a single target file, choose one of these options:

  • Use the file selector next to the TestStepValue field and select the file.

  • Enter the complete path to the PDF file. For example, C:\MyPDFs\file1.pdf.

To compare the contents to multiple target files, choose one of these options:

  • Use the file selector next to the TestStepValue field and select the files.

  • Enter the path to a folder which contains all target files. For example, C:\MyPDFs.

  • Enter the complete path to each target file, separated by semicolon (;) . For example, C:\MyPDFs\file1.pdf;C:\MyPDF\file2.

Target File Password(s)

Password(s) for decrypting the Target File(s). To specify multiple passwords, use a semicolon (;) as a separator.

Moreover, note the following:

  • The number of passwords must match the number of PDF files you specify as Target File(s).

  • If you specify folder paths as Target File(s), the same password(s) applies to all PDF files within each specified folder.

Accuracy [%]

Specify the minimum similarity in percent between the two files.

Comparison Type

Specifies the type of comparison that you want to perform:

  • Full Text Content: Compares the entire text content of the files at once, regardless of the text position in a page. Note that only well-formed text is used for the comparison, meaning OCR text recognition is not supported.

  • Page by Page Text: Compares the entire text content of the files, where each page is compared against its corresponding page in the other file(s).

  • Page by Page Image (default): Compares the visual content of the files, where each page is compared against its corresponding page in the other file(s).

Additionally, you can tell Tosca to ignore whitespaces. To do so, use the setting Ignore whitespace in text-only comparison.

Excluded Pages

Optionally, specify the pages that you want to exclude from the comparison.

To specify a page range, use a hyphen (-). To specify multiple pages or page ranges, use a semicolon (;).

Excluded Areas

Optionally, specify areas in your PDF files that you want to exclude from the comparison. The following ModuleAttributes apply for each excluded area:

  • Area to Exclude > Dimensions contains the dimension coordinates respective to the excluded area. For information on how to get these dimensions, see "Exclude areas from the comparison".

  • Area to Exclude > Page(s) specifies in which pages of the document the excluded area is. To specify a page range, use a hyphen (-). To specify multiple pages or page ranges, use a semicolon (;).

Excluded Text

Optionally, specify patterns of text that you wish to exclude from the comparison. You can use regular expressions to specify unique patterns, if needed. We recommend you use Regular Expressions 101 to verify your regular expressions.

The following ModuleAttributes apply for each excluded pattern:

In this example, you compare the text contents of the file ReferencePDF.pdf with the files Target_A.pdf and Target_B.pdf.

You expect that the files should be at least 90 percent similar.

You exclude page 2 and pages 5 to 8 from the comparison.

Compare two PDF files

The comparison result shows that the files don't have the minimum similarity that you have specified.

Consequently, your TestCase fails.

Failed PDF comparison

In this example, you only compare the text between two PDF files, expecting 100% similarity. Moreover, you exclude three different text patterns from the comparison:

  • The sentences Good morning, and Good afternoon,.

  • The sentence Your assigned category is:, followed by any letter between A and Z.

  • The sentence Your assigned token is:, followed by any combination of letters and numbers, followed by a line break with carriage return.

  • The word Tricentis.

Check PDF for Broken Links

The Module Check PDF for Broken Links lets you check for any links in a PDF file that return client error responses (400-499) or server error responses (500-599). This helps you verify whether there are broken links inside a document, for instance.

The Module has the following ModuleAttributes:

ModuleAttribute

Description

PDF File

Complete path to the PDF file you want to verify.

Example: C:\MyReports\YearlyReport.pdf.

Ignore Errors

Error code you want to ignore in your test.

For example, enter 403 to ignore all links with a "Forbidden" response.

This example TestCase investigates the PDF file located at C:\TestPDF.pdf for broken links. It checks for all error types except 403 errors.

Example - Checking for error responses

Once it finds the window, it verifies all links within this web page for error responses:

Link verification example in Tricentis Tosca

Barcode Reader

The Module Barcode Reader lets you verify or buffer the values of the following barcodes and QR codes in PDF files:

  • For barcodes: Code 39, Code 93, Code 128, EAN-8, EAN-13, UPC-A, UPC-E, ITF, Industrial 2 of 5, Inverted 2 of 5, IATA, Add 2, Add 5, Matrix 2 of 5, Datalogic 2 of 5, Codabar, BCD Matrix.

  • For QR codes: QR, Micro QR, Data Matrix, PDF417, Aztec.

The Module has the following ModuleAttributes:

ModuleAttribute

Description

PDF File

Absolute path to the PDF file.

Barcode - Type

Specify the type of barcode you want to scan: 

  • All: Check for any supported barcode. This option requires additional processing and may impact on performance.

  • Barcode: Check for barcodes only.

  • QR: Check for QR codes only.

Barcode - Page

Optionally, specify which pages you want to check for barcodes or QR codes. If you leave this blank, Tricentis Tosca checks all pages.

Barcode - Index

Optionally, specify the index of the barcode or QR code, to narrow down your matches.

Enter one of these values:

  • A number to indicate the <n>th barcode or QR code based in your file. You can enter any positive integer.

  • first to specify the first barcode or QR code in your file.

  • last to specify the last barcode or QR code in your file.

Barcode - Value

Specify the action you want to perform:

In this example, you verify the third QR code in a PDF document called MyPDFCodes.pdf, which is located on C: drive. The expected value is 2FH8VB6PL.

Verification example of a QR code

Extract Links

The Module Extract Links lets you extract and buffer links inside any PDF into an array. You can then use your buffered links in later TestSteps. Tosca automatically finds all the links, so you don't need to scan the PDF.

The Module has the following ModuleAttributes:

ModuleAttribute

Description

PDF File

Enter the full path of the PDF, including the file name and extension.

Indexes

Optionally, specify the index position of the links you want in the buffer. Index numbers start at 1. For instance, 1-3; 5 buffers the first, second, third, and fifth link.

To buffer all links, leave this field blank.

Links

Enter a name for the buffer that contains the links from the PDF.

In this example, you extract a link from a PDF and buffer it for use in later TestSteps.

  • You extract the link from a PDF file called Invoice_08008.pdf located at C:\Documents\Invoices\2024.

  • You want to buffer all links, so you leave the Indexes field blank.

  • You name your buffer Order Summary Link and select the ActionMode Buffer so that you can store this specific link for future TestSteps.

Extract a link from a PDF