Though the digital realm has overtaken much of society, paper documents or “paper trails” remain crucial for certain market segments (for instance, the legal professionals often produce paper documents that span hundreds, if not thousands of pages).
To help better organize and identify these documents, you can use different indexing methods, including Bates numbering. In this blog post, we’ll show you how to use DevExpress PDF Document API to update sequential page numbering across PDF files, and also review a couple of content-related enhancements we introduced in our most recent major release (v21.2).
For this example, we’ll use the DevExpress.com Privacy Policy with customized Bates numbering - digital reference points used to uniquely identify and label each page. These reference points contain page information (page number and number of all pages).
When the document changes (another document is appended), page information also changes. We need to find initial identifiers, remove them, and add new identifiers with current information throughout the document.
Our PDF Document API (v21.2) includes Facade API enhancements - PdfPageFacade.ClearContent method overloads allow you to clear content from one/multiple page areas and specify which content type to retain.
Call the PdfDocumentProcessor.FindText method to find the numbering string and obtain its location. Pass the location for the PdfPageFacade.ClearContent method parameter to remove initial identifiers. Since identifiers have the same location on each page, we can use the retrieved location for the entire document.
using (var pdfDocumentProcessor = new PdfDocumentProcessor())
{
// Load a document
pdfDocumentProcessor.LoadDocument("Documents//Document1.pdf");
// Remove numbering from the initial file
RemoveNumbering(pdfDocumentProcessor);
//…
private static void RemoveNumbering(PdfDocumentProcessor pdfDocumentProcessor)
{
PdfRectangle textRectangle = null;
int pageCount = pdfDocumentProcessor.Document.Pages.Count;
string text =
string.Format("Case 0001, 00001 Page 1 of {0}", pageCount);
// Find the identifier string
PdfTextSearchResults numeration = pdfDocumentProcessor.FindText(text);
// If the string is found, obtain its rectangle
if (numeration.Status == PdfTextSearchStatus.Found)
{
textRectangle = numeration.Rectangles[0].BoundingRectangle;
}
// The identifier is located at the same position on each page
// Clear the obtained rectangle on each page
foreach (PdfPageFacade pageFacade in pdfDocumentProcessor.DocumentFacade.Pages)
{
pageFacade.ClearContent(textRectangle);
}
}
At this point, we can append another document and add new numbering to all pages. We will use PDF Graphics API to draw a new text string at the top right corner of each page:
//…
// Append another document
pdfDocumentProcessor.AppendDocument("Documents//Document2.pdf");
using (SolidBrush textBrush = new SolidBrush(Color.FromArgb(255, Color.SlateGray)))
{
// Create new numbering with current page information
AddGraphics(pdfDocumentProcessor, textBrush);
}
// Save the result
pdfDocumentProcessor.SaveDocument("Documents//DevExpress Privacy Policy.pdf");
}
static void AddGraphics(PdfDocumentProcessor processor, SolidBrush textBrush)
{
float DrawingDpi = 72f;
IList<PdfPage> pages = processor.Document.Pages;
for (int i = 0; i < pages.Count; i++)
{
// Specify identifier format
PdfPage page = pages[i];
int pageCount = pages.Count;
int number = page.GetPageIndex() + 1;
string text =
string.Format("Case 0001, {0:00000} Page {0} of {1}", number, pageCount);
using (PdfGraphics graphics = processor.CreateGraphics())
{
// Obtain page size
var cropBox = page.CropBox;
SizeF pageSize = new SizeF((float)cropBox.Width, (float)cropBox.Height);
using (Font font = new Font("Segoe UI", 14, FontStyle.Regular))
{
// Define identifier size
SizeF textSize =
graphics.MeasureString(text, font, PdfStringFormat.GenericDefault, DrawingDpi, DrawingDpi);
// Calculate a point to insert numbering
PointF topRight = new PointF(pageSize.Width - textSize.Width, textSize.Height);
// Draw a numbering string
graphics.DrawString(text, font, textBrush, topRight);
// Add the resulting graphics to page foreground
graphics.AddToPageForeground(page, DrawingDpi, DrawingDpi);
}
}
}
}
As you can see below, each document page includes the following updated identifier:
The Facade API is also available in our WinForms PDF Viewer and WPF PDF Viewer controls. Call the PdfViewerExtensions.GetDocumentFacade method to retrieve the DocumentFacade object. Add the DevExpress.Docs.v21.2.dll assembly to your project to use extension methods.
IMPORTANT NOTE: Please remember that you must own an Office File API or Universal Subscription to use this assembly/capability in production code. If you currently own a WinForms Subscription/WPF Subscription and if you’d like to use this new capability in your WinForms/WPF app, please contact our Client Services team (clientservices@devexpress.com). We’ll be happy to help you upgrade your subscription.
Try It Now
To explore the PDF Clear Page Content feature in greater detail, be sure to check out our new Office File API demo module:
This demo allows you to remove content from the selected page area. Use checkboxes in the Options section to specify the content to retain in the selected area.
Your Feedback Matters
We’d love to know what you think of our PDF Facade API within our Office-inspired/PDF product line. Please share your thoughts in the comment section below or submit a support ticket via the DevExpress Support Center at your convenience.