While working on how Documenity displays data in the main view, I ran into an interesting problem. A lot of the documents I deal with are very large, so the application tended to stall when adding files to the interface. Although this doesn’t present any real functional problems, it does make the program feel very clunky and annoying for the end user, so I decided to look into the issue more closely.

Even with very long file lists, the document collection was being updated quickly enough to avoid interface freezing, so the problem had to be with how the view was generating data and rendering it to the screen.

The big performance bottleneck in the process was calculating the word count, image count etc. The IDocument interface defines these values as class properties, so the challenge was to design a property getter that exited quickly and allowed a long-running process to gradually update the value as it parsed through the document’s pages.

Here is the pattern I settled on:

public sealed class PdfDocument : IDocument
{
    private enum CountStatus
    {
      Uncounted,
      Counting,
      Counted
    }
    public BigInteger ImageCount
    {
      get
      {
          if (imagesCounted == CountStatus.Uncounted)
          {
            imagesCounted = CountStatus.Counting;
            Task.Factory.StartNew(() =>
            {
                ImageCount = 0;
                foreach (var page in document)
                {
                  ImageCount = CountImages(page);
                }   imagesCounted = CountStatus.Counted;
            });
          }   return imageCount;
      }
      private set
      {
          imageCount = value;
          OnPropertyChanged("ImageCount");
      }
    }
}

In this code, I use the Task Parallel Library from .NET 4 to launch a background job that does the actual counting. The CountStatus enumerated type prevents the process from running more than once and the property setter handles notification of the changes.

This allows the Document collection to emit update events as the pages are iterated over, so the numbers in the user interface can be updated dynamically without interrupting the user experience. This makes it a lot less frustrating to use the program, allowing everything to run much more smoothly.

Documenity still has some way to go before the user interface is finished. There are now a few too many buttons in the GUI that make using the software a little clunky and one or two basic design decisions still need revisiting. All in good time.