INTERMEDIATE   Ask the VB Pro



Build Strings Faster

by Karl E. Peterson

Q. Improve Page Output
I have to build hundreds of static HTML pages regularly, pulling the content from a database. Each page contains hundreds, if not thousands, of individual string elements. I know that disk operations are expensive in time spent, so I've been concatenating the page contents into one long string, then writing it out all at once. Still, this operation takes far longer than I'd like; how can I speed it up?

About This Column
Ask the VB Pro provides you with free advice on programming obstacles, techniques, and ideas. Read more answers from our crack VB pros on the Web at www.devx.com/gethelp. You can submit your questions, tips, or ideas on the site, or access a comprehensive database of previously answered questions.

A. String concatenations can be time-consuming, so it's always wise to look for alternatives when you need to build long strings. Consider what's happening when you concatenate two strings. VB needs to calculate the length of the resulting string and allocate space for its content, then copy the content of each of the source strings to this new string. If you're assigning the result to one of the original source strings—a common practice—VB needs to swap the pointers to the original and new strings, then deallocate the original string. These steps add up to a lot of memory operations.

Here's a far better approach: When you recognize the need to build a long string, allocate a buffer that holds the entire content, then insert the individual elements into their proper places. This foresight saves several memory allocations, memory copies, and possibly a deallocation for each added element. You can accomplish these steps using standard VB functions, but don't use this typical concatenation method:

Dim s As String
s = "String "
s = s & "Builder "
s = s & "Test"
Debug.Print s

Instead, allocate a buffer, and stuff it using the Mid statement:

Dim s As String
s = Space$(50)
Mid$(s, 1) = "String"
Mid$(s, 8) = "Builder"
Mid$(s, 16) = "Test"
Debug.Print Trim$(s)

This approach means more work for the programmer. For example, you must have a fairly good idea of how large your string might grow. You must also track where the end of your content lies within the buffer, so you know where to insert the next piece. You must also be sure each impending insertion won't overrun the buffer; otherwise, you'll generate a runtime error. These sorts of requirements beg for class-module encapsulation.

Microsoft's next-generation platform, .NET, includes a native StringBuilder class that provides a good model on which you can base a classic VB class. I wrote a class called CStringBuilder, which is fun because it pushes the limits of string operations in VB, exceeding native concatenation times by roughly 15x (in the IDE) to 100x (compiled) (see Listing 1).

CStringBuilder uses a byte array as the memory buffer, with two bytes reserved for each character of the built string. One property not offered by the .NET class is BufferBump, which sets and returns the number of bytes by which the buffer is resized as needed. The class defaults to using a 2K buffer, resized in increments of 2K, but you might want larger initial values if you plan to build really large strings.

One good approach is to call the EnsureCapacity method after instantiating the class, passing a value larger than you anticipate you'll need. When you do this, you create an initial buffer that's a multiple of BufferBump in capacity and will later hold your final string. EnsureCapacity calls Capacity, the method that resizes the internal buffer, only if the requested value is larger than the current buffer size. You can use CStringBuilder's character-based Length property if you prefer counting characters instead of bytes.

The Append method starts by storing the number of bytes in the passed string—a value used several times within the routine. To avoid memory overruns, Append next calls EnsureCapacity to make sure the buffer will accept the additional contents. Once Append guarantees the buffer size, a call to CopyMemory slings the passed string into the buffer at the location marked by the end-of-buffer pointer. This pointer increments by the number of bytes copied, so the class always knows how much data the buffer contains. Finally, Append returns a self-reference to the caller.

CStringBuilder clients can retrieve the built string by calling the class's ToString method. ToString uses the Space function to allocate enough characters to its return value to hold the built string stored in the internal buffer. A call to CopyMemory then fills ToString's return value with the built string's bytes. Using the StringBuilder class is straightforward:

Dim s As New CStringBuilder
s.Append "String "
s.Append "Builder "
s.Append "Test"
Debug.Print s.ToString

You can find the code for the full CStringBuilder class on VBPJ's Web site. Insert and Remove operations are much trickier than Append; you must use a double buffer to achieve the best performance. Each call to one of these methods rebuilds the string within a scratch buffer, then exchanges the pointers to the two buffers to avoid the penalty for performing overlapped memory copies within a single buffer.

Q. Sink Notifications From Arrays of Objects, Part II
In August 2000, a reader wrote:

I've written some classes that raise events back to their clients. I'd like to use arrays of these classes, but VB won't allow that together with WithEvents:

Dim WithEvents MyClass() As Class1

I need to find a strategy for listening to events from many objects. Any ideas?

A. My proposed solution then called for the parent to implement a custom interface and for the client to call this interface directly, as needed. The drawback of that approach is the circular reference it creates, which requires the parent to tear down each client explicitly. Grant Wickman wrote to propose a more elegant solution, I liked it, and he graciously offered permission for me to publish it.

This solution revolves around a single intermediate notification object, which the parent creates at startup using WithEvents, and passes to each object in the array. The code within the notification object is drop-dead simple:

Option Explicit

Public Event Notify(Source As Object)

Public Sub CallNotify(Source As Object)
   RaiseEvent Notify(Source)
End Sub

Assume you need to create an array of many class objects and need to sink events from each. As part of each object's initialization, pass the reference to the common notification object. You'll also want a means to uniquely identify each object. I suggest adding a Tag property to your MyObject class:

Private WithEvents objNotify As CNotify
Private obj(0 To 5) As MyObject
Private Sub Form_Load()
   Dim i As Long
   ' Create array of callback objects.
   Set objNotify = New CNotify
   For i = 0 To 5
      Set obj(i) = New MyObject
      Set obj(i).Notify = objNotify
      obj(i).Tag = i
   Next i
End Sub

To notify its parent, the event-raising object needs only to fire the CallNotify method of the CNotify class. This means the parent is free to examine the properties of the event-raising object to ascertain the purpose of the call. You might need to force the notification—for example, when the class needs to let its parent know its TestCallback method has been invoked (see Listing 2). In this case, the TestCallback routine tests to make sure a notification object has been assigned, then invokes its CallNotify method. The parent object sinks this event and responds appropriately:

Private Sub objNotify_Notify(Source _ 
   As Object)

   ' Use the Tag property as you would
   ' a control array Index parameter.

   MsgBox "Object(" & Source.Tag & _
      ") calling back!"
End Sub

The scheme's advantage is that it creates no circular references, which in turn means the parent has no need to perform an explicit tear-down when it's through with the array. The overhead is slightly higher than with a strictly interface-based approach, but, for non-time-critical applications, the robustness more than makes up for that.




Karl E. Peterson is a GIS analyst with a regional transportation-planning agency and serves as a member of the Visual Basic Programmer's Journal Technical Review and Editorial Advisory Boards. Online, he's a Microsoft MVP and a section leader on several VBPJ forums. Find more of Karl's VB samples at www.mvps.org/vb.

 
  Get the original code for this article here.

Updated samples based on this article:
ObjArrays
StrBldr


• Ask the VB Pro, "Listen to Objects With Interface," by Karl E. Peterson [VBPJ August 2000]
StringBuilder Class, .NET Framework Class Library