EXPERT  Black Belt Programming


Modify a Variable's Pointer

Minimizing memory requirements is only one benefit of knowing how to change a variable's pointer.

by Bill McCarthy

    Visual C++ programmers are pretty familiar with using pointers. Unfortunately, you have to do a bit more work to use pointers in Visual Basic, but the benefits remain—minimizing memory requirements and allowing simultaneous views of the same data. You can also map different types of intrinsic variables to the same block of data—for example, an array of Longs to an array of Large Integer structures, or an array of Bytes to a variable length string—and have the changes in one variable be reflected immediately in the other, because they both point to the same block of data.

What you need:
Visual Basic 5.0 or 6.0 (Professional or Enterprise Edition for the sample code)
To modify a variable's pointer, you have to read and write to the memory address where that variable's pointer is stored, and, most importantly, you have to get that address to start with. The memory address you retrieve is the address of the data, not a pointer to the data, for most of VB's intrinsic variable types. The four obvious exceptions are variants, objects, strings, and arrays. Variants can contain a pointer or sometimes only the data itself, so using variants, as usual, is always full of pitfalls. You can point one object at another by using a simple assignment. That leaves two types of variables that show some promise—strings and arrays.

Visual Basic uses COM Automation for variable-length strings and for arrays. VB's strings are Basic strings (BSTRs), and arrays are SafeArrays. You can find out more about BSTRs and SafeArrays in the Platform SDK (see Resources).

 
Figure 1 VarPtr Returns Address of Pointer Click here.

VB strings consist of a variable pointer, four bytes long, that holds a pointer to the string (see Figure 1). The four bytes preceding the string contain the string length in bytes. The string is Unicode, so each character has two bytes. At the end of the string are another two bytes, containing zeros, that are more for compatibility with CHAR* strings than anything else. VB does not seem to use these Null terminators internally, but rather relies on the four bytes preceding the strings to determine the string length.

The main requirement for pointing a string to another block of memory is that the four bytes immediately preceding the new pointer address contain the length of the string. If you create a file for use as a memory-mapped file, you need to enter the string length before each string. You can also modify the string length, without having to create a new string, by changing those preceding four bytes. But if you don't restore the original length, VB doesn't clean up after itself properly when the string goes out of scope; instead, you have to wait for your application to finish and then Windows tries to clean up.

You need to use the CopyMemory API and the VarPtr function to change the pointer of a string. The StrPtr function returns the pointer to the string. This pointer is stored at the VarPtr address, so you can read from and write to the pointer to the string:

' get the string pointer
lngStrPtr = StrPtr(MyString)
' or like this
CopyMemory lngStrPtr, ByVal VarPtr(MyString), 4&
'change the string pointer
CopyMemory ByVal VarPtr(MyString), newStrPtr, 4&

However, working with string pointers presents several major problems. You can't lock the string, so you have to code carefully to ensure you don't release the pointer. VB creates a new string and releases the old string every time you modify a string's length, or append another string to it. As soon as VB does this, you lose your pointer. So, with the exception of the Mid statement (not the Mid function), it's safe only to read from your string, not to write to it.

 
Figure 2 VarPtrArray Returns Address of Array Click here.

But you can do a few nice tricks with arrays of strings. The array's data is a series of Longs that contain the StrPtr. This means you can sort an array of strings by shuffling around the order of the pointers, without the overhead that would usually be involved if you tried to move the strings around themselves. VB has to allocate a new string, copy the data there, then release the old string every time you assign a new string to a string variable. By the time you finish sorting an array of strings by moving the strings around, chances are your strings will be scattered around in memory instead of in nice, neat, consecutive blocks. Among the strings will probably be little blocks of memory you can't assign to any of the strings—in other words, memory fragmentation. Sample2 and Sample5 in the sample code demonstrate using pointers to sort an array of strings (download from the code).

Using SafeArrays
Arrays in VB are OLE SafeArrays. With arrays, the memory address of the variable in VB contains a pointer to a pointer to a SafeArray structure (ppSA) (see Figure 2). A pointer to the SafeArray structure (pSA) resides at the memory address specified by the ppSA value. The SafeArray structure contains a pointer to the data (pvData). You can change either the ppSA or the pSA values to point your array to another array. Alternatively, you can change the value stored in the pvData member of the SafeArray structure to point to another block of data. Also, you can change the other members of the SafeArray structure and/or the members of the SafeArrayBounds structures, but before you do any of that, you need to get the pointer to the SafeArray (pSA).

The VarPtr function can give you the ppSA, but it won't take an array as an argument. To get around this restriction, you need to declare the function this way:

Private Declare Function VarPtrArray 
   Lib "msvbvm60.dll" Alias "VarPtr" _
   (Var() As Any) As Long

For VB5, you need to replace the msvbvm60.dll with msvbvm50.dll. Also, the VarPtrArray function doesn't work with a variant array or an array of strings.

An alternative method: Pass the array to a function as a variant. This method allows you to work with all array types, including a variant array or an array of strings. At an offset of eight bytes to the variant's base address (VarPtr) is the ppSA if it's a normal array, or pSA if it's a variant array. The AttachToArray method of the clsSAFEARRAY in the sample code checks the vartype of the Variant and obtains the pSA accordingly (see Listing A; see the sidebar, "About clsSAFEARRAY"). The AttachToArray method can also accept a Long value that is the ppSA. To avoid problems associated with trying to pass an array of UDTs to the class module, pass the ppSA value returned from the VarPtrArray function instead.

The SafeArray structure is 16 bytes long and is followed by a series of SafeArrayBounds structures, one for each dimension:

  About clsSAFEARRAY
In the class clsSAFEARRAY, you can see that no copymemory functions are used in the Property Let and Property Get statements. This is because the variables used inside the class have been pointed to the array's SafeArray structure. Therefore, changing their values changes the attached array's properties.

Type SAFEARRAY
   ' Count of dimensions 
   cDims As Integer
   ' Flags
   fFeatures As Integer  
   ' Size of an element
   cbElements As Long    
   ' Number of locks
   cLocks As Long        
   ' pointer to data
   pvData As Long        
End Type

Type SAFEARRAYBOUND
   cElements As Long
   lLbound As Long
End Type

cDims is the count of the number of dimensions. You can decrease the number of dimensions safely, but increasing the number of dimensions requires more SafeArrayBounds structures to follow the SafeArray structure. Although you might be able to write to those extra eight bytes required for each additional SafeArrayBounds structure, and hence effect the change, you can't be sure that something else isn't using that memory space. So, it's probably better to dimension the array with the maximum number of dimensions you'll need to start with. For example, you might want to change the number of dimensions when you are working with matrices or with grids (see Listing B).

The iFeatures member of the SafeArray structure describes the attributes of the array, such as whether it's allocated statically or whether it's of fixed size. The sample SafeArray class contains more descriptions of the features flags. You can try changing these flags, but probably the only difference you'll see is in the error messages.

CbElements specifies the size of each element in the array. You can use this member to overlap elements or to provide padding between elements. However, it has an effect on certain datatypes only depending on the compile options you set when building your application.

CLocks specifies the number of locks on the array. When the value is greater than zero, and you try to redim or erase the array, a runtime error occurs. In my sample SafeArray class, the lock value is incremented by one when the class is attached to an array to ensure that the pointer stays the same. But be warned: If you have your application compiled to native code, and your array goes out of scope while the cLocks value is one or greater, your application will hang.

The pvData member of the SafeArray structure is the pointer to the actual data block. This member lets you point your array at another block of data, such as another array's data or a memory-mapped file. You could use this pointer to have two variables point to the same block of data. Sample1 of the sample code demonstrates a simple example of having two arrays pointing to the same data.

Following the pvData member is the series of SafeArrayBounds, which specify the lower bound (lLBound) and count of elements (cElements) for each dimension. In VB, the UBound function returns cElements plus lLBound minus one (cElements + lLBound - 1) for the given dimension. That's why a Variant array (or ParamArray) with no elements returns minus one for its UBound.

Employing Compiler Options
You get different results when modifying the SafeArray structure depending on the compiler options you set. If you compile to p-code or turn on the "Remove Array Bounds Check" in the compile advanced options settings, then you can happily increase the number of elements in any of the dimensions of a fixed array. If you don't compile with either of those two settings, then you get a runtime error 10 if you try to increase the number of elements in a fixed array.

  Quiz
If you have an array of Longs, X(i), and a Long called Y, when does this code return false?
Y = X(i)
If Hex$(Y) <> Hex$(X(i)) Then
   'return False
End if

You're probably thinking that's no big deal. After all, compiling to native code and turning off the array bounds checking gives you faster code in most cases. But what happens when you try to change the padding between elements? Strange things, that's what, and here's the answer to the quiz (see the sidebar, "Quiz").

In Sample3 of the accompanying sample code, the array of Longs has its element size member changed to 2, instead of 4:

clsSAX.AttachToArray X
clsSAX.SizeofElements = 2&

With the original data set at (0, 1, 2, 3, 4), the values for the array after changing the size of the elements should be (0, &H1000, 1, &H2000, 2) if that change works.

The cbElements setting is used when the array item is passed to a function (otherwise, it's ignored) while running in the IDE, which is probably the same as p-code.

The sample code's output looks like this:

x(i) ‡ ( 0, 1, 2, 3, 4)
Hex$(x(i)) ‡ (0, 10000, 1, 20000, 2)

And sure enough, when you compile the code to p-code, the results displayed in a message box are the same. Optimizing for Fast Code means that the cbElements member is ignored; optimizing for small code or no optimizations makes VB read the cbElements member and offset the data accordingly, as long as the "Remove Safe Array Bounds" advanced option is disabled (see Table 1).

The techniques outlined here also give you a way to point an array or string directly to memory-mapped files. Previously, Karl E. Peterson showed how to use memory-mapped files (see Resources). When you call the MapViewOfFile API function, the return value is the base address of the mapped view. To point an array at a memory-mapped file, set the pvData member to the mapped view address. You can avoid the overhead of having to copy the data to a local variable, and you can use pointers to share data between applications efficiently.

Sample4 of the accompanying code demonstrates using pointers with memory-mapped files. In the sample, a named memory-mapped file is used to share data between multiple instances of the sample application, and an array of Longs has its pvData modified to point to the memory-mapped file. Because each instance points to the same block of memory, changes initiated in one instance change the data for all instances.

Memory Leaks and Cleaning Up
VB tries to clean up after itself whenever a variable goes out of scope. If the variable's pointer points to another block of memory, VB tries to release that block of memory and leaves the original block of memory that belonged to the variable stranded until your application closes. When your application closes, Windows tries to recover that memory. The safest way to handle cleanup is to ensure you reset your variable's pointer back to its original pointer, and let VB clean up after itself properly. If you are using the clsSAFEARRAY from the sample code, make sure you call the ReleaseArray procedure before your array goes out of scope.

You can't use pointers in VB as easily as you can in VC++. VB's setup and use of pointers entails some initial overhead—you must clean up properly. Used properly, pointers in VB allow you to share data more efficiently between applications, as demonstrated in Sample4. Pointers in VB also let you manipulate data more efficiently. Run Sample5 and compare the times for performing a traditional quicksort in VB with using pointers. The larger your data set, the greater the relative efficiency of using pointers.

The clsSAFEARRAY from the sample code makes it easier to work with an array's pointers and clean up afterwards. You are welcome to use the class in your applications.



Bill McCarthy, a Microsoft VB MVP, lives in Victoria, Australia. Take a look at his VB Web site at www.TotalEnviro.com/PlatformVB. Reach him at Bill@TotalEnviro.com.

 
  Get the original code for this article here.

Updated samples based on this article:
MapFile

 
• Microsoft's Platform SDK, Component Services, COM Automation section
"Move Data With Memory-Mapped Files" by Karl E. Peterson [Ask the VB Pro, VBPJ October 1999]