Borland Online And The Cobb Group Present:


May, 1994 - Vol. 1 No. 5

Under the hood - Examining the find () function

The BI_ISVectorImp class template implements the find() function with a binary search. Unfortunately, if you use the find() function to search for an item that isn't in the array but would be the first item if it were in the array, the search algorithm can begin using invalid array indices.

To see what causes this, let's first look at a program that illustrates the problem. Then, we'll trace through the find() function in order to see how the index problem develops.

Spotting the problem

This problem is difficult to track down because it occurs only when the size of the array disturbs the search boundary values. An example of this problem appears in the code shown in Figure A.


Figure A - By using the find( ) function in the BI_ISVectorImp class template, you may unintentionally reference addresses outside the array's boundaries.
#include <vectimp.h>

int main()
{
  int* intPtr1 = new int(1);
  int* intPtr2 = new int(2);
  int* intPtr3 = new int(3);

  BI_ISVectorImp<int> newArray(1,1);
  newArray.add(intPtr2);
  newArray.add(intPtr3);

  int index = newArray.find(intPtr1);
  return 0;
}

In this code sample, we first create the integer pointers intPtr1, intPtr2, and intPtr3 and initialize them with integers that have values of 1, 2, and 3, respectively. Then, we create a sorted array of integer pointers called newArray from Borland's BI_ISVectorImp class template. After we create the array instance, we add pointers intPtr2 and intPtr3 to the array.

Finally, we call the find() function of the newArray array with the pointer intPtr1 in order to search for a pointer to an integer whose value is 1. Figure B shows the body of the find() function from Borland's VECTIMP.H file.


Figure B - The VECTIMP.H file implements the find( ) function for the BI_ISVectorImp class template.

template &lt;class T&gt; unsigned
    BI_ISVectorImp&lt;T&gt;::find(void _FAR *t) const
{
  unsigned lower = 0;
  unsigned upper = count_ - 1;
  if( count_ != 0)
    {
    while(lower &lt; upper)
      {
      unsigned middle = (lower+upper)/2;
      if(*(const T _FAR *)(data[middle]) ==
                        *(const T _FAR *)t)
        return middle;
      if(*(const T _FAR *)(data[middle]) &lt;
                        *(const T _FAR *)t)
        lower = middle+1;
      else
        upper = middle-1;
      }
    }
  if(lower == upper &amp;&amp; 
          *(const T _FAR*)(data[lower]) == 
                        *(const T _FAR*)t)
    return lower;
  else
    return UINT_MAX;
} 

The sorted details

When we call the find() function in the example from Figure A, two pointers are already in the array­­intPtr2 and intPtr3. Therefore, the value of count_ is 2, which sets the value of upper to 1.

In the first pass through the while loop, the loop assigns a value of 0 to middle. Since neither of the pointers currently in the array points to an integer whose value is 1, the else clause sets upper equal to middle minus 1.

If middle were always a nonzero value, the loop would eventually reduce the value of upper to 0, forcing the loop to end. However, if the loop ever sets middle equal to 0, the problem will surface.

Subtracting 1 from an unsigned integer value of 0 will cause that value to become 65,535 (in a process known as rolling under). The next pass through the while loop sets middle equal to 32,767 (65,535/2).

Now when the comparison statements use middle as an index into the internal array data, the return values will likely come from outside the array's memory locations (unless there are more than 32,767 pointers in the array). In DOS, this may show up as erratic program behavior. Under Windows, this will probably cause a General Protection Fault.

Conclusion

Because the find() function uses a binary search to locate a matching entry, it will use different values for the upper and lower search limits based on its current size. Unfortunately, this means that an error in the limit checking algorithm can be very hard to locate.

Return to the Borland C++ Developer's Journal index

Subscribe to the Borland C++ Developer's Journal


Copyright (c) 1996 The Cobb Group, a division of Ziff-Davis Publishing Company. All rights reserved. Reproduction in whole or in part in any form or medium without express written permission of Ziff-Davis Publishing Company is prohibited. The Cobb Group and The Cobb Group logo are trademarks of Ziff-Davis Publishing Company.