Borland Online And The Cobb Group Present:


April, 1994 - Vol. 1 No. 4

Source Code Management - Searching for text in multiple source files

If you work on large software projects, you've probably struggled with finding the specific file that contains a class declaration or implements a particular function. The search feature in a typical text editor is fine for searching within that file, but usually falls short when you're looking for text that could be in one of a number of files.

Borland C++ provides you with the GREP.COM utility to help you search for text in one or more files. GREP is an acronym for Global Regular Expression Print. Borland's version of the GREP utility is similar to a utility that's available on many UNIX workstations.

GREP can search for a simple text string or for a complex combination of characters. In this article, we'll give you a brief introduction to the GREP utility. First, let's look at the basic syntax for GREP and some of its command-line options.

Get a GREP

To use the GREP utility from any directory, you'll want to make sure the current PATH environment variable contains the path to either the \BORLANDC\BIN directory (for Borland C++ 3.1) or the \BC4\BIN directory (for Borland C++ 4.0). You use this path because the installation programs place GREP.COM in the compiler's \BIN directory.

The most basic way to use GREP is to search for a specific word in a single file. For example, to search for the word class in the file VCIRC.CPP, you would enter

grep class vcirc.cpp

If you enter this command in the \BORLANDC\EXAMPLES directory, GREP will produce the output shown in Figure A.


Figure A - The GREP utility can search a file for a specific text string and print each line that contains that text.

File VCIRC.CPP:
// A Circle class derived from Point
#include "vpoint.h"     // Location and Point class declarations
class Circle : public Point { // derived from class Point
                        // and ultimately from class Location

GREP's output lists each line that contains the search text or expression. However, you may not always want to see the content of these matching lines. Instead, you may want to know only the number of lines that contain the search expression. To change the format of GREP's output, you can specify one or more command-line options.

Command-line options

GREP, like most command-line utilities, accepts a number of command-line options. These options affect the format of the output, determine whether GREP will search subdirectories, and control how GREP will use the search expression you provide.

For example, if you want to see the number of lines that contain an expression, add -c after the word grep in the command. Table A briefly describes the command-line options that affect GREP's output formatting.

Table A - You can change the format of the GREP utility's output by specifying one or more of the Output Formatting Options.
Output Formatting Options
-c Print only a count of matching lines
-l Print only the names of files that contain matching lines
-n Print the line number at the beginning of each matching line
-o Print the filename at the beginning of each matching line
-z Print the filename and count of matching lines for all files (even those with no matching lines)

If you want to search all the subdirectories of the current directory, add -d after the word grep in the command. If you use the -d option, GREP will search all the subdirectories first, and will then look in the current directory.

Finally, you can tell GREP to interpret the search expression in different ways. Table B describes the options that affect GREP's interpretation of the search expression.

Table B - You can change the way GREP treats the search text by using the Search Expression Options.
Search Expression Options
-i Ignore uppercase and lowercase mismatches
-r Treat the text as a GREP expression
-v Print the lines that don't match the search expression
-w Match only the whole search expression
-w[] Change the valid search expression character set and then match only the whole search expression

The -r option tells GREP to treat the search expression as a regular expression instead of treating it as a simple text string. You'll use a regular expression to specify a pattern of acceptable characters.

Changing your (regular) expression

If you're familiar with using wildcard characters in DOS commands, you'll feel comfortable with the special symbols you'll use in a regular expression. In a GREP command line, these regular-expression symbols allow you to customize the conditions that constitute a match.

You can use different regular-expression symbols to specify the location of matching search expressions within each line or to specify optional matching characters at a given position in the expression. Table C summarizes the symbols that GREP recognizes in a regular expression.

Table C - You can place a regular-expression symbol in the search expression to customize the matching conditions.
Regular-expression Symbols
^ Match only if the search expression appears at the beginning of a line
$ Match only if the search expression appears at the end of a line
. Match any character at this position
* Match zero or more occurrences of the following character at this position
+ Match one or more occurrences of the following character at this position
[] Match any occurrence of one of the following characters at this position
[^] Match anything except one of the following characters at this position
\ Match the following character or symbol at this position

The first two symbols (^ and $) specify a position in each line where GREP will look for matching search expressions. The remaining symbols tell GREP how to determine if a match exists.

You can now create a complex searching algorithm by invoking the GREP utility with a regular expression and one or more of the command-line options. Let's work through a simple example to see how you might use GREP to scan your source files.

Calling the GREP line

At a DOS prompt, enter PATH [Return] to confirm that the \BORLANDC\BIN directory (or \BC4\BIN if you're using Borland C++ 4.0) is part of the current PATH environment variable for DOS. You'll see something like

PATH=C:\DOS;C:\
       BORLANDC\BIN;

As long as \BORLANDC\BIN or \BC4\BIN appears somewhere in the PATH, you'll be able to use the GREP utility. (If you chose something other than the default directory names during installation, look for the path to the BCC.EXE command-line compiler. The installation program stores GREP.COM and BCC.EXE in the same directory.) For the remainder of the article, we'll assume you're using Borland C++ 3.1. Keep in mind that the techniques work for version 4.0 as well.

Now, let's create a search specification that demonstrates some of the GREP command-line options and regular-expression symbols. Since the ObjectWindows Library (OWL) example files provide a large set to work with, let's look through all the OWL header (.H) files for classes that directly and publicly derive from only the TWindow class.

First, change the current directory to the main OWL directory by entering

cd\borlandc\owl

Specifying the OWL directory will limit the scope of our search to just the OWL directory and subdirectories.

Now, we'll create the GREP command to start the search. Enter the following command exactly as it appears:

grep -d -r "public+ TWindow* [^,]" *.h

When GREP begins processing the command, the following output will appear:

File EXAMPLES\GDIDEMO\DEMOBASE.H:
class TBaseDemoWindow : public TWindow {
File EXAMPL31\OLESRVR\OLESRVR.H:
class TWindowServer : public TWindow {
File EXAMPL31\OLECLNT\OLECLNT.H:
class TOleDocWindow : public TWindow {
File INCLUDE\CONTROL.H:
class _EXPORT TControl : public TWindow {

As you can see, GREP found four files that match the search criteria. This means that there are four files in the directories inside the OWL directory that declare classes directly and publicly derived from the TWindow class.

GREP in the IDE

In addition to the command-line version of GREP, Borland also supplies a special message conversion utility that allows you to call GREP from within the 3.1 Integrated Development Environment (IDE) for DOS or the 4.0 IDE for Windows. In a future issue, we'll show you how this utility can help you examine the results of a GREP search.

Conclusion

GREP is a powerful utility you can use to quickly locate names, classes, or specific constructs in your code. In this article, we introduced the DOS command-line version of GREP.

Return to the Borland C++ Developer's Journal index

Subscribe to the Borland C++ Developer's Journal


Copyright (c) 1996 The Cobb Group, a division of Ziff-Davis Publishing Company. All rights reserved. Reproduction in whole or in part in any form or medium without express written permission of Ziff-Davis Publishing Company is prohibited. The Cobb Group and The Cobb Group logo are trademarks of Ziff-Davis Publishing Company.