home *** CD-ROM | disk | FTP | other *** search
- Turbo Pascal for DOS Tutorial
- by Glenn Grotzinger
- Part 10 -- binary files; units, overlays, and include files.
- All parts copyright 1995-6 (c) by Glenn Grotzinger.
-
- There was no prior problem, so lets get started...
-
- Typed binary files
- ==================
- We know that files can be of type text. We can also make them type "file
- of <datatype>". We can read and write binary data types to disk. Here's
- an example. Keep in mind that with typed binary files, you can only
- read and write the type of file you define it to be. For the example
- below, we can only deal with integers with this file. The type we may
- use may be anything that we have covered up to this point. We also
- will see that reading, accessing and writing of typed binary files will
- be no different than accessing text files, except we can not make use
- of readln and writeln (as those are for text files only).
-
- program integers2disk;
- { writing integers 1 thru 10 to a disk data file, then reading 'em back }
- var
- datafile: file of integer;
- i: integer;
- begin
- assign(datafile, 'INTEGERS.DAT');
- rewrite(datafile);
- for i := 1 to 10 do
- write(datafile, i);
- close(datafile); { done with write }
- reset(datafile); { now lets start reading }
- read(datafile, i);
- while not eof(datafile) do { we can use the same concept }
- begin
- writeln(i);
- read(datafile, i);
- end;
- writeln(i);
- close(datafile);
- end.
-
- You will notice the numbers 1 through 10 come up. Look for the file
- named INTEGERS.DAT, and then load it up in a text editor. You will
- notice that the file is essentially garbage to the human eye. That,
- as you see, is how the computer sees integers. In part 11, I will
- explain storage methods of many many different variables, and introduce
- a few new types of things we can define. We can use records, integers,
- characters, strings, whatever...with a typed file as long as we comply
- with the specific type we assign a file to be in the var line.
-
- Untyped Binary Files
- ====================
- We can also open binary files as an untyped, unscratched (essentially)
- file. There we simply use the declaration "file". (I think this is ver7
- dependent, am I right?) Anyway, in addition to this, we have to learn
- a few new commands in order to use untyped files.
-
- BLOCKREAD(filevar, varlocation, size of varlocation, totalread);
-
- filevar is the untyped file variable.
- varlocation is the location of where we read the variable into.
- size of varlocation is how big varlocation is.
- totalread is how much of varlocation that was readable. (optional)
-
- BLOCKWRITE(filevar, varlocation, totalread, totalwritten);
-
- filevar is the untyped file variable.
- varlocation is the location of where we read the variable into.
- totalread is how much of varlocation was readable. (optional)
- totalwritten is how much of varlocation that was written. (optional)
-
- SizeOf(varlocation)
-
- Function that gives total size of a variable in bytes.
-
- Maximum readable by BlockRead: 64KB.
-
- Reset and Rewrite have a record size parameter if we deal with an untyped
- file.
-
- Probably, the best thing to make things clearer is to give an example.
- This program does the same thing as the last one does, but only with
- an untyped file. See the differences in processing...
-
- program int2untypedfile;
-
- var
- datafile: file;
- i: integer;
- numread: integer;
-
- begin
- clrscr;
- assign(datafile, 'INTEGERS.DAT');
- rewrite(datafile, 1);
- for i := 1 to 10 do
- blockwrite(datafile, i, sizeof(i));
- close(datafile);
- reset(datafile, 1);
- blockread(datafile, i, sizeof(i), numread);
- while numread <> 0 do
- begin
- writeln(i);
- blockread(datafile, i, sizeof(i), numread);
- end;
- close(datafile);
- end.
-
- This program performs essentially the same function as the first example
- program, but we are using an untyped file. Blockread and blockwrite are
- used in very limited manners here. It's *VERY GOOD* for you to experiment
- with their use!!!!!!! As far as the EOF goes on a comparison, blockread
- returns how many records it actually read. We use that as an equivalent.
-
- The 2 missing DOS file functions
- ================================
- We now have the tools to perform the 2 missing DOS file functions that you
- probably recognized were gone from part 8, copying files, and moving files.
-
- Copying files essentially, is repeated blockreads and blockwrites until
- all the input file is read and all the output file is written. We can
- do it with either typed or untyped files. An untyped file example may
- be found on page 14 of the Turbo Pascal 7.0 Programmer's Reference.
- For those who do not have this reference...Snippet of my own...untested...
-
- while (numread <> 0) or (bytesw = bytesr)
- begin
- blockread(infile, rarray, sizeof(rarray), bytesr);
- blockwrite(outfile, rarray, bytesr, bytesw);
- end;
-
- Moving files is a copy of an input file to a new location, followed by
- erasure of the input file.
-
- Units
- =====
- A unit is what you see probably on your drive in the TP/units directory.
- Compiled units are TPU files. They are accessed via USES clauses at the
- start. CRT, DOS, and WinDos are some of the provided units we have already
- encountered. Nothing is stopping us from writing our own, though. The
- actual coding of procedures/functions that we place into units is no
- different. The format of the unit, though, is something we need to think
- about. An example is the best thing for that. This is a simple
- implementation of a unit, with examples to give you some idea of a
- skeleton to place procedures and functions into.
-
- unit myunit;
-
- interface
- { all global const, type, and var variables go here as well as any
- code we may want to run as initialization for starting the unit. }
-
- { procedures and function headers are listed here }
-
- procedure writehello(str: string);
-
- implementation
- { actual procedural code goes here, as it would in a regular program }
-
- procedure writehello(str: string); { must be same as above }
- begin
- writeln(str);
- end;
-
- end.
-
- The unit up above is compilable to disk/memory, but unrunable. Essentially,
- what it is is a library of procedures/functions that we may use in other
- programs. Let's get an example out on how to use one of our own units.
-
- program tmyunit; uses myunit; { must match ID at beginning }
- var
- str: string;
- begin
- str := 'Hello! Courtesy of myunit!';
- writehello(str);
- end.
-
- Though this program/unit combination is ludicrous, it does illustrate
- exactly how to incorporate your own unit with MANY functions into your
- programming, if your project gets too big, or for portability's sake
- on some of your frequently used procedures.
-
- Overlays
- ========
- This will describe how to use TP's overlay facility. It must be used with
- units. Typically, my thoughts are that if you get a large enough project
- to dictate the use of overlays (we can use 'em on anysize projects, but
- the memory taken up by the overlay manager far uses more memory on smaller
- projects to make it an advantage to habitually do this). We will use
- the overlay facility with the unit/program set above for example purposes.
-
- ONLY CODE IN UNITS HAVE AN OPPORTUNITY TO BE OVERLAID! System, CRT, Graph,
- and Overlay (if I remember right) are non-overlayable.
-
- {$O+} is a compiler directive for UNITS only which designate a unit which
- is OK to overlay. {$O-} is the default, which says it's not OK to overlay
- a unit.
-
- To get to the overlay manager, we must use the overlay unit.
-
- After the overlay unit, we need to use the {$O <TPU name>} compiler
- directive to specify which units that we want to compile as an overlay.
-
- WARNING: It is good to check your conversion to overlays in a program
- with a copy of your source code. If you alter it with overlays in mind
- and it doesn't work (it's known to happen -- a procedure works ONLY when
- it's not overlaid...), you won't have to go through the work to alter
- it back if it doesn't work right...
-
- NOTE: You must compile to disk, then run when you work with overlays.
-
- Results come back in the OvrResult variable. Here's a list...
-
- 0 Success
- -1 Overlay manager error.
- -2 Overlay file not found.
- -3 Not enough memory for overlay buffer.
- -4 Overlay I/O error.
- -5 No EMS driver installed.
- -6 Not enough EMS memory.
-
- As for examples, let's look at the unit set up to overlay. As we can
- see, the only real difference (which is a good policy to make), is that
- there is the {$O+} compiler directive there now...
-
- {$O+}
- unit myunit;
-
- interface
- { all global const, type, and var variables go here as well as any
- code we may want to run as initialization for starting the unit. }
-
- { procedures and function headers are listed here }
-
- procedure writehello(str: string);
-
- implementation
- { actual procedural code goes here, as it would in a regular program }
-
- procedure writehello(str: string); { must be same as above }
- begin
- writeln(str);
- end;
-
- end.
-
- Now lets look into the program itself. It's error-reporting from the
- overlay manager isn't great. It stops the program if the overlay won't
- load, but doesn't do a thing, really, with the ems section.
-
- program tmyunit; uses myunit, overlay;
-
- {$O MYUNIT.TPU} { include myunit in the overlay }
- var
- str: string;
- begin
- ovrinit('TMYUNIT.OVR'); { final overlay file name/init for program. }
- if OvrResult <> 0 then
- begin
- writeln('There is a problem');
- halt(1);
- end
- else
- write('Overlay installed ');
- ovrinitems; {init overlay for EMS. Usable after ovrinit}
- if OvrResult <> 0 then
- writeln('There was a problem putting the overlay in EMS')
- else
- writeln('in EMS memory.');
- str := 'Hello! Courtesy of myunit!';
- writeln;
- writehello(str);
- end.
-
- EXE Overlays
- ============
- Here's how to set up EXE overlays. The DOS copy command features the B
- switch. For example, to take the programs source file above and attach the
- overlay to the end of the EXE (be sure you run any exe packers/encryptors
- before you do this!), use the following:
-
- COPY /B TMYUNIT.EXE+TMYUNIT.OVR
-
- Then the change that needs to be made in the source for the program is to
- change the overinit line to read TMYUNIT.EXE instead of TMYUNIT.OVR. You
- should be able to handle doing this and understanding what is going on.
-
- Include Files
- =============
- Use the {$I <filename>} compiler directive at the position the include
- file is to be placed. An include file is code that is in another file,
- which may be considered as "part of the program" at the position the
- {$I ..} compiler directive is at.
-
- Copy function
- =============
- You can use the copy function to get a portion of a string into another
- part of a string. For example...
-
- str := copy('TurboPascal', 5, 3);
- writeln(str); { writes oPa }
-
- Programming Practice for Part #10
- =================================
- We have opened ourselves a business selling computer equipment in 1993.
- Since we have occupied ourselves with working on computers, and not on
- bookkeeping (we wanted to save the funds instead of hiring someone), and
- rather not use the cash registers, we have done everything on paper over
- the last two years. It's the beginning of 1996, and any accurate records
- of sales progression, as well as records of our customers has become
- almost impossible, since our records are represented by a closet-full of
- paper. So, we finally have decided to get things into computer.
-
- To do the typing, we have temporarily hired interns from a nearby business
- college. Unfortunately, with our limited funds, we could not draw in
- people who had sufficient typing skill and accuracy, but we took what
- we could get. We now have things typed in as text files with 80 columns
- a line. Unfortunately, the interns' attention to detail has been as bad
- as their typing skill, and nothing makes sense in their work.
-
- Our purposes is to save our money in hiring these interns and locate the
- badly entered records, while writing the good records to a solid binary
- data file by the name of COMPHVN.DAT. For the bad records, on EACH AND
- EVERY error we encounter, we should write a text message with the first
- 20 characters of the problem line and a description of what is wrong with
- the data set for that particular error so we may go back through and make
- the interns redo what they did wrong to a text file named ERRORS.LOG.
-
- The data format for the output file COMPHVN.DAT is as follows. For
- interest of efficiency, we shall write this program using COMPHVN.DAT
- as an untyped file. As the person posing this problem, I realize that
- some of the data types in this record will not be recognizable at this
- point, but with the variable description, you will know how to handle
- them, and in part 11, you will see what they are exactly. In creating
- a binary file, we must always be concerned with using the least amount
- of space as effectively as possible. Uses of the variables will be
- explained later. For interest of typing efficiency on your parts,
- I am asking that you cut and paste this record description out of this
- description and save it as a text file named COMPHVN.INC, which may be
- used as an include file in our compilation.
-
- comphvndata = record
- datacode: string[7];
- acct_classification: char;
- phone_area: integer; {area+prefix+exchange = phone number}
- phone_prefix: integer;
- phone_exchange: integer;
- work_area: integer;
- work_prefix: integer;
- work_exchange: integer;
- other_area: integer;
- other_prefix: integer;
- other_exchange: integer;
- cnct1_lname: string[16];
- cnct1_fname: string[11];
- cnct1_minit: char;
- cnct1_pobox: integer;
- cnct1_sname: string[8];
- cnct1_stype: string[4];
- cnct1_apt: integer;
- cnct1_city: string[10];
- cnct1_state: string[2];
- cnct1_zip: longint;
- cnct1_birthm: byte;
- cnct1_birthd: byte;
- cnct1_birthy: integer;
- accept_check: boolean;
- accept_credt: boolean;
- balnce_credt: real;
- total_sold: real;
- cnct1_emp_code: string[4];
- total_sales: integer;
- emp_name: string[10];
- emp_stnum: integer;
- emp_sttype: string[4];
- emp_city: string[10];
- emp_state: string[2];
- emp_zip: longint;
- emp_area: integer;
- emp_prefix: integer;
- emp_exchange: integer;
- emp_yrs: byte;
- compu: boolean;
- compu_type: string[9];
- compu_mon: char;
- compu_cdr: boolean;
- compu_cdt: char;
- compu_mem: byte;
- minor: boolean;
- end;
-
- The format for our INPUT file, which will be named INDATA.TXT, will be as
- follows (80 characters). Since we had 15 interns doing the typing at once
- we also had them merge their work. They were careless, and may have not
- accomplished it properly. There will be three lines for each customer
- that we have encountered.
-
- Line 1 Line 2
- --------------------------------------------------------------------
- datacode columns 1-7 datacode columns 1-7
- acct_classification column 8 accept_check column 8
- sequence number column 9 sequence number column 9
- phone_area columns 10-12 cnct1_stype columns 10-13
- phone_prefix columns 13-15 cnct1_apt columns 14-17
- phone_exchange columns 16-19 cnct1_city columns 18-27
- work_area columns 20-22 cnct1_state columns 28-29
- work_prefix columns 23-25 cnct1_zip columns 30-38
- work_exchange columns 26-29 cnct1_birthm columns 39-40
- other_area columns 30-32 cnct1_birthd columns 41-42
- other_prefix columns 33-35 cnct1_birthy columns 43-46
- other_exchange columns 36-39 balnce_credt columns 47-55
- cnct1_lname columns 40-55 total_sold columns 56-63
- cnct1_fname columns 56-66 cnct1_emp_code columns 64-67
- cnct1_minit column 67 total_sales columns 68-70
- cnct1_pobox columns 68-72 emp_name columbs 71-80
- cnct1_sname columns 73-80
-
- Line 3
- --------------------------------------------------------------------
- datacode columns 1-7
- accept_credt column 8
- sequence number column 9
- emp_stnum column 10-13
- emp_sttype column 14-17
- emp_city column 18-27
- emp_state column 28-29
- emp_zip column 30-38
- emp_area column 39-41
- emp_prefix column 42-44
- emp_exchange column 45-48
- emp_yrs column 49-50
- compu column 51
- compu_type column 52-60
- compu_mon column 61
- compu-cdr column 62
- compu_cdt column 63
- compu_mem column 64-65
- minor column 66
- spaces column 67-80
-
- Now, a description as to what is defined as a correct set that we should
- write to COMPHVN.DAT.
-
- 1) Each 3 lines that are read are considered for errors. Check the sequence
- numbers. The first line's sequence number should be 1, for example. A
- successful read of 3 lines should say 1, 2 and 3 in that order. For example,
- in our error reporting, if you have a read of 1,2,2 , you should not write
- the group to the binary file, and report a duplicate line #2 and a missing
- line #3. There will not ever be a circumstance where these sequence numbers
- will all be the same...The cases covered in this paragraph would be the only
- cases that would ever forstall processing of error-checks listed in points
- 2-14.
-
- 2) Datacode on lines 1, 2 and 3 should MATCH exactly and be checked for the
- following: It has the format, for example, with my name of GROTZ*G, and
- should be verified using the cnct1_names...
-
- 3) phone_area, phone_prefix, phone_exchange, work_area, work_prefix, work_
- exchange, other_area, other_prefix, other_exchange, pobox, emp_zip, emp_
- area, emp_prefix, emp_exchange, emp_yrs, cnct1_zip, cnct1_birthm, cnct1_
- birthd, cnct1_birthy, balance_credt, total_sold, total_sales, compu_mem
- all should be checked to verify that they are numeric in origin.
-
- 4) phone_prefix, work_prefix, other_prefix, emp_prefix all should not start
- with a 1 or a 0.
-
- 5) cnct1_birthy should be in this century 1900-1999.
-
- 6) acct_classification should be B,C,G,P, or O.
-
- 7) accept_check, accept_credt, compu, compu_cdr, and minor should be
- Y or N.
-
- 8) emp_yrs (employed how many years?) should be checked with cnct1_birthy
- for sanity (a person who was born in 1980 cant have worked 20 years).
-
- 9) If compu is N, then compu_type, compu_mon, compu_cdr, compu_cdt, and
- compu_mem should be either blank or 0 depending upon the type of field.
-
- 10) cnct1_emp_code should be GOVT, RET, STUD, or BUS. If this field is
- RET, then emp_* should either be blank or 0 depending on the type of field.
-
- 11) compu_mon should be S, V, E, C, H, or I.
-
- 12) compu_cdt should be 1, 2, 4, 6, or 8.
-
- 13) emp_sttype and cnct1_stype should be BLVD, LANE, ST, AVE, CT, LOOP,
- DR, CIRC, or RR.
-
- 14) minor should be Y if person listed in cnct1_?name is < 21 years old
- and N otherwise. Check to be sure that this field is correct in being
- Y or N.
-
- Format of ERRORS.LOG (also solution to the INDATA.TXT posted below)
- --------------------
-
- Error Report -- INDATA.TXT
- --------------------------
-
- First 20 characters of line Problem
- --------------------------- --------------------------
- GROA2*GN334 ST WAR Datacode does not agree with name.
- GROT2*GP181612932918 Work-exchange is not numeric.
- GROT2*GP181612932918 phone-prefix started with a 0 or 1.
- GROT2*GT2ST 314 SED accept-check is invalid.
- GROT3*GP181642932918 Duplicate line #1
- GROT3*GN234 ST WAR Missing line #3
- GROT4*GI181642932918 Datacode does not agree with name.
- GROT4*GY2ST 314 SED Datacode does not agree with name.
- GROT4*GN334 ST WAR Datacode does not agree with name.
- GROT4*GY2ST 314 SED cnct1-birthy is not in this century.
- GROT4*GI181642932918 acct-classification is invalid.
- GROT7*GN334 ST WAR emp-zip is not numeric.
- GROT7*GN334 ST WAR compu-cdr is invalid.
- GROT7*GN334 ST WAR The emp-yrs doesn't make sense.
- GROT7*GN334 ST WAR There were fields present when compu was N.
- GROT7*GN334 ST WAR compu-mon is invalid.
- GROT7*GN334 ST WAR compu-cdt is invalid.
- GROT8*GN334 ST WAR empcodes are present when RET is true.
- GROT8*GN334 ST WAR compu-mon is invalid.
- GROT0*GN334 STR WAR compu-cdt is invalid.
- GROT0*GN334 STR WAR emp-sttype is invalid.
-
-
-
- Remember to be as general as possible on your error messages. Use the
- example listed above as a guide. Your program can not predict everything.
- Also, in the interest of finding out your programming skill, we ask that
- you code this program using the pascal overlay system with EMS load
- capability, with all error codes and status statements active and visible
- to the user, for at least one procedure or function. Also note, that many
- of the separate integer fields are put together in the input file, so we
- can not just plain read the input file.
-
- Here is a copy of the current input file, INDATA.TXT
- (keep in mind it's 80 characters per line, and the character
- positions MATTER)
- ----------------------------------------------------
- GROT1*GP1816429329181674700008163475753GROT1INGER GLENN K232 34th
- GROT1*GY2ST 314 SEDALIA MO64093 062519742.34 3245.23 STUD32 CMSU
- GROT1*GN334 ST WARRENSBURMO65337 81654341114 YHOMEBUILTVY18 N
- GROT2*GP18161293291816747000A8163475753GROT2INGER GLENN K232 34th
- GROT2*GT2ST 314 SEDALIA MO64093 062519742.34 3245.23 STUD32 CMSU
- GROA2*GN334 ST WARRENSBURMO65337 81654341114 YHOMEBUILTVY18 N
- GROT3*GP1816429329181674700008163475753GROT3INGER GLENN K232 34th
- GROT3*GY1ST 314 SEDALIA MO64093 062519742.34 3245.23 STUD32 CMSU
- GROT3*GN234 ST WARRENSBURMO65337 81654341114 YHOMEBUILTVY18 N
- GROT4*GI1816429329181674700008163475753BROT4INGER GLENN K2E2 34th
- GROT4*GY2ST 314 SEDALIA MO64093 062518742.34 3245.23 STUD32 CMSU
- GROT4*GN334 ST WARRENSBURMO65337 81654341114 YHOMEBUILTVY18 N
- GROT5*GP1816429329181674700008163475753GROT5INGER GLENN K232 34th
- GROT5*GY2ST 314 SEDALIA MO64093 062519742.34 3245.23 STUD32 CMSU
- GROT5*GN334 ST WARRENSBURMO65337 81654341114 YHOMEBUILTVY18 N
- GROT6*GP1816429329181674700008163475753GROT6INGER GLENN K232 34th
- GROT6*GY2ST 314 SEDALIA MO64093 062519742.34 3245.23 STUD32 CMSU
- GROT6*GN334 ST WARRENSBURMO65337 81654341114 YHOMEBUILTVY18 N
- GROT7*GP1816429329181674700008163475753GROT7INGER GLENN K232 34th
- GROT7*GY2ST 314 SEDALIA MO64093 062519742.34 3245.23 STUD32 CMSU
- GROT7*GN334 ST WARRENSBURMO65W37 816543411134NHOMEBUILT 00 N
- GROT8*GP1816429329181674700008163475753GROT8INGER GLENN K232 34th
- GROT8*GY2ST 314 SEDALIA MO64093 062519742.34 3245.23 RET 32 CMSU
- GROT8*GN334 ST WARRENSBURMO65337 81654341114 YHOMEBUILTZY18 N
- GROT9*GP1816429329181674700008163475753GROT9INGER GLENN K232 34th
- GROT9*GY2ST 314 SEDALIA MO64093 062519742.34 3245.23 STUD32 CMSU
- GROT9*GN334 ST WARRENSBURMO65337 81654341114 YHOMEBUILTVY18 N
- GROT0*GP1816429329181674700008163475753GROT0INGER GLENN K232 34th
- GROT0*GY2ST 314 SEDALIA MO64093 062519742.34 3245.23 STUD32 CMSU
- GROT0*GN334 STR WARRENSBURMO65337 81654341114 YHOMEBUILTVYA8 N
-
-
- Notes
- -----
- 1) You may use a for loop to read each set of 3 lines. I will not throw
- an error of omission of lines into the data file. There will always
- be multiples of 3 lines to work with.
-
- 2) The included data file in this text file includes errors from all 14
- points listed above. The data file I use for the contest will be different,
- but will as well cover all 14 points listed above...
-
- 3) Be sure to get good use of your debugger, as you will NEED it...Also, be
- sure to plan the program -- this is an easy one, yet it's complex because
- of the amount of planning it requires...plan well, it's easy. Don't plan
- well, it's a bugger...:>
-
- 4) ONE hint: remember string addressing, and use of the copy procedure.
-
- 5) Another hint. You can have what is referred to as "next sentence"
- IF THEN ELSE statements. It is very good in this program to be able to
- use them. (if condition then else) is essentially, a do nothing if con-
- dition is true situation. I suggest it because the pascal operator NOT
- seems to not work right in all cases. :<
-
- Also, keep in mind that this is the part 10 practice, too, so be sure to
- at least attempt it!
-
- Next Time
- =========
- Interfacing with a common format; how data types are stored in memory and
- on disk. You may wish to obtain use of a hex viewer for this next part.
- Send comments to ggrotz@2sprint.net.
-
-
-