Graphics Plus

home *** CD-ROM | disk | FTP | other *** search

/ Graphics Plus / Graphics Plus.iso / general / hdf / docs / hdfvset.lha / HDFVset.ch2 < prev next >

Wrap

Text File | 1994-01-10 | 16.7 KB | 510 lines

2.1 NCSA HDF Vset Vdatas 2.1 National Center for Supercomputing Applications November 1990 Chapter 2 Vdatas Chapter Overview Interlacing Working with Interlacing HDF Data Format Vdata Options Selecting Random-Access Reads Writing Fields of Mixed Types Specifying Read Fields Inquiring About a Vdata Using Searching Strategies Accessing Vdatas Simultaneously Chapter Overview This section points out some properties of vdatas that might not be obvious from earlier examples. The section covers important concepts and ideas on which the HDF Vset is based. It also gives a better idea of how the vdata's properties and various routines may be used to gain greater control over data organization and access. Interlacing Data for all the fields with a vdata are stored according to a non-interlaced or fully-interlaced scheme. The concept of interlacing is simple but important╤it describes how data of one field is related to data of another in a storage area (whether in memory or in a vdata in the file). Note that interlacing applies only to fields. In a non-interlaced scheme, all data values for one field are stored together contiguously. Think of all the data values for one field forming one data chunk, and all the data values for another field forming another chunk. Such an interlace scheme emphasizes keeping all data values for one field close together. Figure 2.1a shows a vdata with non-interlaced fields. In a fully-interlaced scheme, all data values for one field are never located contiguously. Each data value of one field is immediately followed by one data value of another field, until a data value is taken for all fields. In other words, data values from different fields are intertwined together as aggregates, where each aggregate comprises exactly one data value from each of the fields. Such an interlace scheme emphasizes keeping related data values from different fields together. Figure 2.1b shows a vdata with fully-interlaced fields. When talking about interlacing, a distinction is made between when data is in memory and when data resides in a vdata in a file. The layout of data in memory is its buffer interlace, whereas the layout of data within a vdata in the file is the vdata's file interlace. Working with Interlacing Data as it appears in the file may or may not have the same interlace as the data appears in memory. This feature allows applications to extract data interlaced in the manner most useful to it, independent of the file interlace. In the same way, data may be stored in the file with an interlace different from how it appears in memory. The HDF Vset interface always requires that the buffer interlace be specified in any data transfer. However, it does not require the file interlace be specified. By default, this buffer interlace is always set to FULL_INTERLACE for every newly created vdata. Thus, fields in a vdata are always fully-interlaced by default. You may change the interlace for any particular vdata to NO_INTERLACE with the routine VSsetinterlace. Vdatas within an HDF file may be set to different file interlaces if so desired. However, note that the file interlace of existing vdatas cannot be changed. The buffer interlace must be specified when reading or writing data. Both the read/write routines, Vsread and VSwrite, require the buffer interlace to be specified as the fourth argument. When reading data, the buffer interlace tells the read routine how the returned data should be interlaced in the read buffer. When writing data, the buffer interlace informs the write routine how data in the write buffer is interlaced. Finally, note that when only one field is read or written, or when there is only one field in a vdata or in memory, interlacing is not relevant. In such cases, specifying FULL_INTERLACE or NO_INTERLACE yields identical results. Figure 2.1 Interlacing Within Vdatas HDF Data Format Data stored in a vdata is in IEEE 32-bit floating-point format. When data is read or written, the calling interface always convert from IEEE format to the correct data format and byte length for that particular machine, and vice versa. For instance, 4-byte floats created on a VMS machine are stored as 4-byte IEEE floats in the vset in the HDF file. Later when a Cray (running UNICOS) reads the data, the 4-byte float value is converted to an 8-byte Cray float value. The data stored within a vdata is always contiguous; i.e., the vdata does not contain any non-data spaces or holes. This is true for both fully-interlaced and non-interlaced data. Consequently, any call to VSread always returns data that is contiguous, and any call to VSwrite expects data to be contiguous. Vdata Options Selecting Random-Access Reads The access routine VSseek is used together with a VSread to affect a random-access read within a vdata. VSseek requires an argument that specifies the element location within that vdata. This value will be an integer, with 0 for the first element position, 1 as the second element position, etc. Thus, the following code segment seeks to the 51st element in the vdata vs, and then reads data for four elements according to the fields specified by the last VSsetfields call: VSseek(vs,50); VSread(vs,buf,4,interlace); Writing Fields of Mixed Types The fields within one vdata need not all be of the same type. You may store fields of integer, float, and character types together within one vdata. Assume that the fields "F1", "F2", "I1" and "C1" represent data fields of float, float, character, and integer types of order 1, respectively. The code segment below first defines these fields, and then specifies that the data from buffer buf be written contiguously into the vdata vs according to the format "F1,I1,F2,C1". ... VSfdefine (vs, ╥F1╙, LOCAL_FLOATTYPE, 1); VSfdefine (vs, ╥I1╙, LOCAL_INTTYPE, 1); VSfdefine (vs, ╥F2╙, LOCAL_FLOATTYPE, 1); VSfdefine (vs, ╥C1╙, LOCAL_CHARTYPE, 1); VSsetfields(vs,"F1,I1,F2,C1 "); VSwrite(vs,buf,nvertices,interlace); The data in buf must also be contiguous. There must be no padding or alignment, or non-data spaces. Storing mixed field types within a vdata is very efficient, and is useful to applications that use structures. Such usage intuitively associates a structure in memory to a vdata in the file. However, in general, structures contain padding or alignment bytes (Figure 2.2) and as such, may not be directly written out into a vdata with a VSwrite call. The data from such a structure must first be packed into an array so that the array does not contain holes or non-data spaces. This packed array may then be written out into one vdata using VSwrite. Figure 2.2 Structure Array Vs. Packed Array The following code segment (Figure 2.3) illustrates packing data in the fields from a structure array ss into a contiguous array pp, and then writing the array pp into a vdata. Figure 2.3 Packing Data #define NVERTICES 500 main() { struct { float F1; int I1; float F2; char C1; } ss[NVERTICES]; unsigned char *pp, *p; int i; ... pp = (unsigned char*) malloc( NVERTICES* (2*sizeof(float) + sizeof(char) + sizeof(int)) ); p = pp; for(i=0;i<NVERTICES ;i++) { movebytes (p, &ss[i].F1, sizeof(float) ); p+=sizeof(float); movebytes (p, &ss[i].I1, sizeof(int) ); p+=sizeof(int); movebytes (p, &ss[i].F2, sizeof(float) ); p+=sizeof(float); movebytes (p, &ss[i].C1, sizeof(char) ); p+=sizeof(char); } VSsetfields(vs,"F1,I1,F2,C1"); VSwrite(vs,pp,NVERTICES, FULL_INTERLACE); ... } /* main */ movebytes(dest, src, nbytes) unsigned char *dest, *src; int nbytes; { int i; for(i=0;i<nbytes;i++) *dest++ = *src++: } Note the following points: 1. The exact amount of memory is allocated (i.e., NVERTICES times the size of 2 floats, 1 char and 1 integer). 2. The routine movebytes is called for each field in the sequence desired (i.e., "F1,I1,F2,C1"). The movebytes routine may be replaced by any efficient routine. 3. Data in one structure record is packed each time the code moves through the loop. 4. The contiguous data in array pp may now be stored with by VSwrite. Specifying Read Fields The data read from a vdata depends only on the fields specified by the VSsetfields call. Thus, for reading, you can select one or more fields from the vdata. For example, assume that the vdata vs contains fields specified by "AFLOAT, BFLOAT, CCHAR, DINT, EINT, FCHAR" and is stored in that sequence. (The type of each field is suggested by the fieldname.) Each of the following VSsetfields calls is valid before a VSread call: VSsetfields(vs,"BFLOAT"); VSsetfields(vs,"BFLOAT,AFLOAT"); VSsetfields(vs,"EINT,AFLOAT"); VSsetfields(vs,"CCHAR,DINT,FCHAR"); VSsetfields(vs," FCHAR, EINT, DINT, CCHAR,BFLOAT, AFLOAT") The last call specifies all fields be read in reverse sequence. Note that the VSread call always returns contiguous data for the fields specified. Furthermore, the data is returned in the sequence specified. Several calls to VSsetfield and VSread may follow one another to read data from the same vdata. However, note that the read operations are sequential. Thus, in the following code segment, the first VSread returns 10 "AFLOAT" data values from the first 10 elements in the vdata, while the second VSread returns 10 "BFLOAT" data values from the second 10 elements (i.e., 11-20) in the vdata. VSsetfields(vs,"AFLOAT"); VSread(vs,buf1,10, interlace); VSsetfields(vs,"BFLOAT"); VSread(vs,buf2,10, interlace); To actually read the first 10 "BFLOAT" data values, the access routine VSseek must be explicitly called to position the read pointer back to the first element positions. The following code segment correctly reads the first 10 "AFLOAT" and "BFLOAT" values into two separate float arrays buf1 and buf2. VSsetfields(vs,"AFLOAT"); VSread(vs,buf1,10, interlace); VSseek(vs,0); /*seeks to first element / VSsetfields(vs,"BFLOAT"); VSread(vs,buf2,10, interlace); Inquiring About a Vdata The inquire routine VSinquire is the general inquiry routine for requesting information about the contents of a vdata. It returns the number of elements in the vdata; the interlace, a string containing (comma-separated) names of the fields in the vdata; the byte size of a element in the vdata; and the name of the vdata itself, if any. This routine is useful when searching through the HDF file for a particular vdata by name or by fieldname. Using Searching Strategies Applications that allow users to specify which vgroup or vdata to access should contain general search strategies. The most general search strategy searches by the name of a vgroup or vdata. The names of vgroups and vdatas are the best means for the user to identify them by, since names are readable and mnemonic as compared to integer identifiers (ids) of vgroups and vdatas). Searching for a Vgroup by Name Locating a vgroup by name is simple. It merely involves searching through all vgroups in the file, and then inquiring for the name of the vgroup. The search routine Vgetid sequences through the file and returns the vgids of the vgroups one at a time, or returns -1 when no more vgids are found. The code in Figure 2.4 illustrates the search for a vgroup named "transistor P╙. Figure 2.4 Searching for a Vgroup char vsname[50]; VGROUP *vg; int vgid; ... vgid = -1; found = 0; while( (vgid = Vgetid(f, vgid)) != -1) { vg = (VGROUP*) Vattach(f,vgid,"r"); Vinquire(vs,&nentries, vgname); if (!strcpy(vsname,"transistor P")) { found = 1; break; } else VSdetach(vs); } if (! found) { printf("vgroup 'transistor P' not found"); return; } ... Searching for a Vdata by Name Similar code is used for searching for a specific vdata by name. The code in Figure 2.5 looks for the vdata named "transistor P voltages"; Here, the routines VSinquire and VSgetid are used instead of Vinquire and Vgetid. Figure 2.5 Searching for a Vdata char vgname[50]; VDATA *vs; int vsid; ... vsid = -1; found = 0; while( (vsid = VSgetid(f, vsid)) != -1) { vs = (VDATA*) VSattach(f,vsid,"r"); VSinquire(vs,&nvertices, &interlace, fields, &vsize, vsname); if (!strcpy(vsname,"transistor P voltages")) { found = 1; break; } else VSdetach(vs); } if (! found) { printf("vdata 'transistor P' not found"); return; } ... Hierarchical Search for a Vdata by Name A more systematic way of searching for a vdata takes into account the hierarchical nature of the vset. Thus, you would first go through all vgroups and examine each entry in the vgroup. Each entry is either a vgroup or a vdata identifier. If it is a vdata identifier, just inquire for its name. However, if it is a vgroup identifier, all the entries in that vgroup need to be searched in a similar manner. The code in Figure 2.6 looks for the vdata named "pressure site#9". The search routine Vgetid is used to locate all the vgroups one by one, while the search routine VSgetnext is used to locate the identifiers of all the entities in a vgroup. Finally, the inquiry routines Visvg and Visvs each test whether an identifier in a vgroup is a vgroup or vdata. Figure 2.6 Hierarchical Searching VGROUP *vg; VDATA *vs; int vgid, vsid,id; ... found = 0; vgid = -1; while( (vgid = Vgetid(f, vgid)) != -1) { vg = (VGROUP*) Vattach(f,vgid,"r"); id = -1; while((id = Vgetnext(vg, id)) != -1) { if (Visvs(vg,id) ) { /* vdata id */ vs = (VDATA*) VSattach(f,id,"r"); VSinquire(vs,&nvertices, &interlace, fields, &vsize, vsname); if (!strcpy(vsname,"pressure site#9")) { found=1; break; } else VSdetach(vs); } else (Visvg(vg,id) ) { /* vgroup id */ ... /* perform similar search on vgroup vg */ } } if (found) break; Vdetach(vg); } if(!found) { printf("vdata 'pressure site#9' not found"); return; } ... Searching for a Vdata by Fieldname Finally, to search for a vdata by specifying only the fieldnames, use the inquiry routine VSfexist. This routine takes a vdata pointer as an argument, and returns true if all the specified fields exist in that vdata. In the following code segment (Figure 2.7), all vdatas in the file are searched through until the first vdata to contain the fields "PZ, PX" is found. Figure 2.7 Search Using Fieldnames VDATA *vs; int vsid; ... vsid = -1; found = 0; while( (vsid = VSgetid(f, vsid)) != -1) { vs= (VDATA*) VSattach(f,vsid,"r"); if( VSfexist(vs,"PZ,PX")) { found = 1; break; } else VSdetach(vs); } if (! found) { printf("fields 'PZ,PX' not found in any vdata"); return; } ... Note that VSfexist returns true only if all the specified fields, in any order, exist in the vdata. For instance, specifying "PZ,PX" or "PX,PZ" in the call to VSfexist will return true for a vdata that contains the fields "PX,PY,PZ". Accessing Vdatas Simultaneously Several vdatas may be attached for simultaneous access. This is analogous to having several files opened at the same time. Reading Data Reading data from several vdatas at the same time is efficient and easy, as the code segment in Figure 2.8 shows. Figure 2.8 Accessing Vdatas Simultaneously float xval, yval; v1 = (VDATA*)VSattach(f,vsid1,"r"); v2 = (VDATA*)VSattach(f,vsid2,"r"); VSsetfields(v1,"PX"); VSsetfields(v2,"DENSITY"); for(i=0;i<1000;i++) { VSread(v1,&xval,1,FULL_INTERLACE); VSread(v2,&yval,1,FULL_INTERLACE); plot_point (xval,yval); } ... VSdetach(v1); VSdetach(v2); Note the following points about the above code: 1. Two vdatas v1 and v2 are opened simultaneously. 2. The VSread from v1 returns one "PX" value in xval, while the VSread from v2 returns one "DENSITY" value in yval. 3. A plotting routine, plot_point, processes each pair of data in xval and yval. Writing Data Writing data to several vdatas in the same file simultaneously is supported, but can be inefficient since each vgroup and each vdata is always maintained as a contiguous data block in the file. As an example, assume that two vdatas, A and B, exist in an HDF file such that B immediately follows A in the file. When new data is to be written out (appended) to A, the file storage for A has to be expanded; however, this cannot be done since B immediately follows A. Thus, A must be copied to the end of the file, and only then can new data be appended to it. This procedure creates an area of useless space in the file where the original vdata A was located. As a result, file space becomes fragmented and the file grows unnecessarily large. As a rule, when writing data out to several vdatas, it is more efficient to completely write out data for one vdata before proceeding to write data for the next vdata.