home *** CD-ROM | disk | FTP | other *** search
Text File | 2001-04-17 | 42.6 KB | 1,189 lines |
-
-
-
- - 1 -
-
-
-
- 7. _D_e_v_i_c_e__S_p_e_c_i_f_i_c__H_i_n_t_s__a_n_d__F_e_a_t_u_r_e_s
-
-
-
- 7.1 _M_a_x__I_m_p_a_c_t_,__H_i_g_h__I_m_p_a_c_t_,__S_o_l_i_d__I_m_p_a_c_t_,__M_X_I_,__S_S_I_,__S_I
-
-
-
- 7.1.1 _P_b_u_f_f_e_r_s__o_n__I_m_p_a_c_t The OpenGL Pbuffer extension is
- now supported on all impact configurations: Max Impact,
- High Impact, and Solid Impact. These release notes describe
- the Impact Pbuffer implementation in detail. This will be
- of use primarily to developers of opengl applications that
- use pbuffers.
-
-
- 7.1.1.1 _P_b_u_f_f_e_r__E_x_t_e_n_s_i_o_n_s Note that you will have to use
- the FBConfig extension to create pbuffers. Also, if you
- want to copy between pbuffers and windows, you will need to
- use the MakeCurrentRead extension. This release includes
- all three extensions: Pbuffers, MakeCurrentRead, and
- FBConfig.
-
-
- 7.1.1.2 _P_b_u_f_f_e_r__S_i_z_e_s Each pbuffer allocated consumes a
- set of bitplanes with the same geometry as the entire
- screen. So, if you run the monitor at 1280x1024, allocating
- a single buffer will actually consume 1280x1024 of offscreen
- framebuffer memory. It is impossible to create a pbuffer
- larger than the screen size. You can create one smaller,
- but the remaining screen memory will be wasted and cannot be
- allocated.
-
-
- In some Impact configurations, there are actually multiple
- sets of bitplanes available to pbuffers (pbuffer banks), so
- it is in some cases (e.g. max impact) it is possible to
- allocate more than one pbuffer at a time. See the section
- below called "Recommended pbuffer and window combinations."
-
-
- 7.1.1.3 _P_b_u_f_f_e_r__f_o_r_m_a_t_s This release supports the
- following pbuffer formats: All X visuals (except overlays)
- support pbuffers. This includes double buffered, and stereo
- quad-buffered visuals, and visuals with depth and stencil
- planes.
-
-
- All appear to work fine, and allow copies from pbuffers to
- windows, and windows to pbuffers. Note that the context
- used to do this copy must be created with the same fbconfig
-
-
-
-
-
-
-
-
-
-
-
- - 2 -
-
-
-
- as the window (not the pbuffer). See the "glXMakeCurrent
- compatibility workaround" section below for details.
-
-
- 7.1.1.4 _g_l_X_M_a_k_e_C_u_r_r_e_n_t__c_o_m_p_a_t_i_b_i_l_i_t_y__r_e_l_a_x_a_t_i_o_n In this
- release, glXMakeCurrent's strict compatibility requirements
- have been relaxed for pbuffers (but not for windows). Any
- context that meets certain very minimal requirements can be
- used to render into a pbuffer, or copy from a pbuffer. The
- requirements are simply that the window and context must
- have the same renderType; in other words, they must both be
- color index, or rgba. There are no requirements that the
- color depths of the context match those of the pbuffer, or
- anything of the sort.
-
-
- Note that windows still have strict similarity requirements.
- Contexts and windows bound together with glXMakeCurrent must
- have been created from the same visual.
-
-
- 7.1.1.5 _g_l_X_M_a_k_e_C_u_r_r_e_n_t__c_o_m_p_a_t_i_b_i_l_i_t_y__w_o_r_k_a_r_o_u_n_d In order
- to copy between pbuffers and windows, you will need to use
- the glXMakeCurrentReadSGIX extension. This is all very
- straightforward if the window, pbuffer, and context you are
- using were all created with the same FBConfigID (using
- glXCreateContextWithConfigSGIX).
-
-
- If they weren't created from the same fbconfigid, things get
- more complicated. A problem arises since there is a bug in
- the X server which requires that windows (but not pbuffers)
- and contexts be created from the exact same visual (or
- fbconfig). (This does not follow the spec since the
- fbconfig spec specifies that windows should work with any
- "compatible" context if the context was created from an
- fbconfig.) You must make sure that the window and context
- were always created with the EXACT SAME fbconfig or visual
- id. So, to copy from a single buffered pbuffer to a double
- buffered window, you must:
-
-
- +o create an sb pbuffer
-
- +o create a db window
-
- +o create a db context
-
-
- This will allow glXMakeCurrentReadSGIX to work correctly
- around this bug. Note that the above example would not work
-
-
-
-
-
-
-
-
-
-
-
- - 3 -
-
-
-
- if the context were created single-buffered. This is due to
- the fact that makecurrent between the sb context and db
- window would fail since the window and context were not
- created with reference to the exact same visual (or
- fbconfig).
-
-
- Note that the db context in the example above may be used
- with makecurrent(pbuffer), makecurrent(window), and
- makecurrentread(window, pbuffer), or
- makecurrentread(pbuffer, window) even though the pbuffer is
- single buffered and the context double buffered. This is
- correct behavior according to the pbuffer spec. Window
- behavior should be corrected in a subsequent release so that
- window compatibility is properly tested against contexts.
-
-
- See the section at the end about Impact-specific
- glxMakeCurrentRead Compatibility.
-
-
- 7.1.1.6 _P_b_u_f_f_e_r__p_e_r_f_o_r_m_a_n_c_e The single most important
- thing you can do to insure good pbuffer performance on
- impact is to make sure that the windows that your
- applications are using are not using X visuals with Z
- bitplanes unless absolutely necessary. There are conditions
- outlined below under "pbuffer bank calculations" which cause
- zbuffers to have to swap into host memory when their bank
- usage conflicts with pbuffers. The simplest way to avoid
- this is to create all windows using visuals without
- zbuffers.
-
-
- The most expensive operation with pbuffers is actually
- glXMakeCurrentReadSGIX. Try to minimize your use of this
- routine in order to maximize performance. In pursuit of
- this goal, you should try to minimize the number of contexts
- you use in your application.
-
-
- 7.1.1.7 _P_b_u_f_f_e_r__b_a_n_k__c_a_l_c_u_l_a_t_i_o_n_s In order to calculate
- how many pbuffers you can have concurrently with your
- windows, use the following procedure. Determine how many
- pbuffer banks are available in your system, using the table
- below called "Pbuffer bank availability". This is the
- number of pbuffer banks available on your system.
-
-
- Now you need to determine how many banks your application
- requires. Use the table below called "Pbuffer bank usage"
- to look up how many banks each of the buffers that your
-
-
-
-
-
-
-
-
-
-
-
- - 4 -
-
-
-
- application uses requires. Note that this is a global
- resource, so you must include in your calculations all
- applications running the on the machine. If another
- application allocates a pbuffer, then there is one fewer
- pbuffer bank available for your application. Similarly, if
- any application uses a Z buffer, there will not be enough
- pbuffer banks to support a pbuffer at the same time.
-
-
- Pbuffers are capable of sharing the pbuffer banks with Z
- buffers, and the X server supports swapping the pbuffer bank
- when necessary so the bitplanes may be used for both
- purposes at once. This will incur a substantial performance
- penalty which may be prohibitive for some applications. In
- other cases where applications are willing to accept
- pbuffer/zbuffer swapping, you may allow a pbuffer to "share"
- bitplanes with a zbuffer in your calculations. The one
- exception to this sharing is that you cannot use
- glXMakeCurrentReadSGIX with both a window with a zbuffer and
- a pbuffer that resides in the same bitplanes as that
- zbuffer. In such a case, glXMakeCurrentReadSGIX will return
- GL_FALSE and fail.
-
-
- 7.1.1.8 _E_x_a_m_p_l_e__p_b_u_f_f_e_r__b_a_n_k__c_a_l_c_u_l_a_t_i_o_n_s Note that
- "overlap" in the table below refers to zbuffer/pbuffer
- overlap. Such overlap is not allowed in a single call to
- glXMakeCurrentReadSGIX. Such overlap may incur swapping
- performance penalties.
-
-
- 7.1.1.8.1 _n_o__o_v_e_r_l_a_p
-
- +o 0 banks window
-
- +o 1 bank window + z/s buffer
-
- +o 1 bank window + 12(L) pbuffer
-
- +o 1 bank window + 8888 pbuffer
-
- +o 2 banks window + (2) 12(L) pbuffers
-
- +o 2 banks window + (2) 8888 pbuffers
-
- +o 2 banks window + 8888 pbuffer + z/s buffer
-
- +o 3 banks window + (2) 8888 pbuffer + z/s buffer
-
- +o 2 banks window + 12/12/12 pbuffer + z/s buffer
-
-
-
-
-
-
-
-
-
-
-
-
- - 5 -
-
-
-
- 7.1.1.8.2 _w_i_t_h__o_v_e_r_l_a_p
-
- +o 0 banks window
-
- +o 1 bank window + z/s buffer
-
- +o 1 bank window + 12(L) pbuffer
-
- +o 1 bank window + 8888 pbuffer
-
- +o 2 banks window + (2) 12(L) pbuffers
-
- +o 2 banks window + (2) 8888 pbuffers
-
- +o 1 banks window + 8888 pbuffer + z/s buffer
-
- +o 2 banks window + (2) 8888 pbuffer + z/s buffer
-
- +o 2 banks window + 12/12/12 pbuffer + z/s buffer
-
-
- 7.1.1.9 _P_b_u_f_f_e_r__b_a_n_k__a_v_a_i_l_a_b_i_l_i_t_y The framebuffer memory
- available for pbuffers in Max and High Impact systems are
- organized as follows. Note that on High Impact, the number
- of banks available for pbuffers depends on the timing table
- which is loaded when the X Server starts.
-
-
- 7.1.1.9.1 _M_a_x__I_m_p_a_c_t
-
- +o normal timing tables: 2 pbuffer banks
-
- +o 1024x768 timing tables: 4 pbuffer banks
-
- +o 1600x1200 timing table: 1 pbuffer bank
-
- +o 1600x1200 32db: none
-
-
- 7.1.1.9.2 _H_i_g_h__I_m_p_a_c_t__a_n_d__S_o_l_i_d__I_m_p_a_c_t
-
- +o normal timing tables: 1 pbuffer bank
-
- +o 1024x768 timing tables: 1 pbuffer bank
-
- +o 1024x768 pbuf: 2 pbuffer banks
-
- +o 32db timing tables: none
-
- +o 1600x1200 timing table: none
-
-
-
-
-
-
-
-
-
-
-
-
- - 6 -
-
-
-
- 7.1.1.10 _P_b_u_f_f_e_r__b_a_n_k__u_s_a_g_e OpenGL pbuffer bank usage:
-
-
- 7.1.1.10.1 _C_o_l_o_r__b_u_f_f_e_r_s
-
- +o two banks db 8/8/8/8
-
- +o two banks db 12/12/12
-
- +o two banks stereo db (any resolution)
-
- +o one bank all other color resolutions
-
- 7.1.1.10.2 _A_n_c_i_l_l_a_r_y__b_u_f_f_e_r_s
-
- +o add one extra bank for visuals with Z and/or stencil
-
-
- N.B.: 12-12-12 color buffers (without depth) are prohibited
- from being allocated in the bitplanes normally used by the
- zbuffer (pbuffer bank 0). The zbuffer bank will be
- allocated last when you are allocating a series of pbuffers,
- so the simplest workaround is simply to make sure that you
- allocate any 12-12-12 pbuffers before your other pbuffers.
- This restriction will manifest itself as
- glxCreateGLXPbufferSGIX failing due to BadAlloc.
-
-
- 7.1.1.11 _I_m_p_a_c_t_-_s_p_e_c_i_f_i_c__g_l_X_M_a_k_e_C_u_r_r_e_n_t_R_e_a_d__C_o_m_p_a_t_i_b_i_l_i_t_y
-
- +o 1) Render types must match (color index/rgba).
-
- +o 2) Pbuf with depth & window with depth are
- incompatible.
-
- +o 3) Pbufs in bank 0 and window with z are incompatible.
- (Pbuffers will be put in bank 0 last.) (Pbuffers
- allocated earlier are not likely to be in bank 0.)
-
- +o 4) Color depths of drawables do NOT need to match.
-
- +o 5) DB/Stereo do NOT need to match.
-
- 7.1.2 _O_p_t_i_m_i_z_e_d__V_e_r_t_e_x__A_r_r_a_y_s IRIX 6.5.1 introduces
- performance optimizations for back-face culled primitives
- drawn using OpenGL vertex arrays. Acceleration is currently
- available in direct-rendered OpenGL contexts for
- GL_TRIANGLE_STRIP primitives rendered using glDrawArrays
- with back-face culling (GL_CULL_FACE) enabled. To enable
- acceleration for an OpenGL context, the GLMGRARRAYOPT
- environment variable must be defined when the context is
-
-
-
-
-
-
-
-
-
-
-
- - 7 -
-
-
-
- created and first made the current rendering context.
- Enabling this optimization may yield incorrect back-face
- culling results when combined with noncanonical usage of the
- projection and modelview matrices.
-
-
- 7.1.3 _Y_i_e_l_d_i_n_g _C_P_U _C_y_c_l_e_s _t_o _a _L_o_w_e_r _P_r_i_o_r_i_t_y _T_h_r_e_a_d _D_u_r_i_n_g
- _D_M_A _O_p_e_r_a_t_i_o_n_s The MXI, SSI and SI OpenGL implementations
- utilize a DMA completion synchronization mechanism that will
- not explicitly yield CPU cycles to lower priority threads
- while a DMA operation is outstanding.
-
-
- IRIX 6.5.5 supports a new systune variable which configures
- the OpenGL implementation for these products (only) to
- provide that behavior. The systune variable is in the gfx
- group and is named gfxdmasleepthreshold. The default value
- of 0 configures the system for the pre 6.5.5 behavior. Non
- zero values are interpreted as the number of bytes of
- transfer size within a DMA operation beyond which an
- explicit yielding of the CPU until completion will occur.
-
-
- Enabling this property of the MXI, SSI and SI OpenGL
- implementations can have undesirable performance effects
- upon imaging operations if not tuned properly. At small
- transfer size threshold values aggregate throughput
- degradations of up to 15 percent have been measured (for
- 16KB glDrawPixels transfers). Empirical testing on a two
- processor 250MHz R10K MXI system has shown that setting the
- value to 80000 yields a marked improvement in regaining CPU
- cycles for lower priority threads with about a 5 percent
- degradation in pixel transfer rate. Individual application
- performance will vary; this suggested value should be used
- only as a guide.
-
-
- Note that the default behavior will yield to threads of
- equal or higher priority. Therefore, consider enabling this
- only if it is desirable to have a lower priority thread run
- during lengthy pixel movement operations.
-
-
- 7.2 _I_n_f_i_n_i_t_e_R_e_a_l_i_t_y__P_e_r_f_o_r_m_a_n_c_e__H_i_n_t_s
-
- Set all texture parameters (especially the minification
- filter) before downloading a texture.
-
- Enable texturing before downloading any textures.
-
-
-
-
-
-
-
-
-
-
-
-
-
- - 8 -
-
-
-
- For best performance and greatest control over texture
- memory allocation, use subtexture loads to manage texture
- memory.
-
- 7.3 _I_n_f_i_n_i_t_e_R_e_a_l_i_t_y__S_p_e_c_i_f_i_c__F_e_a_t_u_r_e_s
-
- 7.3.1 _D_i_g_i_t_a_l__D_i_s_p_l_a_y__O_p_t_i_o_n__(_D_D_O_)__s_u_p_p_o_r_t DDO is a new
- video option board for Onyx InfiniteReality systems.
-
- 7.3.2 _H_i_g_h__Q_u_a_l_i_t_y__M_u_l_t_i_s_a_m_p_l_e_d__A_n_t_i_-_A_l_i_a_s_e_d__P_o_i_n_t_s The
- quality of multisampled anti-aliased points has been
- improved. To use these new improved points:
-
- glDisable(GL_POINT_SMOOTH);
-
- glEnable(GL_MULTISAMPLE_SGIS);
-
- To get the best looking points, some care must be taken to
- set the GL_POINT_FADE_THRESHOLD_SIZE correctly. For general
- use we suggest using a threshold of zero to disable the
- threshold size feature.
-
- glPointParameterfSGIS(GL_POINT_FADE_THRESHOLD_SIZE, 0.0);
-
- 7.3.3 _P_i_p_e_l_i_n_e__I_n_s_t_r_u_m_e_n_t_a_t_i_o_n The SGIX_instruments OpenGL
- extension defines a new mechanism for measuring the
- performance of the graphics pipeline. It can be used to
- determine when an application is limited by pixel fill,
- geometry processing load, etc. This is helpful for general
- performance tuning and for maintaining a guaranteed frame
- rate in simulation systems.
-
- 7.3.4 _F_o_r_c_i_n_g__C_o_m_p_l_e_t_i_o_n__o_f__R_a_s_t_e_r_i_z_a_t_i_o_n The
- SGIX_flush_raster OpenGL extension forces all rasterization
- operations to be completed before processing the next OpenGL
- command. Unlike glFinish(), it does not prevent the
- application from issuing new commands. This is used in
- conjunction with the SGIX_instruments extension to ensure
- that rasterization is complete before taking a measurement
- of the graphics pipeline.
-
- 7.3.5 _D_i_s_p_l_a_y_-_L_i_s_t__M_e_m_o_r_y__M_a_n_a_g_e_m_e_n_t One of the new
- hardware features of InfiniteReality is a display-list cache
- memory. Display lists may be transferred from this memory
- at roughly twice the maximum speed possible for display
- lists stored in main memory. The SGIX_list_priority OpenGL
- extension allows applications to manage the contents of the
- display list cache by setting residence priorities for
- display lists.
-
-
-
-
-
-
-
-
-
-
-
-
-
- - 9 -
-
-
-
- Note that this interface is experimental and its behavior is
- subject to change. We are very interested in feedback
- concerning it.
-
- 7.3.6 _C_a_l_l_i_g_r_a_p_h_i_c__L_i_g_h_t__S_u_p_p_o_r_t The
- SGIX_calligraphic_fragment OpenGL extension allows position
- and coverage information for light points to be transmitted
- to a combination calligraphic/raster display system. This
- is valuable for night-time flight simulation.
-
- 7.3.7 _L_a_r_g_e__C_o_l_o_r__T_a_b_l_e_s Luminance-format color tables may
- now have up to 32K entries. Non-luminance color tables may
- have up to 16K entries. (Performance is maximized when color
- tables have 4K or fewer entries, however.)
-
- 7.3.8 _V_i_r_t_u_a_l__C_l_i_p_m_a_p_s Clipmaps are an extension of
- mipmaps, intended to handle texture mapping for extremely
- large textures (such as high-resolution satellite
- photographs of the entire Earth). The first release of
- clipmaps was limited to 15 levels (implying a maximum
- texture size of 32Kx32K). This has been changed to allow a
- much larger number of levels, provided that no more than 15
- adjacent levels are resident in texture memory at any one
- time.
-
- Even now though, there remains a limitation that over-
- subscribing texture memory with two 17-level clipmaps will
- fail. For now, you must not allocate more clipmaps than
- will fit into your physical texture memory.
-
- 7.3.9 _V_i_d_e_o__P_a_n_/_Z_o_o_m libXvc now supports the ability to
- pan over a framebuffer area larger than the display, as well
- as to zoom the display up or down without re-rendering.
-
- 7.3.10 _O_l_d_-_S_t_y_l_e__S_t_e_r_e_o The first InfiniteReality release
- included support only for stereo-in-a-window (``new-style''
- stereo). This release also supports old-style stereo, in
- which the screen is split into two parts and each part is
- scaled up by a factor of two.
-
- 7.3.11 _D_e_p_t_h__T_e_x_t_u_r_e_s The SGIX_depth_texture extension
- defines the behavior of depth textures (analogous to color
- textures). Currently depth textures are used for real-time
- shadows.
-
- 7.3.12 _S_y_n_c_h_r_o_n_i_z_e_d__B_u_f_f_e_r__S_w_a_p The SGIX_swap_barrier
- extension allows buffer swaps to be synchronized with an
- external event. Normally this is used to coordinate
- swapping among several machines, each of which is
- responsible for a portion of a video wall or other
- sophisticated multiple-display system.
-
-
-
-
-
-
-
-
-
-
-
- - 10 -
-
-
-
- 7.3.13 _S_w_a_p__G_r_o_u_p_s The SGIX_swap_group OpenGL extension
- provides the ability to synchronize the buffer swaps of a
- group of windows. For example, a double-buffered main
- window and its associated double-buffered overlay window can
- be placed in a swap group so that they will always be
- buffer-swapped together.
-
- 7.3.14 _D_i_s_p_l_a_y__L_i_s_t_s Display lists can now be transferred
- from main memory to the graphics pipeline by DMA. This is
- substantially faster than immediate mode, though not as fast
- as display lists in the display list cache memory.
-
- For now, applications are limited to 128M of DMA-able
- display lists. Beyond that, the code falls back to non-DMA
- display lists.
-
- 7.3.15 _P_a_c_k_e_d__V_e_r_t_e_x__A_r_r_a_y__F_o_r_m_a_t_s Support for certain
- vertex array formats has been optimized with special
- microcode. See ``man glvertexpointerext'' for details.
-
- 7.3.16 _B_i_t_m_a_p_s__(_T_e_x_t_) Drawing display-listed bitmaps
- (OpenGL-based text) is dramatically faster.
-
- 7.3.17 _H_i_s_t_o_g_r_a_m_s Single-component histograms (a common
- case for luminance-only image-processing operations) and the
- glGetHistogramEXT() command have been tuned.
-
- 7.3.18 _S_m_a_l_l_-_A_r_e_a__P_i_x_e_l__O_p_e_r_a_t_i_o_n_s The overhead for
- glCopyPixels() and glDrawPixels() operations has been
- reduced, allowing significantly better throughput for those
- operations when applied to small pixel arrays.
-
- 7.3.19 _M_o_d_e__C_h_a_n_g_e_s Performance for a variety of mode-
- change operations has been improved.
-
- 7.3.20 _C_u_l_l_i_n_g Backface culling performance has been
- improved, though the penalty for using culling remains
- higher than it was on RealityEngine.
-
- 7.3.21 _E_v_a_l_u_a_t_o_r_s A number of optimizations have been
- applied to parametric polynomial surfaces (evaluators).
-
- 7.3.22 _C_o_n_v_o_l_u_t_i_o_n Performance has been improved
- substantially, especially for separable convolutions and for
- luminance-only convolutions.
-
- 7.3.23 _T_e_x_t_u_r_e__B_i_n_d_s Binding to a texture that is resident
- in texture memory is now over three times faster than it was
- in the first release.
-
-
-
-
-
-
-
-
-
-
-
-
-
- - 11 -
-
-
-
- 7.3.24 _Q_u_a_d_r_i_l_a_t_e_r_a_l_s Decomposition of quadrilaterals into
- triangles has been changed, yielding 30%-50% better
- performance for geometry-limited quadrilateral strips.
-
-
- 7.4 _O_c_t_a_n_e_2__V_P_r_o__P_e_r_f_o_r_m_a_n_c_e__H_i_n_t_s
-
-
- Octane2 VPro host software support is optimized for the n32
- and n64 object file formats, and the mips4 instruction set.
- The old o32 object file format is supported, but cannot use
- the mips4 instruction set. Programs in this format will
- therefore miss out on optimizations in matrix arithmetic and
- in efficient submission of display lists to the graphics
- hardware.
-
- Users who run applications which make heavy use of the
- glAccum() function should configure their Octane2 VPro
- systems to take advantage of the hardware accumulation
- buffer. This buffer is not part of the default
- configuration as these systems are shipped. See the
- xsetmon(1) man page, and the "Unified Graphics Memory" topic
- in the Features section below, for details.
-
- Octane2 VPro provides optimized transfer paths for a
- selection of vertex array layouts. Use of these formats may
- substantially increase performance in glDrawArrays(),
- glDrawElements() and glDrawRangeElements(). This set of
- formats includes some in the set of enums provided for
- glInterleavedArrays() and some others as well. Using any
- means (either glInterleavedArrays() or explicit use of
- glVertexPointer() and related functions) for specifying the
- optimized formats will result in the optimized path being
- taken.
-
- The list of optimized formats follows. In this list,
- abbreviations are used which are akin to the enums provided
- for glInterleaveArrays(), even though some of these
- abbreviations are not actual legal enums. In the latter
- case, one would use glVertexPointer() and related functions
- to specify the array format. We hope the meaning is clear.
-
- +o Packed V3F
-
- +o Interleaved C3F_V3F
-
- +o Packed separate C3F_V3F
-
- +o Interleaved N3F_V3F
-
-
-
-
-
-
-
-
-
-
-
-
-
- - 12 -
-
-
-
- +o Interleaved V3F_N3F
-
- +o Packed separate N3F_V3F
-
- +o Packed separate N3S_V3F
-
- +o Interleaved T2F_V3F
-
- +o Packed separate T2F_V3F
-
-
- Octane2 VPro provides a way to improve precision of
- parameter interpolation across primitives (especially those
- primitives which are large in screen space), using the
- SGIX_vertex_preclip extension. See the man pages for
- glIntro(3G), glEnable(3G) and glHint(3G) for details.
-
- When vertex preclipping is disabled
- (glDisable(GL_VERTEX_PRECLIP_SGIX)), primitive performance
- is maximized. When it is enabled
- (glEnable(GL_VERTEX_PRECLIP_SGIX)), there is a hint which
- controls how much work is done to detect primitives for
- which enhanced interpolation may be beneficial. This hint,
- glHint(GL_VERTEX_PRECLIP_HINT_SGIX, ...), may as usual be
- set to either GL_FASTEST, GL_NICEST or GL_DONT_CARE.
- (GL_DONT_CARE operates identically to GL_FASTEST.) Use of
- vertex preclipping results in a decrease in peak primitive
- rates, with GL_NICEST costing more than GL_FASTEST.
-
- The initial state of vertex preclipping is disabled; the
- initial state of the hint is GL_DONT_CARE. However, we have
- provided an environment variable which sets the initial
- values, so that an unmodified application may operate in any
- of these modes. (If the application uses the
- SGIX_vertex_preclip extension at runtime, the application's
- settings will override the initial settings provided by the
- environment variable.)
-
- This environment variable is called GL_VERTEX_PRECLIP. It
- can have three values: the string "DISABLED" disables
- vertex preclipping (this simply confirms the default case);
- the string "FASTEST" both enables vertex preclipping and
- sets the hint to GL_FASTEST; and the string "NICEST" both
- enables vertex preclipping and sets the hint to GL_NICEST.
-
- Octane2 VPro systems represent the depth buffer in eye space
- rather than in window space (see the topic "Depth Buffer in
- Eye Space" in the Features section below for details). As a
- result, glReadPixels() and glDrawPixels() incur a cost to
- convert GL_DEPTH_COMPONENT data to and from the hardware
- format. Applications can use the GL_DEPTH_COMPONENT24_SGIX
-
-
-
-
-
-
-
-
-
-
-
- - 13 -
-
-
-
- format parameter to cause depth buffer reads and writes to
- be done directly in the hardware format, thus avoiding this
- cost. See the man pages for glReadPixels() and
- glDrawPixels().
-
- Octane2 VPro systems are capable of asynchronous transfers
- of pixel data and texture images. This feature can provide
- dramatic performance gains for certain kinds of
- applications. One example is applications which engage in
- frequent, ongoing texture downloads, such as those which use
- dynamically changing textures. Another example is
- applications which can overlap substantial CPU work with
- glDrawPixels or glReadPixels operations, and can't easily
- structure themselves to be multithreaded.
-
- This feature is provided via the SGIX_async and
- SGIX_async_pixel extensions. For information on SGIX_async,
- see the man pages for glAsyncMarkerSGIX(3G),
- glPollAsyncSGIX(3G), glFinishAsyncSGIX(3G),
- glGenAsyncMarkersSGIX(3G), glIsAsyncMarkerSGIX(3G),
- glDeleteAsyncMarkersSGIX(3G), and glFinish(3G). For
- information on SGIX_async_pixel, see the man pages for
- glEnable(3G), glDrawPixels(3G), glReadPixels(3G),
- glTexImage1D(3G), glTexImage2D(3G), glTexImage3D(3G),
- glTexSubImage1D(3G), glTexSubImage2D(3G), and
- glTexSubImage3D(3G).
-
- The SGIX_async and SGIX_async_pixel extensions relax the
- normal OpenGL semantics of sequentiality. In particular,
- asynchronous transfers of pixel or texel data may happen
- out-of-order with respect to each other and with respect to
- any synchronous OpenGL commands that follow. Buffers for
- asynchronous glDrawPixels and glTexImage commands must not
- be modified before the transfers have finished. Likewise,
- buffers for asynchronous glReadPixels commands will not be
- valid until the transfers have finished.
-
- Applications must take care to enforce their own
- dependencies on asynchronously transferred data. To support
- this, the SGIX_async extension provides bookkeeping
- mechanisms and both blocking and non-blocking
- synchronization commands. Failure to enforce all
- dependencies may result in obscure, timing-related bugs, as
- well as bugs which remain latent until the application is
- run on higher-performance systems than may presently be
- available.
-
- As with all SGIX extensions, this feature may not be
- available on future products.
-
-
-
-
-
-
-
-
-
-
-
-
-
- - 14 -
-
-
-
- 7.5 _O_c_t_a_n_e_2__V_P_r_o__S_p_e_c_i_f_i_c__F_e_a_t_u_r_e_s
-
-
-
- 7.5.1 _U_n_i_f_i_e_d__G_r_a_p_h_i_c_s__M_e_m_o_r_y The Octane2 VPro graphics
- architecture supplies a pool of memory out of which many
- graphics memory resources are allocated. (This pool of
- memory is, however, separate from system memory and not
- addressable in the system's virtual address space.) The set
- of resources present in this "unified graphics memory" are:
- the framebuffer (including the color, depth and stencil
- buffers, and optionally the accumulation buffer);
- framebuffer overlays; pbuffers; textures; and some system
- overhead.
-
- The amount of unified graphics memory present, in megabytes,
- can be found as the last field of the string returned by
- glGetString(GL_RENDERER). See the man page for
- glGetString() for details.
-
- The xsetmon(1) program (also accessible from the IRIX
- desktop at Toolchest->System->Display Properties) is used to
- statically allocate the framebuffer. The user may choose
- among a variety of framebuffer X and Y dimensions; a
- framebuffer depth of 8 or 16 bytes; and the optional
- presence of a hardware accumulation buffer. The unified
- graphics memory remaining after this static allocation (and
- after a small amount of system overhead) is available for
- dynamic allocation of pbuffers and textures.
-
- If the hardware accumulation buffer is selected, it provides
- 24 bits of precision per component. When no hardware
- accumulation buffer is present, OpenGL will allocate a host
- software accumulation buffer for use by the glAccum()
- function. This software accumulation buffer provides 16
- bits per component.
-
- The default configuration is a 1280x1024 framebuffer with 16
- bytes per pixel, and no hardware accumulation buffer.
-
- Depending on the amount of memory present in the graphics
- subsystem, some framebuffer configurations may not fit. A
- rough calculation of the amount of memory consumed by a
- framebuffer configuration may be done as follows. Count
- pixels simply by multiplying the X and Y dimensions of the
- framebuffer. For each pixel, allow 8 or 16 bytes according
- to the selection made in xsetmon(1). Add 2 bytes per pixel
- for the overlay/WID buffer. If the hardware accumulation
- buffer is selected, add 12 more bytes per pixel.
-
-
-
-
-
-
-
-
-
-
-
-
-
- - 15 -
-
-
-
- 7.5.2 _C_l_i_p_p_i_n_g__a_s__S_c_i_s_s_o_r_i_n_g The Octane2 VPro architecture
- substitutes a function with scissoring semantics for fine-
- grained geometry clipping of the usual kind. This includes
- both view-frustum clipping and glClipPlane() clipping. In
- the latter case, fragments lying on the wrong side of the
- clip plane are efficiently killed in rasterization hardware.
-
- Therefore, the OpenGL clipping semantics with respect to the
- glPolygonMode(..., GL_LINES) function are modified so that
- new edges are not supplied when such polygons are clipped.
- As well, wide points and wide lines never slop over the
- bounds of the viewport. Although these are technical
- violations of the OpenGL specification, in practice most
- users consider the specified behaviors to be bugs, and their
- absence to be features.
-
-
- 7.5.3 _W_i_d_e__L_i_n_e_s Octane2 VPro systems use a "french-cut"
- style of line endings for anti-aliased wide lines. This
- means that the ends of wide line segments are either
- vertical, if the line has an X-major slope on the screen, or
- horizontal if the line has a Y-major slope.
-
-
- 7.5.4 _D_e_c_o_m_p_o_s_i_t_i_o_n__o_f__Q_u_a_d_s Unlike most other OpenGL
- implementations, Octane2 VPro systems decompose quads into
- triangles using the diagonal between the first and third
- vertices, rather than between the second and fourth
- vertices. This may produce different interpolations of
- parameters such as color, texture coordinates or Z
- coordinate at given interior points of the quad. This is
- most likely to be noticeable when the four values of a given
- parameter at the four vertices are not planar in the
- parameter space. One example would be a square with three
- vertices white and one vertex red. Another would be a quad
- whose geometry is grossly non-planar. In such cases, the
- application has underspecified the desired interpolation,
- and different OpenGL implementations are free to behave
- differently.
-
-
- 7.5.5 _C_o_m_p_u_t_a_t_i_o_n__o_f__T_e_x_t_u_r_e__L_e_v_e_l__o_f__D_e_t_a_i_l In computing
- the lambda parameter controlling level-of-detail while
- sampling a texture, the Octane2 VPro hardware uses a
- diagonal distance between pixels rather than the usual
- rectangular distances in estimating the partial derivatives
- of s and t with respect to x and y. The contours where
- level-of-detail changes across the surface of the primitive
- may be of different shape as a result.
-
-
-
-
-
-
-
-
-
-
-
-
-
- - 16 -
-
-
-
- 7.5.6 _V_i_s_u_a_l_s__w_i_t_h__M_u_l_t_i_p_l_e__B_u_f_f_e_r_s To provide maximum
- rasterization performance, Octane2 VPro systems provide
- duplicate depth buffers in double-buffered visuals which
- have depth. The depth buffer associated with the current
- draw buffer is used for depth tests. Therefore, the depth
- buffer should ideally be cleared at the same time as the
- color buffer. In any case, the depth buffer may be observed
- to have different contents as the result of a call to
- glXSwapBuffers() or glDrawBuffer().
-
-
- 7.5.7 _C_l_e_a_r_s On Octane2 VPro systems, dithering is not
- applied to screen clears.
-
-
- 7.5.8 _D_e_p_t_h__B_u_f_f_e_r__i_n__E_y_e__S_p_a_c_e The Octane2 VPro graphics
- system stores eye Z values in the depth buffer instead of
- device Zvalues as many other graphics chips do.
- Specifically, it stores eye Z divided by eye W. Let us call
- this divided eye Z. In addition, the eye Z values are
- stored in the depth buffer in an internal floating point
- representation.
-
- Note: Eye coordinates are obtained by multiplying the object
- coordinates by the modelview matrix. Eye coordinates
- multiplied by the projection matrix yield clip coordinates.
- Clip coordinates divided by their W coordinates (clip W)
- result in device coordinates.
-
- The advantage of this approach is in improved resolution of
- Z values in areas where objects are most often located, that
- is in the first half of the viewing frustum (closer to the
- camera). First, divided eye Z values are always uniformly
- spaced in the view frustum, even in case of perspective
- projection. Second, the use of floating point values
- increases the resolution at the beginning of the viewing
- frustum. Thus for objects closer to the camera (both in
- perspective and orthographic projection) the precision of
- depth tests is increased, compared to objects further away
- from the camera. The increase in resolution close to the
- camera is not as drastic as in the case of device Z values
- in perspective projection. For example, in case of a
- frustum with near plane at 1 and far plane at 1001, the
- precision in the interval of eye Z values from -501 to -1001
- is 18 bits, in the interval (-251,-501) is 19 bits, in the
- interval (-126,-251) is 20 bits and so on (the values have a
- 6 bit mantissa). On the other hand, device Z values would
- have 20 bit precision around -16, 19 bit around -32, and 18
- bit around -64, assuming that the values are stored as
- integers or fixed point numbers.
-
-
-
-
-
-
-
-
-
-
-
-
- - 17 -
-
-
-
- Thus eye Z values have about 4 bits better precision than
- device Z values in the first half of the Z range, except the
- very beginning when device Z values are more precise. The
- difference increases as one moves further away from the
- camera, because eye Z values have 18 bit precision in the
- whole second half of the frustum, while device Z values
- still lose precision with higher distance from the camera.
-
- There are a few caveats the user should be aware of:
-
- The resolution of the depth buffer in orthographic
- projection is lower for objects in the second half of the
- view frustum (further away from the camera), compared to the
- same size depth buffer with integer values.
-
- If your application changes the projection matrix without
- clearing the depth buffer, the behavior of your program may
- be different, since the values are distributed uniformly in
- case of perspective projection. For example, if you draw an
- instrument panel using orthographic projection and then the
- cockpit view using perspective projection, you may have to
- adjust the parameters of your projections (or even better,
- use glDepthRange()).
-
- If you specify perspective projection it is very important
- to make the proper distinction between modelview and
- projection matrices, and to correctly specify the projection
- on the projection matrix. Most programmers who use lighting
- in their application are familiar with this condition, since
- lighting works in eye coordinates and it is necessary to
- specify the projection on the projection matrix stack to
- obtain correct eye coordinates.
-
-
- 7.5.9 _V_i_s_u_a_l_s__w_i_t_h__1_6_-_b_i_t__D_e_p_t_h On Octane2 VPro V12
- systems, there is a new visual type which provides a 16-bit
- depth buffer in combination with color buffers of 12 bits
- each for red, green, blue and alpha channels; it may be
- either single- or double-buffered. (This visual type does
- not provide a stencil buffer.)
-
- Because most applications that use a depth buffer need the
- full precision of 24 bits, we have arranged the visual
- ranking algorithms in glXChooseVisual() and
- glXChooseFBConfig() to prefer the visuals with 24-bit depth
- buffers, even though they have fewer bits of color
- precision. If an application directly requests 24 bits of
- depth, then of course only those visuals will be considered
- by these GLX functions. Conversely, an application may get
- one of the new visuals by asking for 12 bits of red, green,
- blue and alpha, along with from 1 to 16 bits of depth.
-
-
-
-
-
-
-
-
-
-
-
- - 18 -
-
-
-
- 7.5.10 _S_p_e_c_u_l_a_r__H_i_g_h_l_i_g_h_t_s The Octane2 VPro graphics
- system is the first system from SGI to implement the
- GL_SGIX_fragment_lighting extension. The implementation is
- optimized for providing improved lighting effects with no
- loss of performance at interactive frame rates. If only a
- fragment light is enabled, and if a single highlight
- occupies a large area of the screen (more than 200 or so
- pixels across), some slight mach banding may be visible in
- the highlight. The effect is ameliorated by enabling other
- lights in addition to the fragment light, and is
- unnoticeable in the normal case where highlights are only a
- small part of the visible window area.
-
- For materials with very low shininess the extent of a
- specular highlight will also be slightly different from the
- value given by the formula in the OpenGL specification. The
- discrepancy becomes less as the shininess increases, and has
- disappeared by the time the shininess gets into the range
- typical of glossy surfaces where specular reflection is most
- significant.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-