home *** CD-ROM | disk | FTP | other *** search
-
-
-
- aaaarrrrrrrraaaayyyy____sssseeeerrrrvvvviiiicccceeeessss((((5555)))) aaaarrrrrrrraaaayyyy____sssseeeerrrrvvvviiiicccceeeessss((((5555))))
-
-
-
- NNNNAAAAMMMMEEEE
- array_services - overview of array services
-
- DDDDEEEESSSSCCCCRRRRIIIIPPPPTTTTIIIIOOOONNNN
- Along with the power and flexibility of clustered systems comes some
- additional complexity in the area of administering and managing the array
- as a whole. IRIX provides several services to help ease this situation.
- Some of these services revolve around the notion of an array session,
- which is a set of processes, perhaps running on different nodes in an
- array, that are conceptually related as a single "job". Additional
- services are provided by the array services daemon, which knows about the
- configuration of an array and is therefore able to provide functions for
- describing and administering it.
-
- AAAARRRRRRRRAAAAYYYY SSSSEEEESSSSSSSSIIIIOOOONNNNSSSS
- A principal use of an array system is to run jobs that are large enough
- to span two or more machines. Unfortunately, the mechanisms that are
- typically used to manage multiple related processes (for example: process
- groups, terminal sessions) are limited in scope to a single machine. As
- a result, mundane tasks such as killing a job or accounting for all of
- its resource usage can become very difficult when the job runs across
- several machines. Some means of correlating related processes on
- different machines is required. IRIX provides this function with the
- notion of an "array session".
-
- In formal terms, an array session is a set of processes all related to
- each other by a single unique identifier, the aaaarrrrrrrraaaayyyy sssseeeessssssssiiiioooonnnn hhhhaaaannnnddddlllleeee (ASH).
- A child process ordinarily inherits the ASH of its parent when it is
- created, thus becoming a member of its parent's array session. However,
- it is possible for a process to leave its parent's array session and
- start a new one. This would be done by programs such as _l_o_g_i_n(1) or
- _r_s_h_d(1M) so that logging in to the system will effectively start a new
- array session. This is also done by programs like _c_r_o_n(1M) and _s_u(1M) so
- that work done on behalf of another user will be done in its own array
- session. When the last process with a given ASH exits, a session
- accounting record containing accumulated statistics for all of the
- processes that ran in the array session is written and the array session
- ceases to exist.
-
- The array session handle itself is a 64-bit value. By default, a unique,
- increasing value (similar to a process ID) is assigned to each new array
- session as its handle. This type of ASH is referred to as a llllooooccccaaaallll AAAASSSSHHHH:
- although it is guaranteed to be unique on the local machine, it may also
- be in use by a different session on another machine in the same array.
- Because of this, a local ASH is not appropriate for identifying multi-
- machine jobs. However, there is a second type of ASH known as a gggglllloooobbbbaaaallll
- AAAASSSSHHHH. These are assigned by the array services daemon (see below) and _a_r_e
- supposed to be unique across the entire array. By arranging for the same
- global ASH to be associated with each process in a job, it is possible to
- treat the set of processes as a single entity, even though some of the
- processes may be running on different machines.
-
-
-
-
- PPPPaaaaggggeeee 1111
-
-
-
-
-
-
- aaaarrrrrrrraaaayyyy____sssseeeerrrrvvvviiiicccceeeessss((((5555)))) aaaarrrrrrrraaaayyyy____sssseeeerrrrvvvviiiicccceeeessss((((5555))))
-
-
-
- The next trick is "arranging for the same global ASH to be associated
- with each process in a job". This involves several steps. First, each
- machine that is to run part of the job must start a new array session to
- contain the related processes. By default, this new array session will
- only have a local ASH, so it must "upgrade" its handle to a global ASH.
- If this is the first machine to run part of the job, it would need to
- obtain a new global ASH from the array services daemon (this is done with
- a single library call, _a_s_a_l_l_o_c_a_s_h(3X)). Additional machines that are
- called into service for the job would need to get a copy of the first
- machine's global ASH; presumably this information would be passed along
- at the same time as the rest of the information concerning the new job.
- Once an appropriate global ASH has been settled upon, it can then be
- assigned to the new array session, replacing the original local ASH. The
- process on each machine that started the new array session is now free to
- fork off any number of children to do the required work. These children
- will all have the same ASH and can therefore be correlated with each
- other for administrative tasks such as job control or accounting.
-
- AAAARRRRRRRRAAAAYYYY SSSSEEEERRRRVVVVIIIICCCCEEEESSSS DDDDAAAAEEEEMMMMOOOONNNN
- Although being able to correlate related processes on different machines
- in an array is necessary for the stated goal of administering an array in
- a reasonable way, it is not sufficient: something still needs to find all
- of those related processes and act upon them. That is the job of the
- array services daemon.
-
- Each machine in an array should have an array services daemon running on
- it. The array services daemon (arrayd) performs several different tasks:
-
- - It allocates global array session handles
-
- - It knows the current array configuration and can provide that
- information to other commands and programs
-
- - It can determine which processes belong to a particular array session
- and provide that information to other commands and programs
-
- - It can forward commands to all of the machines in an array
-
- GGGGlllloooobbbbaaaallll AAAArrrrrrrraaaayyyy SSSSeeeessssssssiiiioooonnnn HHHHaaaannnnddddlllleeeessss
- As mentioned earlier, a global array session handle is important for
- keeping track of jobs that run on several machines in an array. Because
- the array services daemon knows the configuration of an array, it is
- better suited to providing a unique global ASH than the IRIX kernel,
- which necessarily knows only about the local machine.
-
- When a program needs to allocate a global ASH it invokes a single library
- call, specifying (optionally) which array the ASH is to be allocated for.
- The library call, which is part of libarray (see below), takes care of
- the pragmatic issues of contacting and communicating with the local array
- services daemon. The resulting global ASH can then be passed to the
- _s_e_t_a_s_h(2) system call. Note that while anybody can allocate a global
- ASH, only a process with root privileges can actually change its ASH
-
-
-
- PPPPaaaaggggeeee 2222
-
-
-
-
-
-
- aaaarrrrrrrraaaayyyy____sssseeeerrrrvvvviiiicccceeeessss((((5555)))) aaaarrrrrrrraaaayyyy____sssseeeerrrrvvvviiiicccceeeessss((((5555))))
-
-
-
- using setash.
-
- The global ASH itself is an "opaque" value: it does not necessarily have
- any specific information embedded in it, other than to distinguish it
- from a local ASH (a library function is provided to make this
- distinction). Nevertheless, the identity of the specific machine that
- creates a global ASH and the array for which it is intended may play some
- role in the generation of the ASH value itself. System administrators
- may specify particular values to be used for this purpose if desired.
-
- AAAArrrrrrrraaaayyyy CCCCoooonnnnffffiiiigggguuuurrrraaaattttiiiioooonnnn DDDDaaaattttaaaabbbbaaaasssseeee
- Each array services daemon has knowledge about one or more arrays and the
- machines that make up each of them. This information can be provided to
- other programs and commands with straightforward library calls in
- libarray. Ideally, this should make it unnecessary for other array-
- oriented programs to maintain their own separate array configuration
- data.
-
- An array services daemon obtains its configuration information from a
- configuration file located in its local filespace. Each daemon has its
- own configuration file which must be synchronized by the system
- administrator with configuration files on other machines in the array(s).
-
- AAAArrrrrrrraaaayyyy SSSSeeeessssssssiiiioooonnnn IIIInnnnffffoooorrrrmmmmaaaattttiiiioooonnnn
- To take advantage of array sessions, it is necessary to be able to
- enumerate the processes that are contained in a given array session. For
- certain applications (monitor programs for example) it may also be useful
- to enumerate ALL of the known array sessions. The array services daemon
- can obtain both types of information and provide it to other programs via
- libarray functions.
-
- CCCCoooommmmmmmmaaaannnndddd FFFFoooorrrrwwwwaaaarrrrddddiiiinnnngggg
- Command forwarding pulls all of the other array services together: it
- allows a user on one machine to issue a single command and have it
- executed on all of the machines in an array, perhaps only affecting a
- particular array session. With the appropriate setup, this could be used
- for such tasks as killing a runaway job or shutting down an entire array.
-
- Users use a simple client program (_a_r_r_a_y(1)) to specify the command they
- want to execute, any arguments it may require and the array they want to
- execute it on. Such an array command might look like this:
-
- _aaaa_rrrr_rrrr_aaaa_yyyy _----_aaaa _DDDD_eeee_vvvv_AAAA_rrrr_rrrr_aaaa_yyyy _kkkk_iiii_llll_llll_aaaa_ssss_hhhh _1111_3333_5555_4444_3333_4444_2222_3333
-
- This example says "execute the command 'killash 13543423' on the machines
- in the array 'DevArray'". The command "killash" is not necessarily an
- actual program on any machine in the array; instead it refers to an entry
- in each machine's array configuration file. The entry itself specifies
- which program to execute, which arguments should be passed to it, which
- user/group/project the command should be executed under, etc. This
- allows each machine in an array to handle a particular command
- differently, or not handle it at all.
-
-
-
- PPPPaaaaggggeeee 3333
-
-
-
-
-
-
- aaaarrrrrrrraaaayyyy____sssseeeerrrrvvvviiiicccceeeessss((((5555)))) aaaarrrrrrrraaaayyyy____sssseeeerrrrvvvviiiicccceeeessss((((5555))))
-
-
-
- The "array" program itself is fairly basic: it simply passes the user's
- command to the local array services daemon. The local array services
- daemon forwards the command to each machine in the specified array, then
- gathers the results which are then passed back to the "array" program and
- then the user.
-
- TTTTHHHHEEEE AAAARRRRRRRRAAAAYYYY SSSSEEEERRRRVVVVIIIICCCCEEEESSSS LLLLIIIIBBBBRRRRAAAARRRRYYYY
- In general, users should never have any direct interaction with the array
- services daemon. Instead, all interaction with the array services daemon
- is done through the array services library, _l_i_b_a_r_r_a_y. libarray provides
- functions for dealing with global ASH's, describing the current array
- configuration, and executing array commands.
-
- There are a number of libarray functions, all of which are documented in
- chapter 3X man pages. Some of the libarray functions include:
-
- ASH Functions
- _aaaa_ssss_aaaa_llll_llll_oooo_cccc_aaaa_ssss_hhhh - Allocates a global ASH
- _aaaa_ssss_aaaa_ssss_hhhh_iiii_ssss_gggg_llll_oooo_bbbb_aaaa_llll - Indicates whether an ASH is global or local
- _aaaa_ssss_llll_iiii_ssss_tttt_aaaa_ssss_hhhh_ssss______aaaa_rrrr_rrrr_aaaa_yyyy - Returns all global ASH's in specified array
-
- Configuration Functions
- _aaaa_ssss_llll_iiii_ssss_tttt_aaaa_rrrr_rrrr_aaaa_yyyy_ssss - Returns info on all known arrays
- _aaaa_ssss_llll_iiii_ssss_tttt_mmmm_aaaa_cccc_hhhh_iiii_nnnn_eeee_ssss - Returns info on all machines in specified array
-
- Command Forwarding
- _aaaa_ssss_cccc_oooo_mmmm_mmmm_aaaa_nnnn_dddd - Execute an array command
-
- SSSSEEEEEEEE AAAALLLLSSSSOOOO
- array(1), arrayd(1M), newsess(1), asallocash(3X), asashisglobal(3X),
- ascommand(3X), aslistarrays(3X), aslistashs_array(3X),
- aslistashs_server(3X), aslistmachines(3X), arrayd.conf(4),
- array_sessions(5).
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- PPPPaaaaggggeeee 4444
-
-
-
-