starfire

Starfire Administration

Jeff Ruggeri

               The Enterprise line of Sun servers has proven, over time, to be an
               indomitable force in large-scale UNIX installations. Despite this fact,
               administration of the largest enterprise server still remains something of a
               mystery to many Solaris sys admins. The E10000, or Starfire, has brought
               many interesting new dimensions to the Solaris environment that
               dramatically enhance the existing flexibility and power of a Sun. The goal of
               this article is to explain precisely what makes this machine so different, and
               to demystify some of the concepts and eccentricities surrounding
               administration of what is arguably one of the most versatile machines
               available today.

The System

               The first difference between the Starfire and its enterprise line of cousins is its
               capacity. Physically, the Starfire is much larger than the rest of the line, and its
               curvy design is made to stand out in a data center full of boxy machines. Peeking
               under the hood proves that it doesn't just look big and fast. With support for up to
               64 CPUs overall, this machine can give nearly any vendor's largest workhorse a run
               for its money. However, the central concept that differentiates the Starfire from
               any other system is that it is capable of being partitioned into several logical
               machines, or domains, each of which can operate as a stand-alone Solaris box.
               Beyond that, system boards can be dynamically added to or removed from a
               running domain, allowing for previously unthinkable levels of flexibility in
               production environments. In addition to dynamic reconfiguration, other features
               such as a floating network console (netcon) and a system service processor (SSP)
               further enrich the machine. The SSP is the hub of operations for the Starfire,
               actually a separate machine, from which every aspect of the environment can be
               controlled. Netcon is the software equivalent of a terminal concentrator, allowing
               access to each domain's console from anywhere.

               Examining the physical layout of the system is the first key to understanding
               the environment's flexibility. The machine is split into 16 system boards, with
               8 on each side. Each system board is capable of holding 4 UltraSPARC II
               CPUs, 4 SBUS I/O cards, and 8 GB of memory. In addition to this, there are
               two Centerplane Support Boards (CSBs) that allow for netcon functionality
               without a network connection present on a domain. All of these boards are
               connected by an intelligent, high-bandwidth backplane that is capable of
               making point-to-point connections from any system board to any other,
               allowing for seamless SMP between random boards. Yet another notable
               feature is a required private, hub-based network that connects all of the
               domains, the CSBs, and the SSP. The usefulness of this will come to light
               shortly.

               The theory behind the design is elegantly simple. Domains are essentially
               logical entities, existing primarily in the software of the SSP. The SSP itself
               groups boards into domains, allows access to consoles with netcon, can
               power on or off any component of the system, and also controls a virtual
               system "key" with the bringup command. Domains can be accessed even
               when their network connections are not present through the CSB
               connection, called the JTAG, which communicates over the backplane.
               Even the OpenBoot PROM for each domain is contained in software on the
               SSP! Of course, this is where the complexity comes in. The commands that
               are used for controlling the Starfire environment are essentially unique, as
               they are not currently found anywhere else in the enterprise line.
               Administration requires familiarity with the SSP commands, and more
               specifically the ssp user.

The SSP User

               The SUNWssp package, which is generally preloaded on the Ultra5 (which comes
               with the E10k), installs a user called ssp, by default. This user controls the
               environment variables and scripts that are used to create, destroy, and modify
               domains. Upon logging into this user, you will be greeted with the prompt:

Please enter SUNW_HOSTNAME:

               This is referring to the current working domain. The value of this variable is
               the domain to which any ssp commands issued will be applied, so it is
               important to ensure it is set correctly before you perform an action.
               (Fortunately, its value is displayed by default in the ssp user's command
               prompt.) To switch its value at any time, use the command:

domain_switch <domainname>

Keep in mind that all of the following commands source this variable as their
argument.

               Regarding domain naming conventions, note that using a domain's
               hostname as the domain name is generally not a good idea. The reasoning
               behind this is simple: each domain has both a private (SSP) and public
               Ethernet address. Giving the domain a name that differs from its actual
               hostname provides an easy way to ensure you're talking to the right IP
               address, which is invaluable when booting from the SSP or reconfiguring
               the domain. A good convention would be to name the domain something
               like hostname-dN, where N is a number, incremented for each domain.

To view information about configured domains, use the command:

domain_status

On a configured system, the output might look something like:

DOMAIN TYPE PLATFORM OS SYSBDS
frobozz-d1 Ultra-Enterprise-10000 frood 2.7 0 1 2 3 4

               and so on, for each domain. Notice the platform name. This is essentially
               the name of your E10000, which is established at initial SSP setup,
               presumably to differentiate it from the scores of other E10000s littering your
               machine room.

               Of course, for any of this to work, one must first configure domains. To do
               this, a EEPROM image will be required for each domain. When you first
               uncrate the machine, Sun will provide you with an image for each domain
               requested. If further images ever need to be added, you can obtain a hostid
               and key for use with the sys-id utility, which will generate the EEPROMs.
               Once the images are successfully installed, the next step is to power up the
               individual system boards being used in the domain. This is accomplished
               with the command power. Its simplest uses are:

power -on -sb 0 1
power -on -cb 0

               which power up system boards 0 and 1, and CSB 0, respectively. To
               reverse this, use the -off flag. Used with no arguments, the power
               command will display the power statistics of each board in the Starfire. The
               -all flag will apply a command to every board on the system, hence its
               inherent danger. Fortunately, the SSP software is smart enough to
               recognize when domains are running and deny power requests.

Once power is established, the domain_create command can be used to
initialize a domain. Its syntax is:

domain_create -d <domain name> -b <system boards> -o <os version> \
-p <platform>

To remove a domain, the command is simply:

domain_remove -d <domain name>

               Fortunately, this command is essentially reversible, and recreating a domain
               with domain_create will restore it as if it had not been destroyed,
               provided the same system boards are used in the re-creation.

Bringup and Netcon

               Once the domain is created, how do you "turn the key" on the host, or
               access the console? The answer lies in the bringup and netcon
               commands. The bringup command will cycle a domain through a power on
               self-test (POST), bring it up to an OpenBoot PROM prompt, and even boot
               the OS if it is installed. Before issuing this command, it is wise to determine
               the status of the domain with the check_host command. This will return a
               simple "Host is UP" or "Host is DOWN" response, which lets you know
               whether it is safe to bring the machine up.

The bringup command has a very specific set of functions that it executes,
in the following order:

                   bringup runs power to check that all of the system boards in the
                   domain are powered up. If not, it will abort with a message to this
                   effect.
                   Next, it runs check_host to determine the status of the domain. If the
                   domain is determined to be up, bringup will prompt you whether to
                   continue. This is useful if you are using the command to recover from a
                   hung host; however, it is recommended that you use bringup for this
                   purpose only in extreme situations.
                   The blacklist file, located in $SSPVAR/etc/<platform
                   name>/blacklist is checked. This file allows components, from I/O
                   units and CPUs to entire system boards, to be individually excluded
                   from the domain at start time. This is a fairly useful feature, which can
                   be manually edited.
                   bringup runs hpost on the domain. hpost is a very valuable tool,
                   which can be run (on a domain that is not up) interactively at any time.
                   It runs the domain through a series of tests, which can often shake out
                   hardware errors. By default, it will run at level 16. It can be configured
                   to run up to level 127 (which executes extremely detailed testing), with
                   the file ~ssp/.postrc by adding the line level N.
                   Finally, bringup starts the obp_helper and the netcon_server,
                   which indicates that the domain is ready.

               The only important arguments to bringup are -A on or -A off. This is the
               equivalent of the AutoBoot? OpenBoot PROM parameter, as "on" will
               boot the system (if extant) and "off" will dump you to the OpenBoot PROM
               itself.

               Some mention should be made of ways to interrupt a running domain.
               Assuming your domain is hung, and you can't seem to get it to come back
               for whatever reason, the Starfire offers several commands above and
               beyond the traditional stop-a type interrupt. They are, in order of severity:

hostint -- This forces a panic on a domain.

hostreset -- The domain goes into a reset state, but you can then run
a bringup on it.

sys_reset -- This performs a hardware reset of all of the system
boards in a domain, and should be used only as a last resort.

The bringup command is, in fact, most severe of all in a hang situation.

               Once bringup exits (assuming success), the netcon command can be used
               to access the domain. If netcon is used on a domain which has not been
               brought up, the command will sit and idle, waiting for a connection. As
               previously mentioned, the current value of $SUNW_HOSTNAME is the
               domain which is accessed, meaning that multiple windows can most
               definitely access multiple domains, simply by using domain_switch. Once
               the command is issued, a sysadmin can interact with the system no matter
               what state or runlevel it is in. One caveat, however, is that when the system
               is not in multi-user mode, the connection can be extremely slow, as it is
               going over the JTAG. It does, however, grant the access needed. Once the
               system reaches runlevel 2, the cvcd is started, which allows
               communication between the domain and the ssp on the private ethernet
               network.

               Of course, since multiple users could theoretically have access to the ssp
               user at once, it follows that multiple users could try and netcon into the
               same domain at once. This could lead to some problems, but fortunately
               netcon implements a locking mechanism that only allows one user to have
               write access at a time. A user of netcon can be in unlocked write, locked
               write, or read only mode. Control commands for netcon begin with the tilde
               (~) as an escape character, and are as follows:

                   ~# -- Analogous to stop-A on a normal system. This will halt your
                   system and bring it to the OpenBoot PROM. Use caution with this
                   command.

~? -- Shows the current status of all the open netcon sessions.

                   ~= -- Switch between the SSP private interface for the domain and the
                   control board JTAG interface. This feature only works in private mode,
                   when the cvcd is running on the host.

                   ~* -- Private mode. This sets Locked Write permission, closes any
                   open netcon sessions, and disallows access to netcon from any
                   other terminal. This is the same as the -f (force) flag to the netcon
                   command itself.

~& -- Locked Write mode. This is the same as opening a session with
the -l flag.

~@ -- Unlocked Write mode. Another user easily revokes this. This is
the same as opening a session with the -g flag.

~^ -- Read-Only mode. Releases write permission and echoes any
other session with write permission to your terminal.

~. -- Release netcon. This will exit the netcon session and return
you to the command prompt.

A sample output of netcon might look like:

               frobozz-ssp01:frood-d1% netcon
               trying to connect...
               connected.

               SUNW,Ultra-Enterprise-10000, using Network Console
               OpenBoot 3.2.4, 12288 MB memory installed, Serial #00000000.
               Ethernet address 0:0:00:00:00:00, Host ID: 00000000.

<#0> ok

               At this point, you should be in familiar territory. You can essentially treat the
               domain just as you would any other enterprise system. The only major
               difference once the domain is up comes with the dynamically reconfigurable
               properties of the Starfire.

Dynamic Reconfiguration

               The feature on the Starfire, which is the most important departure from the
               rest of the enterprise line, is the ability to change the capacity of a running
               system without interrupting any services. The practical applications for this
               feature are almost endless, and it is limited only by I/O configuration.
               System boards can be allocated from one domain to another, or even
               removed from a domain, powered off, and removed from the system for
               repair! There are two methods that can be used to accomplish the task of
               reconfiguration. The first method is to use the dr command, and the other
               (less reliable) method is to use the hostview GUI interface.

               A brief note about hostview: this tool can be used to perform several
               actions, including modifying the aforementioned blacklist file or opening
               netcon consoles. However, it has been my experience that dr should
               always be used for reconfiguration, as hostview seems unreliable when it
               comes to modifying a domain. Board attachments or detachments often do
               not work, for no visible reason. While the intent is not to malign this tool, as
               it is useful in its own right, it is not the best tool for this particular feature of
               the E10k.

               Issuing the command dr will start a shell-like environment and report on
               what boards are physically present. It will also report which boards are
               currently in use by $SUNW_HOSTNAME, as this is the domain that will be
               modified. Before entering dr, you may want to first use domain_status to
               see what boards are being used overall on the platform. The major actions
               that can be performed from within dr are the attachment or detachment of a
               system board. The commands used to achieve these functions are as
               follows:

To attach an unused system board to the current domain:

init_attach <sysbd> -- Prepare the named board for attachment.

complete_attach <sysbd> -- Attaches the board to the domain, after
running init_attach.

abort_attach <sysbd> -- Aborts the attach process after a failed
attach, or before complete_attach is run.

Detaching a system board:

drain <sysbd> -- Evacuates the memory on the named board.

complete_detach <sysbd> -- Detaches the board from the domain,
after running drain.

abort_detach <sysbd> -- Aborts the detach process after a failed
drain, or before complete_detach is run.

Other commands:

                   reconfig -- Run after a board attachment, this will run the Solaris
                   config sequence on the domain: drvconfig; devlinks; disks; ports;
                   tapes.

drshow <sysbd> <command> -- Shows the status of a running dr
command. The most important arguments are drain and io.

               Now for the warnings. Although attachment is relatively straightforward, and
               can be done without incident using any free system board, use caution
               when detaching a board from a running domain. The first notable issue is
               that running drain on a board is not an instantaneous process, even
               though the command returns immediately. Before running a
               complete_detach, the board should be examined with the command
               drshow <sysbd> drain, which shows the status of the drain process.
               drain actually attempts to move physical memory pages off to memory on
               other system boards, and attempting to detach the board before this is
               complete can be catastrophic. Of course, if enough free memory isn't
               available elsewhere on the system, the drain may not work!

               The second, more important, caveat is that a board should never be
               detached if it contains any I/O. While it is obvious that attempting to detach
               a board that contains the SCSI channel to your boot disk would be a bad
               thing, what is less obvious is that any I/O cards on a board may be held
               open by the kernel. This includes boards that you may not be using at that
               particular moment. Detaching a board that the system is not ready to
               release can lead to a panic! To be safe in these situations, use the
               command drshow <sysbd> io. This will tell you whether the kernel on
               the domain is using any I/O. In designing your domains for proper dr usage,
               the best idea is to institute a set of floater boards, which contain no I/O
               whatsoever -- only CPU and memory. These boards can easily be attached
               or detached from any system with few problems and make life much easier
               on an E10k, which is constantly reconfigured. Also, it is a good idea to
               concentrate as much I/O on the first board or two of a domain (in a
               multi-board domain) as possible, yet still ensure redundancy. Keeping as
               little I/O on the last boards of a domain is incredibly useful if you plan on
               swapping system boards often.

Conclusion

               The Starfire presents several layers of complexity, which significantly
               expands upon the existing Sun architecture. The features presented in this
               article lend themselves to a coherent and, above all, reliable machine that
               takes the concept of uptime very seriously. The benefits of such a platform
               are plain to be seen. Although there are many more facets to the
               administration of a Starfire, I have attempted to provide you with a base
               arsenal of concepts and commands with which to approach this powerful
               environment.

               Jeff Ruggeri is a Solaris Systems Administrator at Aetna in Middletown,
               CT., where he is responsible for an environment comprised of nearly 300
               mission-critical Sun Enterprise servers. He has been hacking UNIX in one
               form or another since he was approximately 12 years old.

               The text of "Starfire Administration" has been adapted from Jeff Ruggeri's
               contribution to the book Solaris Solutions for System Administrators by
               Sandra Henry-Stocker and Evan R. Marks, and is reprinted here with
               permission from Wiley Computer Publishing.