Jeff Ruggeri
The Enterprise line of Sun servers has proven, over time, to be an
indomitable force in large-scale UNIX installations. Despite this fact,
administration of the largest enterprise server still remains something
of a
mystery to many Solaris sys admins. The E10000, or Starfire, has brought
many interesting new dimensions to the Solaris environment that
dramatically enhance the existing flexibility and power of a Sun. The goal
of
this article is to explain precisely what makes this machine so different,
and
to demystify some of the concepts and eccentricities surrounding
administration of what is arguably one of the most versatile machines
available today.
The System
The first difference between the Starfire and its enterprise line of cousins
is its
capacity. Physically, the Starfire is much larger than the rest of the
line, and its
curvy design is made to stand out in a data center full of boxy machines.
Peeking
under the hood proves that it doesn't just look big and fast. With support
for up to
64 CPUs overall, this machine can give nearly any vendor's largest workhorse
a run
for its money. However, the central concept that differentiates the Starfire
from
any other system is that it is capable of being partitioned into several
logical
machines, or domains, each of which can operate as a stand-alone Solaris
box.
Beyond that, system boards can be dynamically added to or removed from
a
running domain, allowing for previously unthinkable levels of flexibility
in
production environments. In addition to dynamic reconfiguration, other
features
such as a floating network console (netcon) and a system service processor
(SSP)
further enrich the machine. The SSP is the hub of operations for the Starfire,
actually a separate machine, from which every aspect of the environment
can be
controlled. Netcon is the software equivalent of a terminal concentrator,
allowing
access to each domain's console from anywhere.
Examining the physical layout of the system is the first key to understanding
the environment's flexibility. The machine is split into 16 system boards,
with
8 on each side. Each system board is capable of holding 4 UltraSPARC II
CPUs, 4 SBUS I/O cards, and 8 GB of memory. In addition to this, there
are
two Centerplane Support Boards (CSBs) that allow for netcon functionality
without a network connection present on a domain. All of these boards are
connected by an intelligent, high-bandwidth backplane that is capable of
making point-to-point connections from any system board to any other,
allowing for seamless SMP between random boards. Yet another notable
feature is a required private, hub-based network that connects all of the
domains, the CSBs, and the SSP. The usefulness of this will come to light
shortly.
The theory behind the design is elegantly simple. Domains are essentially
logical entities, existing primarily in the software of the SSP. The SSP
itself
groups boards into domains, allows access to consoles with netcon, can
power on or off any component of the system, and also controls a virtual
system "key" with the bringup command. Domains can be accessed even
when their network connections are not present through the CSB
connection, called the JTAG, which communicates over the backplane.
Even the OpenBoot PROM for each domain is contained in software on the
SSP! Of course, this is where the complexity comes in. The commands that
are used for controlling the Starfire environment are essentially unique,
as
they are not currently found anywhere else in the enterprise line.
Administration requires familiarity with the SSP commands, and more
specifically the ssp user.
The SSP User
The SUNWssp package, which is generally preloaded on the Ultra5 (which
comes
with the E10k), installs a user called ssp, by default. This user controls
the
environment variables and scripts that are used to create, destroy, and
modify
domains. Upon logging into this user, you will be greeted with the prompt:
Please enter SUNW_HOSTNAME:
This is referring to the current working domain. The value of this variable
is
the domain to which any ssp commands issued will be applied, so it is
important to ensure it is set correctly before you perform an action.
(Fortunately, its value is displayed by default in the ssp user's command
prompt.) To switch its value at any time, use the command:
domain_switch <domainname>
Keep in mind that all of the following commands source this variable as
their
argument.
Regarding domain naming conventions, note that using a domain's
hostname as the domain name is generally not a good idea. The reasoning
behind this is simple: each domain has both a private (SSP) and public
Ethernet address. Giving the domain a name that differs from its actual
hostname provides an easy way to ensure you're talking to the right IP
address, which is invaluable when booting from the SSP or reconfiguring
the domain. A good convention would be to name the domain something
like hostname-dN, where N is a number, incremented for each domain.
To view information about configured domains, use the command:
domain_status
On a configured system, the output might look something like:
DOMAIN TYPE
PLATFORM OS SYSBDS
frobozz-d1 Ultra-Enterprise-10000 frood
2.7 0 1 2 3 4
and so on, for each domain. Notice the platform name. This is essentially
the name of your E10000, which is established at initial SSP setup,
presumably to differentiate it from the scores of other E10000s littering
your
machine room.
Of course, for any of this to work, one must first configure domains. To
do
this, a EEPROM image will be required for each domain. When you first
uncrate the machine, Sun will provide you with an image for each domain
requested. If further images ever need to be added, you can obtain a hostid
and key for use with the sys-id utility, which will generate the EEPROMs.
Once the images are successfully installed, the next step is to power up
the
individual system boards being used in the domain. This is accomplished
with the command power. Its simplest uses are:
power -on -sb 0 1
power -on -cb 0
which power up system boards 0 and 1, and CSB 0, respectively. To
reverse this, use the -off flag. Used with no arguments, the power
command will display the power statistics of each board in the Starfire.
The
-all flag will apply a command to every board on the system, hence its
inherent danger. Fortunately, the SSP software is smart enough to
recognize when domains are running and deny power requests.
Once power is established, the domain_create command can be used to
initialize a domain. Its syntax is:
domain_create -d <domain name> -b <system boards> -o <os version>
\
-p <platform>
To remove a domain, the command is simply:
domain_remove -d <domain name>
Fortunately, this command is essentially reversible, and recreating a domain
with domain_create will restore it as if it had not been destroyed,
provided the same system boards are used in the re-creation.
Bringup and Netcon
Once the domain is created, how do you "turn the key" on the host, or
access the console? The answer lies in the bringup and netcon
commands. The bringup command will cycle a domain through a power on
self-test (POST), bring it up to an OpenBoot PROM prompt, and even boot
the OS if it is installed. Before issuing this command, it is wise to determine
the status of the domain with the check_host command. This will return
a
simple "Host is UP" or "Host is DOWN" response, which lets you know
whether it is safe to bring the machine up.
The bringup command has a very specific set of functions that it executes,
in the following order:
bringup runs power to check that all of the system boards in the
domain are powered up. If not, it will abort with a message to this
effect.
Next, it runs check_host to determine the status of the domain. If the
domain is determined to be up, bringup will prompt you whether to
continue. This is useful if you are using the command to recover from a
hung host; however, it is recommended that you use bringup for this
purpose only in extreme situations.
The blacklist file, located in $SSPVAR/etc/<platform
name>/blacklist is checked. This file allows components, from I/O
units and CPUs to entire system boards, to be individually excluded
from the domain at start time. This is a fairly useful feature, which can
be manually edited.
bringup runs hpost on the domain. hpost is a very valuable tool,
which can be run (on a domain that is not up) interactively at any time.
It runs the domain through a series of tests, which can often shake out
hardware errors. By default, it will run at level 16. It can be configured
to run up to level 127 (which executes extremely detailed testing), with
the file ~ssp/.postrc by adding the line level N.
Finally, bringup starts the obp_helper and the netcon_server,
which indicates that the domain is ready.
The only important arguments to bringup are -A on or -A off. This is the
equivalent of the AutoBoot? OpenBoot PROM parameter, as "on" will
boot the system (if extant) and "off" will dump you to the OpenBoot PROM
itself.
Some mention should be made of ways to interrupt a running domain.
Assuming your domain is hung, and you can't seem to get it to come back
for whatever reason, the Starfire offers several commands above and
beyond the traditional stop-a type interrupt. They are, in order of severity:
hostint -- This forces a panic on a domain.
hostreset -- The domain goes into a reset state, but you can then run
a bringup on it.
sys_reset -- This performs a hardware reset of all of the system
boards in a domain, and should be used only as a last resort.
The bringup command is, in fact, most severe of all in a hang situation.
Once bringup exits (assuming success), the netcon command can be used
to access the domain. If netcon is used on a domain which has not been
brought up, the command will sit and idle, waiting for a connection. As
previously mentioned, the current value of $SUNW_HOSTNAME is the
domain which is accessed, meaning that multiple windows can most
definitely access multiple domains, simply by using domain_switch. Once
the command is issued, a sysadmin can interact with the system no matter
what state or runlevel it is in. One caveat, however, is that when the
system
is not in multi-user mode, the connection can be extremely slow, as it
is
going over the JTAG. It does, however, grant the access needed. Once the
system reaches runlevel 2, the cvcd is started, which allows
communication between the domain and the ssp on the private ethernet
network.
Of course, since multiple users could theoretically have access to the
ssp
user at once, it follows that multiple users could try and netcon into
the
same domain at once. This could lead to some problems, but fortunately
netcon implements a locking mechanism that only allows one user to have
write access at a time. A user of netcon can be in unlocked write, locked
write, or read only mode. Control commands for netcon begin with the tilde
(~) as an escape character, and are as follows:
~# -- Analogous to stop-A on a normal system. This will halt your
system and bring it to the OpenBoot PROM. Use caution with this
command.
~? -- Shows the current status of all the open netcon sessions.
~= -- Switch between the SSP private interface for the domain and the
control board JTAG interface. This feature only works in private mode,
when the cvcd is running on the host.
~* -- Private mode. This sets Locked Write permission, closes any
open netcon sessions, and disallows access to netcon from any
other terminal. This is the same as the -f (force) flag to the netcon
command itself.
~& -- Locked Write mode. This is the same as opening a session with
the -l flag.
~@ -- Unlocked Write mode. Another user easily revokes this. This is
the same as opening a session with the -g flag.
~^ -- Read-Only mode. Releases write permission and echoes any
other session with write permission to your terminal.
~. -- Release netcon. This will exit the netcon session and return
you to the command prompt.
A sample output of netcon might look like:
frobozz-ssp01:frood-d1% netcon
trying to connect...
connected.
SUNW,Ultra-Enterprise-10000, using Network Console
OpenBoot 3.2.4, 12288 MB memory installed, Serial #00000000.
Ethernet address 0:0:00:00:00:00, Host ID: 00000000.
<#0> ok
At this point, you should be in familiar territory. You can essentially
treat the
domain just as you would any other enterprise system. The only major
difference once the domain is up comes with the dynamically reconfigurable
properties of the Starfire.
Dynamic Reconfiguration
The feature on the Starfire, which is the most important departure from
the
rest of the enterprise line, is the ability to change the capacity of a
running
system without interrupting any services. The practical applications for
this
feature are almost endless, and it is limited only by I/O configuration.
System boards can be allocated from one domain to another, or even
removed from a domain, powered off, and removed from the system for
repair! There are two methods that can be used to accomplish the task of
reconfiguration. The first method is to use the dr command, and the other
(less reliable) method is to use the hostview GUI interface.
A brief note about hostview: this tool can be used to perform several
actions, including modifying the aforementioned blacklist file or opening
netcon consoles. However, it has been my experience that dr should
always be used for reconfiguration, as hostview seems unreliable when it
comes to modifying a domain. Board attachments or detachments often do
not work, for no visible reason. While the intent is not to malign this
tool, as
it is useful in its own right, it is not the best tool for this particular
feature of
the E10k.
Issuing the command dr will start a shell-like environment and report on
what boards are physically present. It will also report which boards are
currently in use by $SUNW_HOSTNAME, as this is the domain that will be
modified. Before entering dr, you may want to first use domain_status to
see what boards are being used overall on the platform. The major actions
that can be performed from within dr are the attachment or detachment of
a
system board. The commands used to achieve these functions are as
follows:
To attach an unused system board to the current domain:
init_attach <sysbd> -- Prepare the named board for attachment.
complete_attach <sysbd> -- Attaches the board to the domain, after
running init_attach.
abort_attach <sysbd> -- Aborts the attach process after a failed
attach, or before complete_attach is run.
Detaching a system board:
drain <sysbd> -- Evacuates the memory on the named board.
complete_detach <sysbd> -- Detaches the board from the domain,
after running drain.
abort_detach <sysbd> -- Aborts the detach process after a failed
drain, or before complete_detach is run.
Other commands:
reconfig -- Run after a board attachment, this will run the Solaris
config sequence on the domain: drvconfig; devlinks; disks; ports;
tapes.
drshow <sysbd> <command> -- Shows the status of a running dr
command. The most important arguments are drain and io.
Now for the warnings. Although attachment is relatively straightforward,
and
can be done without incident using any free system board, use caution
when detaching a board from a running domain. The first notable issue is
that running drain on a board is not an instantaneous process, even
though the command returns immediately. Before running a
complete_detach, the board should be examined with the command
drshow <sysbd> drain, which shows the status of the drain process.
drain actually attempts to move physical memory pages off to memory on
other system boards, and attempting to detach the board before this is
complete can be catastrophic. Of course, if enough free memory isn't
available elsewhere on the system, the drain may not work!
The second, more important, caveat is that a board should never be
detached if it contains any I/O. While it is obvious that attempting to
detach
a board that contains the SCSI channel to your boot disk would be a bad
thing, what is less obvious is that any I/O cards on a board may be held
open by the kernel. This includes boards that you may not be using at that
particular moment. Detaching a board that the system is not ready to
release can lead to a panic! To be safe in these situations, use the
command drshow <sysbd> io. This will tell you whether the kernel on
the domain is using any I/O. In designing your domains for proper dr usage,
the best idea is to institute a set of floater boards, which contain no
I/O
whatsoever -- only CPU and memory. These boards can easily be attached
or detached from any system with few problems and make life much easier
on an E10k, which is constantly reconfigured. Also, it is a good idea to
concentrate as much I/O on the first board or two of a domain (in a
multi-board domain) as possible, yet still ensure redundancy. Keeping as
little I/O on the last boards of a domain is incredibly useful if you plan
on
swapping system boards often.
Conclusion
The Starfire presents several layers of complexity, which significantly
expands upon the existing Sun architecture. The features presented in this
article lend themselves to a coherent and, above all, reliable machine
that
takes the concept of uptime very seriously. The benefits of such a platform
are plain to be seen. Although there are many more facets to the
administration of a Starfire, I have attempted to provide you with a base
arsenal of concepts and commands with which to approach this powerful
environment.
Jeff Ruggeri is a Solaris Systems Administrator at Aetna in Middletown,
CT., where he is responsible for an environment comprised of nearly 300
mission-critical Sun Enterprise servers. He has been hacking UNIX in one
form or another since he was approximately 12 years old.
The text of "Starfire Administration" has been adapted from Jeff Ruggeri's
contribution to the book Solaris Solutions for System Administrators by
Sandra Henry-Stocker and Evan R. Marks, and is reprinted here with
permission from Wiley Computer Publishing.