Hardware Diagnostics for Sun
TM
Systems:
A Toolkit for System Administrators
|
|
Have you ever stared at the ok prompt on your Sun system, and
wondered how to continue? Or have you ever wondered why all the LEDs
on the system board sometimes appear to flash madly like a broken
street light?
Look no further -- read on to find the answer to these questions.
By using OpenBootTM commands, the
Power On Self Test (POST) program, and the status LEDs on system
boards, you can diagnose hardware related problems on Sun Microsystems
TM server and desktop products. With these low-level diagnostics,
you can establish the state of the system and attached devices. For
example, you can determine if a device is recognized by the system and
working properly, or you can also obtain useful system configuration
information.
Use this table to locate subjects in this article:
OpenBoot Prom (OBP) Diagnostic Commands and Tools
|
Describes OBP
commands you can use to display the system configuration,
test devices attached to the system, monitor network
connections, and more.
|
OBDiag
|
Shows how you
can run tests and perform diagnostics on the main logic
board and its interfaces, and on devices such as disk and
tape drives.
|
Power On Self Test (POST)
|
Explains how
POST initializes, configures, and tests the system, and
shows you how to capture POST output and interpret the
results using the LEDs on the system board and power supply.
|
System Board and Power Supply LED Status Tables
|
Provides
reference information to help you interpret the meaning of
LED status for system boards and power supplies installed on
Ultra
TM Enterprise
Server products.
|
Solaris Operating Environment Diagnostic Commands
|
Lists useful
OS commands you can use to display the system configuration,
including failed Field Replacable Units (FRU), hardware
revision information, installed patches, and more.
|
|
OBP DIAGNOSTIC COMMANDS AND TOOLS
OBP is a powerful, low-level interface to the system and devices
attached to the system (OBP is also known as the ok prompt).
By entering simple OBP commands, you can learn system configuration
details such as the ethernet address, the CPU and bus speeds,
installed memory, and so on. Using OBP, you can also query and set
system parameter values such as the default boot device, run tests on
devices such as the network interface, and display the SCSI and SBUS
devices attached to the system.
The following table describes commands available in OpenBoot version 3.
x. To use a command, simply type the command at the OBP ok prompt and
press Return.
banner
|
Displays the
power on banner. The banner includes information such as CPU
speed, OBP revision, total system memory, ethernet address
and hostid.
|
devalias
alias path
|
Defines a new
device alias, where alias is the new alias name and
path is the physical path of the device. If devalias is used
without arguments, it displays all system device aliases
(will run up to 120 MHz).
|
.enet-addr
|
Displays the
ethernet address.
|
led-off/led-on
|
Turns the
system led off or on.
|
nvalias
name path
|
Creates a new
alias for a device, where name is the name of the
alias and path is the physical path of the device.
Note - Run the reset-all or the nvstore command to save
the new alias in non-volatile memory (NVRAM).
|
nvunalias
name path
|
Deletes a
user-created alias (see nvalias), where name
is the name of the alias and path is the physical
path of the device. Note - Run the
reset-all or nvstore command to save changes
in NVRAM.
|
nvstore
|
Copies the
contents of the temporary buffer to NVRAM and discards the
contents of the temporary buffer.
|
power-off/power-on
|
Powers the
system off or on.
|
printenv
|
Displays all
parameters, settings, and values.
|
probe-fcal-all
|
Identifies
Fiber Channel Arbitrated Loop (FCAL) devices on a system.
1
|
probe-sbus
|
Identifies
devices attached to all SBUS slots. Note -
This command works only on systems with SBUS slots.
|
probe-scsi
|
Identifies
devices attached to the onboard SCSI bus.
1
|
probe-scsi-all
|
Identifies
devices attached to all SCSI busses.
1
|
set-default
parameter
|
Resets the
value of parameter to the default setting.
|
set-defaults
|
Resets the
value of all parameters to the default settings.
Tip - You can also press the Stop and N keys
simultaneously during system power-up to reset the values to
their defaults.
|
setenv
parameter value
|
Sets
parameter to specified value. Note - Run
the reset-all command to save changes in NVRAM.
|
show-devs
|
Displays all
the devices recognized by the system.
|
show-disks
|
Displays the
physical device path for disk controllers.
|
show-displays
|
Displays the
physical device path for frame buffers.
|
show-nets
|
Displays the
physical device path for network interfaces.
|
show-post-results
|
If run after
Power On Self Test (POST) is completed, this command
displays the findings of POST in a readable format.
|
show-sbus
|
Displays
devices attached to all SBUS slots. Similar to probe-sbus
.
|
show-tapes
|
Displays the
physical device path for tape controllers.
|
sifting
string
|
Searches for
OBP commands or methods that contain string. For
example, the sifting probe command displays
probe-scsi, probe-scsi-all, probe-sbus
, and so on.
|
.speed
|
Displays CPU
and bus speeds.
|
test
device-specifier
|
Executes the
selftest method for device-specifier. For example,
the test net command tests the network connection.
|
test-all
|
Tests all
devices that have a built-in test method.
|
.version
|
Displays OBP
and POST version information.
|
watch-clock
|
Tests a clock
function.
|
watch-net
|
Monitors the
network connection for the primary interface.
|
watch-net-all
|
Monitors all
the network connections.
|
words
|
Displays all
OBP commands and methods.
|
1
On Ultra (sun4u) systems, set the auto-boot? variable to false
, or the probe-scsi, probe-scsi-all, and probe-fcal-all
commands will cause the system to hang. To set this variable, type
setenv auto-boot? false at the ok prompt,
then type reset-all (remember to change the
value back to true when testing is completed, or
the system will not automatically boot).
|
|
OBDIAG
OBDiag enables you to interactively run tests and diagnostics at the
OBP level on these Sun systems:
-
Sun Enterprise 420R Server
-
Sun Enterprise 220R Server
-
Sun Ultra Enterprise 450 Server
-
Sun Ultra Enterprise 250 Server
-
Sun Ultra 80
-
Sun Ultra 60
-
Sun Ultra 30
-
Sun Ultra 10
-
Sun Ultra 5
OBDiag displays its test results using the LEDs on the front system
panel and on the keyboard. Use the system board and
power supply LED status tables table to interpret the results.
OBDiag also displays diagnostic and error messages on the system
console. To learn more about OBDiag, visit
docs.sun.com.
On the main logic board, OBDiag tests not only the main logic board,
but also its interfaces:
-
PCI
-
SCSI
-
Ethernet
-
Serial
-
Parallel
-
Keyboard/mouse
-
NVRAM
-
Audio
-
Video
How To Run OBDiag
To run OBDiag, simply type obdiag at the Open Boot ok
prompt.
You can also set up OBDiag to run automatically when the system is
powered on using the following methods:
-
Set the OBP diagnostics variable:
ok setenv diag-switch? true
-
Press the Stop and D keys simultaneously while you
power on the system.
-
On Ultra Enterprise servers, turn the key switch to the diagnostics
position and power on the system.
POWER ON SELF TEST (POST)
POST is a program that resides in the firmware of each board in a
system, and it is used to initialize, configure, and test the system
boards. POST output is sent to serial port A (on an Ultra Enterprise
server, POST output is sent only to serial port A on the system and
clock board). The status LEDs of each system board on Ultra
Enterprise servers indicate the POST completion status. For example,
if a system board fails the POST test, the amber LED stays lit.
You can watch POST ouput in real-time by attaching a terminal device
to serial port A. If none is available, you can use the OBP command
show-post-results to view the results after POST completes.
How To Run POST
-
Attach a terminal device to serial port A.
-
Set the OBP diagnostics variable:
ok setenv diag-switch? true
-
Set the desired testing level.
Two different levels of POST can be run, and you can choose to run
all tests or some of the tests. Set the OBP variable diag-level
to the desired level of testing (max or min), for example:
ok setenv diag-level max
-
If you wish to boot from disk, set the OBP variable diag-device
:
ok setenv diag-device disk
The system default for this variable is net.
-
Set the auto-boot variable:
ok setenv auto-boot? false
-
Save the changes.
ok reset-all
Power cycle the system (turn it off, and then back on).
POST runs while the system is powered on, and the output is
displayed on the device attached to serial port A. After POST is
completed, you can also run the OBP command show-post-results
to view the results.
SYSTEM BOARD AND POWER SUPPLY LED STATUS TABLES
This section contains reference information to help you understand
the LED status on system boards and power supplies installed on
Ultra Enterprise Server products.
Ultra Enterprise Server Front Panel and Clock Board LED Status
Power LED
|
Service
LED
|
Cycling
LED
|
Condition
|
off
|
off
|
off
|
no power
|
off
|
on
|
off
|
failure mode
|
off
|
off
|
on
|
failure mode
|
off
|
on
|
on
|
failure mode
|
on
|
off
|
off
|
hung in
POST/OBP or OS
|
on
|
off
|
on
|
hung in OS
|
on
|
on
|
off
|
hung in
POST/OBP hung in OS/failed component
|
on
|
on
|
on
|
hung in
POST/OBP hung in OS/failed component
|
on
|
off
|
flashing
|
OS running
normally
|
on
|
on
|
flashing
|
OS running
with failed component
|
on
|
flashing
|
off
|
slow flash
= POST fast flash=OBP
|
on
|
flashing
|
on
|
OS or OBP
error
|
|
Notes:
LED Name
|
Location
|
Note
|
Power LED
|
Left
|
Should
always be on. If all three LEDs are off, suspect a power
problem. If this LED is in any other state than on and
steady, it indicates a problem.
|
Service LED
|
Middle
|
This LED
should be off in normal operation. If on, a component is
in an error state and you should check check individual
board LEDs. A lit service LED does not imply there is an
OS-related problem.
|
Cycling LED
|
Right
|
This LED
should be flashing -- this is the normal state.
|
|
Ultra Enterprise CPU/Memory, I/O, and Disk Board LED Status
Power LED
|
Service
LED
|
Cycling
LED
|
Condition
|
off
|
off
|
off
|
board no
power
|
off
|
on
|
off
|
low power
mode - unpluggable
|
off
|
off
|
on
|
failure mode
|
off
|
on
|
on
|
failure mode
|
on
|
off
|
off
|
hung in
POST/OBP of OS
|
on
|
off
|
on
|
hung in OS
|
on
|
on
|
off
|
hung in
POST/OBP hung in OS and failed component on board
|
on
|
on
|
on
|
hung in
POST/OBP hung in OS/failed component on board
|
on
|
off
|
flashing
|
OS running
normally
|
on
|
on
|
flashing
|
OS running
normally/failed component on board
|
on
|
flashing
|
off
|
slow flash
= POST fast flash = OBP
|
on
|
flashing
|
on
|
OS or OBP
error
|
Notes:
Low Power Mode - If the status of the LEDs on the board is off-on-off,
this means the board is in low power mode. This occurs
when the board is disabled because it failed POST, or if
the board was just inserted. Low power mode is the only
state in which you may unplug the board while the system
is running. Disk Boards - The amber LED on
disk boards installed in Ultra Enterprise servers will
remain on when the Ultra Enterprise server is running
Solaris 2.6 5/98 or above. This is normal, and it
indicates the board is in low power mode (the board can be
removed from the system provided the disks have been
idled).
|
|
Power Supply LED Status
LEDs are used on the power supply to report an error condition such
as power supply or fan failure. Power supplies are hot-pluggable,
but the Solaris Operating Environment halts the system if
insufficient power is detected. Generally, a system is configured
with a power supply for each system board.
Green LED
|
Yellow
LED
|
Condition
|
off
|
off
|
No AC input
or keyswitch is turned off
|
on
|
off
|
normal
operation
|
on
|
on
|
Fan failure
or one or more voltages out of specification
|
off
|
on
|
One or more
DC outputs failed, or voltages out of specification, or
system in low power state
|
|
SOLARIS OPERATING ENVIRONMENT DIAGNOSTIC COMMANDS
The following table describes OS commands you can use to display
the system configuration, such as failed Field Replaceable Units
(FRU), hardware revision information, installed patches, and so
on.
/usr/platform/sun4u/sbin/prtdiag -v
|
Displays system configuration and diagnostic
information, and lists any failed Field Replaceable
Units (FRU).
|
/usr/bin/showrev [-p]
|
Displays revision information for the current
hardware and software. When used with the -p
option, displays installed patches.
|
/usr/sbin/prtconf
|
Displays system configuration information.
|
/usr/sbin/psrinfo -v
|
Displays CPU information, including clock speed.
|
|
RELATED LINKS
Using Device Path Names to Identify System Devices: Eliminate the Guesswork
Establish the hardware configuration of your system using the
OpenBootTM device tree.
|