SunSolve Internal

Infodoc ID   Synopsis   Date
23476   Hardware Diagnostics for Sun Systems: A Toolkit for System Administrators   10 Aug 2000

Description Top


Hardware Diagnostics for Sun TM Systems: A Toolkit for System Administrators
 

Have you ever stared at the ok prompt on your Sun system, and wondered how to continue? Or have you ever wondered why all the LEDs on the system board sometimes appear to flash madly like a broken street light?

Look no further -- read on to find the answer to these questions.

By using OpenBootTM commands, the Power On Self Test (POST) program, and the status LEDs on system boards, you can diagnose hardware related problems on Sun Microsystems TM server and desktop products. With these low-level diagnostics, you can establish the state of the system and attached devices. For example, you can determine if a device is recognized by the system and working properly, or you can also obtain useful system configuration information.

Use this table to locate subjects in this article:

OpenBoot Prom (OBP) Diagnostic Commands and Tools Describes OBP commands you can use to display the system configuration, test devices attached to the system, monitor network connections, and more.
OBDiag Shows how you can run tests and perform diagnostics on the main logic board and its interfaces, and on devices such as disk and tape drives.
Power On Self Test (POST) Explains how POST initializes, configures, and tests the system, and shows you how to capture POST output and interpret the results using the LEDs on the system board and power supply.
System Board and Power Supply LED Status Tables Provides reference information to help you interpret the meaning of LED status for system boards and power supplies installed on Ultra TM Enterprise Server products.
Solaris Operating Environment Diagnostic Commands Lists useful OS commands you can use to display the system configuration, including failed Field Replacable Units (FRU), hardware revision information, installed patches, and more.

OBP DIAGNOSTIC COMMANDS AND TOOLS

OBP is a powerful, low-level interface to the system and devices attached to the system (OBP is also known as the ok prompt). By entering simple OBP commands, you can learn system configuration details such as the ethernet address, the CPU and bus speeds, installed memory, and so on. Using OBP, you can also query and set system parameter values such as the default boot device, run tests on devices such as the network interface, and display the SCSI and SBUS devices attached to the system.

The following table describes commands available in OpenBoot version 3. x. To use a command, simply type the command at the OBP ok prompt and press Return.

banner Displays the power on banner. The banner includes information such as CPU speed, OBP revision, total system memory, ethernet address and hostid.
devalias alias path Defines a new device alias, where alias is the new alias name and path is the physical path of the device. If devalias is used without arguments, it displays all system device aliases (will run up to 120 MHz).
.enet-addr Displays the ethernet address.
led-off/led-on Turns the system led off or on.
nvalias name path Creates a new alias for a device, where name is the name of the alias and path is the physical path of the device.
Note - Run the reset-all or the nvstore command to save the new alias in non-volatile memory (NVRAM).
nvunalias name path Deletes a user-created alias (see nvalias), where name is the name of the alias and path is the physical path of the device.
Note - Run the reset-all or nvstore command to save changes in NVRAM.
nvstore Copies the contents of the temporary buffer to NVRAM and discards the contents of the temporary buffer.
power-off/power-on Powers the system off or on.
printenv Displays all parameters, settings, and values.
probe-fcal-all Identifies Fiber Channel Arbitrated Loop (FCAL) devices on a system. 1
probe-sbus Identifies devices attached to all SBUS slots.
Note - This command works only on systems with SBUS slots.
probe-scsi Identifies devices attached to the onboard SCSI bus. 1
probe-scsi-all Identifies devices attached to all SCSI busses. 1
set-default parameter Resets the value of parameter to the default setting.
set-defaults Resets the value of all parameters to the default settings.
Tip - You can also press the Stop and N keys simultaneously during system power-up to reset the values to their defaults.
setenv parameter value Sets parameter to specified value.
Note - Run the reset-all command to save changes in NVRAM.
show-devs Displays all the devices recognized by the system.
show-disks Displays the physical device path for disk controllers.
show-displays Displays the physical device path for frame buffers.
show-nets Displays the physical device path for network interfaces.
show-post-results If run after Power On Self Test (POST) is completed, this command displays the findings of POST in a readable format.
show-sbus Displays devices attached to all SBUS slots. Similar to probe-sbus .
show-tapes Displays the physical device path for tape controllers.
sifting string Searches for OBP commands or methods that contain string. For example, the sifting probe command displays probe-scsi, probe-scsi-all, probe-sbus , and so on.
.speed Displays CPU and bus speeds.
test device-specifier Executes the selftest method for device-specifier. For example, the test net command tests the network connection.
test-all Tests all devices that have a built-in test method.
.version Displays OBP and POST version information.
watch-clock Tests a clock function.
watch-net Monitors the network connection for the primary interface.
watch-net-all Monitors all the network connections.
words Displays all OBP commands and methods.
1 On Ultra (sun4u) systems, set the auto-boot? variable to false , or the probe-scsi, probe-scsi-all, and probe-fcal-all commands will cause the system to hang. To set this variable, type setenv auto-boot? false at the ok prompt, then type reset-all (remember to change the value back to true when testing is completed, or the system will not automatically boot).

OBDIAG

OBDiag enables you to interactively run tests and diagnostics at the OBP level on these Sun systems:

OBDiag displays its test results using the LEDs on the front system panel and on the keyboard. Use the system board and power supply LED status tables table to interpret the results.

OBDiag also displays diagnostic and error messages on the system console. To learn more about OBDiag, visit docs.sun.com.

On the main logic board, OBDiag tests not only the main logic board, but also its interfaces:

How To Run OBDiag

To run OBDiag, simply type obdiag at the Open Boot ok prompt.

You can also set up OBDiag to run automatically when the system is powered on using the following methods:

POWER ON SELF TEST (POST)

POST is a program that resides in the firmware of each board in a system, and it is used to initialize, configure, and test the system boards. POST output is sent to serial port A (on an Ultra Enterprise server, POST output is sent only to serial port A on the system and clock board). The status LEDs of each system board on Ultra Enterprise servers indicate the POST completion status. For example, if a system board fails the POST test, the amber LED stays lit.

You can watch POST ouput in real-time by attaching a terminal device to serial port A. If none is available, you can use the OBP command show-post-results to view the results after POST completes.

How To Run POST

  1. Attach a terminal device to serial port A.

  2. Set the OBP diagnostics variable:

    ok setenv diag-switch? true

  3. Set the desired testing level.
    Two different levels of POST can be run, and you can choose to run all tests or some of the tests. Set the OBP variable diag-level to the desired level of testing (max or min), for example:

    ok setenv diag-level max

  4. If you wish to boot from disk, set the OBP variable diag-device :

    ok setenv diag-device disk

    The system default for this variable is net.

  5. Set the auto-boot variable:

    ok setenv auto-boot? false

  6. Save the changes.
ok reset-all

  • Power cycle the system (turn it off, and then back on).

    POST runs while the system is powered on, and the output is displayed on the device attached to serial port A. After POST is completed, you can also run the OBP command show-post-results to view the results.

    SYSTEM BOARD AND POWER SUPPLY LED STATUS TABLES

    This section contains reference information to help you understand the LED status on system boards and power supplies installed on Ultra Enterprise Server products.

    Ultra Enterprise Server Front Panel and Clock Board LED Status

    Power LED Service LED Cycling LED Condition
    off off off no power
    off on off failure mode
    off off on failure mode
    off on on failure mode
    on off off hung in POST/OBP or OS
    on off on hung in OS
    on on off hung in POST/OBP
    hung in OS/failed component
    on on on hung in POST/OBP
    hung in OS/failed component
    on off flashing OS running normally
    on on flashing OS running with failed component
    on flashing off slow flash = POST
    fast flash=OBP
    on flashing on OS or OBP error

    Notes:

    LED Name Location Note
    Power LED Left Should always be on. If all three LEDs are off, suspect a power problem. If this LED is in any other state than on and steady, it indicates a problem.
    Service LED Middle This LED should be off in normal operation. If on, a component is in an error state and you should check check individual board LEDs. A lit service LED does not imply there is an OS-related problem.
    Cycling LED Right This LED should be flashing -- this is the normal state.

    Ultra Enterprise CPU/Memory, I/O, and Disk Board LED Status

    Power LED Service LED Cycling LED Condition
    off off off board no power
    off on off low power mode - unpluggable
    off off on failure mode
    off on on failure mode
    on off off hung in POST/OBP of OS
    on off on hung in OS
    on on off hung in POST/OBP
    hung in OS and failed component on board
    on on on hung in POST/OBP
    hung in OS/failed component on board
    on off flashing OS running normally
    on on flashing OS running normally/failed component on board
    on flashing off slow flash = POST
    fast flash = OBP
    on flashing on OS or OBP error
    Notes: Low Power Mode - If the status of the LEDs on the board is off-on-off, this means the board is in low power mode. This occurs when the board is disabled because it failed POST, or if the board was just inserted. Low power mode is the only state in which you may unplug the board while the system is running.
    Disk Boards - The amber LED on disk boards installed in Ultra Enterprise servers will remain on when the Ultra Enterprise server is running Solaris 2.6 5/98 or above. This is normal, and it indicates the board is in low power mode (the board can be removed from the system provided the disks have been idled).
    Power Supply LED Status

    LEDs are used on the power supply to report an error condition such as power supply or fan failure. Power supplies are hot-pluggable, but the Solaris Operating Environment halts the system if insufficient power is detected. Generally, a system is configured with a power supply for each system board.

    Green LED Yellow LED Condition
    off off No AC input or keyswitch is turned off
    on off normal operation
    on on Fan failure or one or more voltages out of specification
    off on One or more DC outputs failed, or voltages out of specification, or system in low power state
    SOLARIS OPERATING ENVIRONMENT DIAGNOSTIC COMMANDS

    The following table describes OS commands you can use to display the system configuration, such as failed Field Replaceable Units (FRU), hardware revision information, installed patches, and so on.

    /usr/platform/sun4u/sbin/prtdiag -v Displays system configuration and diagnostic information, and lists any failed Field Replaceable Units (FRU).
    /usr/bin/showrev [-p] Displays revision information for the current hardware and software. When used with the -p option, displays installed patches.
    /usr/sbin/prtconf Displays system configuration information.
    /usr/sbin/psrinfo -v Displays CPU information, including clock speed.

    RELATED LINKS

    Using Device Path Names to Identify System Devices: Eliminate the Guesswork
    Establish the hardware configuration of your system using the OpenBootTM device tree.

  • Applies To Hardware, Tools/Diagnostics
    Attachments (none)