SunSolve Internal

 

  Simple Search | Advanced Search | Product Search | Tips | Investigation Wizard

 Search for in

Printer Friendly Page ] [ E-mail this Document to Someone ]
Was this document useful? Yes or No ]

Jump to
Infodoc ID   Synopsis   Date
130   Scsi Transport Errors and how to identify them   27 Jun 2000

Status Issued

Description Top

Scsi Transport Errors and how to identify them:

Note: This is just a quick refernce on scsi errors. Scsi errors can and will be more indepth than what we can go into here.

There are 4 tools that can help identify scsi transport errors on a sun4u machine.

  1. Find out what kind of system it is and what OS they are running
  2. Lastest /var/adm/messages
  3. Copy of /usr/platform/sun4u/sbin/prtdiag -v
  4. A copy of showrev -p

The /var/adm/messages will give you hints to what is going on.

If the scsi errors look like this:

Aug 3 14:02:57 asta unix: WARNING: /pci@1f,4000/scsi@3 (glm0):

Aug 3 14:02:57 asta unix: WARNING: /pci@1f,4000/scsi@3 (glm0):

Aug 3 14:02:57 asta unix: WARNING: /pci@1f,4000/scsi@3/sd@0,0 (sd0):

Aug 3 14:02:57 asta unix: SCSI transport failed: reason 'reset': retrying command                  

Then check the level of the glm patch they have. If it is much lower then the latest rev have them update the patch level.

If the messages look like this:

Mar 24 06:58:10 aurora unix: warning: /pci@6,4000/scsi@4 (glm4):

Mar 24 06:58:10 aurora unix: SCSI bus DATA IN phase parity error

Mar 24 06:58:10 aurora unix: warning: ID[SUNWpd.glm.parity_check.6010]

Mar 24 06:58:10 aurora unix: warning: /pci@6,4000/scsi@4 (glm4):

Mar 24 06:58:10 aurora unix: Target 0 reducing sync. transfer rate

Mar 24 06:58:10 aurora unix: warning: ID[SUNWpd.glm.sync_wide_backoff.6014]

Mar 24 06:58:10 aurora unix: warning: /pci@6,4000/scsi@4/sd@0,0 (sd60):

Mar 24 06:58:10 aurora unix: SCSI transport failed: reason 'tran_err': retrying command                  

Check the termination and cables for bent pins. Usually with SCSI bus DATA phase parity errors it is the cable and/ or termination. Also check for patches for glm, and disk firmware.

If the errors looks like this:

Jun 6 19:16:34 oerpsv01 unix: ID[SUNWssa.socal.link.5010] socal1: port 0:Fibre Channel is OFFLINE

Jun 6 19:16:34 oerpsv01 unix: ID[SUNWssa.socal.link.6010] socal1: port 0:Fibre Channel Loop is ONLINE

Jun 6 19:17:49 oerpsv01 unix: WARNING: /sbus@2,0/SUNW,socal@2,0/sf@0, 0/ssd@w22000020370ed7ff,0 (ssd16):

Jun 6 19:17:49 oerpsv01 unix: SCSI transport failed: reason 'timeout':retrying command

Jun 6 19:19:14 oerpsv01 unix: WARNING: /sbus@2,0/SUNW,socal@2,0/sf@0, 0/ssd@w22000020370edd68,0 (ssd12):

Jun 6 19:19:14 oerpsv01 unix: SCSI transport failed: reason 'timeout':retrying command

Jun 6 19:20:44 oerpsv01 unix: WARNING: /sbus@2,0/SUNW,socal@2,0/sf@0, 0/ssd@w22000020370edd69,0 (ssd13):

Jun 6 19:20:44 oerpsv01 unix: SCSI transport failed: reason 'timeout':retrying command                  

This is a typical error message of an A5x00 array. You can see that the machine is going offline and online.

If the errors are are on more than one disk. The first thing an engineer should check is the A5x00 patch matrix for latest firmware of the array and disks.

As for the hardware you are looking at the gbi, fiber cabels, IB bd. on the array, or fcal I/O bd on the server side. Take the /sbus@2,0/SUNW,socal@2,0/sf@0,0/ssd@w22000020370edd68,0 path and plug it into this web site: http://spider.AUS/cgi-bin/device.info and find out what is the bad board.

If you see errors like this:

Aug 6 07:49:59 bureau3 unix: WARNING: /pci@1f,0/pci@1/scsi@2/sd@2,0 (sd2):

Aug 6 07:49:59 bureau3 unix: Error for Command: write(10) ErrorLevel: Fatal

Aug 6 07:49:59 bureau3 unix: Requested Block: 6429986 ErrorBlock: 6429986

Aug 6 07:49:59 bureau3 unix: Vendor: SEAGATE Serial Number: NG031399

Aug 6 07:49:59 bureau3 unix: Sense Key: Not Ready

Aug 6 07:49:59 bureau3 unix: ASC: 0x4 (<vendor unique code 0x4>), ASCQ: 0x1,FRU: 0x2

Aug 6 07:49:59 bureau3 unix: WARNING: /pci@1f,0/pci@1/scsi@2/sd@2,0 (sd2):

Aug 6 07:49:59 bureau3 unix: Error for Command: write ErrorLevel: Fatal                  

*This will tell you that it is a disk at sd@2,0 (sd is for scsi disk) at target 2 controller 0 (onboard).

For A1000 and D1000 scsi errors:

Oct 7 16:30:00 uasympatico unix: WARNING: /sbus@1f,0/QLGC,isp@3,10000/sd@1,0(sd16):

Oct 7 16:30:00 uasympatico unix: SCSI transport failed: reason 'incomplete':retrying command

Oct 7 16:30:00 uasympatico unix: SCSI transport failed: reason 'incomplete':retrying command

Oct 7 16:30:01 uasympatico unix:

Oct 7 16:30:01 uasympatico unix:

Oct 7 16:30:11 uasympatico unix: WARNING: /sbus@1f,0/QLGC,isp@3,10000/sd@1,0(sd12):

Oct 7 16:30:11 uasympatico unix: WARNING: /sbus@1f,0/QLGC,isp@3,10000/sd@1,0(sd18):

Oct 7 16:30:11 uasympatico unix: SCSI transport failed: reason 'incomplete':retrying command

Oct 7 16:30:11 uasympatico unix: SCSI transport failed: reason 'incomplete':retrying command                  

*This tells you that they are using a QLOGIC card differental card that is having the problem.

Showrev -p will tell you the rev of the installed patches. The system type will tell you if it pci, scsi, or fcal type. The OS will tell you what patches are avaiable for that system. The prtdiag will give you a break down of the hardware.

INTERNAL SUMMARY:

Internal Summary

  • karenv@east.sun.cvom (Karen Vergakes x21036)
  • http://spider.AUS/cgi-bin/device.info

Submitter Karen Vergakes
Applies To Hardware, Hardware/Ultra Enterprise/Servers/Enterprise 6500, Hardware/Ultra Enterprise/Servers/Enterprise 6000, Hardware/Ultra Enterprise/Servers/Enterprise 5500, Hardware/Ultra Enterprise/Servers/Enterprise 5000, Hardware/Ultra Enterprise/Servers/Enterprise 4500, Hardware/Ultra Enterprise/Servers/Enterprise 4000, Hardware/Ultra Enterprise/Servers/Enterprise 3500, Hardware/Ultra Enterprise/Servers/Enterprise 3000, Hardware/Ultra Enterprise/Servers/Enterprise 450, Hardware/Ultra Enterprise/Servers/Enterprise 250, Hardware/Ultra Workstations/Ultra 80, Hardware/Ultra Workstations/Ultra 60, Hardware/Ultra Workstations/Ultra 30, Hardware/Ultra Workstations/Ultra 10, Hardware/Ultra Workstations/Ultra 5, Hardware/Ultra Workstations/Ultra 2, Hardware/Ultra Workstations/Ultra 1
Attachments (none)

Top

SunWeb Home SunWeb Search SunSolve Home Simple Search

Sun Proprietary/Confidential: Internal Use Only
Feedback to SunSolve Team