| Title: | DIGITAL UNIX (FORMERLY KNOWN AS DEC OSF/1) | 
| Notice: | Welcome to the Digital UNIX Conference | 
| Moderator: | SMURF::DENHAM | 
| Created: | Thu Mar 16 1995 | 
| Last Modified: | Fri Jun 06 1997 | 
| Last Successful Update: | Fri Jun 06 1997 | 
| Number of topics: | 10068 | 
| Total number of notes: | 35879 | 
Hi,
    
    My customer is having a Alpha4100 running D. Unix 3.2G with HSZ40. They
    are running SAP and Informix using raw device. We experience
    intermittent cam scsi error on the raw device and noticed a "lost
    cylinder" after zero the label on RZ29B-VA as show below:
    
Below is some info regarding the problem:
THe disklayout before the error:
-----------------------------------------------------
# /dev/rrzc33c:
type: SCSI
disk: HSZ40
label: 
flags:
bytes/sector: 512
sectors/track: 113
tracks/cylinder: 20
sectors/cylinder: 2260
cylinders: 3707
sectors/unit: 8378028
rpm: 3600
interleave: 1
trackskew: 0
cylinderskew: 0
headswitch: 0		# milliseconds
track-to-track seek: 0	# milliseconds
drivedata: 0 
8 partitions:
#        size   offset    fstype   [fsize bsize   cpg]
  a:       32        0    unused     1024  8192       	# (Cyl.    0 - 0*)
  b:        0        0    unused     1024  8192       	# (Cyl.    0 - -1)
  c:  8378028        0    unused     1024  8192       	# (Cyl.    0 - 3707*)
  d:        0        0    unused     1024  8192       	# (Cyl.    0 - -1)
  e:        0        0    unused     1024  8192       	# (Cyl.    0 - -1)
  f:        0        0    unused     1024  8192       	# (Cyl.    0 - -1)
  g:  4188998       32    unused     1024  8192       	# (Cyl.    0*- 1853*)
  h:  4188998  4189030    unused     1024  8192       	# (Cyl. 1853*- 3707*)
-----------------------------------------------------
Then the SAP R3 under Informix RDBMS performed some works. After some time, 
Informix DB crashed due to chunk offline problem.
Informix uses partition g and h for its raw devices.
When checked in the uerf, the following output was found:
						  uerf version 4.2-011 (122)
********************************* ENTRY     2.
*********************************
----- EVENT INFORMATION -----
EVENT CLASS                             ERROR EVENT 
OS EVENT TYPE                  199.     CAM SCSI 
SEQUENCE NUMBER                 16.
OPERATING SYSTEM                        DEC OSF/1 
OCCURRED/LOGGED ON                      Thu May 29 16:43:51 1997
OCCURRED ON SYSTEM                      posmal1 
SYSTEM ID                 x00070016
SYSTYPE                   x00000000
PROCESSOR COUNT                  2.
PROCESSOR WHO LOGGED      x00000000
----- UNIT INFORMATION -----
CLASS                         x0000     DISK 
SUBSYSTEM                     x0000     DISK 
BUS #                         x0004
                              x0112     LUN x2
                                        TARGET x1
----- CAM STRING -----
ROUTINE NAME                            cdisk_complete 
----- CAM STRING -----
                                        Retries Exhausted 
----- CAM STRING -----
ERROR TYPE                              Hard Error Detected 
----- CAM STRING -----
DEVICE NAME                             DEC     HSZ4 
----- CAM STRING -----
                                        Active CCB at time of error 
----- CAM STRING -----
                                        CCB request completed with an error 
ERROR - os_std, os_type = 11, std_type = 10
----- ENT_CCB_SCSIIO -----
*MY ADDR                  x3FE29B28
CCB LENGTH                    x00C0
FUNC CODE            x01
CAM_STATUS                    x0084     CAM_REQ_CMP_ERR 
                                        AUTOSNS_VALID 
PATH ID              4.
TARGET ID            2.
TARGET LUN           2.
CAM FLAGS                 x00000442
                                        CAM_QUEUE_ENABLE 
                                        CAM_DIR_IN 
                                        CAM_SIM_QFRZDIS 
*PDRV_PTR                 x3FE29828
*NEXT_CCB                 x00000000
*REQ_MAP                  x3FE08400
VOID (*CAM_CBFCNP)()      x00526660
*DATA_PTR                 x400A5828
DXFER_LEN                 x00002000
*SENSE_PTR                x3FE29850
SENSE_LEN            xA0
CDB_LEN              x0A
SGLIST_CNT                    x0000
CAM_SCSI_STATUS               x0002     SCSI_STAT_CHECK_CONDITION 
SENSE_RESID          x8E
RESID                     x00002000
CAM_CDB_IO           x000000100000ACD47F000028
CAM_TIMEOUT               x0000003C
MSGB_LEN                      x0000
VU_FLAGS                      x4000
TAG_ACTION           x20
----- CAM STRING -----
                                        Error, exception, or abnormal 
                                         _condition 
----- CAM STRING -----
                                        ILLEGAL REQUEST - Illegal request or 
                                         _CDB parameter 
----- ENT_SENSE_DATA -----
ERROR CODE                    x0070     CODE x70
SEGMENT              x00
SENSE KEY                     x0005     ILLEGAL REQ 
INFO BYTE 3          x00
INFO BYTE 2          x00
INFO BYTE 1          x00
INFO BYTE 0          x00
ADDITION LEN         x0A
CMD SPECIFIC 3       x00
CMD SPECIFIC 2       x00
CMD SPECIFIC 1       x00
CMD SPECIFIC 0       x00
ASC                  x21
ASQ                  x00
FRU                  x00
SENSE SPECIFIC       x0200C0
ADDITIONAL SENSE    
0000:   02000000  00000000  00000000  00000000        *................*
0010:   00000000  00000000  00000000  00000000        *................*
0020:   00000000  00000000  00000000  00000000        *................*
0030:   00000000  00000000  00000000  00000000        *................*
0040:   00000000  00000000  00000000  00000000        *................*
0050:   00000000  00000000  00000000  00000000        *................*
0060:   00000000  00000000  00000000  00000000        *................*
0070:   00000000  00000000  00000000  00000000        *................*
0080:   00000000  00000000  00000000  00000000        *................*
0090:   7E250000  00005E3C  00000000  00000000        *..%~<^..........*
****************END Of uerf extraction****************************
After the error, the following steps was carried out:
1) disklabel -z /dev/rrzc33c
2) disklabel -wr /dev/rrzc33c hsz40
The total cylinder has becomes less 1, ie before error=3707, 
  after error=3706.
# /dev/rrzc33c:
type: SCSI
disk: HSZ40
label: 
flags: dynamic_geometry
bytes/sector: 512
sectors/track: 113
tracks/cylinder: 20
sectors/cylinder: 2260
cylinders: 3706
sectors/unit: 8377528
rpm: 3600
interleave: 1
trackskew: 0
cylinderskew: 0
headswitch: 0		# milliseconds
track-to-track seek: 0	# milliseconds
drivedata: 0 
8 partitions:
#        size   offset    fstype   [fsize bsize   cpg]
  a:   200000        0    unused     1024  8192       	# (Cyl.    0 - 88*)
  b:        0        0    unused     1024  8192       	# (Cyl.    0 - -1)
  c:  8377528        0    unused     1024  8192       	# (Cyl.    0 - 3706*)
  d:        0        0    unused     1024  8192       	# (Cyl.    0 - -1)
  e:        0        0    unused     1024  8192       	# (Cyl.    0 - -1)
  f:        0        0    unused     1024  8192       	# (Cyl.    0 - -1)
  g:  4088700   200004    unused     1024  8192       	# (Cyl.   88*- 1897*)
  h:  4088700  4288708    unused     1024  8192       	# (Cyl. 1897*- 3706*)
*************End of disklabel*********************************
What does the error means and why there is a missing of one cylinder?
    Appreciate someone can help?
    
    Thks and rgds.
    
    Ong
       
| T.R | Title | User | Personal Name | Date | Lines | 
|---|---|---|---|---|---|
| 10025.1 | Haven't a clue why you're getting illegal request. | SSDEVO::ROLLOW | Dr. File System's Home for Wayward Inodes. | Tue Jun 03 1997 10:52 | 21 | 
| The size change is pretty simple. When you initialize a disk on an HS family controller, it uses some of the space at the end of the disk for its configuration information. The HSJ and HSJ use the space to implement MSCP Forced Error, since that is expected. The HSZ may do something similar. In the event the disk is used in a RAID, stripe set or mirror, the space is used to include information about the other members of the array, etc. This is the reason the disk ceases to be reported as its native type, but shows up as HSZ40. There is an option on the "add disk" command to disable the allocation of this space and present the disk exactly as it is; TRANSPORTABLE. You certainly lose the ability to make the disk a member of anything interesting with respect to the controller. I don't know what other features you lose. This is significant because in firmware versions starting in V2.7Z there are commands to change single devices in a single member mirrors, then add a member to the mirror to copy the data and break the mirror when done (clone). Obviously it can also be used to promote single devices to being mirrors. | |||||
| 10025.2 | SMURF::KNIGHT | Fred Knight | Tue Jun 03 1997 16:39 | 21 | |
| You also need to look at the HSZ40 blitzs that have been sent out. There is one where some conditions can cause the HSZ40 to change the size of a device (all by itself). You may have hit that case. You could also ask in the HSZ40 notes file. I also find it interesting that the first disklabel is NOT flagged as dynamic_geometry, yet the second one is. That tells me that the disk was not hooked to an HSZ40 at the time it was first disklabeled! If it was disklabeled with the name HSZ40 (but connected as a single spindle), then the disk was transfered to the HSZ40 and used (without any re-labeling) then this is EXACTLY what you will see. The label will have the older large size, yet the H/W will have a current smaller size. As Alan pointed out, the HSZ40 reserves some of the media space for it's own meta-data. Fred | |||||