| Title: | DIGITAL UNIX (FORMERLY KNOWN AS DEC OSF/1) |
| Notice: | Welcome to the Digital UNIX Conference |
| Moderator: | SMURF::DENHAM |
| Created: | Thu Mar 16 1995 |
| Last Modified: | Fri Jun 06 1997 |
| Last Successful Update: | Fri Jun 06 1997 |
| Number of topics: | 10068 |
| Total number of notes: | 35879 |
Hi,
My customer is having a Alpha4100 running D. Unix 3.2G with HSZ40. They
are running SAP and Informix using raw device. We experience
intermittent cam scsi error on the raw device and noticed a "lost
cylinder" after zero the label on RZ29B-VA as show below:
Below is some info regarding the problem:
THe disklayout before the error:
-----------------------------------------------------
# /dev/rrzc33c:
type: SCSI
disk: HSZ40
label:
flags:
bytes/sector: 512
sectors/track: 113
tracks/cylinder: 20
sectors/cylinder: 2260
cylinders: 3707
sectors/unit: 8378028
rpm: 3600
interleave: 1
trackskew: 0
cylinderskew: 0
headswitch: 0 # milliseconds
track-to-track seek: 0 # milliseconds
drivedata: 0
8 partitions:
# size offset fstype [fsize bsize cpg]
a: 32 0 unused 1024 8192 # (Cyl. 0 - 0*)
b: 0 0 unused 1024 8192 # (Cyl. 0 - -1)
c: 8378028 0 unused 1024 8192 # (Cyl. 0 - 3707*)
d: 0 0 unused 1024 8192 # (Cyl. 0 - -1)
e: 0 0 unused 1024 8192 # (Cyl. 0 - -1)
f: 0 0 unused 1024 8192 # (Cyl. 0 - -1)
g: 4188998 32 unused 1024 8192 # (Cyl. 0*- 1853*)
h: 4188998 4189030 unused 1024 8192 # (Cyl. 1853*- 3707*)
-----------------------------------------------------
Then the SAP R3 under Informix RDBMS performed some works. After some time,
Informix DB crashed due to chunk offline problem.
Informix uses partition g and h for its raw devices.
When checked in the uerf, the following output was found:
uerf version 4.2-011 (122)
********************************* ENTRY 2.
*********************************
----- EVENT INFORMATION -----
EVENT CLASS ERROR EVENT
OS EVENT TYPE 199. CAM SCSI
SEQUENCE NUMBER 16.
OPERATING SYSTEM DEC OSF/1
OCCURRED/LOGGED ON Thu May 29 16:43:51 1997
OCCURRED ON SYSTEM posmal1
SYSTEM ID x00070016
SYSTYPE x00000000
PROCESSOR COUNT 2.
PROCESSOR WHO LOGGED x00000000
----- UNIT INFORMATION -----
CLASS x0000 DISK
SUBSYSTEM x0000 DISK
BUS # x0004
x0112 LUN x2
TARGET x1
----- CAM STRING -----
ROUTINE NAME cdisk_complete
----- CAM STRING -----
Retries Exhausted
----- CAM STRING -----
ERROR TYPE Hard Error Detected
----- CAM STRING -----
DEVICE NAME DEC HSZ4
----- CAM STRING -----
Active CCB at time of error
----- CAM STRING -----
CCB request completed with an error
ERROR - os_std, os_type = 11, std_type = 10
----- ENT_CCB_SCSIIO -----
*MY ADDR x3FE29B28
CCB LENGTH x00C0
FUNC CODE x01
CAM_STATUS x0084 CAM_REQ_CMP_ERR
AUTOSNS_VALID
PATH ID 4.
TARGET ID 2.
TARGET LUN 2.
CAM FLAGS x00000442
CAM_QUEUE_ENABLE
CAM_DIR_IN
CAM_SIM_QFRZDIS
*PDRV_PTR x3FE29828
*NEXT_CCB x00000000
*REQ_MAP x3FE08400
VOID (*CAM_CBFCNP)() x00526660
*DATA_PTR x400A5828
DXFER_LEN x00002000
*SENSE_PTR x3FE29850
SENSE_LEN xA0
CDB_LEN x0A
SGLIST_CNT x0000
CAM_SCSI_STATUS x0002 SCSI_STAT_CHECK_CONDITION
SENSE_RESID x8E
RESID x00002000
CAM_CDB_IO x000000100000ACD47F000028
CAM_TIMEOUT x0000003C
MSGB_LEN x0000
VU_FLAGS x4000
TAG_ACTION x20
----- CAM STRING -----
Error, exception, or abnormal
_condition
----- CAM STRING -----
ILLEGAL REQUEST - Illegal request or
_CDB parameter
----- ENT_SENSE_DATA -----
ERROR CODE x0070 CODE x70
SEGMENT x00
SENSE KEY x0005 ILLEGAL REQ
INFO BYTE 3 x00
INFO BYTE 2 x00
INFO BYTE 1 x00
INFO BYTE 0 x00
ADDITION LEN x0A
CMD SPECIFIC 3 x00
CMD SPECIFIC 2 x00
CMD SPECIFIC 1 x00
CMD SPECIFIC 0 x00
ASC x21
ASQ x00
FRU x00
SENSE SPECIFIC x0200C0
ADDITIONAL SENSE
0000: 02000000 00000000 00000000 00000000 *................*
0010: 00000000 00000000 00000000 00000000 *................*
0020: 00000000 00000000 00000000 00000000 *................*
0030: 00000000 00000000 00000000 00000000 *................*
0040: 00000000 00000000 00000000 00000000 *................*
0050: 00000000 00000000 00000000 00000000 *................*
0060: 00000000 00000000 00000000 00000000 *................*
0070: 00000000 00000000 00000000 00000000 *................*
0080: 00000000 00000000 00000000 00000000 *................*
0090: 7E250000 00005E3C 00000000 00000000 *..%~<^..........*
****************END Of uerf extraction****************************
After the error, the following steps was carried out:
1) disklabel -z /dev/rrzc33c
2) disklabel -wr /dev/rrzc33c hsz40
The total cylinder has becomes less 1, ie before error=3707,
after error=3706.
# /dev/rrzc33c:
type: SCSI
disk: HSZ40
label:
flags: dynamic_geometry
bytes/sector: 512
sectors/track: 113
tracks/cylinder: 20
sectors/cylinder: 2260
cylinders: 3706
sectors/unit: 8377528
rpm: 3600
interleave: 1
trackskew: 0
cylinderskew: 0
headswitch: 0 # milliseconds
track-to-track seek: 0 # milliseconds
drivedata: 0
8 partitions:
# size offset fstype [fsize bsize cpg]
a: 200000 0 unused 1024 8192 # (Cyl. 0 - 88*)
b: 0 0 unused 1024 8192 # (Cyl. 0 - -1)
c: 8377528 0 unused 1024 8192 # (Cyl. 0 - 3706*)
d: 0 0 unused 1024 8192 # (Cyl. 0 - -1)
e: 0 0 unused 1024 8192 # (Cyl. 0 - -1)
f: 0 0 unused 1024 8192 # (Cyl. 0 - -1)
g: 4088700 200004 unused 1024 8192 # (Cyl. 88*- 1897*)
h: 4088700 4288708 unused 1024 8192 # (Cyl. 1897*- 3706*)
*************End of disklabel*********************************
What does the error means and why there is a missing of one cylinder?
Appreciate someone can help?
Thks and rgds.
Ong
| T.R | Title | User | Personal Name | Date | Lines |
|---|---|---|---|---|---|
| 10025.1 | Haven't a clue why you're getting illegal request. | SSDEVO::ROLLOW | Dr. File System's Home for Wayward Inodes. | Tue Jun 03 1997 10:52 | 21 |
The size change is pretty simple. When you initialize a disk on an HS family controller, it uses some of the space at the end of the disk for its configuration information. The HSJ and HSJ use the space to implement MSCP Forced Error, since that is expected. The HSZ may do something similar. In the event the disk is used in a RAID, stripe set or mirror, the space is used to include information about the other members of the array, etc. This is the reason the disk ceases to be reported as its native type, but shows up as HSZ40. There is an option on the "add disk" command to disable the allocation of this space and present the disk exactly as it is; TRANSPORTABLE. You certainly lose the ability to make the disk a member of anything interesting with respect to the controller. I don't know what other features you lose. This is significant because in firmware versions starting in V2.7Z there are commands to change single devices in a single member mirrors, then add a member to the mirror to copy the data and break the mirror when done (clone). Obviously it can also be used to promote single devices to being mirrors. | |||||
| 10025.2 | SMURF::KNIGHT | Fred Knight | Tue Jun 03 1997 16:39 | 21 | |
You also need to look at the HSZ40 blitzs that have been sent out. There is one where some conditions can cause the HSZ40 to change the size of a device (all by itself). You may have hit that case. You could also ask in the HSZ40 notes file. I also find it interesting that the first disklabel is NOT flagged as dynamic_geometry, yet the second one is. That tells me that the disk was not hooked to an HSZ40 at the time it was first disklabeled! If it was disklabeled with the name HSZ40 (but connected as a single spindle), then the disk was transfered to the HSZ40 and used (without any re-labeling) then this is EXACTLY what you will see. The label will have the older large size, yet the H/W will have a current smaller size. As Alan pointed out, the HSZ40 reserves some of the media space for it's own meta-data. Fred | |||||