| Title: | FDDI - The Next Generation |
| Moderator: | NETCAD::STEFANI |
| Created: | Thu Apr 27 1989 |
| Last Modified: | Thu Jun 05 1997 |
| Last Successful Update: | Fri Jun 06 1997 |
| Number of topics: | 2259 |
| Total number of notes: | 8590 |
Hello!
I got similar problem as described in topic 1923.0. Error
messages are same, but some status are different and every time when
problem occurs got only 2 messages .
There aren't control time out messages (after reboot, but
there are
"FATAL ERRORS DETECTED BY DATALINK"
After reboot there are immediately 2 errors on PEA0 and then 1
or 2 times weekly we got the same errors on running system. In this
case all network connections are broken and in some minutes
restarted.
System description: 2x OVMS in cluster AS2100 with DEFPA.
If this is the same problem as described in 1923.0 , please
give me a pointer to patch, because the topic 1923 ending with CLD
request and I could not found patch.
ALPLAN02_062 includes sys$fwdriver for V6.1 and patch
ALPLAN03_062 is on hold.(?)
Thanks!
Gabriel
Gabriel Balogh @ BRC
V M S SYSTEM ERROR REPORT COMPILED 13-FEB-1997 11:01:47
PAGE 1.
******************************* ENTRY 133. *******************************
ERROR SEQUENCE 1. LOGGED ON: CPU_TYPE 00000005
DATE/TIME 29-JAN-1997 17:46:20.50 SYS_TYPE 00000009
SYSTEM UPTIME: 0 DAYS 00:00:17
SCS NODE: BCPUPP OpenVMS AXP V6.2
HW_MODEL: 0000045F Hardware Model = 1119.
DEVICE ATTENTION AlphaServer 2100 5/250
NI-SCS SUB-SYSTEM, BCPUPP$PEA0:
FATAL ERROR DETECTED BY DATALINK
STATUS 8A2DF200
00001201
DATALINK UNIT 0001
DATALINK NAME 41574603
00000000
00000000
00000000
DATALINK NAME = FWA1:
REMOTE NODE 00000000
00000000
00000000
00000000
REMOTE ADDR 00000000
0000
LOCAL ADDR 000400AA
0401
ETHERNET ADDR = 0E-01-01-00-00-00
ERROR CNT 0001
1. ERROR OCCURRENCES THIS ENTRY
UCB$L_ERRCNT 00000001
1. ERRORS THIS UNIT
V M S SYSTEM ERROR REPORT COMPILED 13-FEB-1997 11:01:47
PAGE 2.
******************************* ENTRY 134. *******************************
ERROR SEQUENCE 2. LOGGED ON: CPU_TYPE 00000005
DATE/TIME 29-JAN-1997 17:46:21.36 SYS_TYPE 00000009
SYSTEM UPTIME: 0 DAYS 00:00:18
SCS NODE: BCPUPP OpenVMS AXP V6.2
HW_MODEL: 0000045F Hardware Model = 1119.
DEVICE ATTENTION AlphaServer 2100 5/250
NI-SCS SUB-SYSTEM, BCPUPP$PEA0:
FATAL ERROR DETECTED BY DATALINK
STATUS 00000400
00001200
DATALINK UNIT 0001
DATALINK NAME 41574603
00000000
00000000
00000000
DATALINK NAME = FWA1:
REMOTE NODE 00000000
00000000
00000000
00000000
REMOTE ADDR 00000000
0000
LOCAL ADDR 000400AA
0401
ETHERNET ADDR = 0E-01-01-00-00-00
ERROR CNT 0001
1. ERROR OCCURRENCES THIS ENTRY
UCB$L_ERRCNT 00000002
2. ERRORS THIS UNIT
V M S SYSTEM ERROR REPORT COMPILED 13-FEB-1997 11:01:47
PAGE 3.
******************************* ENTRY 146. *******************************
ERROR SEQUENCE 3114. LOGGED ON: CPU_TYPE 00000005
DATE/TIME 30-JAN-1997 13:37:29.84 SYS_TYPE 00000009
SYSTEM UPTIME: 0 DAYS 19:51:23
SCS NODE: BCPUPP OpenVMS AXP V6.2
HW_MODEL: 0000045F Hardware Model = 1119.
ERL$LOGMESSAGE AlphaServer 2100 5/250
NI-SCS SUB-SYSTEM, _BCPUPP$PEA0:
PORT HAS CLOSED VIRTUAL CIRCUIT
LOCAL STATION ADDRESS, FFFFFFFFFF00(X)
LOCAL SYSTEM ID, 000000000401(X)
REMOTE STATION ADDRESS, 0000000000DE(X)
REMOTE SYSTEM ID, 000000000402(X)
UCB$L_ERTCNT 00000032
50. RETRIES REMAINING
UCB$L_ERTMAX 00000032
50. RETRIES ALLOWABLE
UCB$L_ERRCNT 00000003
3. ERRORS THIS UNIT
PPD$B_PORT 00
REMOTE NODE # 0.
PPD$B_STATUS 00
PPD$B_OPC 00
UNKNOWN OPCODE
PPD$B_FLAGS 00
V M S SYSTEM ERROR REPORT COMPILED 13-FEB-1997 11:01:47
PAGE 4.
******************************* ENTRY 148. *******************************
ERROR SEQUENCE 3116. LOGGED ON: CPU_TYPE 00000005
DATE/TIME 30-JAN-1997 13:38:28.29 SYS_TYPE 00000009
SYSTEM UPTIME: 0 DAYS 19:52:21
SCS NODE: BCPUPP OpenVMS AXP V6.2
HW_MODEL: 0000045F Hardware Model = 1119.
DEVICE ATTENTION AlphaServer 2100 5/250
NI-SCS SUB-SYSTEM, BCPUPP$PEA0:
FATAL ERROR DETECTED BY DATALINK
STATUS 0000045C
00001201
DATALINK UNIT 0001
DATALINK NAME 41574603
00000000
00000000
00000000
DATALINK NAME = FWA1:
REMOTE NODE 00000000
00000000
00000000
00000000
REMOTE ADDR 00000000
0000
LOCAL ADDR 000400AA
0401
ETHERNET ADDR = 0E-01-01-00-00-00
ERROR CNT 0001
1. ERROR OCCURRENCES THIS ENTRY
UCB$L_ERRCNT 00000004
4. ERRORS THIS UNIT
V M S SYSTEM ERROR REPORT COMPILED 13-FEB-1997 11:01:47
PAGE 5.
******************************* ENTRY 149. *******************************
ERROR SEQUENCE 3117. LOGGED ON: CPU_TYPE 00000005
DATE/TIME 30-JAN-1997 13:38:32.30 SYS_TYPE 00000009
SYSTEM UPTIME: 0 DAYS 19:52:25
SCS NODE: BCPUPP OpenVMS AXP V6.2
HW_MODEL: 0000045F Hardware Model = 1119.
DEVICE ATTENTION AlphaServer 2100 5/250
NI-SCS SUB-SYSTEM, BCPUPP$PEA0:
FATAL ERROR DETECTED BY DATALINK
STATUS 00000400
00001200
DATALINK UNIT 0001
DATALINK NAME 41574603
00000000
00000000
00000000
DATALINK NAME = FWA1:
REMOTE NODE 00000000
00000000
00000000
00000000
REMOTE ADDR 00000000
0000
LOCAL ADDR 000400AA
0401
ETHERNET ADDR = 0E-01-01-00-00-00
ERROR CNT 0001
1. ERROR OCCURRENCES THIS ENTRY
UCB$L_ERRCNT 00000005
5. ERRORS THIS UNIT
ANAL ERRLOG.SYS/ERR/FULL/SINCE=15-JAN-1997
00:00:00.00/INCL=PEA0/OUT=V6.TXT/ENTR=(START:133,END:149)
| T.R | Title | User | Personal Name | Date | Lines |
|---|---|---|---|---|---|
| 2220.1 | STAR::STOCKDALE | Fri Feb 14 1997 06:40 | 6 | ||
What version of VMS? Note that the error log entries are meaningless. Do a SHOW LAN/ERROR in SDA to find the device error information. - Dick | |||||
| 2220.2 | re .1 | BRADEC::BALOGH | Gabriel Balogh @BRC | Fri Feb 14 1997 09:31 | 18 |
Hello Dick! 1) Version of VMS is V6.2 2) I got output from show lan/err by fax from this reason I put here only part with errors: Fatal error count 2 Last error CSR 00000400 Fatal error code 3-XmtTimeout Last fatal error 14-feb 13:09:04 . . Transmit timeouts 2 Last UUB time 14-feb 13:57:03 In this moment they got the messages described in .0 entry=146,148,149 Thank! Gabriel | |||||
| 2220.3 | STAR::STOCKDALE | Fri Feb 14 1997 14:03 | 11 | ||
Normally, transmit timeouts occur when the link goes unavailable and there are outstanding transmits issued to the device. The driver times them out by declaring a fatal error which results in the error log entries. Most likely there is a ring problem and the link goes away for a while. You might see these sort of errors on multiple systems at the same time which would be a strong indication that the problem is not related to the system and DEFPA itself. I'd try swapping the cable used on the DEFPA and/or the port in the concentrator. - Dick | |||||
| 2220.4 | Re.: .3 | BRADEC::BALOGH | Gabriel Balogh @BRC | Mon Feb 17 1997 06:56 | 28 |
Hello Dick!
========================================\\
// ||
|| ||
|| MS900 Backplane ||
|| ||
|| ---------- ---------- ||
\\=========|DEF6X-MA| ======|DEFBA-MA|===//
---------- ----------
| | |
| | \____-> BCPUPP
| \______-> BCPDOWN
\________-> CAESAR
BCPUPP and BCPDOWN are in cluster, errors occur in different time.
On CAESAR no errors found.
UTP cables & ports on DEF6X was changed between cluster members
and DEFPA was changed on BCPUPP.
There is theoretical possibility to change port between
CAESAR & one cluster member. I try to do this today.
Thanks !
Gabriel
| |||||
| 2220.5 | + .$4 | BRADEC::BALOGH | Gabriel Balogh @BRC | Mon Feb 17 1997 08:59 | 54 |
You are right the problems appear on two nodes in same time, BUT
on node BCPDWN only 1 message (PORT HAS CLOSED VIRTUAL CIRCUIT)
and on the BCPUPP 3 Messages (PORT & 2 Data link see .0)
On BCPDWN no sho lan/err reported! (in this moment, but I can found
opposite case also).
SW versions are
MS900 4.1.1
900EF 1.5.2
900MX 3.2.3
Gabriel
P.S. here is the report from second cluster member.
There is no datalink errors!?
V M S SYSTEM ERROR REPORT COMPILED 17-FEB-1997 14:46:36
PAGE 1.
******************************* ENTRY 114. *******************************
ERROR SEQUENCE 6141. LOGGED ON: CPU_TYPE 00000005
DATE/TIME 30-JAN-1997 13:37:29.00 SYS_TYPE 00000009
SYSTEM UPTIME: 0 DAYS 19:51:22
SCS NODE: BCPDWN OpenVMS AXP V6.2
HW_MODEL: 0000045F Hardware Model = 1119.
ERL$LOGMESSAGE AlphaServer 2100 5/250
NI-SCS SUB-SYSTEM, _BCPDWN$PEA0:
PORT HAS CLOSED VIRTUAL CIRCUIT
LOCAL STATION ADDRESS, FFFFFFFFFF00(X)
LOCAL SYSTEM ID, 000000000402(X)
REMOTE STATION ADDRESS, 0000000000DE(X)
REMOTE SYSTEM ID, 000000000401(X)
UCB$L_ERTCNT 00000032
50. RETRIES REMAINING
UCB$L_ERTMAX 00000032
50. RETRIES ALLOWABLE
UCB$L_ERRCNT 00000003
3. ERRORS THIS UNIT
PPD$B_PORT 00
REMOTE NODE # 0.
PPD$B_STATUS 00
PPD$B_OPC 00
UNKNOWN OPCODE
PPD$B_FLAGS 00
ANA/ERR ERRLOG.SYS/INCL=PEA/SINCE=30-JAN-1997 00:00:00.00/BEFORE=31-JAN-1997
00:00:00.00/OUT=XXX.TXT
| |||||
| 2220.6 | STAR::STOCKDALE | Mon Feb 17 1997 13:24 | 10 | ||
So it sounds like the problem is localized to the one system (where the SHOW LAN/ERROR shows errors). I'd verify that the revisions of the modules that you provided are the correct versions. And verify the DEFPA firmware version and if everything is up to rev, start replacing hardware. Also, you could try the DEFPA in a different slot. I'll send you the latest V6.2 remedial stream SYS$FWDRIVER.EXE just in case although there were no problems fixed that I know of in this area. - Dick | |||||
| 2220.7 | re.: .6 | BRADEC::BALOGH | Gabriel Balogh @BRC | Tue Feb 18 1997 04:47 | 24 |
Hello Dick! "So it sounds like the problem is localized to the one system..." I could not prove now but I think, that messages are logged in the following order: BCPUPP BCPDWN PORT HAS CLOSED VIRTUAL CIRCUIT PORT HAS CLOSED VIRTUAL CIRCUIT FATAL ERROR DETECTED BY DATALINK - FATAL ERROR DETECTED BY DATALINK - LAN errors in SDA ================================================================================ I have found the above errors symmetric on the opposite machine, but I could not found LAN errors in errlog.sys files. This is a missing information, which will be prove, that errors are symmetric. Gabriel | |||||
| 2220.8 | +.7 | BRADEC::BALOGH | Gabriel Balogh @BRC | Tue Feb 18 1997 06:32 | 6 |
FDDI port on concentrator was changed between CAESAR & BCPDWN, yesterday. Now BCPDWN reports 2 LAN errors (3-XMitTimeouts) and the described 3 errorlog entries in errlog.sys => It's symmetric. Not dependent on concentrator port. It can depend on cluster ? Gabriel | |||||
| 2220.9 | BRADEC::BALOGH | Gabriel Balogh @BRC | Tue Mar 04 1997 07:43 | 10 | |
Hi! We have changed DECconcentrator 900MX. There are increasing LEM count on every port. What does it mean exactly? On VMS SDA> show lan /err => are no new errors, but on one of them was changed LAST UUB time. What does it mean LAST UUB TIME ? Thanks Gabriel. | |||||
| 2220.10 | STAR::STOCKDALE | Tue Mar 04 1997 09:44 | 13 | ||
>>There are increasing LEM count on every port. What does it mean exactly? LEM = Link Error Monitor. What counter are you seeing increment and who is displaying the counter? >>On VMS SDA> show lan /err => are no new errors, but on one of them >>was changed LAST UUB time. What does it mean LAST UUB TIME ? UUB are User Buffer Unavailable which means an application did not keep up with the incoming receives so the driver discarded a received packet for this user because the user had not supplied a buffer. - Dick | |||||
| 2220.11 | re .10 | BRADEC::BALOGH | Gabriel Balogh @BRC | Fri Mar 07 1997 07:50 | 8 |
LEM counters are increasing on every port directed via front inserts. LEMR also are non zeroes on 2 of them. These values are from DECconcentrator 900 MX in MS mananger. thank. Gabriel | |||||
| 2220.12 | Some questions on the UTP ports | NPSS::KIRK | Fri Mar 07 1997 08:32 | 11 | |
What UTP cable lengths are used on the ports with the increasing
LEM counts? Can you measure the ring utilization rate?
We have been having some LEM problems with UTP FDDI connections.
Can you obtain the 54 Class numbers and serial numbers from the
UTP cards?
Dick Kirk
NEtwork Product support
| |||||
| 2220.13 | re. .12 | BRADEC::BALOGH | Gabriel Balogh @BRC | Fri Mar 07 1997 10:13 | 14 |
Hi Dick!
UTP cable lenght is less then 20m.(Customer guess)
UTP Card 54 Class number is : 54-22499-03
SN: TA62900004
Thanks
Gabriel
P.S. there are another 2 UTP card. I can check number for these cards.
If you require.
| |||||