| Title: | DEChub/HUBwatch/PROBEwatch CONFERENCE |
| Notice: | Firmware -2, Doc -3, Power -4, HW kits -5, firm load -6&7 |
| Moderator: | NETCAD::COLELLA DT |
| Created: | Wed Nov 13 1991 |
| Last Modified: | Fri Jun 06 1997 |
| Last Successful Update: | Fri Jun 06 1997 |
| Number of topics: | 4455 |
| Total number of notes: | 16761 |
--------------------------------------------------------------------------------
DANIEL JEYACHANDRAN <F/W V2.0 problem.> 05-MAR-1996 14:44
--------------------------------------------------------------------------------
HUBWATCH 4.1.1 & MAM module V4.0.2
DR900TM S/W V2.0.0, H/W V3, RO v04
^
|
|
HP OpenView reporting "RMON rising alarm exceeded 1" etc.
Hubwatch reports, "RPT900 Missing data LED14 PCOM LED program
LED15 No information." etc
I have not been able to find any docs on hubwatch messages.
I found the following entry in STARS database indicating a probable
bug in V2.0 S/W of DR900TM. Any fix for this problem yet?
Thanks in advance.
Daniel.
CSC, Sydney.
------------------------------------------------------------------------------
{Elev} problems with repeater after micro code upgrade getting traps
COPYRIGHT (c) 1988, 1993 by Digital Equipment Corporation.
ALL RIGHTS RESERVED. No distribution except as provided under contract.
PRODUCT or COMPONENT: DECrepeater 900TM
OP/SYS: OPENVMSVAX
VERSION INFORMATION:
Operating System Version(s): OSF1 V3.2
Layered Product/Component Version(s): {include all relevant version numbers}
Polycenter netview 3.1B
Hubwatch V3.1
SOURCE: Digital Equipment Corporation
SYMPTOM:
Customer has seen a problem with traps occuring since upgrading several
DECrepeater 900tm's from V1.1.0 to V2.0.0. The customer noted that some of
the upgraded repeaters are not having the problem. Customer is launching
hubwatch standalone, he is seeing the traps attached below via polycenter
netview. His version of hubwatch does not have "alarms" in the applications
pull down window so it appears that it can not be set up to log the traps.
Customer is seeing the traps hundreds of times a day on the affected
repeaters, some of the repeaters having the problem have hundreds of PC's
connected and some only have a device connected that monitors a remote sites
UPS.
The customer clicked on one of the affected modules and and read the status
screen to me:
status: enabled
Health Text: 0 ports are not operational, 0 ports are auto partitioned, 11
media are not available.
Health Text Changes: 414
Partitioned Ports: 0
Media Unavailable Ports: 11
Transmit Collisions: 335000 (uptime of 35 days on a busy network per customer).
Customer can think of nothing unique to those DECrepeater 900tm's that are
having the problem and those that are not.
Below is information received by the customer via FAX which is everything
the customer has on the traps, there may be a few inaccuracies due to a few
words not being readable:
A RMON falling alarm repeater mau repeater information
repeater mau total media unavailable 0
fell below threshold 1; value = 21: (sample type = 2)
specific = 2
enterprise= rmon 1.3.6.1.2.1.16
A RMON rising alarm: repeater repeater information repeater health text
changes 0
exceeded threshold 1; value 40 (sample type =2; alarm index =4)
specific: 1
generic : 6
catagory: threshold events
enterprise: rmon 1.3.6.1.2.1.16
source: agent (A)
hostname: rep2.shost.ksc.com
severity: critical
RMON rising alarm
repeater extensions, repeater basic package, repeater repeater information 5.0
exceeded threshold 1; value 83 (sample type =2; alarm index = 5)
specific: 1
generic : 6
catagory: threshold events
enterprise: rmon 1.3.6.1.2.1.16
source: agent (A)
hostname: rep2.elrio.ksc.com
severity: critical
MTI.fddi.ksc.co N elrio2.elrio.ksc.com
reported different link address than obtained from mti.fddi.ksc.com by snmp
specific: 58982401 (hex: 3840001)
generic : 6
catagory: mode configuration events
enterprise: netview 1.3.6.1.4.1.2.6.3.1
source: network (N)
hostname: net1.fddi.ksc.com
severity: indeterminate
I have talked to Frank Levesque regarding the problem, Frank has agreed to
look into what the traps are indicating and whether or not polycenter
netview could be a factor.
Customer has been somewhat difficult to work with and does not understand
why we are asking questions regarding version and topology information. I
have explained to him that we can not find information regarding what the
traps are telling us so we therefore can not tell him how to resolve the
issue.
The customer wants information on what the traps mean and how to correct
them, I have reviewed RFC 1157 (SNMP) and did not see them in there.
DIGITAL RESPONSE:
This problem has been reported to Engineering.
WORKAROUND:
{workaround}
ANALYSIS:
{cause}
| T.R | Title | User | Personal Name | Date | Lines |
|---|---|---|---|---|---|
| 3323.1 | NETCAD::GALLAGHER | Tue Mar 05 1996 09:31 | 135 | ||
>Customer has seen a problem with traps occuring since upgrading several
>DECrepeater 900tm's from V1.1.0 to V2.0.0.
Why is this a problem? If customers don't want to get traps they should
not provided trap sinks (trap destination IP addresses).
Traps don't always report "bad" things. Sometimes they're just imformational,
like the trap below:
>A RMON falling alarm repeater mau repeater information
>repeater mau total media unavailable 0
>fell below threshold 1; value = 21: (sample type = 2)
>specific = 2
>enterprise= rmon 1.3.6.1.2.1.16
Definition:
> 8: erptrMauTotalMediaUnavailable One or more media have become
> 1.3.6.1.4.1.36.2.18.11.5.1.1.5.1.1.0 available or unavailable.
This usually means that someone plugged in a a cable, or removed a cable.
>A RMON rising alarm: repeater repeater information repeater health text
>changes 0
>
>exceeded threshold 1; value 40 (sample type =2; alarm index =4)
>specific: 1
>generic : 6
>catagory: threshold events
>enterprise: rmon 1.3.6.1.2.1.16
>source: agent (A)
Traps are also sent when changes to healthText occur. The alarmed object
is in the DEC Private "Extended Repeater MIB". It's definition is:
>erptrHealthTextChanges OBJECT-TYPE
> SYNTAX Counter
> ACCESS read-only
> STATUS mandatory
> DESCRIPTION
> "This counter increments each time the rptrHealthText object
> defined in RFC 1516 is modified."
> REFERENCE
> "Reference RFC 1516 repeater MIB"
> ::= { erptrRptrInfo 4 }
And the repeater MIB's rptrHealthText object is defined as:
> rptrHealthText OBJECT-TYPE
> SYNTAX DisplayString (SIZE (0..255))
> ACCESS read-only
> STATUS mandatory
> DESCRIPTION
> "The health text object is a text string that
> provides information relevant to the operational
> state of the repeater. Agents may use this string
> to provide detailed information on current
> failures, including how they were detected, and/or
> instructions for problem resolution. The contents
> are agent-specific."
> REFERENCE
> "Reference IEEE 802.3 Rptr Mgt, 19.2.3.2,
> aRepeaterHealthText."
> ::= { rptrRptrInfo 3 }
This basically means that rptrHealth text is used to report anything
deemed "interesting" by the repeater implementation. The trap is meant
to alert network managers to look at the health text.
>specific: 58982401 (hex: 3840001)
>generic : 6
>catagory: mode configuration events
>enterprise: netview 1.3.6.1.4.1.2.6.3.1
>source: network (N)
>hostname: net1.fddi.ksc.com
>severity: indeterminate
I'm not sure what this is, but it looks like it's coming form a host
rather than a repeater. Can you confirm this?
I've attached a list of object on repeaters which are alarmed.
>The customer wants information on what the traps mean and how to correct
>them, I have reviewed RFC 1157 (SNMP) and did not see them in there.
rfc1757 describes RMON and contains definitions for the RMON rising and
falling event traps.
-Shawn
-------------------------------------------------------------------------
REPEATER/PORTswitch (see Matrix below):
DEFAULT ALARMS (NAME & OBJECTID) TRIGGER OF EVENT
-------------------------------- -----------------
1: pcomEsysNVRAMavailableOctets There is no more memory for
1.3.6.1.4.1.36.2.18.11.2.7.6.0 nonvolatile parameters.
2: rptrTotalPartitionedPorts One or more ports has been
1.3.6.1.2.1.22.1.1.6.0 autopartitioned, or a port that
was previously autopartitioned
is now operational.
3: erptrHealthTextChanges The module's operational state
1.3.6.1.4.1.36.2.18.11.5.1.1.1.1.4.0 has changed.
4: erptrTotalPortEvents The total number of times a port
1.3.6.1.4.1.36.2.18.11.5.1.1.1.1.5.0 has become nonoperational,
autopartitioned, or unavailable.
5: erptrTotalRptrErrors The total number of errors for
1.3.6.1.4.1.36.2.18.11.5.1.1.1.1.6.0 this module.
6: erptrDprTotalStateChange The module's link state change has
1.3.6.1.4.1.36.2.18.11.5.1.1.3.1.1.0 occurred while using redundant-link
configuration.
7: erptrSecurityRptrSecurityViolation A security violation has occurred
1.3.6.1.4.1.36.2.18.11.5.1.1.4.1.1.0 on one or more ports.
8: erptrMauTotalMediaUnavailable One or more media have become
1.3.6.1.4.1.36.2.18.11.5.1.1.5.1.1.0 available or unavailable.
9: erptrSecurityRptrSecurityViolation A security violation has occurred
1.3.6.1.4.1.36.2.18.11.5.1.1.4.1.1.0 on one or more ports.
10: erptrMauTotalMediaUnavailable One or more media have become
1.3.6.1.4.1.36.2.18.11.5.1.1.5.1.1.0 available or unavailable.
* indicates the module supports the alarm
| |||||
| 3323.2 | IAMOSI::DANIEL | Fri Mar 08 1996 01:04 | 18 | ||
Hi Shawn, Thanks for your quick reply. The attachment I sent was not from my customer. My cust log did not reveal any details like the one attached, but only a single line summary "RMON rising alarm exceeded 1" " " " " RMON falling alarm below 1 etc. I have asked him to give me a detailed print-out of it. Daniel. | |||||
| 3323.3 | NETCAD::MILLBRANDT | answer mam | Fri Mar 08 1996 10:36 | 9 | |
from .0 - > HUBWATCH 4.1.1 & MAM module V4.0.2 > DR900TM S/W V2.0.0, H/W V3, RO v04 You should be running V4.1 of the MAM with 4.1 HUBwatch. Dotsie | |||||