[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference azur::mcc

Title:	DECmcc user notes file. Does not replace IPMT.
Notice:	Use IPMT for problems. Newsletter location in note 6187
Moderator:	TAEC::BEROUD

Created:	Mon Aug 21 1989
Last Modified:	Wed Jun 04 1997
Last Successful Update:	Fri Jun 06 1997
Number of topics:	6497
Total number of notes:	27359

5735.0. "PNM 1.3/ipreachability down, but ping gets 'alive'" by MUNICH::SCHWEMMER () Mon Nov 15 1993 10:52

    
Our customer found unexpected behaviour in PNM 1.3/TCPIP_AM X1.3.7
at testing an snmp-node for ipreachability.

Over the same period of time a PNM-test for ipreachability gave status:
'ipreachability down' while UCX>PING-command answered with 'is alive'.

The snmp-node is wan-connected over a 64kbit-line; this line obviously
isn't very good.

In the first session he did:

MCC> SET MCC 0 TCPIP_AM UDP TIME 20,UDP RETR 3,ICMP TIME 20, ICMP RETR 3
MCC> SHOW SNMP IP.BE1003 IPREACH ,AT EVERY 00:00:03

In the second session he started UCX>ping/all with default-parameters,
that means with timeout=20 seconds.

The ipreachability-test sometimes came back with ipreachability down,
whereas the ping-cmd always gave status 'is alive'.

Because normally ipreachability is checked by using alarm rules, he gets
confused about the wrong ipreachability-message.

The customer uses UCX V2.0-d, but works with UCX$PING.EXE V2.0-0.

Is there anything we can do within MCC to correct this behaviour?
(changing parameters obviously doesn't help. He tried already to
raise the ICMP Timeout-value, but the behaviour was the same).

Customer doesn't want to use IP-Poller, because it just recognizes
a reachability-change.


Any help would be much appreciated,

Mathilde Schwemmer,
DSC Munich

T.R	Title	User	Personal Name	Date	Lines
5735.1	X????	TOOK::MINTZ	Erik Mintz	`Mon Nov 15 1993 11:29`	9
	Surely you are kidding. X1.3.7 is an unsupported, internal pre-field test baselevel code. Why on earth is your customer running it at all? If the same behavior occurs in released code, then by all means request a fix through the support channels. But a report against this kind of software would just get bounced. -- Erik
5735.2	Occurs also with official module	MUNICH::SCHWEMMER		`Tue Nov 16 1993 04:41`	8
	Customer got TCPIP_AM 1.3.7, because the same problem uccurred with the official module, delivered with PNM 1.3. Sorry, I forgot to mention it. Mathilde.
5735.3		TOOK::MINTZ	Erik Mintz	`Tue Nov 16 1993 06:21`	6
	I would definitely suggest that you escallate this through the usual process (see note 7). X1.3.7 is a much earlier baselevel than the released code, and part of the purpose of the escallation process is to insure that such things don't happen. -- Erik
5735.4	Clarification	BIKINI::KRAUSE	European NewProductEngineer for MCC	`Tue Nov 23 1993 04:31`	11
	Just to clarify the matter (and cool down Erik :-) : The MCC_TCPIP_AM image used here was built by Rahul Bose to fix a few bugs. The link date is 4-OCT-1993 so it is faily recent. It just happens to have "X1.3.7" in it's Component Version attribute. BTW: None of the fixed modules I got from engineering recently as response to an official CLD reflected the change in 'Component Version'. They all show "V1.3.0". So much for reliability of this attribute... *Robert
5735.5	Check if the UCX results are correct before blaming DECmcc.	MOLAR::YAHEY::BOSE		`Tue Nov 23 1993 10:09`	14
	RE .0 There is a known bug in UCX where the ping or the loop command will return a status stating that the node is alive, where in fact it is not reachable. Can you rlogin or ftp to that node? Can you ping that node from an Ultrix or OSF/1 station and compare the result. I aplogise for the version nos. The version nos. are defined in an include file used globally by all the MMs, and the system where I built the executable must have had an older version. Rahul.
5735.6		MOLAR::YAHEY::BOSE		`Tue Nov 23 1993 10:13`	6
	One more thing. Try to increase the ICMP Timeout and Retry values and see if it makes a difference. Since you are trying to test the node over a WAN, the response time would be greater. Rahul.
5735.7	Three to one	BIKINI::KRAUSE	European NewProductEngineer for MCC	`Wed Nov 24 1993 09:30`	19
	Rahul, the customer already set the ICMP Timeout to 20 and Retry to 3. It didn't help. UCX PING/ALL sometimes shows delays in the 3 to 5 seconds range and every now and then a missing packet, but this shouldn't produce an IP Reachability = Down, especially given the high timeout and retry values. The IP reachability poller, running at the same time and polling even more frequently, never shows a Down event. Also Telnet sessions to this node are never interrupted. So there are three voting against the AM :-) Because of the transient nature this problem is not easy to reproduce. But I'll have to escalate it anyway because this customer is annoyed by false alarms. Do you have any ideas how to tackle the problem? Log bits? Regards, *Robert
5735.8	Problem seen at many sites in Sweden	ANTIK::WESTERBERG	Stefan Westerberg DS Stockholm	`Wed Nov 24 1993 12:11`	23
	Hi, this problem has been observed on many sites here in Sweden. Some things to look for to keep the symtoms at a minmum level is: 1. Check SNMP rules that generates alot of exceptions. Try to rewrite them or remove them. 2. Check the quality of the local lan the PN station is connectet to. This is not tested but on segment with a high error frequens the IPreachability problem occurs more frequent. 3. Increase the BYTLIM quota for UCX processes.(see UCX$AUX_CONFIG.COM) Our general feeling is that it is UCX that cause the problem with IPreachability alarms. Another thing is that increasing the ICMP timeout and ICMP retries don't seem to have any affect. Some time it only seems to make it worse ! Regards Stefan P.S We have an entered CLD on this but I don't recal the number.
5735.9	CLD info?	BIKINI::KRAUSE	European NewProductEngineer for MCC	`Thu Nov 25 1993 04:54`	6
	Thanks Stefan! This makes me feel better. I were almost tempted to believe that I'm seeing ghosts :-) Could you send me the CLD info (number, sent to UCX or MCC?) *Robert