| T.R | Title | User | Personal Name
 | Date | Lines | 
|---|
| 865.1 |  | OPG::PHILIP | And through the square window... | Thu Jul 13 1995 14:54 | 17 | 
|  | Ian,
  Can you do the following...
  1) Shut PCM down
  2) Define/Sys Console$Debug "TERMINAL"
  3) Define/Sys Console$Debug_Level 6144
  4) Start up PCM V1.6
  When the errors have occured, shut PCM down and post one of the
  controller_nn.log files here.
Cheers,
Phil
 | 
| 865.2 | info | SNOOTY::HAWLEYI | Mr Flibble says: Game over boys | Thu Jul 13 1995 17:25 | 65 | 
|  |     
    Phil,
    
    Heres the info:
    
    Author: Ian G Strachan, VSS, BCO      
    Date: 13-Jul-1995
    Posted-date: 12-Jul-1995
    
    $ set noon
    $ save_ver = f$verify (0)
    $ EXIT
    $ !
    $ ! Start a Child Controller process, name_num 1, child_num 1
    $ !                                                                
    $ CHILD :== $CONSOLE$IMAGE:CONSOLE$DAEMON.EXE
    $ CHILD "child" 1
    POLYCENTER Console Manager
    Console Controller Daemon Version V1.6-100
    Copyright (c) 1995 Digital Equipment Corporation. All Rights Reserved
    
     SYS$ASSIGN - Assigning Channel to LAT Device.
    Attempting to Map  Lat Terminal Start
    Cancelling QIOw timer  (status = 1, iosb[0] = 1)
    Attempting to Map  Lat Terminal End
    Attempting connect to Lat Terminal
    QIOw timer Timeout procedure called, cancelling I/O
    Cancelling QIOw timer  (status = 1, iosb[0] = 44)
    Connected to Lat Terminal, Status = 1
    iosb status was not normal value was <44>
     CMTerminalGetErrorMessages - Code is       : -190
     CMTerminalGetErrorMessages - Errno_Val is  : 44
     CMTerminalGetErrorMessages - Transport  is : 1
    Deleting LAT port 
     SYS$DASSGN - Deassigning Channel from LAT terminal->chan in Close.
     SYS$ASSIGN - Assigning Channel to LAT Device.
    Attempting to Map  Lat Terminal Start
    Cancelling QIOw timer  (status = 1, iosb[0] = 1)
    Attempting to Map  Lat Terminal End
    Attempting connect to Lat Terminal
    QIOw timer Timeout procedure called, cancelling I/O
    Cancelling QIOw timer  (status = 1, iosb[0] = 44)
    Connected to Lat Terminal, Status = 1
    iosb status was not normal value was <44>
     CMTerminalGetErrorMessages - Code is       : -190
     CMTerminalGetErrorMessages - Errno_Val is  : 44
     CMTerminalGetErrorMessages - Transport  is : 1
    Deleting LAT port 
     SYS$DASSGN - Deassigning Channel from LAT terminal->chan in Close.
     SYS$ASSIGN - Assigning Channel to LAT Device.
    Attempting to Map  Lat Terminal Start
    Cancelling QIOw timer  (status = 1, iosb[0] = 1)
    Attempting to Map  Lat Terminal End
    Attempting connect to Lat Terminal
    QIOw timer Timeout procedure called, cancelling I/O
    Cancelling QIOw timer  (status = 1, iosb[0] = 44)
    Connected to Lat Terminal, Status = 1
    iosb status was not normal value was <44>
     CMTerminalGetErrorMessages - Code is       : -190
     CMTerminalGetErrorMessages - Errno_Val is  : 44
     CMTerminalGetErrorMessages - Transport  is : 1
    Deleting LAT port 
     SYS$DASSGN - Deassigning Channel from LAT terminal->chan in Close.
    
    ...repeat to fade!...
 | 
| 865.3 |  | OPG::PHILIP | And through the square window... | Thu Jul 13 1995 17:45 | 25 | 
|  | Ian,
>>     SYS$ASSIGN - Assigning Channel to LAT Device.
>>    Attempting to Map  Lat Terminal Start
>>    Cancelling QIOw timer  (status = 1, iosb[0] = 1)
>>    Attempting to Map  Lat Terminal End
>>    Attempting connect to Lat Terminal
>>    QIOw timer Timeout procedure called, cancelling I/O
>>    Cancelling QIOw timer  (status = 1, iosb[0] = 44)
>>    Connected to Lat Terminal, Status = 1
  It would appear that we stalled trying to open the LTA
  device because our 5 second timer went off!!
  Now, the question is, why did our connect to the LAT device
  QIOW stall for so long??? the status of 44 (SS$_ABORT) returned
  when we did the cancel is normal because we did the abort
  ourselves.
  Question, when you did a "set host/lat" how long did it take
  to actually connect?
Cheers,
Phil
 | 
| 865.4 | change in 1.6? | SNOOTY::HAWLEYI | Mr Flibble says: Game over boys | Fri Jul 14 1995 15:18 | 10 | 
|  |     
    Phil,
    
    Is this a big change in 1.6?
    Can this be changed so that it allows more time?
    
    My customer doesnt think it takes 5 secs to establish a connection...
    but if it works under 1.5A how comes it doesnt work under 1.6?
    
    Ian.
 | 
| 865.5 |  | OPG::PHILIP | And through the square window... | Fri Jul 14 1995 17:11 | 51 | 
|  | Ian,
  looking a little more closely at the log output, it would appear
  something wierd is happening...
>>     SYS$ASSIGN - Assigning Channel to LAT Device.
>>    Attempting to Map  Lat Terminal Start
>>    Cancelling QIOw timer  (status = 1, iosb[0] = 1)
>>    Attempting to Map  Lat Terminal End
>>    Attempting connect to Lat Terminal
  The above has done the QIOW to connect to the lat device, before
  we did this QIO we called SYS$SETIMR for 5 seconds
>>    QIOw timer Timeout procedure called, cancelling I/O
  We are in the timers AST routine here, meaning it took 5 seconds
  (maybe!!!) so we SYS$CANCEL the QIOW for the connect 
>>    Cancelling QIOw timer  (status = 1, iosb[0] = 44)
  The QIOW has returned, but neither its status or IOSB[1] values are
  SS$_CANCEL, so we assume that the timer is still running, so we do a
  SYS$CANTIM on it...
  Now the IOSB[0] is 44 meaning the QIOW completed with SS$_ABORT, this
  normally happens when there is a problem with the terminal server (the
  port has hung up or something. Is there any chance of the customer
  using TSM or NCP to connect to the terminal server and doing a SHOW USER
  to see if something has grabbed the port on the server?
  The message I would have expected here if the QIOW terminated because of
  the SYS$CANCEL is an IOSB of SS$_CANCEL and a status of SS$_CANCEL
  resulting in debug output saying something like ...
QIOw was cancelled  (status = xx, iosb[0] = xx)
    Resetting status to SS$_TIMEOUT
  Now, it could be that the timer was completed prematurely because we dont
  use an event flag on it (we have had problems like this before) so, what
  I have done is added an event flag to the SYS$SETIMR call this change will
  be in the FT ECO kit which we will release on Monday, it was to be today,
  but we have had quite a busy week. Can your customer try this ECO kit to see
  if it fixes their problems? If it doesnt, then I will tell you how to increase
  the 5 second timer and we will see if that makes a difference.
Cheers,
Phil
 
 | 
| 865.6 |  | 29067::BUTTERWORTH | Gun Control is a steady hand. | Fri Jul 14 1995 19:45 | 12 | 
|  |     >  Now, it could be that the timer was completed prematurely because we
    >dont use an event flag on it (we have had problems like this before) so,
    >what
    
    Phil,
      The event flag is *irrelevant* to the actual firing of the timer. If you
    specify 5 seconds, you'll get 5 seconds unless someone does a SET TIME
    command. Period - the end. The event flag is set when the timer
    expires.
    
    Regs,
      Dan
 | 
| 865.7 |  | OPG::PHILIP | And through the square window... | Sat Jul 15 1995 15:19 | 8 | 
|  | Dan,
  In which case I dont know what is happening here, except that the timer did 
  fire, meaning it took at least 5 seconds to try the connect to the server, 
  this would indicate either a LAT or terminal server problem to me.
Cheers,
Phil
 | 
| 865.8 | ou est le patch? | SNOOTY::HAWLEYI | Mr Flibble says: Game over boys | Mon Jul 17 1995 10:50 | 23 | 
|  |     
    Phil,
    
    but does this explain why it works in 1.5a and not in 1.6?
    The lat/terminal server setup is the same.
    
    We have connected to the terminal server and done a SHOW USER and
    theres NOTHING with any hold on the port.
    
    Everything seems to point to a change in operation in 1.6 that is
    incompatible with my customers setup.
    
    I'd like to put the ECO patch on but the customer tells me that he is
    not allowed to put FT software on the system normally, but we may be
    able to make an exception. where is the kit?
    
    Also, if you could tell me how to change the 5 second timer i would be
    very grateful as this would be alot simpler and we are running out of
    time on this one.
    
    Thanks for all your help,
    
    Ian Hawley.
 | 
| 865.9 |  | OPG::PHILIP | And through the square window... | Mon Jul 17 1995 13:22 | 13 | 
|  | Ian,
  The patch kit isnt ready yet, sometime today or tomorrow we hope.
  In the Character cell editor type "SET HIDDEN" what you want to
  change is the value of "Console Open Timeout".
  Please remember, if you or your customer reports a problem and
  these hidden values have been changed WITHOUT A VERY VERY GOOD
  REASON then you are on your own.
Cheers,
Phil
 | 
| 865.10 | fixed | SNOOTY::HAWLEYI | Mr Flibble says: Game over boys | Mon Jul 17 1995 17:00 | 12 | 
|  |     
    Philip,
    
    Increasing the "Console Open Timeout" value has fixed the problem!
    So, testing continues...!
    
    I still can't see why it works under 1.5 but not under 1.6, but I
    guess mine is not to reason why!
    
    Thanks.
    
    Ian.
 | 
| 865.11 |  | OPG::PHILIP | And through the square window... | Mon Jul 17 1995 17:23 | 11 | 
|  | Hmm,
  It would be better if we understood this a little more, we chose 5 seconds
  as we figured that you would need to have a pretty bad network for it to
  take that long to open the LAT connection. I would still be inclined to
  have a close look at the customers LAN to see why its taking so long to
  open. Looking back at the code, it would appear that it worked in V1.5
  because this timer wasnt implemented for LAT in that version!
Cheers,
Phil
 | 
| 865.12 | lan probs | SNOOTY::HAWLEYI | Mr Flibble says: Game over boys | Tue Jul 18 1995 10:48 | 12 | 
|  |     
    Well, due to the way their network is setup it should take a little
    longer for it to establish a connection (dual ethernet = twice the
    work?). It takes more than 10 seconds in reality. I'm trying to suggest
    to the customer that he has a network problem. However, he is happy
    with the fix (its set to 20 seconds). Whatever, its definately not a
    PCM problem. Lets hope that now he can test 1.6 properly, he doesnt
    have a repeat of the console extract problems that plagued him in 1.5A!
    
    Thanks,
    
    Ian.
 | 
| 865.13 |  | OPG::PHILIP | And through the square window... | Tue Jul 18 1995 12:23 | 10 | 
|  | Ian,
  Your customer should be made aware that he could wait up to 16 * 20
  (320) seconds before his child controllers are up and running properly,
  Nearly 5 and half minutes is an awfully long time during which he wont
  be able to do ANYTHING on any of the systems consoles because the daemon
  wont be ready to accept connects!!!!!!
Cheers,
Phil
 | 
| 865.14 | 5� minutes!!! | SNOOTY::HAWLEYI | Mr Flibble says: Game over boys | Tue Jul 18 1995 17:23 | 11 | 
|  |     
    Philip,
    
    UUUUuuuuuuuuuurgh!
    
    I'll tell him. I'm not very conversant with communications so I can't
    suggest where the problem may lie but we will work something out.
    
    Thanks for all your help,
    
    Ian.
 | 
| 865.15 | multiple LAT Links! | 60549::SIMMONDS | Universe of Indifference | Mon Mar 11 1996 23:59 | 13 | 
|  |     Re: .*
    
    There is definitely a case for a longer default interval for the
    Console Open Timeout value : the configuration in .0 matches the
    one that my Customer is using and we too saw the failure to connect to
    any terminal server ports on servers connected to LAT LINKs other than
    the default (LAT$LINK).. obviously the additional time is taken by
    LTDRIVER/LATACP trying to reach the server via each LAT link in turn..
    
    Where should I enter a QAR for this?
    
    Thanks,
    John.
 | 
| 865.16 |  | CSC32::BUTTERWORTH | Gun Control is a steady hand. | Tue Mar 12 1996 12:17 | 7 | 
|  |     John,
      No QAR for relesed versions so please IPMT this. What's the maximum
    value you have found necessary?
    
    Regards,
       Dan
    
 | 
| 865.17 | Temp. workaround | 16660::ADKINS |  | Tue Mar 12 1996 15:02 | 8 | 
|  |     Well, one quick but sleazy workaround I've found is to define a service
    on the server. I was getting the timeout problem, but after defining
    a service on the server, my connections came up quickly. It looks like
    the service broadcast enters the server node information (LAT link and
    address) in the LAT database.
    
    Jim Adkins
    
 | 
| 865.18 |  | CSC32::BUTTERWORTH | Gun Control is a steady hand. | Wed Mar 13 1996 12:23 | 3 | 
|  |     Thats a great tip Jim. Thanks much!!
    
    Dan
 |