| T.R | Title | User | Personal Name
 | Date | Lines | 
|---|
| 3544.1 |  | TOOK::SWIST | Jim Swist LKG2-2/T2 DTN 226-7102 | Wed Aug 12 1992 12:13 | 2 | 
|  |     setenv MCC_LOG 0x10000 and rerun the first test....
    
 | 
| 3544.2 |  | MICROW::LIM |  | Thu Aug 13 1992 09:22 | 37 | 
|  |     I'm having the same problem in my regression test collection:
    %MCC-E-RECEIVEERROR, error trying to receive a packet.
    
    I enroll the mcc process in prologue, sleeps for 30 seconds.
    The first call to mcc in the first test fails with the error.  It does
    not happen always, but happens about 80 %.
    
    When I turned on logging, the following appears:
    
    %MCC-I-LOG, MCC_LOG = 10000
    RPC_LOG: REG CONN-OK: frm id=1, to id=16
    RPC_LOG: SEND: frm id=1, to id=16
    RPC_LOG: SEND: frm id=1, to id=16
    RPC_LOG: DISCONN-OK, id=1
    RPC_LOG: REG DISCONN: frm id=1, to id=16
    %MCC-E-RECEIVEERROR, error trying to receive a packet
    
    The next call, which succeeded, has the following log:
    DECmcc (T1.2.7)
    
    %MCC-I-LOG, MCC_LOG = 10000
    
    RPC_LOG: CONN-FAIL: frm id=1, to id=16
    RPC_LOG: DISCONN-OK, id=1
    RPC_LOG: REG DISCONN: frm id=1, to id=16
    Starting MM mcc_tps_am (enroll ID 16) from MM enroll id 1
    RPC_LOG: REG CONN-OK: frm id=1, to id=16
    RPC_LOG: SEND: frm id=1, to id=16
    RPC_LOG: SEND: frm id=1, to id=16
    RPC_LOG: RECV: frm id=16, to id=1
    
    TPCONTROLLER LOCAL_NS:.servershow
    AT YYYY-MM-DD-HH:MM:SS
    
    It appears the first call never started mcc_tps_am, but the second call
    did, but why?
                         
 | 
| 3544.3 | Please try the V1.2 SSB kit | TOOK::GUERTIN | It fall down, go boom | Thu Aug 13 1992 11:46 | 4 | 
|  |     I believe there are a couple of bug fixes in V1.2.0 that could solve
    your receive errors.
    
    -Matt.
 | 
| 3544.4 |  | MICROW::LIM |  | Thu Aug 13 1992 13:11 | 3 | 
|  | I'm just trying to understand... why does this problem  happen?
- Kyungae
 | 
| 3544.5 | Bug somewhere | TOOK::MINTZ | Erik Mintz, dtn 226-5033 | Thu Aug 13 1992 14:07 | 6 | 
|  | This can happen if a management module crashes.
However, there were some problems in the T1.2.7 RPC mechanism;
that is why Matt suggests that you upgrade.
-- Erik
 | 
| 3544.6 | will MCC set $status? | MACROW::LIM |  | Fri Aug 14 1992 10:26 | 4 | 
|  |     If %MCC-E-RECEIVEERROR is returned, will $status be set to a certain
    value?
    
    Kyungae
 | 
| 3544.7 | even worse with new kit.... | MICROW::SEVIGNY |  | Wed Aug 26 1992 14:56 | 8 | 
|  |     
    We took the advice, and upgraded to 1.2
    
    The "Error receiving packet" occurs more frequently now.  What should
    we do now?
    
    Marc
    
 | 
| 3544.8 | Something weird must be happening at enrollment | TOOK::GUERTIN | It fall down, go boom | Wed Aug 26 1992 15:33 | 18 | 
|  |     Why not set the log bit for the background process?  Perhaps the MM is
    self-destructing.  The foreground process (FCL) seems to be behaving
    itself.  I'm assuming that the FCL process is starting the background
    MM during enrollment.  The MM must be dying because later when you try
    to access it again, a message is displayed saying that it is starting
    it up (again).
    
    Also you could try running the MM in the foreground:
    
    % /<your-path>/<your-MM-name> 16 Y
                                  ^  ^
                                  |  |
         your MM's enrollment id -+  |
                                     +-- Enroll the MM [Y/N]?
    
    See if any error messages get displayed.
    
    -Matt.
 | 
| 3544.9 |  | MICROW::SEVIGNY |  | Thu Aug 27 1992 13:47 | 12 | 
|  |     
    Well, I took your advice, and set the MCC_LOC env var, and reran the
    tests while running the AM in the foreground.  Almost every request to
    the AM resulted in a segmentation fault.  
    
    When I used the debugger to determine where the AM dies, it seems to
    often die in MCC's RPC. 
    
    I hope that it is safe to assume that there are no incompatibilities
    between MCC's RPC and DCE's RPC, right?  Because the AM uses DCE RPC to
    communicate to the agent.  Sound suspicious?
    
 | 
| 3544.10 | Dispatch table in synch? | TOOK::MINTZ | Erik Mintz, dtn 226-5033 | Thu Aug 27 1992 13:52 | 9 | 
|  | Are you sure that the version of the AM that you are running is
EXACTLY the same as the one last enrolled?  And are you sure that
nobody else is writing to your dispatch table?
This kit of symptom often happens when the dispatch table gets out
of synch with the module (eg when the module is re-linked).
-- Erik
 | 
| 3544.11 |  | MICROW::SEVIGNY |  | Thu Aug 27 1992 16:00 | 37 | 
|  |     
    
     These are the steps that I took.
    
    1. Log onto a node with NO other users.
    2. manage enroll mcc_tps_am (MCC_MMEXE_LOCATION points to a known
    executable).
    3. manually kill the AM. (kill <pid>)
    4. make sure there are no other AM running. (there were none)
    5. setenv MCC_LOG 0x10000
    6. mcc_tps_am 16 Y &  
    7. sleep 10  (give it some time to initialize)
    8. manage create ......
    
    When I look at /usr/mcc/mcc_system, I notice that
    mcc_dispatch_table.dat has been updated.  So I assume that the enroll
    did what it was supposed to do.
    
    This is the result of the most recent fault:  (doesn't look very
    meaningful to me)
    
    dbx /pdir/ptpmresults/cd5a_debug/mcc_tps_am
    core
    dbx version 2.10.1
    Type 'help' for help.
    Corefile produced from file "mcc_fcl_pm"
    Child died at pc 0x48050c of signal : Segmentation fault
    reading symbolic information ...
    warning: volatile variable in symbol table -- $datacache zeroed
    
    [using memory image in core]
    (dbx) t
    >  0 validate_time_now(0x0, 0x0, 0x0, 0x0, 0x0)
    ["mcc_desframe_internal.c":1895]
    (dbx)
    
    
 | 
| 3544.12 |  | MICROW::SEVIGNY |  | Thu Aug 27 1992 16:31 | 6 | 
|  |     
    Again, I don't have to worry about CMA compatibility, do I?  The AM
    links in DECthreads version V1.10-030.
    
    I seem to remember hearing that DCE and MCC need to be in synch.
    
 | 
| 3544.13 | In case this matters | MICROW::SEVIGNY | Unity without Uniformity | Fri Aug 28 1992 10:38 | 18 | 
|  |     
    As an addendum, I might warn you that our test scripts are written in
    this manner:
    
    #!/bin/csh              
    source $TSRC/env_vars_setup.csh
    manage enroll mcc_tps_am
    
    manage create entity1 name1 attr1=foo attr2=bar
    
    manage show entity1  name1 attr1
    
    manage set entity1 name1 attr2=junk
    
    .
    .
    .
    
 | 
| 3544.14 |  | TOOK::SWIST | Jim Swist LKG2-2/T2 DTN 226-7102 | Fri Aug 28 1992 11:43 | 12 | 
|  |     I'm not sure it matters but why write your scripts so that you bring
    FCL up and down completely for each command?
    
    manage <<%
    enroll....
    set...
    show...
    %
    
    would be a lot more efficient.
    
    
 | 
| 3544.15 |  | MICROW::SEVIGNY | Unity without Uniformity | Fri Aug 28 1992 13:43 | 4 | 
|  |     
    We intersperse comment-like "echos" between the commands.
    
    
 | 
| 3544.16 |  | MICROW::SEVIGNY |  | Mon Aug 31 1992 09:34 | 10 | 
|  |     
    
    Is there any further information that I can provide that would help to
    diagnose this problem?  We are really desperate...  Our deadlines are
    being affected by not being able to close this issue.
    
    Thanks,
    
    Marc
    
 | 
| 3544.17 | QAR 3395 | TOOK::MINTZ | Erik Mintz, dtn 226-5033 | Mon Aug 31 1992 09:43 | 2 | 
|  | Entered as QAR 3395 at high priority
 |