[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference help::decnet-osi_for_vms

Title:DECnet/OSI for OpenVMS
Moderator:TUXEDO::FONSECA
Created:Thu Feb 21 1991
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:3990
Total number of notes:19027

3857.0. "fal process hung" by PRSSOS::MAGENC () Thu Jan 30 1997 08:39

    
    


AlphaServer 2100 - VMS 6.2 - DECnet/OSI 6.3 ECO5 - UCX 3.3 ECO 13
Node name = COSMET
VAX 6510         - VMS 6.2 - DECnet/OSI 6.3 ECO6 - Pathway 2.5.1
Node name = SOUAL
Decnet/osi over TCP/IP (Pathway)

A batch process running on the Alpha system makes a decnet file transfer
to the VAX system .
The following problem appeared once :
- the copy operation seems to have succeeded (from the user point of view),
but the batch process does'nt complete on the Alpha system , and
on the VAX , the associated FAL process remains, never reverts to 
a server_xxxx process , and does'nt close the output file.
- The customer had to stop/id the FAL process on the VAX , and then,
the output file was there , closed, and fully transfered.

Now , my questions : 
1) Where could this problem come from ? I suppose the decnet link
was hung ...
2) Knowing that this problem occurred only once between those 2 systems,
and once between 2 other systems (2 Vaxes running decnet/osi 6.3 eco 6
over TCP/IP stack) , how could it be further investigated ?
Note : customer refuses the eventuality of forcing a crash dump.

		Thanks in advance , and best regards , Michele.

PS : please find some related info provided by the customer,
especially the net$server.log file contents, and results
showing that the output file was not "locked by another user" at the
time the copy was "hung".
All these informations have been collected by the customer before he
killed the FAL process on the VAX. 

On VAX System :
---------------

SOUAL>sh syst/net

OpenVMS V6.2  on node SOUAL  28-JAN-1997 09:41:14.12  Uptime  0 21:00:50
  Pid    Process Name    State  Pri      I/O       CPU       Page flts  Pages
00000689 FAL_14040013    LEF      6     1762   0 00:00:04.44       946    502  N

SOUAL>ana/syst    

OpenVMS (TM) VAX System analyzer

SDA> sh proc/id=00000689

Process index: 0089   Name: FAL_14040013   Extended PID: 00000689
-----------------------------------------------------------------
Status : 00240001 res,phdres,netwrk
Status2: 00000001 quantum_resched
PCB address              88303680    JIB address              880ECA40
PHD address              928ADE00    Swapfile disk address    00000000
Master internal PID      00030089    Subprocess count                0
Internal PID             00030089    Creator internal PID     00000000
Extended PID             00000689    Creator extended PID     00000000
State                       LEF      Termination mailbox          0014
Current priority                6    AST's enabled                KESU
Base priority                   4    AST's active                 NONE
UIC                [00010,000011]    AST's remaining                37
Mutex count                     0    Buffered I/O count/limit       38/40
Waiting EF cluster              0    Direct I/O count/limit         40/40
Starting wait time       1B001B1A    BUFIO byte count/limit      31424/31424
Event flag wait mask     FFFFFFFD    # open files allowed left     296
Local EF cluster 0       60000035    Timer entries allowed left     39
Local EF cluster 1       80000000    Active page table count         0
Global cluster 2 pointer 00000000    Process WS page count         365
Global cluster 3 pointer 00000000    Global WS page count          137
SDA> 
SDA> sh proc/id=00000689/cha

Process index: 0089   Name: FAL_14040013   Extended PID: 00000689
-----------------------------------------------------------------


                            Process active channels
                            -----------------------

Channel  Window           Status        Device/file accessed
-------  ------           ------        --------------------
  0010  00000000                        DSA12:
  0020  87DE91C0                        DSA10:[VMS$COMMON.SYSEXE]FAL.EXE;1 (sect
ion file)
  0030  00000000             Busy       MBA7180:
  0040  882FB618             Busy       NET981:
  0050  8820F980                        DSA12:[SOUAL.DTRF]COARMV.DAT;1
  0060  87DE5300                        DSA10:[VMS$COMMON.SYSEXE]DCL.EXE;1 (sect
ion file)
  0080  87E08040                        DSA10:[VMS$COMMON.SYSLIB]DCLTABLES.EXE;5
2 (section file)
  0090  87F83900                        DSA12:[SOUAL.DTRF]NET$SERVER.LOG;544
  00A0  87F63C80                        DSA10:[VMS$COMMON.SYSEXE]NET$SERVER.COM;
1
SDA> sh dev mba7180         

I/O data structures
-------------------
MBA7180                                 MBX               UCB address:  882F9F80

Device status:   00000010 online
Characteristics: 0C150001 rec,shr,avl,mbx,idv,odv
                 00000200 nnm

Owner UIC [000010,000011]   Operation count          0   ORB address    88220CC0
      PID        00000000   Error count              0   DDB address    873496F0
Class/Type          A0/01   Reference count          2   DDT address    87291908
Def. buf. size         64   BOFF                  0000   CRB address    87349E84
DEVDEPEND        00000000   Byte count            0000   I/O wait queue    empty
DEVDEPND2        00000000   SVAPTE            00000000                          
FLCK index             28   DEVSTS                0002                          
DLCK address     87344200                                                       
Charge PID       00030089                                                       

        *** I/O request queue is empty ***
SDA> sh dev net981 

I/O data structures
-------------------
NET981                                  Unknown           UCB address:  882FB580

Device status:   00010010 online,deleteucb
Characteristics: 0C1C2000 net,avl,mnt,mbx,idv,odv
                 00000000 

Owner UIC [000010,000011]   Operation count          1   ORB address    87EFAA40
      PID        00030089   Error count              0   DDB address    87E1F840
Class/Type          00/00   Reference count          1   DDT address    884E0190
Def. buf. size        256   BOFF                  0000   VCB address    884E57D8
DEVDEPEND        00000001   Byte count            0000   CRB address    87E1F8C0
DEVDEPND2        00000000   SVAPTE            00000000   AMB address    882F9F80
FLCK index             34   DEVSTS                0002   I/O wait queue    empty
DLCK address     87345A00                                                       
Charge PID       00030089                                                       

        *** I/O request queue is empty ***



    Press RETURN for more.
SDA> 

I/O data structures
-------------------

                --- Volume Control Block (VCB) 884E57D8 ---

Transactions           0    Mount count          111    AQB address     00000000
SDA>  Exit 

SOUAL>type DSA12:[SOUAL.DTRF]NET$SERVER.LOG;544
$!                      SYS$MANAGER:SYLOGIN.COM
$ set noverify

        --------------------------------------------------------

        Connect request received at 27-JAN-1997 23:13:57.17
            from remote process COSMET::"0=TRANSFERT"
            for object "SYS$COMMON:[SYSEXE]FAL.EXE"

        --------------------------------------------------------


========================================================
FAL V6.2-10 started execution on 27-JAN-1997 23:13:57.24
  with SYS$NET = COSMET::"0=TRANSFERT" and
  with FAL$LOG = 1/DISABLE=8

Requested file access operation: Create file    
Specified file: FDISK2:[SOUAL.DTRF]COARMV.DAT;
Resultant file: FDISK2:[SOUAL.DTRF]COARMV.DAT;1

<----- note : the file ends there , even after the FAL process has been stopped

SOUAL>run directory_eexe:rmslock
Enter file name: DSA12:[SOUAL.DTRF]COARMV.DAT;1

 1 locks on resource "RMS$�.6....FDISK2      ..." at node SOUAL::           
 File name is: DSA12:[SOUAL.DTRF]COARMV.DAT;1

      PID     NODE   process name    LOCK ID  RQ GR QUEUE    MSTLKID
    000016DC SOUAL  MERCUSH          52000AA0 NL NL GRANTED

<------ note : MERCUSH is the user entering the above command.
This shows that nothing else have RMS locks on the output file.

SOUAL>run directory_eexe:files_info
_Filespec: DSA12:[SOUAL.DTRF]COARMV.DAT;1

FILE: _DSA12:[SOUAL.DTRF]COARMV.DAT;1
Total access count of 1, XQP access 1, writers 1, size 0/17703

   PID     USERNAME      READS    WRITES   ACCESS CHARACTERISTICS
-------- ------------  --------  --------  ----------------------
00000689 TRANSFERT            5       186  Write, Sequential, NoReadShr

SOUAL>dir/size=all DSA12:[SOUAL.DTRF]COARMV.DAT;1

Directory DSA12:[SOUAL.DTRF]

COARMV.DAT;1           17703/17703  

Total of 1 file, 17703/17703 blocks.

------------------------------------------------------------------------------- 

On Alpha system:
----------------

COSMET>sh syst/bat

OpenVMS V6.2  on node COSMET  28-JAN-1997 09:44:51.82  Uptime  39 20:33:41
  Pid    Process Name    State  Pri      I/O       CPU       Page flts  Pages
0000EB25 BATCH_92        LEF      4     1807   0 00:00:02.14       220     23  B

COSMET>ana/syst

OpenVMS (TM) Alpha System analyzer

SDA> sh proc /id= 0000EB25

Process index: 0025   Name: BATCH_92   Extended PID: 0000EB25
-------------------------------------------------------------
Process status:        00044001  RES,BATCH,PHDRES
Required capabilities: 0000000C  QUORUM,RUN

PCB address              80BD5600    JIB address              80AC6A80
PHD address              83658000    Swapfile disk address    00000000
Master internal PID      00EB0025    Subprocess count                0
Internal PID             00EB0025    Creator internal PID     00000000
Extended PID             0000EB25    Creator extended PID     00000000
State                       LEF      Termination mailbox          0000
Previous CPU Id          00000000    Current CPU Id           00000000
Previous ASNSEQ  000000000000092E    Previous ASN     000000000000003B
Current priority                4    # of threads     0000000000000000
Initial process priority        2    Delete pending count         0
Base priority                   2    AST's active                 NONE
UIC                [00010,000011]    AST's remaining               247
Mutex count                     0    Buffered I/O count/limit      148/150
Waiting EF cluster              0    Direct I/O count/limit        150/150
Abs time of last event   14868ADE    BUFIO byte count/limit      98592/98592
Event flag wait mask     BFFFFFFF    # open files allowed left      96

    Press RETURN for more.
SDA> 

Process index: 0025   Name: BATCH_92   Extended PID: 0000EB25
-------------------------------------------------------------
Swapped copy of LEFC0    00000000    Timer entries allowed left      9
Swapped copy of LEFC1    00000000    Active page table count         0
Global cluster 2 pointer 00000000    Process WS page count          22
Global cluster 3 pointer 00000000    Global WS page count            1
SDA> sh proc /id= 0000EB25/cha

Process index: 0025   Name: BATCH_92   Extended PID: 0000EB25
-------------------------------------------------------------


                            Process active channels
                            -----------------------

Channel  Window           Status        Device/file accessed
-------  ------           ------        --------------------
  0010  00000000                        DKA400:
  0020  8093EC80                        DKA0:[VMS$COMMON.SYSEXE]COPY.EXE;1 (sect
ion file)
  0030  8094A900                        DKA0:[VMS$COMMON.SYSLIB]LIBOTS.EXE;1 (se
ction file)
  0040  8094A7C0                        DKA0:[VMS$COMMON.SYSLIB]LIBRTL.EXE;1 (se
ction file)
  0050  8093EF80                        DKA0:[VMS$COMMON.SYSEXE]DCL.EXE;1 (secti
on file)
  0060  80946400                        DKA0:[VMS$COMMON.SYSLIB]DCLTABLES.EXE;11
3 (section file)
  0070  80AFBB00                        DKA400:[COSMET.DLOG]COPY_COARMV_SOUAL.LO
G;26
  0080  80AD6740                        DKA400:[COSMET.DTMP]COPY_COARMV_SOUAL.CO
M;340
  0090  00000000                        NET2959:
  00A0  00000000                        DKA400:
  00B0  80B5ABC0                        DKA400:[COSMET.DDAT]COARMV.DAT;1
    Press RETURN for more.
SDA> 

Process index: 0025   Name: BATCH_92   Extended PID: 0000EB25
-------------------------------------------------------------

Channel  Window           Status        Device/file accessed
-------  ------           ------        --------------------
  00C0  808E2F44             Busy       NET2960:
SDA> sh dev net2960

I/O data structures
-------------------
NET2960                       Unknown                     UCB address:  808E2E80

Device status:   00010010 online,deleteucb
Characteristics: 0C1C2000 net,avl,mnt,mbx,idv,odv
                 00000000 

Owner UIC [000010,000011]   Operation count          1   ORB address    80CB8540
      PID        00EB0025   Error count              0   DDB address    80A330C0
Class/Type          00/00   Reference count          1   DDT address    89A26600
Def. buf. size        256   BOFF              00000000   VCB address    89A261A0
DEVDEPEND        00000001   Byte count        00000000   CRB address    80A34180
DEVDEPND2        00000000   SVAPTE            00000000   I/O wait queue 808E2EEC
DEVDEPND3        00000000   DEVSTS            00000002                          
FLCK index             3A                                                       
DLCK address     8312B500                                                       
Charge PID       00EB0025                                                       

        *** I/O request queue is empty ***


    Press RETURN for more.
SDA> 

I/O data structures
-------------------

                --- Volume Control Block (VCB) 89A261A0 ---

Transactions           0    Mount count           72    AQB address     00000000
SDA>  Exit 

COSMET>run directory_eexe:rmslock
Enter file name: DKA400:[COSMET.DDAT]COARMV.DAT;1

 1 locks on resource "RMS$.......FDISK4      ..." at node COSMET::          
 File name is: DKA400:[COSMET.DDAT]COARMV.DAT;1

      PID     NODE   process name    LOCK ID  RQ GR QUEUE    MSTLKID
    00011978 COSMET MERCUSH          7F000037 NL NL GRANTED
                      ^
                      |
                      --------- Customer's user name.
 
COSMET>run directory_eexe:files_info
_Filespec: DKA400:[COSMET.DDAT]COARMV.DAT;1

FILE: _COSMET$DKA400:[COSMET.DDAT]COARMV.DAT;1
Total access count of 1, XQP access 1, writers 0, size 17703/17703

   PID     USERNAME      READS    WRITES   ACCESS CHARACTERISTICS
-------- ------------  --------  --------  ----------------------
0000EB25 TRANSFERT          199         0  Read, Sequential, NoWriteShr

COSMET>dir/size=all DIRECTORY_DDAT:COARMV.DAT

Directory FDISK4:[COSMET.DDAT]

COARMV.DAT;1           17703/17703  

Total of 1 file, 17703/17703 blocks.

T.RTitleUserPersonal
Name
DateLines