[Search for users]
[Overall Top Noters]
[List of all Conferences]
[Download this site]
Title: | DECnet/OSI for OpenVMS |
|
Moderator: | TUXEDO::FONSECA |
|
Created: | Thu Feb 21 1991 |
Last Modified: | Fri Jun 06 1997 |
Last Successful Update: | Fri Jun 06 1997 |
Number of topics: | 3990 |
Total number of notes: | 19027 |
3857.0. "fal process hung" by PRSSOS::MAGENC () Thu Jan 30 1997 08:39
AlphaServer 2100 - VMS 6.2 - DECnet/OSI 6.3 ECO5 - UCX 3.3 ECO 13
Node name = COSMET
VAX 6510 - VMS 6.2 - DECnet/OSI 6.3 ECO6 - Pathway 2.5.1
Node name = SOUAL
Decnet/osi over TCP/IP (Pathway)
A batch process running on the Alpha system makes a decnet file transfer
to the VAX system .
The following problem appeared once :
- the copy operation seems to have succeeded (from the user point of view),
but the batch process does'nt complete on the Alpha system , and
on the VAX , the associated FAL process remains, never reverts to
a server_xxxx process , and does'nt close the output file.
- The customer had to stop/id the FAL process on the VAX , and then,
the output file was there , closed, and fully transfered.
Now , my questions :
1) Where could this problem come from ? I suppose the decnet link
was hung ...
2) Knowing that this problem occurred only once between those 2 systems,
and once between 2 other systems (2 Vaxes running decnet/osi 6.3 eco 6
over TCP/IP stack) , how could it be further investigated ?
Note : customer refuses the eventuality of forcing a crash dump.
Thanks in advance , and best regards , Michele.
PS : please find some related info provided by the customer,
especially the net$server.log file contents, and results
showing that the output file was not "locked by another user" at the
time the copy was "hung".
All these informations have been collected by the customer before he
killed the FAL process on the VAX.
On VAX System :
---------------
SOUAL>sh syst/net
OpenVMS V6.2 on node SOUAL 28-JAN-1997 09:41:14.12 Uptime 0 21:00:50
Pid Process Name State Pri I/O CPU Page flts Pages
00000689 FAL_14040013 LEF 6 1762 0 00:00:04.44 946 502 N
SOUAL>ana/syst
OpenVMS (TM) VAX System analyzer
SDA> sh proc/id=00000689
Process index: 0089 Name: FAL_14040013 Extended PID: 00000689
-----------------------------------------------------------------
Status : 00240001 res,phdres,netwrk
Status2: 00000001 quantum_resched
PCB address 88303680 JIB address 880ECA40
PHD address 928ADE00 Swapfile disk address 00000000
Master internal PID 00030089 Subprocess count 0
Internal PID 00030089 Creator internal PID 00000000
Extended PID 00000689 Creator extended PID 00000000
State LEF Termination mailbox 0014
Current priority 6 AST's enabled KESU
Base priority 4 AST's active NONE
UIC [00010,000011] AST's remaining 37
Mutex count 0 Buffered I/O count/limit 38/40
Waiting EF cluster 0 Direct I/O count/limit 40/40
Starting wait time 1B001B1A BUFIO byte count/limit 31424/31424
Event flag wait mask FFFFFFFD # open files allowed left 296
Local EF cluster 0 60000035 Timer entries allowed left 39
Local EF cluster 1 80000000 Active page table count 0
Global cluster 2 pointer 00000000 Process WS page count 365
Global cluster 3 pointer 00000000 Global WS page count 137
SDA>
SDA> sh proc/id=00000689/cha
Process index: 0089 Name: FAL_14040013 Extended PID: 00000689
-----------------------------------------------------------------
Process active channels
-----------------------
Channel Window Status Device/file accessed
------- ------ ------ --------------------
0010 00000000 DSA12:
0020 87DE91C0 DSA10:[VMS$COMMON.SYSEXE]FAL.EXE;1 (sect
ion file)
0030 00000000 Busy MBA7180:
0040 882FB618 Busy NET981:
0050 8820F980 DSA12:[SOUAL.DTRF]COARMV.DAT;1
0060 87DE5300 DSA10:[VMS$COMMON.SYSEXE]DCL.EXE;1 (sect
ion file)
0080 87E08040 DSA10:[VMS$COMMON.SYSLIB]DCLTABLES.EXE;5
2 (section file)
0090 87F83900 DSA12:[SOUAL.DTRF]NET$SERVER.LOG;544
00A0 87F63C80 DSA10:[VMS$COMMON.SYSEXE]NET$SERVER.COM;
1
SDA> sh dev mba7180
I/O data structures
-------------------
MBA7180 MBX UCB address: 882F9F80
Device status: 00000010 online
Characteristics: 0C150001 rec,shr,avl,mbx,idv,odv
00000200 nnm
Owner UIC [000010,000011] Operation count 0 ORB address 88220CC0
PID 00000000 Error count 0 DDB address 873496F0
Class/Type A0/01 Reference count 2 DDT address 87291908
Def. buf. size 64 BOFF 0000 CRB address 87349E84
DEVDEPEND 00000000 Byte count 0000 I/O wait queue empty
DEVDEPND2 00000000 SVAPTE 00000000
FLCK index 28 DEVSTS 0002
DLCK address 87344200
Charge PID 00030089
*** I/O request queue is empty ***
SDA> sh dev net981
I/O data structures
-------------------
NET981 Unknown UCB address: 882FB580
Device status: 00010010 online,deleteucb
Characteristics: 0C1C2000 net,avl,mnt,mbx,idv,odv
00000000
Owner UIC [000010,000011] Operation count 1 ORB address 87EFAA40
PID 00030089 Error count 0 DDB address 87E1F840
Class/Type 00/00 Reference count 1 DDT address 884E0190
Def. buf. size 256 BOFF 0000 VCB address 884E57D8
DEVDEPEND 00000001 Byte count 0000 CRB address 87E1F8C0
DEVDEPND2 00000000 SVAPTE 00000000 AMB address 882F9F80
FLCK index 34 DEVSTS 0002 I/O wait queue empty
DLCK address 87345A00
Charge PID 00030089
*** I/O request queue is empty ***
Press RETURN for more.
SDA>
I/O data structures
-------------------
--- Volume Control Block (VCB) 884E57D8 ---
Transactions 0 Mount count 111 AQB address 00000000
SDA> Exit
SOUAL>type DSA12:[SOUAL.DTRF]NET$SERVER.LOG;544
$! SYS$MANAGER:SYLOGIN.COM
$ set noverify
--------------------------------------------------------
Connect request received at 27-JAN-1997 23:13:57.17
from remote process COSMET::"0=TRANSFERT"
for object "SYS$COMMON:[SYSEXE]FAL.EXE"
--------------------------------------------------------
========================================================
FAL V6.2-10 started execution on 27-JAN-1997 23:13:57.24
with SYS$NET = COSMET::"0=TRANSFERT" and
with FAL$LOG = 1/DISABLE=8
Requested file access operation: Create file
Specified file: FDISK2:[SOUAL.DTRF]COARMV.DAT;
Resultant file: FDISK2:[SOUAL.DTRF]COARMV.DAT;1
<----- note : the file ends there , even after the FAL process has been stopped
SOUAL>run directory_eexe:rmslock
Enter file name: DSA12:[SOUAL.DTRF]COARMV.DAT;1
1 locks on resource "RMS$�.6....FDISK2 ..." at node SOUAL::
File name is: DSA12:[SOUAL.DTRF]COARMV.DAT;1
PID NODE process name LOCK ID RQ GR QUEUE MSTLKID
000016DC SOUAL MERCUSH 52000AA0 NL NL GRANTED
<------ note : MERCUSH is the user entering the above command.
This shows that nothing else have RMS locks on the output file.
SOUAL>run directory_eexe:files_info
_Filespec: DSA12:[SOUAL.DTRF]COARMV.DAT;1
FILE: _DSA12:[SOUAL.DTRF]COARMV.DAT;1
Total access count of 1, XQP access 1, writers 1, size 0/17703
PID USERNAME READS WRITES ACCESS CHARACTERISTICS
-------- ------------ -------- -------- ----------------------
00000689 TRANSFERT 5 186 Write, Sequential, NoReadShr
SOUAL>dir/size=all DSA12:[SOUAL.DTRF]COARMV.DAT;1
Directory DSA12:[SOUAL.DTRF]
COARMV.DAT;1 17703/17703
Total of 1 file, 17703/17703 blocks.
-------------------------------------------------------------------------------
On Alpha system:
----------------
COSMET>sh syst/bat
OpenVMS V6.2 on node COSMET 28-JAN-1997 09:44:51.82 Uptime 39 20:33:41
Pid Process Name State Pri I/O CPU Page flts Pages
0000EB25 BATCH_92 LEF 4 1807 0 00:00:02.14 220 23 B
COSMET>ana/syst
OpenVMS (TM) Alpha System analyzer
SDA> sh proc /id= 0000EB25
Process index: 0025 Name: BATCH_92 Extended PID: 0000EB25
-------------------------------------------------------------
Process status: 00044001 RES,BATCH,PHDRES
Required capabilities: 0000000C QUORUM,RUN
PCB address 80BD5600 JIB address 80AC6A80
PHD address 83658000 Swapfile disk address 00000000
Master internal PID 00EB0025 Subprocess count 0
Internal PID 00EB0025 Creator internal PID 00000000
Extended PID 0000EB25 Creator extended PID 00000000
State LEF Termination mailbox 0000
Previous CPU Id 00000000 Current CPU Id 00000000
Previous ASNSEQ 000000000000092E Previous ASN 000000000000003B
Current priority 4 # of threads 0000000000000000
Initial process priority 2 Delete pending count 0
Base priority 2 AST's active NONE
UIC [00010,000011] AST's remaining 247
Mutex count 0 Buffered I/O count/limit 148/150
Waiting EF cluster 0 Direct I/O count/limit 150/150
Abs time of last event 14868ADE BUFIO byte count/limit 98592/98592
Event flag wait mask BFFFFFFF # open files allowed left 96
Press RETURN for more.
SDA>
Process index: 0025 Name: BATCH_92 Extended PID: 0000EB25
-------------------------------------------------------------
Swapped copy of LEFC0 00000000 Timer entries allowed left 9
Swapped copy of LEFC1 00000000 Active page table count 0
Global cluster 2 pointer 00000000 Process WS page count 22
Global cluster 3 pointer 00000000 Global WS page count 1
SDA> sh proc /id= 0000EB25/cha
Process index: 0025 Name: BATCH_92 Extended PID: 0000EB25
-------------------------------------------------------------
Process active channels
-----------------------
Channel Window Status Device/file accessed
------- ------ ------ --------------------
0010 00000000 DKA400:
0020 8093EC80 DKA0:[VMS$COMMON.SYSEXE]COPY.EXE;1 (sect
ion file)
0030 8094A900 DKA0:[VMS$COMMON.SYSLIB]LIBOTS.EXE;1 (se
ction file)
0040 8094A7C0 DKA0:[VMS$COMMON.SYSLIB]LIBRTL.EXE;1 (se
ction file)
0050 8093EF80 DKA0:[VMS$COMMON.SYSEXE]DCL.EXE;1 (secti
on file)
0060 80946400 DKA0:[VMS$COMMON.SYSLIB]DCLTABLES.EXE;11
3 (section file)
0070 80AFBB00 DKA400:[COSMET.DLOG]COPY_COARMV_SOUAL.LO
G;26
0080 80AD6740 DKA400:[COSMET.DTMP]COPY_COARMV_SOUAL.CO
M;340
0090 00000000 NET2959:
00A0 00000000 DKA400:
00B0 80B5ABC0 DKA400:[COSMET.DDAT]COARMV.DAT;1
Press RETURN for more.
SDA>
Process index: 0025 Name: BATCH_92 Extended PID: 0000EB25
-------------------------------------------------------------
Channel Window Status Device/file accessed
------- ------ ------ --------------------
00C0 808E2F44 Busy NET2960:
SDA> sh dev net2960
I/O data structures
-------------------
NET2960 Unknown UCB address: 808E2E80
Device status: 00010010 online,deleteucb
Characteristics: 0C1C2000 net,avl,mnt,mbx,idv,odv
00000000
Owner UIC [000010,000011] Operation count 1 ORB address 80CB8540
PID 00EB0025 Error count 0 DDB address 80A330C0
Class/Type 00/00 Reference count 1 DDT address 89A26600
Def. buf. size 256 BOFF 00000000 VCB address 89A261A0
DEVDEPEND 00000001 Byte count 00000000 CRB address 80A34180
DEVDEPND2 00000000 SVAPTE 00000000 I/O wait queue 808E2EEC
DEVDEPND3 00000000 DEVSTS 00000002
FLCK index 3A
DLCK address 8312B500
Charge PID 00EB0025
*** I/O request queue is empty ***
Press RETURN for more.
SDA>
I/O data structures
-------------------
--- Volume Control Block (VCB) 89A261A0 ---
Transactions 0 Mount count 72 AQB address 00000000
SDA> Exit
COSMET>run directory_eexe:rmslock
Enter file name: DKA400:[COSMET.DDAT]COARMV.DAT;1
1 locks on resource "RMS$.......FDISK4 ..." at node COSMET::
File name is: DKA400:[COSMET.DDAT]COARMV.DAT;1
PID NODE process name LOCK ID RQ GR QUEUE MSTLKID
00011978 COSMET MERCUSH 7F000037 NL NL GRANTED
^
|
--------- Customer's user name.
COSMET>run directory_eexe:files_info
_Filespec: DKA400:[COSMET.DDAT]COARMV.DAT;1
FILE: _COSMET$DKA400:[COSMET.DDAT]COARMV.DAT;1
Total access count of 1, XQP access 1, writers 0, size 17703/17703
PID USERNAME READS WRITES ACCESS CHARACTERISTICS
-------- ------------ -------- -------- ----------------------
0000EB25 TRANSFERT 199 0 Read, Sequential, NoWriteShr
COSMET>dir/size=all DIRECTORY_DDAT:COARMV.DAT
Directory FDISK4:[COSMET.DDAT]
COARMV.DAT;1 17703/17703
Total of 1 file, 17703/17703 blocks.
T.R | Title | User | Personal Name | Date | Lines
|
---|