| Title: | DIGITAL UNIX (FORMERLY KNOWN AS DEC OSF/1) | 
| Notice: | Welcome to the Digital UNIX Conference | 
| Moderator: | SMURF::DENHAM | 
| Created: | Thu Mar 16 1995 | 
| Last Modified: | Fri Jun 06 1997 | 
| Last Successful Update: | Fri Jun 06 1997 | 
| Number of topics: | 10068 | 
| Total number of notes: | 35879 | 
    A customer is testing an AlphaStation 500/500 in his homogeneous 
    SUN environment. They are experiencing massive NFS performance
    problems.
    
    Digital UNIX V4.0b
    Sun OS 4.1.
    
    They use automount although we could see no difference to normal
    NFS-mounts. ftp between the two systems gives the optimal Ethernet
    performance ( ~1MB/s). We did not see any network problems
    (Collisions,...). NFS-Performance with DU as NFS-server was between 300
    and 400 KB/s with loaded network and busy SUN NFS-server; they considered
    this to be good. NFS-Performance when reading from the SUN NFS-Server
    (DU as NFS client) was down to 30 - 40 KB/s. As their tool reads quite 
    a few 100 MB during startup this is not acceptable to them. 
    
    mount showed the mount as: v2, rw, udp, hard, intr.
    
    We tried to trace the problem using "tcpdump -s300 udp port 2049 and
    host mosel" where mosel is the Alpha. The main difference seems to be
    that for the slow performance we only see 3 or 4 interactions per
    second with 8KB each (fragmented) whereas in the good performing case
    we see ~10 to 15 interactions per second. Now where is the problem?
    
    I attach some tcpdump lines from the middle of the transfer:
    
    15:01:37.471103 mosel.226b740d > elbe.nfs-v2: 128 call read fh
    	503775232.16777216.2560.2198471936 8192 bytes @ 1417216
    15:01:37.473056 elbe.226b740d > mosel.nfs-v2: 1472 reply read (frag
    	18498:1480@0+)
    15:01:38.337314 mosel.216b740d > elbe.nfs-v2: 128 call read fh
    	503775232.16777216.2560.2198471936 8192 bytes @ 1409024
    15:01:38.340243 elbe.216b740d > mosel.nfs-v2: 1472 reply read (frag
    	18506:1480@0+)
    15:01:38.347079 mosel.236b740d > elbe.nfs-v2: 128 call read fh
    	503775232.16777216.2560.2198471936 8192 bytes @ 1425408
    15:01:38.347079 mosel.246b740d > elbe.nfs-v2: 128 call read fh
    	503775232.16777216.2560.2198471936 8192 bytes @ 1433600
    15:01:38.349032 elbe.236b740d > mosel.nfs-v2: 1472 reply read (frag
    	18507:1480@0+)
    15:01:38.355868 elbe.246b740d > mosel.nfs-v2: 1472 reply read (frag
    	18508:1480@0+)
    15:01:38.356845 mosel.256b740d > elbe.nfs-v2: 128 call read fh
    	503775232.16777216.2560.2198471936 8192 bytes @ 1441792
    15:01:38.362704 mosel.266b740d > elbe.nfs-v2: 128 call read fh
    	503775232.16777216.2560.2198471936 8192 bytes @ 1449984
    15:01:38.364657 elbe.266b740d > mosel.nfs-v2: 1472 reply read (frag
    	18509:1480@0+)
    15:01:39.230868 mosel.256b740d > elbe.nfs-v2: 128 call read fh
    	503775232.16777216.2560.2198471936 8192 bytes @ 1441792
    15:01:39.233798 elbe.256b740d > mosel.nfs-v2: 1472 reply read (frag
    	18694:1480@0+)
    
    The fragment traffic looks like this:
    
    15:56:17.547275 mosel.1018 > elbe.2049: udp 128
    15:56:17.547275 mosel.1023 > elbe.2049: udp 128
    15:56:17.549228 elbe.2049 > mosel.1018: udp 8292 (frag 18734:1480@0+)
    15:56:17.550204 elbe > mosel: (frag 18734:1480@1480+)
    15:56:17.552157 elbe > mosel: (frag 18734:1480@2960+)
    15:56:17.553134 elbe > mosel: (frag 18734:1480@4440+)
    15:56:17.554110 elbe > mosel: (frag 18734:1480@5920+)
    15:56:17.555087 elbe > mosel: (frag 18734:900@7400)
    15:56:17.555087 mosel.1018 > elbe.2049: udp 128
    15:56:17.556064 elbe.2049 > mosel.1023: udp 8292 (frag 18735:1480@0+)
    15:56:17.558017 elbe > mosel: (frag 18735:1480@1480+)
    15:56:17.558993 elbe > mosel: (frag 18735:1480@2960+)
    15:56:17.559970 elbe > mosel: (frag 18735:1480@4440+)
    15:56:17.560946 elbe > mosel: (frag 18735:1480@5920+)
    15:56:17.561923 elbe > mosel: (frag 18735:900@7400)
    15:56:17.562900 mosel.1023 > elbe.2049: udp 128
    
    Any advice?
    
    	Hartmut
                                                       
| T.R | Title | User | Personal Name | Date | Lines | 
|---|---|---|---|---|---|
| 9487.1 | Try NFS V3 | MIPSBX::"[email protected]" | Brian Haley | Mon Apr 14 1997 13:08 | 7 | 
| Hi, Why aren't you using NFS V3 - does SunOS 4.1 not support it? That should give you some performance improvement. -Brian [Posted by WWW Notes gateway] | |||||
| 9487.2 | Server is losing requests | NNTPD::"[email protected]" | Ric Werme | Tue Apr 15 1997 09:32 | 59 | 
|     15:01:38.349032 elbe.236b740d > mosel.nfs-v2: 1472 reply read
    15:01:38.355868 elbe.246b740d > mosel.nfs-v2: 1472 reply read
    15:01:38.356845 mosel.256b740d > elbe.nfs-v2: 128 call read
    15:01:38.362704 mosel.266b740d > elbe.nfs-v2: 128 call read
    15:01:38.364657 elbe.266b740d > mosel.nfs-v2: 1472 reply read
    15:01:39.230868 mosel.256b740d > elbe.nfs-v2: 128 call read
    15:01:39.233798 elbe.256b740d > mosel.nfs-v2: 1472 reply read
Thank you for the tcpdump traces.  Now if the news->usenet gateway
would get fixed and stay fixed I'd see these posting sooner....
The key here is to look at the transmision IDs, or XIDs.  XIDs are
sequence numbers attache to RPC messages that allow replies to be
matched to the request.  Ususally the XID increments each message
For little endian machines, they get printed backwards, so the high
byte increments first.
Look at this one:
    15:01:38.356845 mosel.256b740d > elbe.nfs-v2: 128 call read
                          ^^^XID^^
I assume that tcpdump was running on the client, so this says that
we sent the request.  We see no reply, so later on we retransmit:
    15:01:39.230868 mosel.256b740d > elbe.nfs-v2: 128 call read
    15:01:39.233798 elbe.256b740d > mosel.nfs-v2: 1472 reply read
The next step is to determine:
1) Did the Sun get the request?
2) Did it reply?
3) Did traffic get corrupted?
Since things ran fine when Dunix ws the server, you can probably rule
out corruption, though it would be worth trying writes to see what
happens the Sun sends long messages.  You mention a busy network,
NFS over UDP is extremely sensitive to lost messages.  In a busy network
the usual cause is "excessive collisions", which we report via netstat -i -s.
You don't have enough data to determine if the request or reply is getting
lost - I generally have tcpdump record all traffic between the two nodes.
If the first fragment of a reply is missing, but other fragments show
up, then that's proof the server got the request.
I'd look at more tcpdump data, look at nfsstat/netstat/ifconfig data on
client and server.  (One way to tell if the Sun got the requests is to
do nfsstat on client and server, read the file, do nfsstat and see if Sun
reports as many new READs as Dunix does.)
I'd also look at FTP tcpdump traces, looking for retransmits (also reported
in nfsstat).  TCP has several features that make it much less sensitive to
packet loss, so you may be losing packets and not notice.
At any rate, the key question is "Why aren't all my reads being answered?"
[Posted by WWW Notes gateway]
 | |||||
| 9487.3 | SunOS 4.x doesn't handle NFS V3 | NNTPD::"[email protected]" | Ric Werme | Tue Apr 15 1997 09:38 | 5 | 
| Think of SunOS 4.x kernels as the Sun analog to Ultrix. Never talked V3, never will. While "upgrading" to Solaris is sometimes possible, it is often not desirable, especially if you dislike SysV. [Posted by WWW Notes gateway] | |||||
| 9487.4 | more analysis | ALFAM7::STREPPEL | Mon Apr 28 1997 10:12 | 45 | |
|     We went on analyzing the problem and found the following pattern
    
    11:38:57.438876 mosel.67e14976 > elbe.nfs-v2: 132 call read fh 
    11:38:57.438876 mosel.68e14976 > elbe.nfs-v2: 132 call read fh 
    11:38:57.441806 elbe.67e14976 > mosel.nfs-v2: 1472 reply read 
    ...
    11:38:57.447665 mosel.69e14976 > elbe.nfs-v2: 132 call read 
    11:38:57.448642 elbe.68e14976 > mosel.nfs-v2: 1472 reply read 
    ...
    11:38:57.456454 mosel.6ae14976 > elbe.nfs-v2: 132 call read 
    11:38:57.459384 elbe.6ae14976 > mosel.nfs-v2: 1472 reply read 
    ...
    11:38:58.321689 mosel.69e14976 > elbe.nfs-v2: 132 call read 
    11:38:58.324618 elbe.69e14976 > mosel.nfs-v2: 1472 reply read 
    
    i.e. elbe, (Digital UNIX) issues two read requests in a row (very fast,
    look at the timestamp). Mosel (SUNos) asnwers the first(67e14..), elbe
    issues the third, mosel answers the second (68e...), elbe issues the
    fourth and mosel answers the fourth (6ae..) and forgets(!) the third
    (69e...).
    
    The performance problem comes from the fact that DU is waiting ~ 0.9s
    before it resends the third request - 11:38:57.447665 mosel.69e14976
    until the resend 11:38:58.321689 mosel.69e14976. There is no activity
    after the reply to the fourth request has been received. This
    calculates to 3 or 4 times 8KB per second, == ~30KB/s. exactly what we
    measured.
    
    The question is: is this normal behaviour? Can this be changed
    dynamically? Can I prevent UDP from sending requests before a previous
    one was answered??
    
    As .2 pointed out we figured out whether the requests made it into
    SUNos. Using nfsstat on the SUN did not show the non answered requests.
    A PD version of tcpdump from the Internet did not show them either. So
    they got lost, either in SUNos, in the SUN's ethernet hardware or on
    the wire - we had to insert a repeater as the customer only had a
    yellow cable and our AlphaStation only supported BNC and Twisted Pair.
    
    So the problem remains unsolved, as we had no time left to do low level
    analysis.
    
    	Regards
    		Hartmut
             
 | |||||