| T.R | Title | User | Personal Name
 | Date | Lines | 
|---|
| 1786.1 |  | KAOFS::S_HYNDMAN | Acronym Decoder Ring Architect | Thu Dec 15 1994 17:46 | 11 | 
|  |     
    
    	I'm not really clear on what your trying to do, have backup
    connections to bypass the bridge or provide redundant connections for 
    the server.  If it was the latter, why not go DAS FDDI and multi home the 
    server on the ring?
    
    	Cabletron also make redundant fiber tranceivers.
    
    
    Scott  
 | 
| 1786.2 | It's a VAX 4500 Pathworks Server | MSDOA::REED | John Reed @CBO, DTN:367-6463, KB4FFE, SouthEast | Fri Dec 16 1994 09:17 | 23 | 
|  |     The 4000 series VAXes have Q-bus FDDI controllers as the only available
    option, and the customer feels that the throughput of the ISA Ethernet
    will be faster than a Q-bus attached FDDI controller.
    
    I need to have a way to reach this file server if the DEChub900 near it
    decides to crash.   The customer has expericenced several hub crashes
    (it's on a UPS, has  three DECCon, and one DECbridge, with three power
    supply modules) and each time the hub reboots, he looses the
    connections to his file server.   He wants a way to keep the PC's
    running through the HUB crashes.  
    
    The PC's are connected to Ethernets, on various other DEChub mounted
    DECbridge900's.   I beleive that the Fault Tolerant Ethernet FOT
    attached to his ISA-0 Ethernet, and one Primary fiber port fed to a
    repeater module in one hub, and the backup fiber port fed to a module
    in a different hub will work, as long as the repeater modules have
    ANOTHER working node on their PORT GROUP.  This will keep the spanning
    tree from shutting down either port, and allow the FOT to choose the 
    proper path to enable.   The customer would like to eventually put an
    FDDI PC file server on the ring.  But I think that this will be a godd
    starting point.
    
    JR
 | 
| 1786.3 |  | NETCAD::SLAWRENCE |  | Fri Dec 16 1994 12:23 | 8 | 
|  |     
    Ahh hah!  The hub is crashing?  It shouldn't be, so let's look at
    that...
    
    What are the firmware revs for the Hub and all modules?
    
    Are there error log entries? 
    
 | 
| 1786.4 | I HOPE the crashing has stopped... | MSDOA::REED | John Reed @CBO, (803) 781-9571 NIS Networker | Mon Dec 19 1994 09:13 | 31 | 
|  |     The crashing appears to have stopped after we upgraded to the most
    recent revisions.  (It hasn't occured for a week now, and it used to be
    several times a day).   They used to have DECcon FM 2.0.0, DECBridge900
    version 1.2.1, and HUBmanager v3.0.0.   It ran wonderfully, until the
    imaging application on the alpha's came online.  They have an Alpha
    farm with Kubota(tm) graphics accelerators and funny little
    transmitters on top of their screens.  They wear 3-D glasses, and do
    molecular modelling.  The images spin around, suspended in the air in
    front of your monitor.  If you wear the glasses, and turn out the
    lights, it would make a great lava lamp at a 60's party...   They are a
    medical research and design firm, with a lobby full of patent grants
    and awards.
    
    They have since upgraded to v2.8.0 on the Conc, 1.4.0 on the Bride, and
    3.1.0 on the HUB managers.  They feel the problem was traffic related,
    and they think the DECbridge900 "couldn't keep up with the traffic." 
    The customer's MIS department suffered a lot of grief during the
    period when the Hubs were rebooting.  The MIS staff doesn't want this
    to occur again, and they see how the link to their file server is a
    single point of failure.   (For that matter, having a single file
    server is also troublesome).   So, they are planning additional fault
    tolerance.   They like the FDDI, and the speed, and the way that it
    wraps around outages.  We are trying to add to their comfort level
    about the bridges, and give them some redundancy.  Ethernet and the STP
    will not bypass a fault as quickly as FDDI, (typically 45 seconds) so
    their LAT and Pathworks DIsks might time out during a hub crash.  But I
    hope to create a config where the users can get back on quickly.
    
    JR
     
    
 | 
| 1786.5 | The crashing should never have started... | NETCAD::SLAWRENCE |  | Mon Dec 19 1994 17:30 | 41 | 
|  |           
    I don't know how much comfort it will add, but here's more data, for
    what it's worth:
    
    The crash you saw was very well understood here; in fact, it took out
    our file servers here in DEChub Engineering before we ever released the
    bridge to field test.  
    
    The original problem was in the bridge, and was - in a way - traffic 
    related (your customer was right).  It started with a bug in the IP 
    fragmentation code in the bridge that occured only if two IP packets 
    arrived from the FDDI requiring fragmentation _very_ close together 
    such that they both were queued together in the bridge (this is a 
    very narrow window).  It took a little while, but a few of these
    crashed the bridge.  Combine that with some problems in the hub manager
    that had problems with modules that crashed too frequently, and you end
    up with an unstable hub.  (Without this bug they keep up just fine, by
    the way)
    
    The good news is that all of the above are fixed in the latest
    releases.
    
    The bad news (for your customer) is that the bugs have been fixed for
    quite a while now, and they didn't get the fix.
    
    We have spent a great deal of energy here on trying to create a set of
    mechanisms that ensures that the latest releases of all our firmware is
    available to the field and (where possible) directly to the customer -
    but it does no good if you don't check them.  What your customer had
    was the very first field release of firmware - almost certain to have
    at least some minor problems (in this case, unfortunately, it was
    fairly serious for them because thier Alphas were so fast).
    
    We _cannot_ guarantee that you will get the latest release of firmware
    when hardware is delivered to you.  You should _never_ assume that it
    is up to date.
    
    We have Internet and Easynet archives for the latest firmware, and
    mailing lists that you and your customers can subscribe to for release
    notices.  Pointers to both are in the owners manuals and/or the release
    notes.
 |