|  |     >	- How costly would be, in terms of performance penalty, to access 
    > from a CPU another's CPU cache either whitin a single system or withing 
    > two TruCluster members ?
    
    When you talk about databases (OPS on TruCluster in your example), it
    is important to define what you mean by cache.  To a database, cache is
    the buffer pool in memory- as opposed to the rest of data, out on disk
    somewhere.  We're NOT talking about accessing CPU onboard caches.
    
    An Oracle instance on a node in a TruCluster has its database cache in
    shared memory- so all CPUs in that instance/on that node can access it 
    equally.  No performance penalty.
    
    An Oracle instance on another node in the TruCluster, if it needs
    access to the data in the first instance's cache, must access it via
    the TruCluster-provided communications mechanisms- over memory channel,
    ultimately, using Distributed Lock Manager and some aspect of BSS/BSC
    (Block Shipment Server and Client pieces of DRD) (I'm not intimately
    familiar with this part of the machanism.)  This will be slower than
    accessing the shared memory on the same node would be.  How much
    slower, depends upon a large number of factors- how much MC bandwidth
    is consumed by other DLM and other BSS traffic, mainly, which is in
    turn dependent upon data partitioning to attempt to keep as much data
    localized to one particular instance so these internode transfers don't
    have to happen most of the time.
    
    >	- How do we compare such a performance hurt versus our competitor's 
    > hybrid SMP/MPP servers ?
    
    That's an evolving science.  In the general case, we refer to
    scaleability- measure performance (X) on one node, compare it to
    performance (Y) on N nodes; if NX = Y, you've achieved 100%
    scaleability, and you ask the other guys what they can do.  Our
    achievement for scaleability in specific benchmarks so far ranges
    widely; one year ago, at OPS announcement, TPC-C numbers on a 4-node
    cluster were roughly 30K, compared to 11K for a single node; 100%
    scaleability would have been 44K for 4 nodes, so we achieved 30K/44K
    or a little better than 67% (I repeat, a year ago, on this one
    application).  Each application must be carefully tuned.  This area is
    getting a lot of attention right now, in OPS' case, and I'm not one of
    the people doing the work, so I'm not in a position to comment further.
    
    DougO
 |