| Title: | Digital Fortran | 
| Notice: | Read notes 1.* for important information | 
| Moderator: | QUARK::LIONEL | 
| Created: | Thu Jun 01 1995 | 
| Last Modified: | Fri Jun 06 1997 | 
| Last Successful Update: | Fri Jun 06 1997 | 
| Number of topics: | 1333 | 
| Total number of notes: | 6734 | 
    DEC Fortran 77 V4.0-1 Digital UNIX 4.0
    
    Can someone explain why there is an accuracy problem in the following
    program output ?   I feel there could be a problem in libraries that we
    are linking.  Can someone explain why this problem occurs ? 
    Should we be using a format statement instead of list-directed
    formatting ?
    
    As you may notice, if I use the exponent operator, the frequency of
    accuracy errors is less compared to using a direct multiplication.
    
    The customer wants to use direct multiplication because they want to
    avoid the overhead of function calls.  This customer is a scientific
    institution doing research on meteorology.  They want absolute
    accuracy and this is an issue with the customer.
    
    Where are we going wrong ?
    
    Thanks for any inputs.
    
    Regards,
    
    R. Devarajan 
    
    Program : test2.f
    
	program test
	real *8 x,y,z
	x=10.0
	y=1.0
	do i=1,50
	y=y*x
	z=x**i
	write (6,*) i,y,z
	end do
	end
    
    Compilation : f77 -o test1 test1.f
    
    Output :
           1   10.0000000000000        10.0000000000000     
           2   100.000000000000        100.000000000000     
           3   1000.00000000000        1000.00000000000     
           4   10000.0000000000        10000.0000000000     
           5   100000.000000000        100000.000000000     
           6   1000000.00000000        1000000.00000000     
           7   10000000.0000000        10000000.0000000     
           8   100000000.000000        100000000.000000     
           9   1000000000.00000        1000000000.00000     
          10   10000000000.0000        10000000000.0000     
          11   100000000000.000        100000000000.000     
          12   1000000000000.00        1000000000000.00     
          13   10000000000000.0        10000000000000.0     
          14   100000000000000.        100000000000000.     
          15  1.000000000000000E+015  1.000000000000000E+015
          16  1.000000000000000E+016  1.000000000000000E+016
          17  1.000000000000000E+017  1.000000000000000E+017
          18  1.000000000000000E+018  1.000000000000000E+018
          19  1.000000000000000E+019  1.000000000000000E+019
          20  1.000000000000000E+020  1.000000000000000E+020
          21  1.000000000000000E+021  1.000000000000000E+021
          22  1.000000000000000E+022  1.000000000000000E+022
          23  9.999999999999999E+022  9.999999999999999E+022
          24  1.000000000000000E+024  1.000000000000000E+024
          25  9.999999999999999E+024  1.000000000000000E+025
          26  9.999999999999999E+025  1.000000000000000E+026
          27  9.999999999999999E+026  1.000000000000000E+027
          28  1.000000000000000E+028  1.000000000000000E+028
          29  9.999999999999999E+028  9.999999999999999E+028
          30  9.999999999999999E+029  1.000000000000000E+030
          31  9.999999999999999E+030  1.000000000000000E+031
          32  9.999999999999999E+031  1.000000000000000E+032
          33  9.999999999999999E+032  1.000000000000000E+033
          34  9.999999999999999E+033  1.000000000000000E+034
          35  1.000000000000000E+035  1.000000000000000E+035
          36  9.999999999999999E+035  1.000000000000000E+036
          37  9.999999999999998E+036  1.000000000000000E+037
          38  9.999999999999998E+037  1.000000000000000E+038
          39  9.999999999999998E+038  1.000000000000000E+039
          40  9.999999999999998E+039  1.000000000000000E+040
          41  9.999999999999998E+040  1.000000000000000E+041
          42  9.999999999999999E+041  1.000000000000000E+042
          43  9.999999999999999E+042  1.000000000000000E+043
          44  9.999999999999999E+043  1.000000000000000E+044
          45  9.999999999999999E+044  1.000000000000000E+045
          46  1.000000000000000E+046  1.000000000000000E+046
          47  1.000000000000000E+047  1.000000000000000E+047
          48  1.000000000000000E+048  1.000000000000000E+048
          49  1.000000000000000E+049  1.000000000000000E+049
          50  1.000000000000000E+050  1.000000000000000E+050
    
| T.R | Title | User | Personal Name | Date | Lines | 
|---|---|---|---|---|---|
| 1188.1 | floating point is not absolutely accurate | HERON::BLOMBERG | Trapped inside the universe | Thu Feb 20 1997 05:30 | 33 | 
|     
            Note that there is no "absolute accuracy" in the whole range
            of floating point. Real*8 in G- or T-floating has a 53-bit
            fraction, that's it.
    
            Look at your example:
    
            x=10.0
    
            This is exactely represented, the fraction starts with 101,
            the rest is zero. (Forget about the hidden most significant
            bit for the moment.)
    
            y=y*x
    
            This will apparently calulate 100,1000,10000 etc.
            But each time you multiply with 10 the fraction will get around
            2.3 more non-zero bits. 53/2.3 = 23, so somewhere around
            10**23 you get underflow, the stored datum is not exactely 10**k.
            After that you get more underflow every round in the loop.
    
            I can't say precisely how z**i does it, some accumulated
            underflow is probably avoided, but still there is the
            limitation  of the 53-bit fraction.
    
            Finally, your program does not write out the eaxct values
            of y and z. Write(6,*) writes the values rounded to 16 digits.
            When investigating behaviours like this it's often more
            practical to type out the hexadecimal representation.
    
    
    /�ke
    
 | |||||
| 1188.2 | Computers are like that | PERFOM::HENNING | Thu Feb 20 1997 05:40 | 19 | |
|     "an accuracy problem" - um, not really.  Just an accuracy limitation.  
    Floating point numbers are stored in computers with a limited number of
    binary digits.
    
    "absolute accuracy" - then don't count anything that can't be counted
    as a reasonable size integer.  (In meterology??)
    
    "overhead of function calls" - inline code is sometimes faster than a
    function call, but not always.  If the function is decently written, it
    can more than pay back the overhead of its call.  A well-written
    function call will also often be MORE accurate than spelling out the
    steps in detail in your own program, because the folks writing the
    functions have carefully picked algorithms that tend to NOT accumulate
    error as they go.
    
    There's a whole field here - Numerical Analysis, Scientific Computing, 
    Algorithms, How To Program Good In Fortran - your customer may want to
    find a book with a title something like that, or a good course at a
    community college.
 | |||||
| 1188.3 | How the RTL does it. | WIBBIN::NOYCE | Pulling weeds, pickin' stones | Thu Feb 20 1997 08:35 | 24 | 
| By the way, the standard way to implement x**i for i not a constant, is to use steps of squaring and multiplying by x, rather than simply multiplying by x. I'm sure this is the method the RTL uses. For example, to compute x**35, it would follow a procedure such as y = x y = y*y = x**2 y = y*y = x**4 y = y*y = x**8 y = y*y = x**16 y = y*x = x**17 y = y*y = x**34 y = y*x = x**35 Only 7 multiplies, compared to 34 or 35 for the naive method in your program. Fewer multiplies means it's faster, and also that there are fewer opportunities for round-off error to accumulate. For some values of i, there are faster ways -- for example, computing x**15 this way takes 6 multiplies, but you can do it with only 5: z = x*x = x**2 z = z*z = x**4 z = z*x = x**5 y = z*z = x**10 y = y*z = x**15 but I don't think the RTL does this. | |||||
| 1188.4 | Alpha vs. UltraSparc | QCAV01::DEVARAJAN | Thu Feb 20 1997 10:18 | 13 | |
|     Re : .1-.3
    
    	Forgot one point.  The customer said, he had run the same program
    on a Sun UltraSparc system and he got exact results (meaning 10.0**23
    or 10.0**24, etc.)
    
    	So, where does Alpha differ ?
    
    	Regards.
    
    	R. Devarajan
    
    
 | |||||
| 1188.5 | Does Sun's output have few digits? | STEVEN::hobbs | Steven Hobbs | Thu Feb 20 1997 11:12 | 4 | 
| I notice that Digital chose to use 16 digits of precision in our list directed output. Is it possible that Sun chose to use only 15 digits of precision? You would not see this behavior on our system if we had chosen to print the results with only 15 digits. | |||||
| 1188.6 | Sun strikes again! | TLE::WHITLOCK | Stan Whitlock | Thu Feb 20 1997 14:12 | 8 | 
| The Sun Fortran docs I have do not specify what format is used for list-directed output, only to say that it chooses a "simple" representation over an accurate one. As .-1 noted, if you print this stuff out E20.15, they all are ".1000...000Enn" so they look accurate... they're really just "simple". /Stan | |||||
| 1188.7 | Thank you!! | QCAV01::DEVARAJAN | Fri Feb 21 1997 08:32 | 10 | |
|     Thanks for all the inputs.  I have sent a mail to the customer that
    they use a format statement with E20.15 for a REAL *8 variable and
    E21.16 for a REAL *16 variable.  I have checked with the program that
    if the precision goes beyond 15 digit for a R*8 variable, "accuracy" is
    lost (basically some calculations give 9s).  In these cases, R*16 is
    helpful.
    
    Thanks and regards,
    
    R. Devarajan
 | |||||