| Title: | Kuck Associates Preprocessor Users |
| Notice: | KAP V2.1 (f90,f77,C) SSB-kits - see note 2 |
| Moderator: | HPCGRP::DEGREGORY |
| Created: | Fri Nov 22 1991 |
| Last Modified: | Fri Jun 06 1997 |
| Last Successful Update: | Fri Jun 06 1997 |
| Number of topics: | 390 |
| Total number of notes: | 1440 |
Hello happy HPC peoples !!
I'm not a Fortran guru and I have a benchmark to run. I have some
issues with the following, what KAP's directives should be used ??
.
.
kavz(i44)=1
C$DIR BEGIN_TASKS
call loop2000(eo1,vcoul1,str1,i1,i2)
C$DIR NEXT_TASK
call loop2000(eo2,vcoul2,str2,i22,i3)
C$DIR NEXT_TASK
call loop2000(eo3,vcoul3,str3,i33,i4)
C$DIR NEXT_TASK
call loop2000(eo4,vcoul4,str4,i44,i5)
C$DIR END_TASKS
Thank you for your help. Jean-Pierre.
| T.R | Title | User | Personal Name | Date | Lines |
|---|---|---|---|---|---|
| 366.1 | a thought | HPCGRP::DEGREGORY | Karen 223-5801 | Mon Feb 10 1997 13:56 | 19 |
Jean-Pierre - I believe what those Cray directives are telling you is that you should run those 4 calls in parallel. The KAP products support parallelism on a loop level. So, my guess is that you would end up parallelizing loop2000 itself. So the the first call would run on all the processors, then the second call would run on all the processors etc. In other words, KAP would do the parallelism one level down. If you have the module that has loop2000, run the kap automatic parallelizer over it and see if it picks up and parallelizes the loop automatically (you will see a call to mppfrk and the entire loop gets done as a subroutine named with a PKprogamname_loopnumber (programname is the name of this module, and loop number is incremented for every loop kap parallelizes). Karen | |||||
| 366.2 | 48641::BOIRIN | Thu Feb 13 1997 10:40 | 12 | ||
Hi Karen
Thank you for your help. I have suppressed all the C$DIR directives,
linked the four calls together and used KAP to do automatic
parallelization of subroutine's loops.
The main issue is that this code was wrote with tasking in mind and now
I have a lot data dependencies within the loops. So the speed-up = 0 !!
Do you think there is an easy way to simulate tasking with KAP ?
Thank again. Best regards. Jean-Pierre.
| |||||
| 366.3 | Try doing what the source code said. | GEMGRP::PIEPER | Thu Feb 13 1997 11:28 | 36 | |
Jean-pierre, You need to change the code like this: . . kavz(i44)=1 new_eo(1) = eo1 new_eo(2) = eo2 new_eo(3) = eo3 new_eo(4) = eo4 (similar assignments for vcoul[1-4] and str[1-4]) new_arg4(1) = i1 new_arg4(2) = i22 (etc., and similar for the 5th argument) c$dir parallel do do new_i = 1, 4 call loop2000( new_eo(new_i), new_vcoul(new_i), new_str(new_i), new_arg4(new_i), new_arg5(new_i)) end do This will let KAP parallelize the four calls just the way it used to do. You may have to spell "c$dir parallel do" some other way for KAP -- that is the PCF spelling. Karen can advise better about details. (maybe you need a concurrent_call directive too?) You are limited to 4-way parallelism, but the code as written had that restriction, too. | |||||
| 366.4 | Is this what you want? | HPCGRP::MANLEY | Thu Feb 13 1997 13:41 | 32 | |
Re: .0, .2
Change this:
kavz(i44)=1
C$DIR BEGIN_TASKS
call loop2000(eo1,vcoul1,str1,i1,i2)
C$DIR NEXT_TASK
call loop2000(eo2,vcoul2,str2,i22,i3)
C$DIR NEXT_TASK
call loop2000(eo3,vcoul3,str3,i33,i4)
C$DIR NEXT_TASK
call loop2000(eo4,vcoul4,str4,i44,i5)
C$DIR END_TASKS
to something like this:
C*$* ASSERT CONCURRENT CALL
C*$* ASSERT DO( CONCURRENT )
DO I=1,4
IF( I.EQ.1 )THEN
call loop2000(eo1,vcoul1,str1,i1,i2)
ELSEIF( I.EQ.2 )THEN
call loop2000(eo2,vcoul2,str2,i22,i3)
ELSEIF( I.EQ.3 )THEN
call loop2000(eo3,vcoul3,str3,i33,i4)
ELSE
call loop2000(eo4,vcoul4,str4,i44,i5)
ENDIF
ENDDO
| |||||
| 366.5 | That's it | GEMGRP::PIEPER | Thu Feb 13 1997 18:54 | 2 | |
That is very much in the spirit of the original directives. And a lot less typing. Kudos, Mr. Manley! | |||||
| 366.6 | 48641::BOIRIN | Fri Feb 14 1997 10:30 | 11 | ||
Thank you everybody. I have ran my tests this afternoon and I achieved
a speedup of 3.5 on 4 processors.
The customer is very impressed byr our numbers.
We are fighting for 32 processors cluster against SGI, Convex and Sun.
Thank you again. I have learned little things (but not enough) in HPC
with this bid. We'll win !!!!
JP.
| |||||