[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference nicctr::kap-users

Title:	Kuck Associates Preprocessor Users
Notice:	KAP V2.1 (f90,f77,C) SSB-kits - see note 2
Moderator:	HPCGRP::DEGREGORY

Created:	Fri Nov 22 1991
Last Modified:	Fri Jun 06 1997
Last Successful Update:	Fri Jun 06 1997
Number of topics:	390
Total number of notes:	1440

366.0. "Remplacement for C$DIR.. ?" by 48641::BOIRIN () Mon Feb 10 1997 08:15

    Hello happy HPC peoples !!
    
    I'm not a Fortran guru and I have a benchmark to run. I have some
    issues with the following, what KAP's directives should be used ??
    
    	     .
    	     .		
             kavz(i44)=1
    C$DIR BEGIN_TASKS
            call loop2000(eo1,vcoul1,str1,i1,i2)
    C$DIR NEXT_TASK
            call loop2000(eo2,vcoul2,str2,i22,i3)
    C$DIR NEXT_TASK
            call loop2000(eo3,vcoul3,str3,i33,i4)
    C$DIR NEXT_TASK
            call loop2000(eo4,vcoul4,str4,i44,i5)
    C$DIR END_TASKS
              
    Thank you for your help. Jean-Pierre.

T.R	Title	User	Personal Name	Date	Lines
366.1	a thought	HPCGRP::DEGREGORY	Karen 223-5801	`Mon Feb 10 1997 13:56`	19
	Jean-Pierre - I believe what those Cray directives are telling you is that you should run those 4 calls in parallel. The KAP products support parallelism on a loop level. So, my guess is that you would end up parallelizing loop2000 itself. So the the first call would run on all the processors, then the second call would run on all the processors etc. In other words, KAP would do the parallelism one level down. If you have the module that has loop2000, run the kap automatic parallelizer over it and see if it picks up and parallelizes the loop automatically (you will see a call to mppfrk and the entire loop gets done as a subroutine named with a PKprogamname_loopnumber (programname is the name of this module, and loop number is incremented for every loop kap parallelizes). Karen
366.2		48641::BOIRIN		`Thu Feb 13 1997 10:40`	12
	Hi Karen Thank you for your help. I have suppressed all the C$DIR directives, linked the four calls together and used KAP to do automatic parallelization of subroutine's loops. The main issue is that this code was wrote with tasking in mind and now I have a lot data dependencies within the loops. So the speed-up = 0 !! Do you think there is an easy way to simulate tasking with KAP ? Thank again. Best regards. Jean-Pierre.
366.3	Try doing what the source code said.	GEMGRP::PIEPER		`Thu Feb 13 1997 11:28`	36
	Jean-pierre, You need to change the code like this: . . kavz(i44)=1 new_eo(1) = eo1 new_eo(2) = eo2 new_eo(3) = eo3 new_eo(4) = eo4 (similar assignments for vcoul[1-4] and str[1-4]) new_arg4(1) = i1 new_arg4(2) = i22 (etc., and similar for the 5th argument) c$dir parallel do do new_i = 1, 4 call loop2000( new_eo(new_i), new_vcoul(new_i), new_str(new_i), new_arg4(new_i), new_arg5(new_i)) end do This will let KAP parallelize the four calls just the way it used to do. You may have to spell "c$dir parallel do" some other way for KAP -- that is the PCF spelling. Karen can advise better about details. (maybe you need a concurrent_call directive too?) You are limited to 4-way parallelism, but the code as written had that restriction, too.
366.4	Is this what you want?	HPCGRP::MANLEY		`Thu Feb 13 1997 13:41`	32
	Re: .0, .2 Change this: kavz(i44)=1 C$DIR BEGIN_TASKS call loop2000(eo1,vcoul1,str1,i1,i2) C$DIR NEXT_TASK call loop2000(eo2,vcoul2,str2,i22,i3) C$DIR NEXT_TASK call loop2000(eo3,vcoul3,str3,i33,i4) C$DIR NEXT_TASK call loop2000(eo4,vcoul4,str4,i44,i5) C$DIR END_TASKS to something like this: C$ ASSERT CONCURRENT CALL C$ ASSERT DO( CONCURRENT ) DO I=1,4 IF( I.EQ.1 )THEN call loop2000(eo1,vcoul1,str1,i1,i2) ELSEIF( I.EQ.2 )THEN call loop2000(eo2,vcoul2,str2,i22,i3) ELSEIF( I.EQ.3 )THEN call loop2000(eo3,vcoul3,str3,i33,i4) ELSE call loop2000(eo4,vcoul4,str4,i44,i5) ENDIF ENDDO
366.5	That's it	GEMGRP::PIEPER		`Thu Feb 13 1997 18:54`	2
	That is very much in the spirit of the original directives. And a lot less typing. Kudos, Mr. Manley!
366.6		48641::BOIRIN		`Fri Feb 14 1997 10:30`	11
	Thank you everybody. I have ran my tests this afternoon and I achieved a speedup of 3.5 on 4 processors. The customer is very impressed byr our numbers. We are fighting for 32 processors cluster against SGI, Convex and Sun. Thank you again. I have learned little things (but not enough) in HPC with this bid. We'll win !!!! JP.