FV3 Bundle
mpp.F90
Go to the documentation of this file.
1 !***********************************************************************
2 !* GNU Lesser General Public License
3 !*
4 !* This file is part of the GFDL Flexible Modeling System (FMS).
5 !*
6 !* FMS is free software: you can redistribute it and/or modify it under
7 !* the terms of the GNU Lesser General Public License as published by
8 !* the Free Software Foundation, either version 3 of the License, or (at
9 !* your option) any later version.
10 !*
11 !* FMS is distributed in the hope that it will be useful, but WITHOUT
12 !* ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
13 !* FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
14 !* for more details.
15 !*
16 !* You should have received a copy of the GNU Lesser General Public
17 !* License along with FMS. If not, see <http://www.gnu.org/licenses/>.
18 !***********************************************************************
19 !-----------------------------------------------------------------------
20 ! Communication for message-passing codes
21 !
22 ! AUTHOR: V. Balaji (V.Balaji@noaa.gov)
23 ! SGI/GFDL Princeton University
24 !
25 ! This program is free software; you can redistribute it and/or modify
26 ! it under the terms of the GNU General Public License as published by
27 ! the Free Software Foundation; either version 2 of the License, or
28 ! (at your option) any later version.
29 !
30 ! This program is distributed in the hope that it will be useful,
31 ! but WITHOUT ANY WARRANTY; without even the implied warranty of
32 ! MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
33 ! GNU General Public License for more details.
34 !
35 ! For the full text of the GNU General Public License,
36 ! write to: Free Software Foundation, Inc.,
37 ! 675 Mass Ave, Cambridge, MA 02139, USA.
38 !-----------------------------------------------------------------------
39 module mpp_mod
40 !a generalized communication package for use with shmem and MPI
41 !will add: co_array_fortran, MPI2
42 !Balaji (V.Balaji@noaa.gov) 11 May 1998
43 
44 ! <CONTACT EMAIL="V.Balaji@noaa.gov">
45 ! V. Balaji
46 ! </CONTACT>
47 
48 ! <HISTORY SRC="http://www.gfdl.noaa.gov/fms-cgi-bin/cvsweb.cgi/FMS/"/>
49 ! <RCSLOG SRC="http://www.gfdl.noaa.gov/~vb/changes_mpp.html"/>
50 
51 ! <OVERVIEW>
52 ! <TT>mpp_mod</TT>, is a set of simple calls to provide a uniform interface
53 ! to different message-passing libraries. It currently can be
54 ! implemented either in the SGI/Cray native SHMEM library or in the MPI
55 ! standard. Other libraries (e.g MPI-2, Co-Array Fortran) can be
56 ! incorporated as the need arises.
57 ! </OVERVIEW>
58 
59 ! <DESCRIPTION>
60 ! The data transfer between a processor and its own memory is based
61 ! on <TT>load</TT> and <TT>store</TT> operations upon
62 ! memory. Shared-memory systems (including distributed shared memory
63 ! systems) have a single address space and any processor can acquire any
64 ! data within the memory by <TT>load</TT> and
65 ! <TT>store</TT>. The situation is different for distributed
66 ! parallel systems. Specialized MPP systems such as the T3E can simulate
67 ! shared-memory by direct data acquisition from remote memory. But if
68 ! the parallel code is distributed across a cluster, or across the Net,
69 ! messages must be sent and received using the protocols for
70 ! long-distance communication, such as TCP/IP. This requires a
71 ! ``handshaking'' between nodes of the distributed system. One can think
72 ! of the two different methods as involving <TT>put</TT>s or
73 ! <TT>get</TT>s (e.g the SHMEM library), or in the case of
74 ! negotiated communication (e.g MPI), <TT>send</TT>s and
75 ! <TT>recv</TT>s.
76 !
77 ! The difference between SHMEM and MPI is that SHMEM uses one-sided
78 ! communication, which can have very low-latency high-bandwidth
79 ! implementations on tightly coupled systems. MPI is a standard
80 ! developed for distributed computing across loosely-coupled systems,
81 ! and therefore incurs a software penalty for negotiating the
82 ! communication. It is however an open industry standard whereas SHMEM
83 ! is a proprietary interface. Besides, the <TT>put</TT>s or
84 ! <TT>get</TT>s on which it is based cannot currently be implemented in
85 ! a cluster environment (there are recent announcements from Compaq that
86 ! occasion hope).
87 !
88 ! The message-passing requirements of climate and weather codes can be
89 ! reduced to a fairly simple minimal set, which is easily implemented in
90 ! any message-passing API. <TT>mpp_mod</TT> provides this API.
91 !
92 ! Features of <TT>mpp_mod</TT> include:
93 !
94 ! 1) Simple, minimal API, with free access to underlying API for
95 ! more complicated stuff.<BR/>
96 ! 2) Design toward typical use in climate/weather CFD codes.<BR/>
97 ! 3) Performance to be not significantly lower than any native API.
98 !
99 ! This module is used to develop higher-level calls for <LINK
100 ! SRC="mpp_domains.html">domain decomposition</LINK> and <LINK
101 ! SRC="mpp_io.html">parallel I/O</LINK>.
102 !
103 ! Parallel computing is initially daunting, but it soon becomes
104 ! second nature, much the way many of us can now write vector code
105 ! without much effort. The key insight required while reading and
106 ! writing parallel code is in arriving at a mental grasp of several
107 ! independent parallel execution streams through the same code (the SPMD
108 ! model). Each variable you examine may have different values for each
109 ! stream, the processor ID being an obvious example. Subroutines and
110 ! function calls are particularly subtle, since it is not always obvious
111 ! from looking at a call what synchronization between execution streams
112 ! it implies. An example of erroneous code would be a global barrier
113 ! call (see <LINK SRC="#mpp_sync">mpp_sync</LINK> below) placed
114 ! within a code block that not all PEs will execute, e.g:
115 !
116 ! <PRE>
117 ! if( pe.EQ.0 )call mpp_sync()
118 ! </PRE>
119 !
120 ! Here only PE 0 reaches the barrier, where it will wait
121 ! indefinitely. While this is a particularly egregious example to
122 ! illustrate the coding flaw, more subtle versions of the same are
123 ! among the most common errors in parallel code.
124 !
125 ! It is therefore important to be conscious of the context of a
126 ! subroutine or function call, and the implied synchronization. There
127 ! are certain calls here (e.g <TT>mpp_declare_pelist, mpp_init,
128 ! mpp_malloc, mpp_set_stack_size</TT>) which must be called by all
129 ! PEs. There are others which must be called by a subset of PEs (here
130 ! called a <TT>pelist</TT>) which must be called by all the PEs in the
131 ! <TT>pelist</TT> (e.g <TT>mpp_max, mpp_sum, mpp_sync</TT>). Still
132 ! others imply no synchronization at all. I will make every effort to
133 ! highlight the context of each call in the MPP modules, so that the
134 ! implicit synchronization is spelt out.
135 !
136 ! For performance it is necessary to keep synchronization as limited
137 ! as the algorithm being implemented will allow. For instance, a single
138 ! message between two PEs should only imply synchronization across the
139 ! PEs in question. A <I>global</I> synchronization (or <I>barrier</I>)
140 ! is likely to be slow, and is best avoided. But codes first
141 ! parallelized on a Cray T3E tend to have many global syncs, as very
142 ! fast barriers were implemented there in hardware.
143 !
144 ! Another reason to use pelists is to run a single program in MPMD
145 ! mode, where different PE subsets work on different portions of the
146 ! code. A typical example is to assign an ocean model and atmosphere
147 ! model to different PE subsets, and couple them concurrently instead of
148 ! running them serially. The MPP module provides the notion of a
149 ! <I>current pelist</I>, which is set when a group of PEs branch off
150 ! into a subset. Subsequent calls that omit the <TT>pelist</TT> optional
151 ! argument (seen below in many of the individual calls) assume that the
152 ! implied synchronization is across the current pelist. The calls
153 ! <TT>mpp_root_pe</TT> and <TT>mpp_npes</TT> also return the values
154 ! appropriate to the current pelist. The <TT>mpp_set_current_pelist</TT>
155 ! call is provided to set the current pelist.
156 
157 ! </DESCRIPTION>
158 ! <PUBLIC>
159 ! F90 is a strictly-typed language, and the syntax pass of the
160 ! compiler requires matching of type, kind and rank (TKR). Most calls
161 ! listed here use a generic type, shown here as <TT>MPP_TYPE_</TT>. This
162 ! is resolved in the pre-processor stage to any of a variety of
163 ! types. In general the MPP operations work on 4-byte and 8-byte
164 ! variants of <TT>integer, real, complex, logical</TT> variables, of
165 ! rank 0 to 5, leading to 48 specific module procedures under the same
166 ! generic interface. Any of the variables below shown as
167 ! <TT>MPP_TYPE_</TT> is treated in this way.
168 ! </PUBLIC>
169 
170 ! Define rank(X) for PGI compiler
171 #ifdef __PGI
172 #define rank(X) size(shape(X))
173 #endif
174 
175 #include <fms_platform.h>
176 
177 #if defined(use_libSMA) && defined(sgi_mipspro)
178  use shmem_interface
179 #endif
180 
181 #if defined(use_libMPI) && defined(sgi_mipspro)
182  use mpi
183 #endif
184 
194  use mpp_parameter_mod, only : mpp_parameter_version=>version
195  use mpp_parameter_mod, only : default_tag
202  use mpp_data_mod, only : stat, mpp_stack, ptr_stack, status, ptr_status, sync, ptr_sync
203  use mpp_data_mod, only : mpp_from_pe, ptr_from, remote_data_loc, ptr_remote
204  use mpp_data_mod, only : mpp_data_version=>version
205 
206 implicit none
207 private
208 
209 #if defined(use_libSMA)
210 #include <mpp/shmem.fh>
211 #endif
212 
213 #if defined(use_libMPI) && !defined(sgi_mipspro)
214 #include <mpif.h>
215 !sgi_mipspro gets this from 'use mpi'
216 #endif
217 
218  !--- public paramters -----------------------------------------------
219  public :: mpp_verbose, mpp_debug, all_pes, any_pe, null_pe, note, warning, fatal
220  public :: mpp_clock_sync, mpp_clock_detailed, clock_component, clock_subcomponent
221  public :: clock_module_driver, clock_module, clock_routine, clock_loop, clock_infra
222  public :: maxpes, event_recv, event_send, input_str_length
223  public :: comm_tag_1, comm_tag_2, comm_tag_3, comm_tag_4
224  public :: comm_tag_5, comm_tag_6, comm_tag_7, comm_tag_8
225  public :: comm_tag_9, comm_tag_10, comm_tag_11, comm_tag_12
226  public :: comm_tag_13, comm_tag_14, comm_tag_15, comm_tag_16
227  public :: comm_tag_17, comm_tag_18, comm_tag_19, comm_tag_20
228  public :: mpp_fill_int,mpp_fill_double
229 
230  !--- public data from mpp_data_mod ------------------------------
231 ! public :: request
232 
233  !--- public interface from mpp_util.h ------------------------------
234  public :: stdin, stdout, stderr, stdlog, lowercase, uppercase, mpp_error, mpp_error_state
235  public :: mpp_set_warn_level, mpp_sync, mpp_sync_self, mpp_set_stack_size, mpp_pe
236  public :: mpp_node, mpp_npes, mpp_root_pe, mpp_set_root_pe, mpp_declare_pelist
237  public :: mpp_get_current_pelist, mpp_set_current_pelist, mpp_get_current_pelist_name
238  public :: mpp_clock_id, mpp_clock_set_grain, mpp_record_timing_data, get_unit
239  public :: read_ascii_file, read_input_nml, mpp_clock_begin, mpp_clock_end
240  public :: get_ascii_file_num_lines
241  public :: mpp_record_time_start, mpp_record_time_end
242 
243  !--- public interface from mpp_comm.h ------------------------------
245  public :: mpp_sum_ad
246  public :: mpp_broadcast, mpp_malloc, mpp_init, mpp_exit
248  public :: mpp_type, mpp_byte, mpp_type_create, mpp_type_free
249 #ifdef use_MPI_GSM
250  public :: mpp_gsm_malloc, mpp_gsm_free
251 #endif
252 
253  !*********************************************************************
254  !
255  ! public data type
256  !
257  !*********************************************************************
258  !peset hold communicators as SHMEM-compatible triads (start, log2(stride), num)
259  type :: communicator
260  private
261  character(len=32) :: name
262  integer, pointer :: list(:) =>null()
263  integer :: count
264  integer :: start, log2stride ! dummy variables when libMPI is defined.
265  integer :: id, group ! MPI communicator and group id for this PE set.
266  ! dummy variables when libSMA is defined.
267  end type communicator
268 
269  type :: event
270  private
271  character(len=16) :: name
272  integer(LONG_KIND), dimension(MAX_EVENTS) :: ticks, bytes
273  integer :: calls
274  end type event
275 
276  !a clock contains an array of event profiles for a region
277  type :: clock
278  private
279  character(len=32) :: name
280  integer(LONG_KIND) :: tick
281  integer(LONG_KIND) :: total_ticks
282  integer :: peset_num
283  logical :: sync_on_begin, detailed
284  integer :: grain
285  type(event), pointer :: events(:) =>null() !if needed, allocate to MAX_EVENT_TYPES
286  logical :: is_on !initialize to false. set true when calling mpp_clock_begin
287  ! set false when calling mpp_clock_end
288  end type clock
289 
291  private
292  character(len=16) :: name
293  real(DOUBLE_KIND) :: msg_size_sums(max_bins)
294  real(DOUBLE_KIND) :: msg_time_sums(max_bins)
295  real(DOUBLE_KIND) :: total_data
296  real(DOUBLE_KIND) :: total_time
297  integer(LONG_KIND) :: msg_size_cnts(max_bins)
298  integer(LONG_KIND) :: total_cnts
299  end type clock_data_summary
300 
302  private
303  character(len=16) :: name
304  type(clock_data_summary) :: event(max_event_types)
305  end type summary_struct
306 
307  ! Data types for generalized data transfer (e.g. MPI_Type)
308  type :: mpp_type
309  private
310  integer :: counter ! Number of instances of this type
311  integer :: ndims
312  integer, allocatable :: sizes(:)
313  integer, allocatable :: subsizes(:)
314  integer, allocatable :: starts(:)
315  integer :: etype ! Elementary data type (e.g. MPI_BYTE)
316  integer :: id ! Identifier within message passing library (e.g. MPI)
317 
318  type(mpp_type), pointer :: prev => null()
319  type(mpp_type), pointer :: next => null()
320  end type mpp_type
321 
322  ! Persisent elements for linked list interaction
324  private
325  type(mpp_type), pointer :: head => null()
326  type(mpp_type), pointer :: tail => null()
327  integer :: length
328  end type mpp_type_list
329 
330 !***********************************************************************
331 !
332 ! public interface from mpp_util.h
333 !
334 !***********************************************************************
335  ! <INTERFACE NAME="mpp_error">
336  ! <OVERVIEW>
337  ! Error handler.
338  ! </OVERVIEW>
339  ! <DESCRIPTION>
340  ! It is strongly recommended that all error exits pass through
341  ! <TT>mpp_error</TT> to assure the program fails cleanly. An individual
342  ! PE encountering a <TT>STOP</TT> statement, for instance, can cause the
343  ! program to hang. The use of the <TT>STOP</TT> statement is strongly
344  ! discouraged.
345  !
346  ! Calling mpp_error with no arguments produces an immediate error
347  ! exit, i.e:
348  ! <PRE>
349  ! call mpp_error
350  ! call mpp_error(FATAL)
351  ! </PRE>
352  ! are equivalent.
353  !
354  ! The argument order
355  ! <PRE>
356  ! call mpp_error( routine, errormsg, errortype )
357  ! </PRE>
358  ! is also provided to support legacy code. In this version of the
359  ! call, none of the arguments may be omitted.
360  !
361  ! The behaviour of <TT>mpp_error</TT> for a <TT>WARNING</TT> can be
362  ! controlled with an additional call <TT>mpp_set_warn_level</TT>.
363  ! <PRE>
364  ! call mpp_set_warn_level(ERROR)
365  ! </PRE>
366  ! causes <TT>mpp_error</TT> to treat <TT>WARNING</TT>
367  ! exactly like <TT>FATAL</TT>.
368  ! <PRE>
369  ! call mpp_set_warn_level(WARNING)
370  ! </PRE>
371  ! resets to the default behaviour described above.
372  !
373  ! <TT>mpp_error</TT> also has an internal error state which
374  ! maintains knowledge of whether a warning has been issued. This can be
375  ! used at startup in a subroutine that checks if the model has been
376  ! properly configured. You can generate a series of warnings using
377  ! <TT>mpp_error</TT>, and then check at the end if any warnings has been
378  ! issued using the function <TT>mpp_error_state()</TT>. If the value of
379  ! this is <TT>WARNING</TT>, at least one warning has been issued, and
380  ! the user can take appropriate action:
381  !
382  ! <PRE>
383  ! if( ... )call mpp_error( WARNING, '...' )
384  ! if( ... )call mpp_error( WARNING, '...' )
385  ! if( ... )call mpp_error( WARNING, '...' )
386  ! ...
387  ! if( mpp_error_state().EQ.WARNING )call mpp_error( FATAL, '...' )
388  ! </PRE>
389  ! </DESCRIPTION>
390  ! <TEMPLATE>
391  ! call mpp_error( errortype, routine, errormsg )
392  ! </TEMPLATE>
393  ! <IN NAME="errortype">
394  ! One of <TT>NOTE</TT>, <TT>WARNING</TT> or <TT>FATAL</TT>
395  ! (these definitions are acquired by use association).
396  ! <TT>NOTE</TT> writes <TT>errormsg</TT> to <TT>STDOUT</TT>.
397  ! <TT>WARNING</TT> writes <TT>errormsg</TT> to <TT>STDERR</TT>.
398  ! <TT>FATAL</TT> writes <TT>errormsg</TT> to <TT>STDERR</TT>,
399  ! and induces a clean error exit with a call stack traceback.
400  ! </IN>
401  ! </INTERFACE>
402  interface mpp_error
403  module procedure mpp_error_basic
404  module procedure mpp_error_mesg
405  module procedure mpp_error_noargs
406  module procedure mpp_error_is
407  module procedure mpp_error_rs
408  module procedure mpp_error_ia
409  module procedure mpp_error_ra
410  module procedure mpp_error_ia_ia
411  module procedure mpp_error_ia_ra
412  module procedure mpp_error_ra_ia
413  module procedure mpp_error_ra_ra
414  module procedure mpp_error_ia_is
415  module procedure mpp_error_ia_rs
416  module procedure mpp_error_ra_is
417  module procedure mpp_error_ra_rs
418  module procedure mpp_error_is_ia
419  module procedure mpp_error_is_ra
420  module procedure mpp_error_rs_ia
421  module procedure mpp_error_rs_ra
422  module procedure mpp_error_is_is
423  module procedure mpp_error_is_rs
424  module procedure mpp_error_rs_is
425  module procedure mpp_error_rs_rs
426  end interface
427 
428  interface array_to_char
429  module procedure iarray_to_char
430  module procedure rarray_to_char
431  end interface
432 
433 !***********************************************************************
434 !
435 ! public interface from mpp_comm.h
436 !
437 !***********************************************************************
438 #ifdef use_libSMA
439  !currently SMA contains no generic shmem_wait for different integer kinds:
440  !I have inserted one here
441  interface shmem_integer_wait
442  module procedure shmem_int4_wait_local
443  module procedure shmem_int8_wait_local
444  end interface
445 #endif
446 
447 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
448  ! !
449  ! ROUTINES TO INITIALIZE/FINALIZE MPP MODULE: mpp_init, mpp_exit !
450  ! !
451 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
452 
453  ! <SUBROUTINE NAME="mpp_init">
454  ! <OVERVIEW>
455  ! Initialize <TT>mpp_mod</TT>.
456  ! </OVERVIEW>
457  ! <DESCRIPTION>
458  ! Called to initialize the <TT>mpp_mod</TT> package. It is recommended
459  ! that this call be the first executed line in your program. It sets the
460  ! number of PEs assigned to this run (acquired from the command line, or
461  ! through the environment variable <TT>NPES</TT>), and associates an ID
462  ! number to each PE. These can be accessed by calling <LINK
463  ! SRC="#mpp_npes"><TT>mpp_npes</TT></LINK> and <LINK
464  ! SRC="#mpp_pe"><TT>mpp_pe</TT></LINK>.
465  ! </DESCRIPTION>
466  ! <TEMPLATE>
467  ! call mpp_init( flags )
468  ! </TEMPLATE>
469  ! <IN NAME="flags" TYPE="integer">
470  ! <TT>flags</TT> can be set to <TT>MPP_VERBOSE</TT> to
471  ! have <TT>mpp_mod</TT> keep you informed of what it's up to.
472  ! </IN>
473  ! </SUBROUTINE>
474 
475  ! <SUBROUTINE NAME="mpp_exit">
476  ! <OVERVIEW>
477  ! Exit <TT>mpp_mod</TT>.
478  ! </OVERVIEW>
479  ! <DESCRIPTION>
480  ! Called at the end of the run, or to re-initialize <TT>mpp_mod</TT>,
481  ! should you require that for some odd reason.
482  !
483  ! This call implies synchronization across all PEs.
484  ! </DESCRIPTION>
485  ! <TEMPLATE>
486  ! call mpp_exit()
487  ! </TEMPLATE>
488  ! </SUBROUTINE>
489 
490  !#######################################################################
491  ! <SUBROUTINE NAME="mpp_malloc">
492  ! <OVERVIEW>
493  ! Symmetric memory allocation.
494  ! </OVERVIEW>
495  ! <DESCRIPTION>
496  ! This routine is used on SGI systems when <TT>mpp_mod</TT> is
497  ! invoked in the SHMEM library. It ensures that dynamically allocated
498  ! memory can be used with <TT>shmem_get</TT> and
499  ! <TT>shmem_put</TT>. This is called <I>symmetric
500  ! allocation</I> and is described in the
501  ! <TT>intro_shmem</TT> man page. <TT>ptr</TT> is a <I>Cray
502  ! pointer</I> (see the section on <LINK
503  ! SRC="#PORTABILITY">portability</LINK>). The operation can be expensive
504  ! (since it requires a global barrier). We therefore attempt to re-use
505  ! existing allocation whenever possible. Therefore <TT>len</TT>
506  ! and <TT>ptr</TT> must have the <TT>SAVE</TT> attribute
507  ! in the calling routine, and retain the information about the last call
508  ! to <TT>mpp_malloc</TT>. Additional memory is symmetrically
509  ! allocated if and only if <TT>newlen</TT> exceeds
510  ! <TT>len</TT>.
511  !
512  ! This is never required on Cray PVP or MPP systems. While the T3E
513  ! manpages do talk about symmetric allocation, <TT>mpp_mod</TT>
514  ! is coded to remove this restriction.
515  !
516  ! It is never required if <TT>mpp_mod</TT> is invoked in MPI.
517  !
518  ! This call implies synchronization across all PEs.
519  ! </DESCRIPTION>
520  ! <TEMPLATE>
521  ! call mpp_malloc( ptr, newlen, len )
522  ! </TEMPLATE>
523  ! <IN NAME="ptr">
524  ! a cray pointer, points to a dummy argument in this routine.
525  ! </IN>
526  ! <IN NAME="newlen" TYPE="integer">
527  ! the required allocation length for the pointer ptr
528  ! </IN>
529  ! <IN NAME="len" TYPE="integer">
530  ! the current allocation (0 if unallocated).
531  ! </IN>
532  ! </SUBROUTINE>
533 
534  !#####################################################################
535 
536  ! <SUBROUTINE NAME="mpp_set_stack_size">
537  ! <OVERVIEW>
538  ! Allocate module internal workspace.
539  ! </OVERVIEW>
540  ! <DESCRIPTION>
541  ! <TT>mpp_mod</TT> maintains a private internal array called
542  ! <TT>mpp_stack</TT> for private workspace. This call sets the length,
543  ! in words, of this array.
544  !
545  ! The <TT>mpp_init</TT> call sets this
546  ! workspace length to a default of 32768, and this call may be used if a
547  ! longer workspace is needed.
548  !
549  ! This call implies synchronization across all PEs.
550  !
551  ! This workspace is symmetrically allocated, as required for
552  ! efficient communication on SGI and Cray MPP systems. Since symmetric
553  ! allocation must be performed by <I>all</I> PEs in a job, this call
554  ! must also be called by all PEs, using the same value of
555  ! <TT>n</TT>. Calling <TT>mpp_set_stack_size</TT> from a subset of PEs,
556  ! or with unequal argument <TT>n</TT>, may cause the program to hang.
557  !
558  ! If any MPP call using <TT>mpp_stack</TT> overflows the declared
559  ! stack array, the program will abort with a message specifying the
560  ! stack length that is required. Many users wonder why, if the required
561  ! stack length can be computed, it cannot also be specified at that
562  ! point. This cannot be automated because there is no way for the
563  ! program to know if all PEs are present at that call, and with equal
564  ! values of <TT>n</TT>. The program must be rerun by the user with the
565  ! correct argument to <TT>mpp_set_stack_size</TT>, called at an
566  ! appropriate point in the code where all PEs are known to be present.
567  ! </DESCRIPTION>
568  ! <TEMPLATE>
569  ! call mpp_set_stack_size(n)
570  ! </TEMPLATE>
571  ! <IN NAME="n" TYPE="integer"></IN>
572  ! </SUBROUTINE>
573 
574 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
575 ! !
576 ! DATA TRANSFER TYPES: mpp_type_create !
577 ! !
578 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
579 
580  interface mpp_type_create
581  module procedure mpp_type_create_int4
582  module procedure mpp_type_create_int8
583  module procedure mpp_type_create_real4
584  module procedure mpp_type_create_real8
585  module procedure mpp_type_create_logical4
586  module procedure mpp_type_create_logical8
587  end interface mpp_type_create
588 
589 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
590  ! !
591  ! GLOBAL REDUCTION ROUTINES: mpp_max, mpp_sum, mpp_min !
592  ! !
593 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
594 
595  ! <INTERFACE NAME="mpp_max">
596  ! <OVERVIEW>
597  ! Reduction operations.
598  ! </OVERVIEW>
599  ! <DESCRIPTION>
600  ! Find the max of scalar a the PEs in pelist
601  ! result is also automatically broadcast to all PEs
602  ! </DESCRIPTION>
603  ! <TEMPLATE>
604  ! call mpp_max( a, pelist )
605  ! </TEMPLATE>
606  ! <IN NAME="a">
607  ! <TT>real</TT> or <TT>integer</TT>, of 4-byte of 8-byte kind.
608  ! </IN>
609  ! <IN NAME="pelist">
610  ! If <TT>pelist</TT> is omitted, the context is assumed to be the
611  ! current pelist. This call implies synchronization across the PEs in
612  ! <TT>pelist</TT>, or the current pelist if <TT>pelist</TT> is absent.
613  ! </IN>
614  ! </INTERFACE>
615 
616  interface mpp_max
617  module procedure mpp_max_real8_0d
618  module procedure mpp_max_real8_1d
619 #ifndef no_8byte_integers
620  module procedure mpp_max_int8_0d
621  module procedure mpp_max_int8_1d
622 #endif
623 #ifdef OVERLOAD_R4
624  module procedure mpp_max_real4_0d
625  module procedure mpp_max_real4_1d
626 #endif
627  module procedure mpp_max_int4_0d
628  module procedure mpp_max_int4_1d
629  end interface
630 
631  interface mpp_min
632  module procedure mpp_min_real8_0d
633  module procedure mpp_min_real8_1d
634 #ifndef no_8byte_integers
635  module procedure mpp_min_int8_0d
636  module procedure mpp_min_int8_1d
637 #endif
638 #ifdef OVERLOAD_R4
639  module procedure mpp_min_real4_0d
640  module procedure mpp_min_real4_1d
641 #endif
642  module procedure mpp_min_int4_0d
643  module procedure mpp_min_int4_1d
644  end interface
645 
646 
647  ! <INTERFACE NAME="mpp_sum">
648  ! <OVERVIEW>
649  ! Reduction operation.
650  ! </OVERVIEW>
651  ! <DESCRIPTION>
652  ! <TT>MPP_TYPE_</TT> corresponds to any 4-byte and 8-byte variant of
653  ! <TT>integer, real, complex</TT> variables, of rank 0 or 1. A
654  ! contiguous block from a multi-dimensional array may be passed by its
655  ! starting address and its length, as in <TT>f77</TT>.
656  !
657  ! Library reduction operators are not required or guaranteed to be
658  ! bit-reproducible. In any case, changing the processor count changes
659  ! the data layout, and thus very likely the order of operations. For
660  ! bit-reproducible sums of distributed arrays, consider using the
661  ! <TT>mpp_global_sum</TT> routine provided by the <LINK
662  ! SRC="mpp_domains.html"><TT>mpp_domains</TT></LINK> module.
663  !
664  ! The <TT>bit_reproducible</TT> flag provided in earlier versions of
665  ! this routine has been removed.
666  !
667  !
668  ! If <TT>pelist</TT> is omitted, the context is assumed to be the
669  ! current pelist. This call implies synchronization across the PEs in
670  ! <TT>pelist</TT>, or the current pelist if <TT>pelist</TT> is absent.
671  ! </DESCRIPTION>
672  ! <TEMPLATE>
673  ! call mpp_sum( a, length, pelist )
674  ! </TEMPLATE>
675  ! <IN NAME="length"></IN>
676  ! <IN NAME="pelist"></IN>
677  ! <INOUT NAME="a"></INOUT>
678  ! </INTERFACE>
679 
680  interface mpp_sum
681 #ifndef no_8byte_integers
682  module procedure mpp_sum_int8
683  module procedure mpp_sum_int8_scalar
684  module procedure mpp_sum_int8_2d
685  module procedure mpp_sum_int8_3d
686  module procedure mpp_sum_int8_4d
687  module procedure mpp_sum_int8_5d
688 #endif
689  module procedure mpp_sum_real8
690  module procedure mpp_sum_real8_scalar
691  module procedure mpp_sum_real8_2d
692  module procedure mpp_sum_real8_3d
693  module procedure mpp_sum_real8_4d
694  module procedure mpp_sum_real8_5d
695 #ifdef OVERLOAD_C8
696  module procedure mpp_sum_cmplx8
697  module procedure mpp_sum_cmplx8_scalar
698  module procedure mpp_sum_cmplx8_2d
699  module procedure mpp_sum_cmplx8_3d
700  module procedure mpp_sum_cmplx8_4d
701  module procedure mpp_sum_cmplx8_5d
702 #endif
703  module procedure mpp_sum_int4
704  module procedure mpp_sum_int4_scalar
705  module procedure mpp_sum_int4_2d
706  module procedure mpp_sum_int4_3d
707  module procedure mpp_sum_int4_4d
708  module procedure mpp_sum_int4_5d
709 #ifdef OVERLOAD_R4
710  module procedure mpp_sum_real4
711  module procedure mpp_sum_real4_scalar
712  module procedure mpp_sum_real4_2d
713  module procedure mpp_sum_real4_3d
714  module procedure mpp_sum_real4_4d
715  module procedure mpp_sum_real4_5d
716 #endif
717 #ifdef OVERLOAD_C4
718  module procedure mpp_sum_cmplx4
719  module procedure mpp_sum_cmplx4_scalar
720  module procedure mpp_sum_cmplx4_2d
721  module procedure mpp_sum_cmplx4_3d
722  module procedure mpp_sum_cmplx4_4d
723  module procedure mpp_sum_cmplx4_5d
724 #endif
725  end interface
726 
727  interface mpp_sum_ad
728 #ifndef no_8byte_integers
729  module procedure mpp_sum_int8_ad
730  module procedure mpp_sum_int8_scalar_ad
731  module procedure mpp_sum_int8_2d_ad
732  module procedure mpp_sum_int8_3d_ad
733  module procedure mpp_sum_int8_4d_ad
734  module procedure mpp_sum_int8_5d_ad
735 #endif
736  module procedure mpp_sum_real8_ad
737  module procedure mpp_sum_real8_scalar_ad
738  module procedure mpp_sum_real8_2d_ad
739  module procedure mpp_sum_real8_3d_ad
740  module procedure mpp_sum_real8_4d_ad
741  module procedure mpp_sum_real8_5d_ad
742 #ifdef OVERLOAD_C8
743  module procedure mpp_sum_cmplx8_ad
744  module procedure mpp_sum_cmplx8_scalar_ad
745  module procedure mpp_sum_cmplx8_2d_ad
746  module procedure mpp_sum_cmplx8_3d_ad
747  module procedure mpp_sum_cmplx8_4d_ad
748  module procedure mpp_sum_cmplx8_5d_ad
749 #endif
750  module procedure mpp_sum_int4_ad
751  module procedure mpp_sum_int4_scalar_ad
752  module procedure mpp_sum_int4_2d_ad
753  module procedure mpp_sum_int4_3d_ad
754  module procedure mpp_sum_int4_4d_ad
755  module procedure mpp_sum_int4_5d_ad
756 #ifdef OVERLOAD_R4
757  module procedure mpp_sum_real4_ad
758  module procedure mpp_sum_real4_scalar_ad
759  module procedure mpp_sum_real4_2d_ad
760  module procedure mpp_sum_real4_3d_ad
761  module procedure mpp_sum_real4_4d_ad
762  module procedure mpp_sum_real4_5d_ad
763 #endif
764 #ifdef OVERLOAD_C4
765  module procedure mpp_sum_cmplx4_ad
766  module procedure mpp_sum_cmplx4_scalar_ad
767  module procedure mpp_sum_cmplx4_2d_ad
768  module procedure mpp_sum_cmplx4_3d_ad
769  module procedure mpp_sum_cmplx4_4d_ad
770  module procedure mpp_sum_cmplx4_5d_ad
771 #endif
772  end interface
773 
774  !#####################################################################
775  ! <INTERFACE NAME="mpp_gather">
776  ! <OVERVIEW>
777  ! gather information onto root pe.
778  ! </OVERVIEW>
779  ! </INTERFACE>
780  interface mpp_gather
781  module procedure mpp_gather_logical_1d
782  module procedure mpp_gather_int4_1d
783  module procedure mpp_gather_real4_1d
784  module procedure mpp_gather_real8_1d
785  module procedure mpp_gather_logical_1dv
786  module procedure mpp_gather_int4_1dv
787  module procedure mpp_gather_real4_1dv
788  module procedure mpp_gather_real8_1dv
789  module procedure mpp_gather_pelist_logical_2d
790  module procedure mpp_gather_pelist_logical_3d
791  module procedure mpp_gather_pelist_int4_2d
792  module procedure mpp_gather_pelist_int4_3d
793  module procedure mpp_gather_pelist_real4_2d
794  module procedure mpp_gather_pelist_real4_3d
795  module procedure mpp_gather_pelist_real8_2d
796  module procedure mpp_gather_pelist_real8_3d
797  end interface
798 
799  !#####################################################################
800  ! <INTERFACE NAME="mpp_scatter">
801  ! <OVERVIEW>
802  ! gather information onto root pe.
803  ! </OVERVIEW>
804  ! </INTERFACE>
805  interface mpp_scatter
806  module procedure mpp_scatter_pelist_int4_2d
807  module procedure mpp_scatter_pelist_int4_3d
808  module procedure mpp_scatter_pelist_real4_2d
809  module procedure mpp_scatter_pelist_real4_3d
810  module procedure mpp_scatter_pelist_real8_2d
811  module procedure mpp_scatter_pelist_real8_3d
812  end interface
813 
814  !#####################################################################
815  ! <interface name="mpp_alltoall">
816  ! <overview>
817  ! scatter a vector across all PEs
818  ! (e.g. transpose the vector and PE index)
819  ! </overview>
820  ! </interface>
821  interface mpp_alltoall
822  module procedure mpp_alltoall_int4
823  module procedure mpp_alltoall_int8
824  module procedure mpp_alltoall_real4
825  module procedure mpp_alltoall_real8
826  module procedure mpp_alltoall_logical4
827  module procedure mpp_alltoall_logical8
828  module procedure mpp_alltoall_int4_v
829  module procedure mpp_alltoall_int8_v
830  module procedure mpp_alltoall_real4_v
831  module procedure mpp_alltoall_real8_v
832  module procedure mpp_alltoall_logical4_v
833  module procedure mpp_alltoall_logical8_v
834  module procedure mpp_alltoall_int4_w
835  module procedure mpp_alltoall_int8_w
836  module procedure mpp_alltoall_real4_w
837  module procedure mpp_alltoall_real8_w
838  module procedure mpp_alltoall_logical4_w
839  module procedure mpp_alltoall_logical8_w
840  end interface
841 
842 
843  !#####################################################################
844 
845  ! <INTERFACE NAME="mpp_transmit">
846  ! <OVERVIEW>
847  ! Basic message-passing call.
848  ! </OVERVIEW>
849  ! <DESCRIPTION>
850  ! <TT>MPP_TYPE_</TT> corresponds to any 4-byte and 8-byte variant of
851  ! <TT>integer, real, complex, logical</TT> variables, of rank 0 or 1. A
852  ! contiguous block from a multi-dimensional array may be passed by its
853  ! starting address and its length, as in <TT>f77</TT>.
854  !
855  ! <TT>mpp_transmit</TT> is currently implemented as asynchronous
856  ! outward transmission and synchronous inward transmission. This follows
857  ! the behaviour of <TT>shmem_put</TT> and <TT>shmem_get</TT>. In MPI, it
858  ! is implemented as <TT>mpi_isend</TT> and <TT>mpi_recv</TT>. For most
859  ! applications, transmissions occur in pairs, and are here accomplished
860  ! in a single call.
861  !
862  ! The special PE designations <TT>NULL_PE</TT>,
863  ! <TT>ANY_PE</TT> and <TT>ALL_PES</TT> are provided by use
864  ! association.
865  !
866  ! <TT>NULL_PE</TT>: is used to disable one of the pair of
867  ! transmissions.<BR/>
868  ! <TT>ANY_PE</TT>: is used for unspecific remote
869  ! destination. (Please note that <TT>put_pe=ANY_PE</TT> has no meaning
870  ! in the MPI context, though it is available in the SHMEM invocation. If
871  ! portability is a concern, it is best avoided).<BR/>
872  ! <TT>ALL_PES</TT>: is used for broadcast operations.
873  !
874  ! It is recommended that <LINK
875  ! SRC="#mpp_broadcast"><TT>mpp_broadcast</TT></LINK> be used for
876  ! broadcasts.
877  !
878  ! The following example illustrates the use of
879  ! <TT>NULL_PE</TT> and <TT>ALL_PES</TT>:
880  !
881  ! <PRE>
882  ! real, dimension(n) :: a
883  ! if( pe.EQ.0 )then
884  ! do p = 1,npes-1
885  ! call mpp_transmit( a, n, p, a, n, NULL_PE )
886  ! end do
887  ! else
888  ! call mpp_transmit( a, n, NULL_PE, a, n, 0 )
889  ! end if
890  !
891  ! call mpp_transmit( a, n, ALL_PES, a, n, 0 )
892  ! </PRE>
893  !
894  ! The do loop and the broadcast operation above are equivalent.
895  !
896  ! Two overloaded calls <TT>mpp_send</TT> and
897  ! <TT>mpp_recv</TT> have also been
898  ! provided. <TT>mpp_send</TT> calls <TT>mpp_transmit</TT>
899  ! with <TT>get_pe=NULL_PE</TT>. <TT>mpp_recv</TT> calls
900  ! <TT>mpp_transmit</TT> with <TT>put_pe=NULL_PE</TT>. Thus
901  ! the do loop above could be written more succinctly:
902  !
903  ! <PRE>
904  ! if( pe.EQ.0 )then
905  ! do p = 1,npes-1
906  ! call mpp_send( a, n, p )
907  ! end do
908  ! else
909  ! call mpp_recv( a, n, 0 )
910  ! end if
911  ! </PRE>
912  ! </DESCRIPTION>
913  ! <TEMPLATE>
914  ! call mpp_transmit( put_data, put_len, put_pe, get_data, get_len, get_pe )
915  ! </TEMPLATE>
916  ! </INTERFACE>
917  interface mpp_transmit
918  module procedure mpp_transmit_real8
919  module procedure mpp_transmit_real8_scalar
920  module procedure mpp_transmit_real8_2d
921  module procedure mpp_transmit_real8_3d
922  module procedure mpp_transmit_real8_4d
923  module procedure mpp_transmit_real8_5d
924 #ifdef OVERLOAD_C8
925  module procedure mpp_transmit_cmplx8
926  module procedure mpp_transmit_cmplx8_scalar
927  module procedure mpp_transmit_cmplx8_2d
928  module procedure mpp_transmit_cmplx8_3d
929  module procedure mpp_transmit_cmplx8_4d
930  module procedure mpp_transmit_cmplx8_5d
931 #endif
932 #ifndef no_8byte_integers
933  module procedure mpp_transmit_int8
934  module procedure mpp_transmit_int8_scalar
935  module procedure mpp_transmit_int8_2d
936  module procedure mpp_transmit_int8_3d
937  module procedure mpp_transmit_int8_4d
938  module procedure mpp_transmit_int8_5d
939  module procedure mpp_transmit_logical8
940  module procedure mpp_transmit_logical8_scalar
941  module procedure mpp_transmit_logical8_2d
942  module procedure mpp_transmit_logical8_3d
943  module procedure mpp_transmit_logical8_4d
944  module procedure mpp_transmit_logical8_5d
945 #endif
946 
947  module procedure mpp_transmit_real4
948  module procedure mpp_transmit_real4_scalar
949  module procedure mpp_transmit_real4_2d
950  module procedure mpp_transmit_real4_3d
951  module procedure mpp_transmit_real4_4d
952  module procedure mpp_transmit_real4_5d
953 
954 #ifdef OVERLOAD_C4
955  module procedure mpp_transmit_cmplx4
956  module procedure mpp_transmit_cmplx4_scalar
957  module procedure mpp_transmit_cmplx4_2d
958  module procedure mpp_transmit_cmplx4_3d
959  module procedure mpp_transmit_cmplx4_4d
960  module procedure mpp_transmit_cmplx4_5d
961 #endif
962  module procedure mpp_transmit_int4
963  module procedure mpp_transmit_int4_scalar
964  module procedure mpp_transmit_int4_2d
965  module procedure mpp_transmit_int4_3d
966  module procedure mpp_transmit_int4_4d
967  module procedure mpp_transmit_int4_5d
968  module procedure mpp_transmit_logical4
969  module procedure mpp_transmit_logical4_scalar
970  module procedure mpp_transmit_logical4_2d
971  module procedure mpp_transmit_logical4_3d
972  module procedure mpp_transmit_logical4_4d
973  module procedure mpp_transmit_logical4_5d
974  end interface
975  interface mpp_recv
976  module procedure mpp_recv_real8
977  module procedure mpp_recv_real8_scalar
978  module procedure mpp_recv_real8_2d
979  module procedure mpp_recv_real8_3d
980  module procedure mpp_recv_real8_4d
981  module procedure mpp_recv_real8_5d
982 #ifdef OVERLOAD_C8
983  module procedure mpp_recv_cmplx8
984  module procedure mpp_recv_cmplx8_scalar
985  module procedure mpp_recv_cmplx8_2d
986  module procedure mpp_recv_cmplx8_3d
987  module procedure mpp_recv_cmplx8_4d
988  module procedure mpp_recv_cmplx8_5d
989 #endif
990 #ifndef no_8byte_integers
991  module procedure mpp_recv_int8
992  module procedure mpp_recv_int8_scalar
993  module procedure mpp_recv_int8_2d
994  module procedure mpp_recv_int8_3d
995  module procedure mpp_recv_int8_4d
996  module procedure mpp_recv_int8_5d
997  module procedure mpp_recv_logical8
998  module procedure mpp_recv_logical8_scalar
999  module procedure mpp_recv_logical8_2d
1000  module procedure mpp_recv_logical8_3d
1001  module procedure mpp_recv_logical8_4d
1002  module procedure mpp_recv_logical8_5d
1003 #endif
1004 
1005  module procedure mpp_recv_real4
1006  module procedure mpp_recv_real4_scalar
1007  module procedure mpp_recv_real4_2d
1008  module procedure mpp_recv_real4_3d
1009  module procedure mpp_recv_real4_4d
1010  module procedure mpp_recv_real4_5d
1011 
1012 #ifdef OVERLOAD_C4
1013  module procedure mpp_recv_cmplx4
1014  module procedure mpp_recv_cmplx4_scalar
1015  module procedure mpp_recv_cmplx4_2d
1016  module procedure mpp_recv_cmplx4_3d
1017  module procedure mpp_recv_cmplx4_4d
1018  module procedure mpp_recv_cmplx4_5d
1019 #endif
1020  module procedure mpp_recv_int4
1021  module procedure mpp_recv_int4_scalar
1022  module procedure mpp_recv_int4_2d
1023  module procedure mpp_recv_int4_3d
1024  module procedure mpp_recv_int4_4d
1025  module procedure mpp_recv_int4_5d
1026  module procedure mpp_recv_logical4
1027  module procedure mpp_recv_logical4_scalar
1028  module procedure mpp_recv_logical4_2d
1029  module procedure mpp_recv_logical4_3d
1030  module procedure mpp_recv_logical4_4d
1031  module procedure mpp_recv_logical4_5d
1032  end interface
1033  interface mpp_send
1034  module procedure mpp_send_real8
1035  module procedure mpp_send_real8_scalar
1036  module procedure mpp_send_real8_2d
1037  module procedure mpp_send_real8_3d
1038  module procedure mpp_send_real8_4d
1039  module procedure mpp_send_real8_5d
1040 #ifdef OVERLOAD_C8
1041  module procedure mpp_send_cmplx8
1042  module procedure mpp_send_cmplx8_scalar
1043  module procedure mpp_send_cmplx8_2d
1044  module procedure mpp_send_cmplx8_3d
1045  module procedure mpp_send_cmplx8_4d
1046  module procedure mpp_send_cmplx8_5d
1047 #endif
1048 #ifndef no_8byte_integers
1049  module procedure mpp_send_int8
1050  module procedure mpp_send_int8_scalar
1051  module procedure mpp_send_int8_2d
1052  module procedure mpp_send_int8_3d
1053  module procedure mpp_send_int8_4d
1054  module procedure mpp_send_int8_5d
1055  module procedure mpp_send_logical8
1056  module procedure mpp_send_logical8_scalar
1057  module procedure mpp_send_logical8_2d
1058  module procedure mpp_send_logical8_3d
1059  module procedure mpp_send_logical8_4d
1060  module procedure mpp_send_logical8_5d
1061 #endif
1062 
1063  module procedure mpp_send_real4
1064  module procedure mpp_send_real4_scalar
1065  module procedure mpp_send_real4_2d
1066  module procedure mpp_send_real4_3d
1067  module procedure mpp_send_real4_4d
1068  module procedure mpp_send_real4_5d
1069 
1070 #ifdef OVERLOAD_C4
1071  module procedure mpp_send_cmplx4
1072  module procedure mpp_send_cmplx4_scalar
1073  module procedure mpp_send_cmplx4_2d
1074  module procedure mpp_send_cmplx4_3d
1075  module procedure mpp_send_cmplx4_4d
1076  module procedure mpp_send_cmplx4_5d
1077 #endif
1078  module procedure mpp_send_int4
1079  module procedure mpp_send_int4_scalar
1080  module procedure mpp_send_int4_2d
1081  module procedure mpp_send_int4_3d
1082  module procedure mpp_send_int4_4d
1083  module procedure mpp_send_int4_5d
1084  module procedure mpp_send_logical4
1085  module procedure mpp_send_logical4_scalar
1086  module procedure mpp_send_logical4_2d
1087  module procedure mpp_send_logical4_3d
1088  module procedure mpp_send_logical4_4d
1089  module procedure mpp_send_logical4_5d
1090  end interface
1091 
1092  ! <INTERFACE NAME="mpp_broadcast">
1093 
1094  ! <OVERVIEW>
1095  ! Parallel broadcasts.
1096  ! </OVERVIEW>
1097  ! <DESCRIPTION>
1098  ! The <TT>mpp_broadcast</TT> call has been added because the original
1099  ! syntax (using <TT>ALL_PES</TT> in <TT>mpp_transmit</TT>) did not
1100  ! support a broadcast across a pelist.
1101  !
1102  ! <TT>MPP_TYPE_</TT> corresponds to any 4-byte and 8-byte variant of
1103  ! <TT>integer, real, complex, logical</TT> variables, of rank 0 or 1. A
1104  ! contiguous block from a multi-dimensional array may be passed by its
1105  ! starting address and its length, as in <TT>f77</TT>.
1106  !
1107  ! Global broadcasts through the <TT>ALL_PES</TT> argument to <LINK
1108  ! SRC="#mpp_transmit"><TT>mpp_transmit</TT></LINK> are still provided for
1109  ! backward-compatibility.
1110  !
1111  ! If <TT>pelist</TT> is omitted, the context is assumed to be the
1112  ! current pelist. <TT>from_pe</TT> must belong to the current
1113  ! pelist. This call implies synchronization across the PEs in
1114  ! <TT>pelist</TT>, or the current pelist if <TT>pelist</TT> is absent.
1115  ! </DESCRIPTION>
1116  ! <TEMPLATE>
1117  ! call mpp_broadcast( data, length, from_pe, pelist )
1118  ! </TEMPLATE>
1119  ! <IN NAME="length"> </IN>
1120  ! <IN NAME="from_pe"> </IN>
1121  ! <IN NAME="pelist"> </IN>
1122  ! <INOUT NAME="data(*)"> </INOUT>
1123  ! </INTERFACE>
1124  interface mpp_broadcast
1125  module procedure mpp_broadcast_char
1126  module procedure mpp_broadcast_real8
1127  module procedure mpp_broadcast_real8_scalar
1128  module procedure mpp_broadcast_real8_2d
1129  module procedure mpp_broadcast_real8_3d
1130  module procedure mpp_broadcast_real8_4d
1131  module procedure mpp_broadcast_real8_5d
1132 #ifdef OVERLOAD_C8
1133  module procedure mpp_broadcast_cmplx8
1134  module procedure mpp_broadcast_cmplx8_scalar
1135  module procedure mpp_broadcast_cmplx8_2d
1136  module procedure mpp_broadcast_cmplx8_3d
1137  module procedure mpp_broadcast_cmplx8_4d
1138  module procedure mpp_broadcast_cmplx8_5d
1139 #endif
1140 #ifndef no_8byte_integers
1141  module procedure mpp_broadcast_int8
1142  module procedure mpp_broadcast_int8_scalar
1143  module procedure mpp_broadcast_int8_2d
1144  module procedure mpp_broadcast_int8_3d
1145  module procedure mpp_broadcast_int8_4d
1146  module procedure mpp_broadcast_int8_5d
1147  module procedure mpp_broadcast_logical8
1148  module procedure mpp_broadcast_logical8_scalar
1149  module procedure mpp_broadcast_logical8_2d
1150  module procedure mpp_broadcast_logical8_3d
1151  module procedure mpp_broadcast_logical8_4d
1152  module procedure mpp_broadcast_logical8_5d
1153 #endif
1154 
1155  module procedure mpp_broadcast_real4
1156  module procedure mpp_broadcast_real4_scalar
1157  module procedure mpp_broadcast_real4_2d
1158  module procedure mpp_broadcast_real4_3d
1159  module procedure mpp_broadcast_real4_4d
1160  module procedure mpp_broadcast_real4_5d
1161 
1162 #ifdef OVERLOAD_C4
1163  module procedure mpp_broadcast_cmplx4
1164  module procedure mpp_broadcast_cmplx4_scalar
1165  module procedure mpp_broadcast_cmplx4_2d
1166  module procedure mpp_broadcast_cmplx4_3d
1167  module procedure mpp_broadcast_cmplx4_4d
1168  module procedure mpp_broadcast_cmplx4_5d
1169 #endif
1170  module procedure mpp_broadcast_int4
1171  module procedure mpp_broadcast_int4_scalar
1172  module procedure mpp_broadcast_int4_2d
1173  module procedure mpp_broadcast_int4_3d
1174  module procedure mpp_broadcast_int4_4d
1175  module procedure mpp_broadcast_int4_5d
1176  module procedure mpp_broadcast_logical4
1177  module procedure mpp_broadcast_logical4_scalar
1178  module procedure mpp_broadcast_logical4_2d
1179  module procedure mpp_broadcast_logical4_3d
1180  module procedure mpp_broadcast_logical4_4d
1181  module procedure mpp_broadcast_logical4_5d
1182  end interface
1183 
1184  !#####################################################################
1185  ! <INTERFACE NAME="mpp_chksum">
1186 
1187  ! <OVERVIEW>
1188  ! Parallel checksums.
1189  ! </OVERVIEW>
1190  ! <DESCRIPTION>
1191  ! <TT>mpp_chksum</TT> is a parallel checksum routine that returns an
1192  ! identical answer for the same array irrespective of how it has been
1193  ! partitioned across processors. <TT>LONG_KIND</TT>is the <TT>KIND</TT>
1194  ! parameter corresponding to long integers (see discussion on
1195  ! OS-dependent preprocessor directives) defined in
1196  ! the header file <TT>fms_platform.h</TT>. <TT>MPP_TYPE_</TT> corresponds to any
1197  ! 4-byte and 8-byte variant of <TT>integer, real, complex, logical</TT>
1198  ! variables, of rank 0 to 5.
1199  !
1200  ! Integer checksums on FP data use the F90 <TT>TRANSFER()</TT>
1201  ! intrinsic.
1202  !
1203  ! The <LINK SRC="http://www.gfdl.noaa.gov/fms-cgi-bin/cvsweb.cgi/FMS/shared/chksum/chksum.html">serial checksum module</LINK> is superseded
1204  ! by this function, and is no longer being actively maintained. This
1205  ! provides identical results on a single-processor job, and to perform
1206  ! serial checksums on a single processor of a parallel job, you only
1207  ! need to use the optional <TT>pelist</TT> argument.
1208  ! <PRE>
1209  ! use mpp_mod
1210  ! integer :: pe, chksum
1211  ! real :: a(:)
1212  ! pe = mpp_pe()
1213  ! chksum = mpp_chksum( a, (/pe/) )
1214  ! </PRE>
1215  !
1216  ! The additional functionality of <TT>mpp_chksum</TT> over
1217  ! serial checksums is to compute the checksum across the PEs in
1218  ! <TT>pelist</TT>. The answer is guaranteed to be the same for
1219  ! the same distributed array irrespective of how it has been
1220  ! partitioned.
1221  !
1222  ! If <TT>pelist</TT> is omitted, the context is assumed to be the
1223  ! current pelist. This call implies synchronization across the PEs in
1224  ! <TT>pelist</TT>, or the current pelist if <TT>pelist</TT> is absent.
1225  ! </DESCRIPTION>
1226  ! <TEMPLATE>
1227  ! mpp_chksum( var, pelist )
1228  ! </TEMPLATE>
1229  ! <IN NAME="pelist" TYPE="integer" DIM="(:)"> </IN>
1230  ! <IN NAME="var" TYPE="MPP_TYPE_"> </IN>
1231  ! </INTERFACE>
1232  interface mpp_chksum
1233 #ifndef no_8byte_integers
1234  module procedure mpp_chksum_i8_1d
1235  module procedure mpp_chksum_i8_2d
1236  module procedure mpp_chksum_i8_3d
1237  module procedure mpp_chksum_i8_4d
1238  module procedure mpp_chksum_i8_5d
1239  module procedure mpp_chksum_i8_1d_rmask
1240  module procedure mpp_chksum_i8_2d_rmask
1241  module procedure mpp_chksum_i8_3d_rmask
1242  module procedure mpp_chksum_i8_4d_rmask
1243  module procedure mpp_chksum_i8_5d_rmask
1244 
1245 #endif
1246  module procedure mpp_chksum_i4_1d
1247  module procedure mpp_chksum_i4_2d
1248  module procedure mpp_chksum_i4_3d
1249  module procedure mpp_chksum_i4_4d
1250  module procedure mpp_chksum_i4_5d
1251  module procedure mpp_chksum_i4_1d_rmask
1252  module procedure mpp_chksum_i4_2d_rmask
1253  module procedure mpp_chksum_i4_3d_rmask
1254  module procedure mpp_chksum_i4_4d_rmask
1255  module procedure mpp_chksum_i4_5d_rmask
1256  module procedure mpp_chksum_r8_0d
1257  module procedure mpp_chksum_r8_1d
1258  module procedure mpp_chksum_r8_2d
1259  module procedure mpp_chksum_r8_3d
1260  module procedure mpp_chksum_r8_4d
1261  module procedure mpp_chksum_r8_5d
1262 #ifdef OVERLOAD_C8
1263  module procedure mpp_chksum_c8_0d
1264  module procedure mpp_chksum_c8_1d
1265  module procedure mpp_chksum_c8_2d
1266  module procedure mpp_chksum_c8_3d
1267  module procedure mpp_chksum_c8_4d
1268  module procedure mpp_chksum_c8_5d
1269 #endif
1270 #ifdef OVERLOAD_R4
1271  module procedure mpp_chksum_r4_0d
1272  module procedure mpp_chksum_r4_1d
1273  module procedure mpp_chksum_r4_2d
1274  module procedure mpp_chksum_r4_3d
1275  module procedure mpp_chksum_r4_4d
1276  module procedure mpp_chksum_r4_5d
1277 #endif
1278 #ifdef OVERLOAD_C4
1279  module procedure mpp_chksum_c4_0d
1280  module procedure mpp_chksum_c4_1d
1281  module procedure mpp_chksum_c4_2d
1282  module procedure mpp_chksum_c4_3d
1283  module procedure mpp_chksum_c4_4d
1284  module procedure mpp_chksum_c4_5d
1285 #endif
1286  end interface
1287 
1288 !***********************************************************************
1289 !
1290 ! module variables
1291 !
1292 !***********************************************************************
1293  integer, parameter :: peset_max = 10000
1294  integer :: current_peset_max = 32
1295  type(communicator), allocatable :: peset(:) ! Will be allocated starting from 0, 0 is a dummy used to hold single-PE "self" communicator
1296  logical :: module_is_initialized = .false.
1297  logical :: debug = .false.
1298  integer :: npes=1, root_pe=0, pe=0
1299  integer(LONG_KIND) :: tick, ticks_per_sec, max_ticks, start_tick, end_tick, tick0=0
1300  integer :: mpp_comm_private
1302  real(DOUBLE_KIND) :: mpi_count0=0 ! use to prevent integer overflow
1303  real(DOUBLE_KIND) :: mpi_tick_rate=0.d0 ! clock rate for mpi_wtick()
1304  logical :: mpp_record_timing_data=.true.
1305  type(clock),save :: clocks(max_clocks)
1306  integer :: log_unit, etc_unit
1307  character(len=32) :: configfile='logfile'
1309  integer :: world_peset_num !the world communicator
1310  integer :: error
1311  integer :: clock_num=0, num_clock_ids=0,current_clock=0, previous_clock(max_clocks)=0
1312  real :: tick_rate
1313 
1315  type(mpp_type), target :: mpp_byte
1316 
1317  integer :: cur_send_request = 0
1318  integer :: cur_recv_request = 0
1319  integer, allocatable :: request_send(:)
1320  integer, allocatable :: request_recv(:)
1321  integer, allocatable :: size_recv(:)
1322  integer, allocatable :: type_recv(:)
1323 ! if you want to save the non-root PE information uncomment out the following line
1324 ! and comment out the assigment of etcfile to '/dev/null'
1325 #ifdef NO_DEV_NULL
1326  character(len=32) :: etcfile='._mpp.nonrootpe.msgs'
1327 #else
1328  character(len=32) :: etcfile='/dev/null'
1329 #endif
1330 
1331 #ifdef SGICRAY
1332  integer :: in_unit=100, out_unit=101, err_unit=102 !see intro_io(3F): to see why these values are used rather than 5,6,0
1333 #else
1334  integer :: in_unit=5, out_unit=6, err_unit=0
1335 #endif
1336 
1337  integer :: stdout_unit
1338 
1339  !--- variables used in mpp_util.h
1340  type(summary_struct) :: clock_summary(max_clocks)
1341  logical :: warnings_are_fatal = .false.
1342  integer :: error_state=0
1343  integer :: clock_grain=clock_loop-1
1344 
1345  !--- variables used in mpp_comm.h
1346 #ifdef use_libMPI
1347 #ifdef _CRAYT3E
1348  !BWA: mpif.h on t3e currently does not contain MPI_INTEGER8 datatype
1349  !(O2k and t90 do)
1350  !(t3e: fixed on 3.3 I believe)
1351  integer, parameter :: mpi_integer8=mpi_integer
1352 #endif
1353 #endif /* use_libMPI */
1354 #ifdef use_MPI_SMA
1355 #include <mpp/shmem.fh>
1356  integer :: psync(shmem_barrier_sync_size)
1357  pointer( p_psync, psync ) !used by SHPALLOC
1358 #endif
1359 
1360  integer :: clock0 !measures total runtime from mpp_init to mpp_exit
1362  logical :: verbose=.false.
1363 #ifdef _CRAY
1364  integer(LONG_KIND) :: word(1)
1365 #endif
1366 #if defined(sgi_mipspro) || defined(__ia64)
1367  integer(INT_KIND) :: word(1)
1368 #endif
1369 
1370  integer :: get_len_nocomm = 0 ! needed for mpp_transmit_nocomm.h
1371 
1372 !***********************************************************************
1373 ! variables needed for subroutine read_input_nml (include/mpp_util.inc)
1374 !
1375 ! parameter defining length of character variables
1376  integer, parameter :: input_str_length = 256
1377 ! public variable needed for reading input.nml from an internal file
1378  character(len=INPUT_STR_LENGTH), dimension(:), allocatable, target, public :: input_nml_file
1379  logical :: read_ascii_file_on = .false.
1380 !***********************************************************************
1381 
1382 ! Include variable "version" to be written to log file.
1383 #include<file_version.h>
1384  public version
1385 
1386  integer, parameter :: max_request_min = 10000
1387  integer :: request_multiply = 20
1388 
1389  logical :: etc_unit_is_stderr = .false.
1390  integer :: max_request = 0
1391  logical :: sync_all_clocks = .false.
1393 
1394  contains
1395 #include <system_clock.h>
1396 #include <mpp_util.inc>
1397 #include <mpp_comm.inc>
1398 
1399  end module mpp_mod
1400 
1401 
1402 
1403 
integer, parameter peset_max
Definition: mpp.F90:1293
integer, parameter, public max_event_types
integer root_pe
Definition: mpp.F90:1298
integer, parameter, public clock_loop
integer, parameter, public comm_tag_13
integer clock_grain
Definition: mpp.F90:1343
integer request_multiply
Definition: mpp.F90:1387
integer, parameter, public comm_tag_9
integer, public default_tag
integer, parameter, public note
integer, parameter, public input_str_length
Definition: mpp.F90:1376
integer, parameter, public null_pe
integer, parameter, public comm_tag_14
integer(long_kind) max_ticks
Definition: mpp.F90:1299
real(double_kind), parameter, public mpp_fill_double
integer get_len_nocomm
Definition: mpp.F90:1370
logical module_is_initialized
Definition: mpp.F90:1296
integer, parameter, public mpp_clock_sync
integer current_peset_num
Definition: mpp.F90:1308
integer cur_recv_request
Definition: mpp.F90:1318
integer, parameter, public comm_tag_11
integer, parameter, public comm_tag_5
logical, public mpp_record_timing_data
Definition: mpp.F90:1304
integer mpp_stack_hwm
Definition: mpp.F90:1361
integer, parameter, public comm_tag_10
integer, parameter, public clock_module
integer, parameter, public mpp_clock_detailed
integer, parameter, public comm_tag_19
integer mpp_stack_size
Definition: mpp.F90:1361
integer, parameter, public comm_tag_20
integer, parameter, public event_alltoall
integer in_unit
Definition: mpp.F90:1332
integer, parameter, public event_send
character(len=32) etcfile
Definition: mpp.F90:1326
Definition: mpp.F90:39
integer stdout_unit
Definition: mpp.F90:1337
integer etc_unit
Definition: mpp.F90:1306
type(summary_struct), dimension(max_clocks) clock_summary
Definition: mpp.F90:1340
integer(long_kind), parameter, public mpp_wait
integer num_clock_ids
Definition: mpp.F90:1311
character(len=input_str_length), dimension(:), allocatable, target, public input_nml_file
Definition: mpp.F90:1378
real(double_kind) mpi_count0
Definition: mpp.F90:1302
integer clock0
Definition: mpp.F90:1360
integer, parameter, public comm_tag_6
integer world_peset_num
Definition: mpp.F90:1309
logical debug
Definition: mpp.F90:1297
integer, parameter, public clock_component
integer, parameter, public comm_tag_3
type(communicator), dimension(:), allocatable peset
Definition: mpp.F90:1295
integer out_unit
Definition: mpp.F90:1332
integer, dimension(shmem_barrier_sync_size) psync
Definition: mpp.F90:1356
integer clock_num
Definition: mpp.F90:1311
integer, parameter, public mpp_debug
logical first_call_system_clock_mpi
Definition: mpp.F90:1301
integer, parameter, public comm_tag_2
integer, parameter, public event_allreduce
integer error
Definition: mpp.F90:1310
integer, parameter, public max_events
integer, parameter, public any_pe
integer(int_kind), parameter, public mpp_fill_int
logical read_ascii_file_on
Definition: mpp.F90:1379
integer current_peset_max
Definition: mpp.F90:1294
integer, parameter, public max_bins
integer peset_num
Definition: mpp.F90:1308
integer log_unit
Definition: mpp.F90:1306
integer, parameter, public comm_tag_12
integer err_unit
Definition: mpp.F90:1332
integer, parameter, public all_pes
logical etc_unit_is_stderr
Definition: mpp.F90:1389
integer, parameter, public maxpes
integer, parameter, public comm_tag_15
type(mpp_type_list) datatypes
Definition: mpp.F90:1314
integer, parameter, public event_type_create
character(len=32) configfile
Definition: mpp.F90:1307
integer, parameter, public max_clocks
integer, parameter, public event_recv
integer, parameter, public event_broadcast
integer, parameter, public mpp_verbose
integer, parameter, public comm_tag_18
integer, dimension(:), allocatable request_recv
Definition: mpp.F90:1320
integer, parameter, public clock_subcomponent
integer, parameter, public event_type_free
integer, parameter, public event_wait
integer(long_kind), dimension(1) word
Definition: mpp.F90:1364
integer error_state
Definition: mpp.F90:1342
integer, parameter, public fatal
integer, dimension(:), allocatable size_recv
Definition: mpp.F90:1321
real(double_kind) mpi_tick_rate
Definition: mpp.F90:1303
integer, parameter, public comm_tag_7
integer current_clock
Definition: mpp.F90:1311
integer(long_kind) end_tick
Definition: mpp.F90:1299
integer, dimension(max_clocks) previous_clock
Definition: mpp.F90:1311
logical sync_all_clocks
Definition: mpp.F90:1391
integer, parameter mpi_integer8
Definition: mpp.F90:1351
integer(long_kind) tick0
Definition: mpp.F90:1299
real tick_rate
Definition: mpp.F90:1312
integer, parameter, public comm_tag_4
logical verbose
Definition: mpp.F90:1362
integer, parameter, public warning
integer max_request
Definition: mpp.F90:1390
integer, parameter, public clock_module_driver
integer, dimension(:), allocatable request_send
Definition: mpp.F90:1319
integer(long_kind) ticks_per_sec
Definition: mpp.F90:1299
integer(long_kind) start_tick
Definition: mpp.F90:1299
integer, parameter max_request_min
Definition: mpp.F90:1386
integer, parameter, public clock_routine
integer, parameter, public comm_tag_16
type(mpp_type), target, public mpp_byte
Definition: mpp.F90:1315
type(clock), dimension(max_clocks), save clocks
Definition: mpp.F90:1305
integer, parameter, public clock_infra
integer(long_kind), parameter, public mpp_ready
integer, parameter, public comm_tag_8
logical warnings_are_fatal
Definition: mpp.F90:1341
integer, parameter, public comm_tag_1
integer, parameter, public comm_tag_17
integer, dimension(:), allocatable type_recv
Definition: mpp.F90:1322
integer npes
Definition: mpp.F90:1298