[mpich-devel] MPICH2 hang

Jim Dinan dinan at mcs.anl.gov
Fri Dec 14 23:52:35 CST 2012


Hi Bob,

The ftmain2.f90 test fails on MPICH2 1.2.1p1, which was released on 
2-22-2010, well before any of the MPI-3 changes.  Could you provide some 
more information on when this test was reporting a failure instead of 
hanging?

It looks like this test case generates a context ID exhaustion pattern 
where context IDs are available at all processes, but the processes have 
no free context IDs in common.  Because there is no common context ID 
available, allocation can't succeed and it loops indefinitely.  This is 
a resource exhaustion pattern that AFAIK, MPICH has not detected in the 
past.

I attached for reference a C translation of this test that is a little 
easier to grok and also fails on MPICH, going back to MPICH2 1.2.1p1.

  ~Jim.

On 12/14/12 11:27 AM, Jim Dinan wrote:
> Hi Bob,
>
> Thanks for the detailed bug report and test cases.  I confirmed that the
> failure you are seeing on the MPICH trunk.  This is likely related to
> changes we made to support MPI-3 MPI_Comm_create_group().  I created a
> ticket to track this:
>
> https://trac.mpich.org/projects/mpich/ticket/1768
>
>   ~Jim.
>
> On 12/12/12 5:38 PM, Bob Cernohous wrote:
>>
>> I've had a hang reported on BG/Q after about 2K MPI_Comm_create's.
>>
>>   It hangs on the latest 2 releases (mpich2 v1.5.x and v1.4.x) on BG/Q.
>>
>>   It also hangs on linux: 64bit (MPI over PAMI) MPICH2 library.
>>
>>   On older mpich 1.? (BG/P) it failed with 'too many communicators' and
>>   didn't hang, which is what they expected.
>>
>>   It seems like it's stuck in the while (*context_id == 0)  loop
>>   repeatedly calling allreduce and never settling on a context id in
>>   commutil.c.  I didn't do a lot of debug but seems like it's in
>>   vanilla mpich code, not something we modified.
>>
>>   ftmain.f90 fails if you run it on >2k ranks (creates one comm per
>> rank).  This was the original customer testcase.
>>
>> ftmain2.f90 fails by looping so you can run on fewer ranks.
>>
>>
>>
>>
>> I just noticed that with --np 1, I get the 'too many communicators' from
>> ftmain2.  But --np 2 and up hangs.
>>
>> stdout[0]:  check_newcomm do-start       0 , repeat         2045 , total
>>         2046
>> stderr[0]: Abort(1) on node 0 (rank 0 in comm 1140850688): Fatal error
>> in PMPI_Comm_create: Other MPI error, error stack:
>> stderr[0]: PMPI_Comm_create(609).........:
>> MPI_Comm_create(MPI_COMM_WORLD, group=0xc80700f6, new_comm=0x1dbfffb520)
>> failed
>> stderr[0]: PMPI_Comm_create(590).........:
>> stderr[0]: MPIR_Comm_create_intra(250)...:
>> stderr[0]: MPIR_Get_contextid(521).......:
>> stderr[0]: MPIR_Get_contextid_sparse(752): Too many communicators
-------------- next part --------------
/* -*- mode: c; c-basic-offset:4 ; indent-tabs-mode:nil ; -*- */
/*
 *  (c) 2012 by argonne national laboratory.
 *      see copyright in top-level directory.
 */

/* This test attempts to create a large number of communicators, in an effort
 * to exceed the number of communicators that the MPI implementation can
 * provide.  It checks that the implementation detects this error correctly
 * handles it.
 */

#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
#include "mpitest.h"

#define MAX_NCOMM 100000

static const int verbose = 1;

int main(int argc, char **argv) {
    int       rank, nproc, mpi_errno;
    int       i, ncomm, *ranks;
    int       errors = 1;
    MPI_Comm *comm_hdls;
    MPI_Group world_group;

    MPI_Init(&argc, &argv);

    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    MPI_Comm_size(MPI_COMM_WORLD, &nproc);
    MPI_Comm_group(MPI_COMM_WORLD, &world_group);

    MPI_Comm_set_errhandler(MPI_COMM_WORLD, MPI_ERRORS_RETURN);
    comm_hdls = malloc(sizeof(MPI_Comm) * MAX_NCOMM);
    ranks     = malloc(sizeof(int) * nproc);

    ncomm = 0;
    for (i = 0; i < MAX_NCOMM; i++) {
        int       incl = i % nproc;
        MPI_Group comm_group;

        /* Comms include ranks: 0; 1; 2; ...; 0; 1; ... */
        MPI_Group_incl(world_group, 1, &incl, &comm_group);

        /* Note: the comms we create all contain one rank from MPI_COMM_WORLD */
        mpi_errno = MPI_Comm_create(MPI_COMM_WORLD, comm_group, &comm_hdls[i]);

        if (mpi_errno == MPI_SUCCESS) {
            if (verbose) printf("%d: Created comm %d\n", rank, i);
            ncomm++;
        } else {
            if (verbose) printf("%d: Error creating comm %d\n", rank, i);
            MPI_Group_free(&comm_group);
            errors = 0;
            break;
        }

        MPI_Group_free(&comm_group);
    }

    for (i = 0; i < ncomm; i++)
        MPI_Comm_free(&comm_hdls[i]);

    free(comm_hdls);
    MPI_Group_free(&world_group);

    MTest_Finalize(errors);
    MPI_Finalize();

    return 0;
}


More information about the devel mailing list