并行时函数只能在processor0执行

huiCfd

Hello,各位大神，

我在用我自己写的一个求解器并行计算时出现了一个奇怪的问题。这个问题只有在并行计算的时候才会出来，当不并行时没事。问题的具体描述如下：
在我的solver中有一个名为 updateVelocity 的成员函数来获取计算域中某观察点（这个观察点的位置可能会随着时间改变）的速度。
具体如下:

    while (runTime.loop())
    {
        Info << "Time = " << runTime.timeName() << nl << endl;

            #include "CourantNo.H"

        // Pressure-velocity PISO corrector
        {
            #include "UEqn.H"
            // --- PISO loop
            while (piso.correct())
            {
                 #include "pEqn.H"
            }
        }
        laminarTransport.correct();
        turbulence->correct();
        //>>>>>>>>>>>>>Below>>>>>>>
        Nettings.updateVelocity(U,mesh);
        //<<<<<<<<<<<<<Above<<<<<<
        runTime.write();
        Info << "ExecutionTime = " << runTime.elapsedCpuTime() << " s"
             << "  ClockTime = " << runTime.elapsedClockTime() << " s"
             << nl << endl;
    }

    Info << "End\n"
         << endl;

    return 0;
}

这个函数的源代码如下，具体实现形式就是判断每个cell的中心到观察点的距离，然后取出距离观察点距离最近的cell的速度


void Foam::netPanel::updateVelocity(
    const volVectorField &U,
    const fvMesh &mesh)
{
    const vectorField &centres(mesh.C());
    List<vector> fluidVelocities(structuralElements_memb.size(), vector::zero);
    Info << "In updateVelocity, number of mesh is " << centres.size() << endl;
    Info << "In updateVelocity, number of U is " << U.size() << endl;
    scalar maxDistance(1);     
    forAll(EPcenter, Elemi)
    {
        maxDistance = 1; 
        vector nearestCell(vector::zero);
        scalar loops(0);
        forAll(centres, cellI) // loop through all the cells,
        {
            scalar k1(calcDist(centres[cellI], EPcenter[Elemi]));
            if (k1 < maxDistance)
            {
                maxDistance = k1;
                fluidVelocities[Elemi] = U[cellI];
                nearestCell = centres[cellI];
                loops += 1;
                Info << "After " << loops << " times of loop, the nearest cell is " << nearestCell << "to point " << EPcenter << "\n"
                     << endl;
            }
        }
    }
    fluidVelocity_memb = fluidVelocities; // only assige onece
    Info << "the velocity on nodes are  " << fluidVelocity_memb << endl;
}

就是这样的一个简单的函数，但是在并行时却出现了问题。
首先我们来看串行计算的情况：

Starting time loop
Time = 0.01
Courant Number mean: 0.000565 max: 0.0904
smoothSolver:  Solving for Ux, Initial residual = 1, Final residual = 2.4007e-06, No Iterations 1
smoothSolver:  Solving for Uy, Initial residual = 0.891308, Final residual = 1.23902e-06, No Iterations 1
smoothSolver:  Solving for Uz, Initial residual = 0.895257, Final residual = 1.31102e-06, No Iterations 1
GAMG:  Solving for p, Initial residual = 1, Final residual = 8.40918e-07, No Iterations 35
time step continuity errors : sum local = 9.50237e-10, global = 1.27782e-10, cumulative = 1.27782e-10
smoothSolver:  Solving for epsilon, Initial residual = 1, Final residual = 0.00445555, No Iterations 1
smoothSolver:  Solving for k, Initial residual = 1, Final residual = 0.004493, No Iterations 1
In updateVelocity, number of mesh is 184320
In updateVelocity, number of U is 184320
After 1 times of loop, the nearest cell is (-0.49375 -0.21875 -0.39375)to point (0 0.05 -0.1)
... ...
After 45 times of loop, the nearest cell is (-0.00625 0.05625 -0.20625)to point (0 0.05 -0.2)

the velocity on nodes are  4((0.226059 -2.8946e-08 -3.59708e-08) (0.226059 3.97379e-08 -4.07855e-08) (0.226059 2.65165e-08 -2.22689e-08) (0.226059 -4.12251e-08 -1.8172e-08))
ExecutionTime = 2.31 s  ClockTime = 2 s

计算的结果没有任何问题，上面的updateVelocity函数也能够顺利的找到4个特定观察点的速度。

但是问题来啦，如果并行计算的话，updateVelocity就没法找到这4个特定观察点的速度了。

Starting time loop
Time = 0.01
Current Number means: 0.000565 max: 0.0904
smoothSolver:  Solving for Ux, Initial residual = 1, Final residual = 2.4007e-06, No Iterations 1
smoothSolver:  Solving for Uy, Initial residual = 0.892185, Final residual = 1.24606e-06, No Iterations 1
smoothSolver:  Solving for Uz, Initial residual = 0.895991, Final residual = 1.31577e-06, No Iterations 1
GAMG:  Solving for p, Initial residual = 1, Final residual = 9.93486e-07, No Iterations 34
time step continuity errors : sum local = 1.12264e-09, global = -1.18173e-10, cumulative = -1.18173e-10
smoothSolver:  Solving for epsilon, Initial residual = 1, Final residual = 0.0107252, No Iterations 1
smoothSolver:  Solving for k, Initial residual = 1, Final residual = 0.0106817, No Iterations 1
In updateVelocity, number of mesh is 23177
In updateVelocity, number of U is 23177
the velocity on nodes are  4{(0 0 0)}
ExecutionTime = 0.57 s  ClockTime = 1 s

并行和串行用的是同统一套网格，并行是用 scotch方法分成了八个subdomains
值得注意的是：
在串行计算中：In updateVelocity, number of mesh is 184320
在并行计算中：In updateVelocity, number of mesh is 23177
其中23177就是在processor0中的cell数，下面的是在processor0内的U文件

FoamFile
{
    version     2.0;
    format      ascii;
    class       volVectorField;
    location    "0.1";
    object      U;
}
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * //
dimensions      [0 1 -1 0 0 0 0];
internalField   nonuniform List<vector> 
23177
(
(0.225112 5.13224e-05 2.5452e-05)

也就是说，在并行时这个函数只读入梁processor0里面的速度和网格，其他processor1、2、3... 均没有放进来。

请问各位大神有没有人遇到过类似的情况？

huiCfd

processor零，不是o。因为字体原因，0和o看起来很像。

李东岳

是的，当你用多个processor的时候，每个processor处理的网格数量不一样，我看你用了cellI这种循环，如果单核，一共100个网格，有100个标志，但是4核只有25个

huiCfd

@东岳感谢东岳老师回复，请问应该用何种循环才能在并行时对所有的cell进行计算呢？

因为并行计算时，函数只在第一个processor里面的domain进行了循环，而所求观测点不在第一个processor的domain里，所以该函数无法返回符合要求的速度。如果要让该函数在其他proessor下也进行循环的话我应该如何修改？或者我应该看哪方面的资料来解决这个问题。
谢谢

金哲飞

你好老师～我也遇到了类似问题，用forall循环遍历每个网格，并进行累加操作。单核没有问题，多核就卡住，不报错也不继续。请问老师问题解决了吗～

新喻庸

@huicfd 并行时每个进程完成该进程的任务，这是只输出了0号进程（openfoam好像默认该进程为master prosessor），而其他进程的结果没有输出。要获取其他进程计算得到的数据，需要同时使用open foam中的Pstrean::gatherList函数和Pstream::scatterList函数进行数据同步。
贴一段我的测试代码，希望对你有帮助。

if ( Pstream::parRun() ) 
{   
    List<scalarField> test(Pstream::nProcs());   
    test[Pstream::myProcNo()].setSize(3, Pstream::myProcNo());   
    Pstream::gatherList<scalarField>(test); 
    Pstream::scatterList(test); 
    // Pout << test << endl; 

    List<scalar> numcell(Pstream::nProcs());  
    numcell[Pstream::myProcNo()] = mesh.nCells();  
    Pstream::gatherList<scalar>(numcell); 
    Pstream::scatterList<scalar>(numcell); 
    // reduce( numcell, sumOp<List<scalar> >() );  

    reduce( nCells, sumOp<scalar>() );  
    // Pout << "CPU = " << Pstream::myProcNo() << ", masterProcNo = " 
    // << Pstream::masterNo() << endl;   

    if ( Pstream::master() ) 
    { 
        scalar size = 0.0; 
        for ( label procI=0; procI<Pstream::nProcs(); procI++ ) 
        {
            size += numcell[procI]; 
        }
        // Pout << "grid size = " << size << endl; 
        // Pout << "Cell size in each Process is " << numcell 
        // << ", Total cell size = " << nCells << endl; 
    }   

}

cresendo

接楼上的发言，Info在并行时只会输出master processor的信息，所以把Info改成Pout应该就可以了

CFD中文网

并行时函数只能在processor0执行