Parallel NetCDF API

所有C接口前加ncmpi前缀，Fortran接口前加nfmpi前缀
函数返回整数 NetCDF 状态变量

1. Variable and Parameter Types

函数采用MPI_Offset类型来表示大小参数，与size_t相比（32-bit）MPI_Offset为64位变量，表示数据几乎不受限制。

有关变量起始下标编号start，各个维度长度count，及间隔大小stride等标量或向量都需定义为MPI_Offset类型。

2. Dataset Functions

ncmpi_create与ncmpi_open函数多了一个附加参数MPI_Info，这个参数主要用于传递提示变量。调用时传递MPI_INFO_NULL则可以忽略此功能。

int ncmpi_create(MPI_Comm comm,

                 const char *path,

                 int cmode,

                 MPI_Info info,

                 int *ncidp)

int ncmpi_open(MPI_Comm comm,

               const char *path,

               int omode,

               MPI_Info info,

               int *ncidp)

3. Define Mode Functions

所有进程必须采用相同值调用这类函数，在定义结束后，所有进程定义内容会进行检查与比较。若其不相同，函数ncmpi_enddef会返回错误代码。

4. Inquiry Functions

Inquiry函数可以在定义模式（define mode）或数据模式（data mode）下被调用。

5. Attribute Functions

Attributes（属性）主要在NetCDF中储存标量或是向量来描述变量。

在原始接口中，attribute函数可以在定义模式或数据模式下调用；然而，在数据模式状态下修改attributes的值有可能会失败。主要由于文件所需空间可能会改变。

6. Data Mode Functions

数据模式（data mode）可分为两个状态：总体模式（collective mode）与独立模式（independent mode）。当用户调用ncmpi_enddef或ncmpi_open后，文件自动进入总体模式。

在总体模式内，所有进程必须在代码相同位置调用相同的函数。调用参数如 start，count，stride 等则可以不同；在独立模式内，进程不必共同调用API。

在定义状态（define mode）下不能进入独立模式，需要首先调用ncmpi_enddef来离开定义状态随后进入数据模式。

数据模式函数分为两类。第一类模仿传统的NetCDF函数并且将其简单的又传统NetCDF接口迁移成为并行NetCDF函数接口。我们称这类数据接口为高级数据模式接口(high level data mode interface)。

第二个类函数使用更多的MPI功能来提供更好的处理内部数据，并且更充分地展示MPI-IO处理应用程序的能力。所有的第一类函数将按照这类函数实现。我们这类称为灵活数据模式接口（flexible data mode interface）。

在两类函数中，都提供了包括独立模式与总体模式操作。总体模式函数名后以_all结尾。所有这些进程必须同时调用该函数。

6.1. High Level Data Mode Interface

每个独立函数都类似于NetCDF数据模式接口。主要变化就是使用MPI_Offset代替size_t类型数据。

ncmpi_put_var_<type> 将变量所有值写入Netcdf文件；
ncmpi_put_vara_<type> 写入数据部分由start向量指定起始位置，count指定各维度长度；
ncmpi_put_vars_<type> 写入数据部分由start向量指定起始位置，count指定各维度长度，stride指定各维度间隔；
ncmpi_put_varm_<type>

6.2. Flexible Data Mode Interface

6.3. Mapping Between NetCDF and MPI Types

7. Q & A

For more details, please refer to Parallel netCDF Q&A

Q: How do I use the buffered nonblocking write APIs?

A: Buffered nonblocking write APIs copy the contents of user buffers into an internally allocated buffer, so the user buffers can be reused immediately after the calls return. A typical way to use these APIs is described below.

First, tell PnetCDF how much space can be allocated to be used by the APIs.
Make calls to the buffered put APIs.
Make calls to the (collective) wait APIs.
Free the space allocated by the internal buffer.

For further information about the buffered nonblocking APIs, readers are referred to this page.

Q: What is the difference between collective and independent APIs?

A: Collective APIs requires all MPI processes to participate the call. This requirement allows MPI-IO and PnetCDF to coordinate the I/O requesting processes to rearrange requests into a form that can achieve the best performance from the underlying file system. On the contrary, independent APIs (also referred as non-collective) has no such requirement. All PnetCDF collective APIs (except create, open, and close) have a suffix of _all, corresponding to their independent counterparts. To switch from collective data mode to independent mode, users must call ncmpi_begin_indep_data. API ncmpi_begin_indep_data is to exit the independent mode.

Q: Should I use collective APIs or independent APIs?

A: Users are encouraged to use collective APIs whenever possible. Collective API calls require the participation of all MPI processes that open the shared file. This requirement allows MPI-IO and PnetCDF to coordinate the I/O requesting processes to rearrange requests into a form that can achieve the best performance from the underlying file system. If the nature of user's I/O does not permit to call collective APIs (such as the number of requests are not equal among processes, or is determined at the run time), then we recommend the followings.

Force all the processes participate the collective calls. When a process has nothing to request, users can still call a collective API with zero-length request. This is achieved by set the contents of argument count to zero.
Use nonblocking APIs. Individual processes can make any number of calls to nonblocking APIs independently from other processes. At the end, a collective wait API, ncmpi_wait_all, is recommended to used to allow all nonblocking requests to commit to the file system.

总结：推荐使用集合接口（collective APIs），不适用也尽量使。

8. Example

/*********************************************************************

 *

 *  Copyright (C) 2012, Northwestern University and Argonne National Laboratory

 *  See COPYRIGHT notice in top-level directory.

 *

 *********************************************************************/

/* $Id$ */

/* simple demonstration of pnetcdf

 * text attribute on dataset

 * write out rank into 1-d array collectively.

 * The most basic way to do parallel i/o with pnetcdf */

/* This program creates a file, say named output.nc, with the following

   contents, shown by running ncmpidump command .

    % mpiexec -n 4 pnetcdf-write-standard /orangefs/wkliao/output.nc

    % ncmpidump /orangefs/wkliao/output.nc

    netcdf output {

    // file format: CDF-2 (large file)

    dimensions:

            d1 = 4 ;

            time = UNLIMITED ; // (2 currently)

    variables:

            int v1(time, d1) ;

            int v2(d1) ;

    // global attributes:

                :string = "Hello World\n",

        "" ;

    data:

         v1 =

            0, 1, 2, 3,

            1, 2, 3, 4 ;

         v2 = 0, 1, 2, 3 ;

    }

*/

#include <stdlib.h>

#include <mpi.h>

#include <pnetcdf.h>

#include <stdio.h>

static void handle_error(int status, int lineno)

{

    fprintf(stderr, "Error at line %d: %s\n", lineno, ncmpi_strerror(status));

    MPI_Abort(MPI_COMM_WORLD, 1);

}

int main(int argc, char **argv) {

    int ret, ncfile, nprocs, rank, dimid1, dimid2, varid1, varid2, ndims;

    MPI_Offset start, count=1;

    int t, i;

    int v1_dimid[2];

    MPI_Offset v1_start[2], v1_count[2];

    int v1_data[4];

    char buf[13] = "Hello World\n";

    int data;

    MPI_Init(&argc, &argv);

    MPI_Comm_rank(MPI_COMM_WORLD, &rank);

    MPI_Comm_size(MPI_COMM_WORLD, &nprocs);

    if (argc != 2) {

        if (rank == 0) printf("Usage: %s filename\n", argv[0]);

        MPI_Finalize();

        exit(-1);

    }

    ret = ncmpi_create(MPI_COMM_WORLD, argv[1],

                       NC_CLOBBER, MPI_INFO_NULL, &ncfile);

    if (ret != NC_NOERR) handle_error(ret, __LINE__);

    ret = ncmpi_def_dim(ncfile, "d1", nprocs, &dimid1);

    if (ret != NC_NOERR) handle_error(ret, __LINE__);

    ret = ncmpi_def_dim(ncfile, "time", NC_UNLIMITED, &dimid2);

    if (ret != NC_NOERR) handle_error(ret, __LINE__);

    v1_dimid[0] = dimid2;

    v1_dimid[1] = dimid1;

    ndims = 2;

    ret = ncmpi_def_var(ncfile, "v1", NC_INT, ndims, v1_dimid, &varid1);

    if (ret != NC_NOERR) handle_error(ret, __LINE__);

    ndims = 1;

    ret = ncmpi_def_var(ncfile, "v2", NC_INT, ndims, &dimid1, &varid2);

    if (ret != NC_NOERR) handle_error(ret, __LINE__);

    ret = ncmpi_put_att_text(ncfile, NC_GLOBAL, "string", 13, buf);

    if (ret != NC_NOERR) handle_error(ret, __LINE__);

    /* all processors defined the dimensions, attributes, and variables,

     * but here in ncmpi_enddef is the one place where metadata I/O

     * happens.  Behind the scenes, rank 0 takes the information and writes

     * the netcdf header.  All processes communicate to ensure they have

     * the same (cached) view of the dataset */

    ret = ncmpi_enddef(ncfile);

    if (ret != NC_NOERR) handle_error(ret, __LINE__);

    start=rank, count=1, data=rank;

    ret = ncmpi_put_vara_int_all(ncfile, varid2, &start, &count, &data);

    if (ret != NC_NOERR) handle_error(ret, __LINE__);

    for (t = 0; t<2; t++){

        v1_start[0] = t, v1_start[1] = rank;

        v1_count[0] = 1, v1_count[1] = 1;

        for (i = 0; i<4; i++){

            v1_data[i] = rank+t;

        }

        /* in this simple example every process writes its rank to two 1d variables */

        ret = ncmpi_put_vara_int_all(ncfile, varid1, v1_start, v1_count, v1_data);

        if (ret != NC_NOERR) handle_error(ret, __LINE__);

    }

    ret = ncmpi_close(ncfile);

    if (ret != NC_NOERR) handle_error(ret, __LINE__);

    MPI_Finalize();

    return 0;

}

Parallel NetCDF 简介的更多相关文章

痞子衡嵌入式：通用NOR接口标准(CFI-JESD68)及SLC Parallel NOR简介
大家好,我是痞子衡,是正经搞技术的痞子.今天痞子衡给大家介绍的是CFI标准及SLC Parallel NOR. NOR Flash是嵌入式世界里最常见的存储器,常常内嵌在微控制器里(Parallel型 ...
痞子衡嵌入式：飞思卡尔i.MX RT系列MCU启动那些事（9）- 从Parallel NOR启动
大家好,我是痞子衡,是正经搞技术的痞子.今天痞子衡给大家介绍的是飞思卡尔i.MX RT系列MCU的Parallel NOR启动. 上一篇讲i.MXRT从Raw NAND启动的文章从Raw NAND启 ...
痞子衡嵌入式：串行EEPROM接口事实标准及SPI EEPROM简介
大家好,我是痞子衡,是正经搞技术的痞子.今天痞子衡给大家介绍的是EEPROM接口标准及SPI EEPROM. 痞子衡之前写过一篇文章 <SLC Parallel NOR简介>,介绍过并行N ...
痞子衡嵌入式：飞思卡尔i.MX RT系列MCU开发那些事 - 索引
大家好,我是痞子衡,是正经搞技术的痞子.本系列痞子衡给大家介绍的是飞思卡尔i.MX RT系列微控制器相关知识. 飞思卡尔半导体(现恩智浦半导体)于2017年开始推出的i.MX RT系列开启了高性能MC ...
CESM部署安装环境和使用
平台信息 Description: CentOS Linux release 7.6.1810 (Core) 安装CESM 安装前提:(小提示:耗时较长,需要耐心)阅读原文 CentOS 7(检查:s ...
.NET异步程序设计之任务并行库
目录 1.简介 2.Parallel类 2.0 Parallel类简介 2.1 Parallel.For() 2.2 Parallel.ForEach() 2.3 Parallel.Invoke() ...
R︱并行计算以及提高运算效率的方式(parallel包、clusterExport函数、SupR包简介)
要学的东西太多,无笔记不能学~~ 欢迎关注公众号,一起分享学习笔记,记录每一颗"贝壳"~ --------------------------- 终于开始攻克并行这一块了,有点小兴 ...
比特币_Bitcoin 简介
2008-11 Satoshi Nakamoto Bitcoin: A Peer-to-Peer Electronic Cash System http://p2pbucks.com/?p=99 ...
[译]何时使用 Parallel.ForEach，何时使用 PLINQ
原作者: Pamela Vagata, Parallel Computing Platform Group, Microsoft Corporation 原文pdf:http://download.c ...

随机推荐

UltraSoft - Beta - Scrum Meeting 12
Date: May 28th, 2020. Scrum 情况汇报进度情况组员负责今日进度 q2l PM.后端会议记录修复了课程中心导入作业时出现重复的问题完成了消息中心界面的交互 Liuzh ...
js_数据类型转换
转整数----parseInt(string,radix) 1)类似于从左往右匹配数字,直到匹配到非数字结束,并返回匹配到的数字.同parseFloat(). parseInt("123&q ...
Noip模拟84 2021.10.27
以后估计都是用\(markdown\)来写了,可能风格会有变化 T1 宝藏这两天老是会的题打不对,还是要细心... 考场上打的是维护\(set\)的做法,但是是最后才想出来的,没有维护对于是没有交. ...
openmp学习心得（二）----常见的运行时库函数
omp_set_dynamic();如果设置了动态调整,并行区域会根据系统的资源状况,动态分配线程的数量.好像仅仅有0和非0的区别,设置为0不进行动态分配. omp_get_num_threads,o ...
vcs(UST)Undefined System Task Call
转载:VCS求助啊 - 微波EDA网 (mweda.com) Error-[UST] Undefined System Task Call../../path/bench/path.v, 51Unde ...
一从二主IIC连接调试
最近有个项目需要实现快速开机出摄像头预览(2s内),但是我的板子linux上的qt应用起来都要10s左右了,于是在硬件上增加了一个屏驱芯片TW8836,这是一个mcu,可以直接获取摄像头数据送到lcd ...
顺时针打印矩阵牛客网剑指Offer
顺时针打印矩阵牛客网剑指Offer 题目描述输入一个矩阵,按照从外向里以顺时针的顺序依次打印出每一个数字,例如,如果输入如下4 X 4矩阵: 1 2 3 4 5 6 7 8 9 10 11 12 ...
第01课 OpenGL窗口（3）
接下来的代码段创建我们的OpenGL窗口.我花了很多时间来做决定是否创建固定的全屏模式这样不需要许多额外的代码,还是创建一个容易定制的友好的窗口但需要更多的代码.当然最后我选择了后者.我经常在EMai ...
Android现有工程使用Compose
Android现有工程使用Compose 看了Compose的示例工程后,我们也想使用Compose.基于目前情况,在现有工程基础上添加Compose功能. 引入Compose 首先我们安装 Andr ...
ICMP 协议仿真及ping命令用途
1.实验目的加深对 IPv4 协议首部各定义域的理解,掌握路由表的结构和基本配置命令,熟悉 ICMP 的调试操作. 2.实验原理 IPv4 协议定义,网络层协议的相关 RFC 定义和描述. 3.实验 ...

Parallel NetCDF 简介