Parallel NetCDF 简介
Parallel NetCDF API
- 所有C接口前加
前缀 - 函数返回整数 NetCDF 状态变量
1. Variable and Parameter Types
2. Dataset Functions
int ncmpi_create(MPI_Comm comm,
const char *path,
int cmode,
MPI_Info info,
int *ncidp)
int ncmpi_open(MPI_Comm comm,
const char *path,
int omode,
MPI_Info info,
int *ncidp)
3. Define Mode Functions
4. Inquiry Functions
Inquiry函数可以在定义模式(define mode)或数据模式(data mode)下被调用。
5. Attribute Functions
6. Data Mode Functions
数据模式(data mode)可分为两个状态:总体模式(collective mode)与独立模式(independent mode)。当用户调用ncmpi_enddef
在总体模式内,所有进程必须在代码相同位置调用相同的函数。调用参数如 start,count,stride 等则可以不同;在独立模式内,进程不必共同调用API。
在定义状态(define mode)下不能进入独立模式,需要首先调用ncmpi_enddef
数据模式函数分为两类。第一类模仿传统的NetCDF函数并且将其简单的又传统NetCDF接口迁移成为并行NetCDF函数接口。我们称这类数据接口为高级数据模式接口(high level data mode interface)。
第二个类函数使用更多的MPI功能来提供更好的处理内部数据,并且更充分地展示MPI-IO处理应用程序的能力。所有的第一类函数将按照这类函数实现。我们这类称为灵活数据模式接口(flexible data mode interface)。
6.1. High Level Data Mode Interface
6.2. Flexible Data Mode Interface
6.3. Mapping Between NetCDF and MPI Types
7. Q & A
For more details, please refer to Parallel netCDF Q&A
Q: How do I use the buffered nonblocking write APIs?
A: Buffered nonblocking write APIs copy the contents of user buffers into an internally allocated buffer, so the user buffers can be reused immediately after the calls return. A typical way to use these APIs is described below.
- First, tell PnetCDF how much space can be allocated to be used by the APIs.
- Make calls to the buffered put APIs.
- Make calls to the (collective) wait APIs.
- Free the space allocated by the internal buffer.
For further information about the buffered nonblocking APIs, readers are referred to this page.
Q: What is the difference between collective and independent APIs?
A: Collective APIs requires all MPI processes to participate the call. This requirement allows MPI-IO and PnetCDF to coordinate the I/O requesting processes to rearrange requests into a form that can achieve the best performance from the underlying file system. On the contrary, independent APIs (also referred as non-collective) has no such requirement. All PnetCDF collective APIs (except create, open, and close) have a suffix of _all
, corresponding to their independent counterparts. To switch from collective data mode to independent mode, users must call ncmpi_begin_indep_data
. API ncmpi_begin_indep_data
is to exit the independent mode.
Q: Should I use collective APIs or independent APIs?
A: Users are encouraged to use collective APIs whenever possible. Collective API calls require the participation of all MPI processes that open the shared file. This requirement allows MPI-IO and PnetCDF to coordinate the I/O requesting processes to rearrange requests into a form that can achieve the best performance from the underlying file system. If the nature of user's I/O does not permit to call collective APIs (such as the number of requests are not equal among processes, or is determined at the run time), then we recommend the followings.
- Force all the processes participate the collective calls. When a process has nothing to request, users can still call a collective API with zero-length request. This is achieved by set the contents of argument count to zero.
- Use nonblocking APIs. Individual processes can make any number of calls to nonblocking APIs independently from other processes. At the end, a collective wait API,
, is recommended to used to allow all nonblocking requests to commit to the file system.
总结:推荐使用集合接口(collective APIs),不适用也尽量使。
8. Example
* Copyright (C) 2012, Northwestern University and Argonne National Laboratory
* See COPYRIGHT notice in top-level directory.
/* $Id$ */
/* simple demonstration of pnetcdf
* text attribute on dataset
* write out rank into 1-d array collectively.
* The most basic way to do parallel i/o with pnetcdf */
/* This program creates a file, say named, with the following
contents, shown by running ncmpidump command .
% mpiexec -n 4 pnetcdf-write-standard /orangefs/wkliao/
% ncmpidump /orangefs/wkliao/
netcdf output {
// file format: CDF-2 (large file)
d1 = 4 ;
time = UNLIMITED ; // (2 currently)
int v1(time, d1) ;
int v2(d1) ;
// global attributes:
:string = "Hello World\n",
"" ;
v1 =
0, 1, 2, 3,
1, 2, 3, 4 ;
v2 = 0, 1, 2, 3 ;
#include <stdlib.h>
#include <mpi.h>
#include <pnetcdf.h>
#include <stdio.h>
static void handle_error(int status, int lineno)
fprintf(stderr, "Error at line %d: %s\n", lineno, ncmpi_strerror(status));
int main(int argc, char **argv) {
int ret, ncfile, nprocs, rank, dimid1, dimid2, varid1, varid2, ndims;
MPI_Offset start, count=1;
int t, i;
int v1_dimid[2];
MPI_Offset v1_start[2], v1_count[2];
int v1_data[4];
char buf[13] = "Hello World\n";
int data;
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &nprocs);
if (argc != 2) {
if (rank == 0) printf("Usage: %s filename\n", argv[0]);
ret = ncmpi_create(MPI_COMM_WORLD, argv[1],
if (ret != NC_NOERR) handle_error(ret, __LINE__);
ret = ncmpi_def_dim(ncfile, "d1", nprocs, &dimid1);
if (ret != NC_NOERR) handle_error(ret, __LINE__);
ret = ncmpi_def_dim(ncfile, "time", NC_UNLIMITED, &dimid2);
if (ret != NC_NOERR) handle_error(ret, __LINE__);
v1_dimid[0] = dimid2;
v1_dimid[1] = dimid1;
ndims = 2;
ret = ncmpi_def_var(ncfile, "v1", NC_INT, ndims, v1_dimid, &varid1);
if (ret != NC_NOERR) handle_error(ret, __LINE__);
ndims = 1;
ret = ncmpi_def_var(ncfile, "v2", NC_INT, ndims, &dimid1, &varid2);
if (ret != NC_NOERR) handle_error(ret, __LINE__);
ret = ncmpi_put_att_text(ncfile, NC_GLOBAL, "string", 13, buf);
if (ret != NC_NOERR) handle_error(ret, __LINE__);
/* all processors defined the dimensions, attributes, and variables,
* but here in ncmpi_enddef is the one place where metadata I/O
* happens. Behind the scenes, rank 0 takes the information and writes
* the netcdf header. All processes communicate to ensure they have
* the same (cached) view of the dataset */
ret = ncmpi_enddef(ncfile);
if (ret != NC_NOERR) handle_error(ret, __LINE__);
start=rank, count=1, data=rank;
ret = ncmpi_put_vara_int_all(ncfile, varid2, &start, &count, &data);
if (ret != NC_NOERR) handle_error(ret, __LINE__);
for (t = 0; t<2; t++){
v1_start[0] = t, v1_start[1] = rank;
v1_count[0] = 1, v1_count[1] = 1;
for (i = 0; i<4; i++){
v1_data[i] = rank+t;
/* in this simple example every process writes its rank to two 1d variables */
ret = ncmpi_put_vara_int_all(ncfile, varid1, v1_start, v1_count, v1_data);
if (ret != NC_NOERR) handle_error(ret, __LINE__);
ret = ncmpi_close(ncfile);
if (ret != NC_NOERR) handle_error(ret, __LINE__);
return 0;
