Skip to content

Parallel-NetCDF/PnetCDF-Python

 
 

Repository files navigation

PnetCDF-python

Overview

PnetCDF-python is a Python interface to PnetCDF, a high-performance parallel I/O library for accessing netCDF files. This integration with Python allows for easy manipulation, analysis, and visualization of netCDF data using the rich ecosystem of Python's scientific computing libraries, making it a valuable tool for python-based applications that require high-performance access to netCDF files.

More about PnetCDF-python

At a granular level, PnetCDF-python is a library that consists of the following components:

Component Description
File pncpy.File is a high-level object representing an netCDF file, which provides a Pythonic interface to create, read and write within an netCDF file. A File object serves as the root container for dimensions, variables, and attributes. Together they describe the meaning of data and relations among data fields stored in a netCDF file.
Attribute In the library, netCDF attributes can be created, accessed, and manipulated using python dictionary-like syntax. A Pythonic interface for metadata operations is provided both in the File class (for global attributes) and the Variable class (for variable attributes).
Dimension Dimension defines the shape and structure of variables and stores coordinate data for multidimensional arrays. The Dimension object, which is also a key component of File class, provides an interface to create, access and manipulate dimensions.
Variable Variable is a core component of a netCDF file representing an array of data values organized along one or more dimensions, with associated metadata in the form of attributes. The Variable object in the library provides operations to read and write the data and metadata of a variable within a netCDF file. Particularly, data mode operations have a flexible interface, where reads and writes can be done through either explicit function-call style methods or indexer-style (numpy-like) syntax.

Dependencies

  • Python 3.9 or above
  • PnetCDF C library
  • Python libraries mpi4py, numpy
  • To work with the in-development version, you need to install Cython

Installation

Currently our PyPI wheels don't cover all systems. If you already have a working MPI with the mpicc compiler wrapper is on your search path and pnetcdf-C installation, you can use pip:

CC=mpicc PNETCDF_DIR=/path/to/pnetcdf/dir/ pip install pncpy==0.0.3

Development installation

  • Clone GitHub repository

  • Make sure numpy, mpi4py and Cython are installed and you have Python 3.9 or newer.

  • Make sure a working MPI implementation and PnetCDF C is installed with shared libraries(--enable-shared), and pnetcdf-config utility is in your Unix $PATH. (or specifiy pnetcdf-config filepath in setup.cfg)

  • (Optional) create python virtual environment and activate it

  • Run CC=mpicc python3 setup.py build, then CC=mpicc python3 setup.py install

Current build status

The project is under active development. Below is a summary of the current implementation status

Component Implemented To be implemented next (w/ priority*)
File API ncmpi_strerror
ncmpi_strerrno
ncmpi_create
ncmpi_open/close
ncmpi_enddef/redef
ncmpi_sync
ncmpi_begin/end_indep_data
ncmpi_inq_path
ncmpi_inq
ncmpi_wait
ncmpi_wait_all
ncmpi_inq_nreqs
ncmpi_inq_buffer_usage/size
ncmpi_cancel
ncmpi_set_fill
ncmpi_set_default_format
ncmpi_inq_file_info
ncmpi_inq_put/get_size
ncmpi_inq_libvers 2
ncmpi_delete 2
ncmpi_sync_numrecs 2
ncmpi__enddef 2
ncmpi_abort 3
ncmpi_inq_files_opened 2
ncmpi_inq 3
Dimension API ncmpi_def_dim
ncmpi_inq_ndims
ncmpi_inq_dimlen
ncmpi_inq_dim
ncmpi_inq_dimname
ncmpi_rename_dim
Attribute API ncmpi_put/get_att_text
ncmpi_put/get_att
ncmpi_inq_att
ncmpi_inq_natts
ncmpi_inq_attname
ncmpi_rename_att
ncmpi_del_att
ncmpi_copy_att 2
Variable API ncmpi_def_var
ncmpi_def_var_fill
ncmpi_inq_varndims
ncmpi_inq_varname
ncmpi_put/get_vara
ncmpi_put/get_vars
ncmpi_put/get_var1
ncmpi_put/get_var
ncmpi_put/get_varn
ncmpi_put/get_varm
ncmpi_put/get_vara_all
ncmpi_put/get_vars_all
ncmpi_put/get_var1_all
ncmpi_put/get_var_all
ncmpi_put/get_varn_all
ncmpi_put/get_varm_all
ncmpi_iput/iget_var
ncmpi_iput/iget_vara
ncmpi_iput/iget_var1
ncmpi_iput/iget_vars
ncmpi_iput/iget_varm
ncmpi_iput/iget_varn
ncmpi_bput_var
ncmpi_bput_var1
ncmpi_bput_vara
ncmpi_bput_vars
ncmpi_bput_varm
ncmpi_bput_varn
ncmpi_fill_var_rec
All type-specific put/get functions 3
(e.g. ncmpi_put_var1_double_all)

All put/get_vard functions 3

All mput/mget_var functions 3
Inquiry API ncmpi_inq
ncmpi_inq_ndims
ncmpi_inq_dimname
ncmpi_inq_varnatts
ncmpi_inq_nvars
ncmpi_inq_vardimid
ncmpi_inq_var_fill
ncmpi_inq_buffer_usage
ncmpi_inq_buffer_size
ncmpi_inq_natts
ncmpi_inq_malloc_max_size
ncmpi_inq_malloc_size
ncmpi_inq_format
ncmpi_inq_file_format
ncmpi_inq_num_rec_vars
ncmpi_inq_num_fix_vars
ncmpi_inq_unlimdim
ncmpi_inq_varnatts
ncmpi_inq_varndims
ncmpi_inq_varname
ncmpi_inq_vartype
ncmpi_inq_varoffset
ncmpi_inq_header_size
ncmpi_inq_header_extent
ncmpi_inq_recsize
ncmpi_inq_version
ncmpi_inq_striping
ncmpi_inq_dimid 3
ncmpi_inq_dim 3
ncmpi_inq_malloc_list 2
ncmpi_inq_var 3
ncmpi_inq_varid 3

\*priority level 1/2/3 maps to first/second/third priority

Testing

  • To run all the existing tests, execute
./test_all.csh [test_file_output_dir]
  • To run a specific single test, execute
mpiexec -n [num_process] python3 test/tst_program.py [test_file_output_dir]

The optional test_file_output_dir argument enables the testing program to save out generated test files in the directory

Resources

License

About

PnetCDF for Python

Topics

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages