-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use nccl in CuPy #181
Use nccl in CuPy #181
Conversation
docs/source/installation/guide.rst
Outdated
with the ``--no-nccl`` flag.:: | ||
|
||
$ python setup.py install --no-nccl | ||
typically for testing for debugging purpose, ChainerMN can be used by not installing CuPy. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whole sentence would be like this:
Users who want to try ChainerMN in CPU-only environment may skip installation of CuPy. Non-GPU set up may not be performant as GPU-enabled set up, but would would be useful for testing or debugging training program in non-GPU environment such as laptops or CI jobs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your comment. I will fix it.
docs/source/installation/guide.rst
Outdated
|
||
.. note:: | ||
|
||
To try current version of ChainerMN in a CPU-only enviroment, you not require ``--no-nccl`` flag, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Current version of ChainerMN does not need --no-nccl
flag for CPU-only environment at installation any more. It would be just ignored.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will fix it.
I think CuPy must be properly installed with NCCL enabled. I'd recommend adding a sentence how to check CuPy installation, before ChainerMN installation like |
Also we'd need update on a nccl check at if use_nccl and not nccl._available:
raise RuntimeError(
'NCCL is not available. '
'Please confirm that NCCL can be found by dynamic linkers, '
'and ChainerMN is installed without --no-nccl flag.'
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My comments are mainly on English, while the code seems nice.
Changes are good; while I tested again and unfortunately found potential fix for non-nccl checking: >>> import chainermn
>>> c = chainermn.create_communicator('pure_nccl')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/kuenishi/src/chainermn/chainermn/communicators/__init__.py", line 77, in create_communicator
return PureNcclCommunicator(mpi_comm=mpi_comm)
File "/home/kuenishi/src/chainermn/chainermn/communicators/pure_nccl_communicator.py", line 12, in __init__
if nccl.get_version() < 2000:
AttributeError: module 'chainermn.nccl' has no attribute 'get_version'
>>> This is because of nccl version checking |
Thank you for testing. I confirmed this error. |
Some errors may occur when NCCL versions used for ChainerMN and CuPy are different. So I modify ChainerMN to use nccl in Cupy and remove NCCL wrapper in ChainerMN.