Skip to content

[Feedback] (the dataset download link gets 403 error) docs/components/training/user-guides/pytorch.md |  #3927

Open
@itay-nvn

Description

issue:

following this guide:
https://www.kubeflow.org/docs/components/training/user-guides/pytorch/

which is using this image:

gcr.io/kubeflow-ci/pytorch-dist-mnist_test:1.0

that attempts to download this file:

http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz

but as of today, requesting this link gets 403 status.

here you can see the proper output for this image:
https://developer-qa.nvidia.com/blog/gpu-containers-runtime/#:~:text=Try%20running%20the%20MNIST%20training%20example%20included%20with%20the%20container%3A

suggestions:

  1. use links from this mirror instead, which is hosted on github and probably will be more reliable
https://github.com/fgnt/mnist
  1. allow to provide links to these files using env vars, to prevent hardcoding links that might be dead sometime.

notes:
i assume this link is hardcoded in a script which is used in the dockerfile used to build this image.
i found several references to this link across the kubeflow github:
https://github.com/search?q=org%3Akubeflow%20%22train-images-idx3-ubyte.gz%22&type=code
but couldn't trace the dockerfile used to build this image, nor detect which of these scripts was used in it.

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions