Windows 常见问题

源码构建
扩展
- Cpp 扩展

加速 Windows 的 CUDA 构建

Visual Studio 目前不支持并行自定义任务。作为替代，我们可以使用来并行化 CUDA 构建任务。它只能通过输入几行代码来使用。

# REM 先安装 ninja
pip install ninja
# REM 将其设置为 cmake 生成器
set CMAKE_GENERATOR=Ninja

一个关键的安装脚本

你可以在这里查阅，它会给你指引。

CFFI 扩展

对 CFFI 扩展的支持是非常实验性的。在 Windows 下启用它通常有两个步骤。

首先，在扩展对象中指定其他库以更在 Windows 上构建。

ffi = create_extension(
    '_ext.my_lib',
    headers=headers,
    sources=sources,
    define_macros=defines,
    relative_to=__file__,
    with_cuda=with_cuda,
    extra_compile_args=["-std=c99"],
    libraries=['ATen', '_C'] # Append cuda libaries when necessary, like cudart
)

下面列出了一个例子：

#include <THC/THC.h>
#include <ATen/ATen.h>
THCState *state = at::globalContext().thc_state;
extern "C" int my_lib_add_forward_cuda(THCudaTensor *input1, THCudaTensor *input2,
                                        THCudaTensor *output)
    if (!THCudaTensor_isSameSizeAs(state, input1, input2))
    return 0;
    THCudaTensor_resizeAs(state, output, input1);
    THCudaTensor_cadd(state, output, input1, 1.0, input2);
    return 1;
}
extern "C" int my_lib_add_backward_cuda(THCudaTensor *grad_output, THCudaTensor *grad_input)
{
    THCudaTensor_fill(state, grad_input, 1);
    return 1;
}

与前一种相比，这种类型的扩展有更好的支持。但是，它仍然需要一些手动配置。首先，您应该打开VS 2017 的 x86_x64 交叉工具链命令提示符。然后，您可以打开其中的 Git-Bash。它通常位于 C：\ Program Files \ Git \ git-bash.exe 中。最后，您可以开始编译过程。

在 win-32 中找不到包

PyTorch 不适用于 32 位系统。请使用 Windows 和 Python 64 位版本。

为什么没有 Windows 的 Python 2 包？

因为它不够稳定。在我们正式发布之前需要解决一些问题。你可以自己构建它。

导入错误

from torch._C import *
ImportError: DLL load failed: The specified module could not be found.

conda install -c peterjc123 vc vs2017_runtime

另一个可能的原因可能是您使用的是没有 NVIDIA 图形卡的 GPU 版本。请将您的 GPU 软件包替换为 CPU 软件包。

RuntimeError:
    An attempt has been made to start a new process before the
    current process has finished its bootstrapping phase.
    在当前进程完成引导阶段之前，已尝试开始一个新进程
   This probably means that you are not using fork to start your
   in the main module:
   这可能意味着你没有使用 fork 来启动你的子进程，并且你忘记了在主模块中使用
   正确的用法：
       if __name__ == '__main__':
           freeze_support()
           ...
   The "freeze_support()" line can be omitted if the program
   is not going to be frozen to produce an executable.
   如果程序不会被冻结以生成可执行文件，则可以省略 “freeze_support()” 行。

Windows 上 multiprocessing 多进程处理的实现不同，它使用 spawn 而不是 fork。所以我们必须用 if 条件语句来保护代码不被执行多次。将您的代码重构为以下结构：

多进程处理错误 “损坏的管道”

ForkingPickler(file, protocol).dump(obj)
BrokenPipeError: [Errno 32] Broken pipe

在父进程发送数据完成之前，子进程先结束就会发生此错误。这说明你的代码可能有问题。你可以通过将 DataLoader 的 num_worker 减为零来调试代码，看看问题是否仍然存在。

多进程处理错误 “驱动程序关闭”

Couldn’t open shared file mapping: <torch_14808_1591070686>, error code: <1455> at torch\lib\TH\THAllocator.c:154
[windows] driver shut down

请更新您的图形驱动程序。如果这种情况持续存在，这可能是因为您的显卡太旧或计算压力对于您的显卡来说太重了。请根据这篇文章更新 TDR 设置。

CUDA IPC 业务

THCudaCheck FAIL file=torch\csrc\generic\StorageSharing.cpp line=252 error=63 : OS call failed or operation not supported on this OS

改为 CPU 共享张量。确保您的自定义 DataSet 数据集返回 CPU 张量。