PyTorch v1.5.1 Release Notes
Release Date: 2020-06-18 // almost 4 years ago-
π PyTorch 1.5.1 Release Notes
- Backwards Incompatible Changes
- βͺ Known Issues and Workarounds
- π Critical Fixes
- π Crashes and Error Fixes
- π Other Fixes
Backwards Incompatible Changes
Autograd: Operations that return integer-type tensors now always returns tensors that donβt require grad (#37789).
This most notably affects
torch.argmin
,torch.argmax
, andtorch.argsort
. This change is BC-Breaking because previously one could obtain an integer-type tensor that requires grad in 1.5.0. However, said tensors were not usable by autograd; calling.backward()
on them resulted in an error, so most users are likely to not have been relying on this behavior.Version 1.5.0 Version 1.5.1 >>> tensor = torch.randn(3, requires_grad=True) >>> torch.argmax(tensor).requires_grad True | >>> tensor = torch.randn(3, requires_grad=True) >>> torch.argmax(tensor).requires_grad False |
βͺ Known Issues and Workarounds
When using multiprocessing, PyTorch 1.5.1 and 1.5.0 may error out with complaints about incompatibility between MKL and libgomp (#37377)
π You may see error messages like the following when using the
torch.multiprocessing
package. This bug has primarily affected users with AMD CPUs.`Error: mkl-service + Intel(R) MKL: MKL_THREADING_LAYER=INTEL is incompatible with libgomp.so.1 library. Try to import numpy first or set the threading layer accordingly. Set MKL_SERVICE_FORCE_INTEL to force it.`
You can get rid of the error and the error message by setting the environment
MKL_THREADING_LAYER=GNU
. This can be done either by including the following in your python code:import os os.environ['MKL_THREADING_LAYER'] = 'GNU'
or by specifying the environment variable when running your script:
MKL_THREADING_LAYER=GNU python my_script.py
βͺ To learn more about what triggers this bug and other workarounds if the above isnβt working, please read this comment on the issue.
π Critical Fixes
π
torch.multinomial
: Fixed a bug where CUDAmultinomial
generated the same sequence over and over again with a shift of 4. (#38046)π
nn.Conv2d
: Fixed a bug where circular padding applied padding across the wrong dimension (#37881)Version 1.5.0 Version 1.5.1 >>> circular = nn.Conv2d(6, 1, (3, 3), padding=(0, 1), padding_mode='circular') >>> circular(torch.zeros(1, 6, 10, 10)).shape # Notice the padding is incorrectly on the H dimension, not the W dimension. torch.Size([1, 1, 10, 8]) | >>> tensor = torch.randn(3, requires_grad=True) >>> other = tensor + 1 >>> output = nn.LeakyReLU(0, inplace=True)(other) >>> output.sum().backward() torch.Size([1, 1, 8, 10]) |
π Fixed bug where asserts in CUDA kernels were mistakingly disabled, leading to many silent kernel errors. (#38943, #39047, #39218)
π
torch.gather
,torch.scatter
: added checks for illegal input dtypes that caused silently incorrect behaviors (#38025, #38646)π
torch.argmin
,torch.argmax
: Fixed silently incorrect result for inputs with more than 232 elements (#39212)π C++ Custom Operators: fixed a bug where custom operators stopped working with autograd and ignored the
requires_grad=True
flag. (#37355)π Crashes and Error Fixes
π Fixed CUDA reduction operations on inputs with more than 232 elements (#37788)
Version 1.5.0 Version 1.5.1 >>> torch.zeros(5, 14400, 14400, device='cuda').sum(0)
RuntimeError: sub_iter.strides(0)[0] == 0 INTERNAL ASSERT FAILED at /pytorch/aten/src/ATen/native/cuda/Reduce.cuh:706, please report a bug to PyTorch.
>>> torch.zeros(5, 14400, 14400, device='cuda').sum(0) # No problem |
π Fixed pickling of PyTorch operators (#38033)
Version 1.5.0 Version 1.5.1 >>> pickle.dumps(torch.tanh)
PicklingError: Can't pickle : it's not the same object as torch._C._VariableFunctions | >>> pickle.dumps(torch.tanh) # No problem |
π
nn.LeakyReLU
: Fixed a bug where using autograd with in-placenn.LeakyReLu
with a slope of 0 incorrectly errored out. (#37453, #37559)Version 1.5.0 Version 1.5.1 >>> tensor = torch.randn(3, requires_grad=True) >>> other = tensor + 1 >>> output = nn.LeakyReLU(0, inplace=True)(other) >>> output.sum().backward() π RuntimeError: In-place leakyReLu backward calculation is triggered with a non-positive slope which is not supported. This is caused by calling in-place forward function with a non-positive slope, please call out-of-place version instead. | >>> tensor = torch.randn(3, requires_grad=True) >>> other = tensor + 1 >>> output = nn.LeakyReLU(0, inplace=True)(other) >>> output.sum().backward() # No error |
π
torch.as_strided
: Fixed crash when passedsizes
andstrides
of different lengths. (#39301)π
nn.SyncBatchNorm.convert_sync_batchnorm
: Fixed bug where it did not respect the devices of the original BatchNorm module, resulting in device mismatch errors (#39344)nn.utils.clip_grad_norm_
: Fixed ability to operate on tensors on different devices (#38615)π
torch.min
,torch.max
: added check for illegal output dtypes (#38850)π MacOS: Fixed
import torch
error (#36941).π C++ Extensions: fixed compilation error when building with older versions of nvcc (#37221)
π§ This bug mainly affected users of ubuntu 16.04. Weβre certain it affected the following configurations:
- ubuntu 16.04 + cuda 9.2 + gcc 5
- ubuntu 16.04 + cuda 9.2 + gcc 7
- ubuntu 16.04 + cuda 10.0 + gcc 5
π C++ Extensions: fixed ability to compile with paths that include spaces (#38860, #38670)
π C++ Extensions: fixed ability to compile with relative
include_dirs
for ahead-of-time compilation (#38264)π Other Fixes
π
nn.Conv1d
,nn.Conv2d
,nn.Conv3d
: Fixed a bug where convolutions were using more memory than previous versions of PyTorch. (#38674)π Fixed in-place floor division magic method (#38695)
π In 1.5.0, the in-place floor division magic method mistakingly performed the floor division out-of-place. Weβve fixed this in 1.5.1.
Version 1.5.0 Version 1.5.1 >>> tensor = torch.ones(1) >>> expected_data_ptr = tensor.data_ptr() >>> tensor //= 1 >>> tensor.data_ptr() == expected_data_ptr False | >>> tensor = torch.ones(1) >>> expected_data_ptr = tensor.data_ptr() >>> tensor //= 1 >>> tensor.data_ptr() == expected_data_ptr True |
π Documentation: fixed link to java docs. (#39039)
π Quantization: Fixed weight quantization inaccuracies for LSTM (#35961)
Weight quantization was done incorrectly for LSTMs, the statistics for all weights (across layers) were combined in the observer. This meant that weights for later layers in a LSTM would use sub-optimal scales impacting accuracy. The problem gets worse as the number of layers increases.
π DistributedDataParallel: Fixed single-process multi-GPU use case (#36503)
π RPC: Fixed future callbacks not capturing and restoring autograd context id (#38512)
π TorchScript: Fixed support with
torch.unique
(#38156)ONNX: Fix
pow
operator export (#39791)