当前位置：首页 > news >正文

nn.Conv1d、nn.Conv2d、nn.Linear

news 来源：原创 2024/11/10 16:27:13

这里写目录标题

nn.Linear
nn.Conv1d
nn.Conv2d

nn.Linear

Args:
in_features: size of each input sample
out_features: size of each output sample
bias: If set to False, the layer will not learn an additive bias.
Default: True

Shape:
- Input: :math:(*, H_{in}) where :math:* means any number of
dimensions including none and : $in_features H_{in} = \text{in\_features}$ .
- Output: :math:(*, H_{out}) where all but the last dimension
are the same shape as the input and : $out_features H_{out} = \text{out\_features}$ .

Examples::
        >>> m = nn.Linear(20, 30)
        >>> input = torch.randn(128, 20) # 需要输入到nn.Linear(in_features, out_features)中的tensor[B C H W]需要满足：W == in_features
        >>> output = m(input)
        >>> print(output.size())
        torch.Size([128, 30])

nn.Conv1d

In the simplest case, the output value of the layer with input size $C_{\text{in}}, L)$ and output : $C_{\text{out}}, L_{\text{out}})$ can be precisely described as:
$\text{out}(N_i, C_{\text{out}_j}) = \text{bias}(C_{\text{out}_j}) + \sum_{k = 0}^{C_{in} - 1} \text{weight}(C_{\text{out}_j}, k) \star \text{input}(N_i, k)$

Shape:
- Input: :math: $N, C_{in}, L_{in})$ or :math: $C_{in}, L_{in})$
- Output: :math: $N, C_{out}, L_{out})$ or :math: $C_{out}, L_{out})$ , where

… math::
$kernel_size − 1 ) − 1 stride + 1 ⌋ L_{out} = \left\lfloor\frac{L_{in} + 2 \times \text{padding} - \text{dilation} \times (\text{kernel\_size} - 1) - 1}{\text{stride}} + 1\right\rfloor$

Attributes:
weight (Tensor): the learnable weights of the module of shape
$out_channels , in_channels groups , kernel_size ) (\text{out\_channels}, \frac{\text{in\_channels}}{\text{groups}}, \text{kernel\_size})$ .
The values of these weights are sampled from
:math: $\mathcal{U}(-\sqrt{k}, \sqrt{k})$ where
:math: $kernel_size k = \frac{groups}{C_\text{in} * \text{kernel\_size}}$
bias (Tensor): the learnable bias of the module of shape
(out_channels). If :attr:bias is True, then the values of these weights are
sampled from :math: $\mathcal{U}(-\sqrt{k}, \sqrt{k})$ where
:math: $kernel_size k = \frac{groups}{C_\text{in} * \text{kernel\_size}}$

Examples::
# 需要输入到nn.Conv1d(in_channels, out_channels)的中的tensor[C H W]需要满足：H = in_channels
        >>> m = nn.Conv1d(16, 33, 3, stride=2)
        >>> input = torch.randn(20, 16, 50)
        >>> output = m(input)  # torch.Size([20, 33, 24])

nn.Conv1d, kernel_size=1与nn.Linear不同
两者都可以作为全连接层，实现同样结构的MLP计算，但计算形式不同，具体为：

nn.Conv1d输入的是一个[batch, channel,
length]，3维tensor，而nn.Linear输入的是一个[batch, *, in_features]，可变形状tensor，在进行等价计算时务必保证nn.Linear输入tensor为三维
nn.Conv1d作用在第二个维度位置channel，nn.Linear作用在第三个维度位置in_features，对于一个XXX，若要在两者之间进行等价计算，需要进行tensor.permute，重新排列维度轴秩序。
nn.Linear速度比 nn.Conv1d, kernel_size=1速度更快。

def count_parameters(model):
    """Count the number of parameters in a model."""
    return sum([p.numel() for p in model.parameters()])

conv = torch.nn.Conv1d(8,32,1)
print(count_parameters(conv))
# 288

linear = torch.nn.Linear(8,32)
print(count_parameters(linear))
# 288

print(conv.weight.shape)
# torch.Size([32, 8, 1])
print(linear.weight.shape)
# torch.Size([32, 8])

# use same initialization
linear.weight = torch.nn.Parameter(conv.weight.squeeze(2))
linear.bias = torch.nn.Parameter(conv.bias)

tensor = torch.randn(128,256,8)
permuted_tensor = tensor.permute(0,2,1).clone().contiguous()

out_linear = linear(tensor)
print(out_linear.mean())
# tensor(0.0067, grad_fn=<MeanBackward0>)

out_conv = conv(permuted_tensor)
print(out_conv.mean())
# tensor(0.0067, grad_fn=<MeanBackward0>)


Speed test:

%%timeit
_ = linear(tensor)
# 151 µs ± 297 ns per loop

%%timeit
_ = conv(permuted_tensor)
# 1.43 ms ± 6.33 µs per loop

nn.Conv2d

Shape:
- Input: $N, C_{in}, H_{in}, W_{in})$ or : $C_{in}, H_{in}, W_{in})$
- Output: $N, C_{out}, H_{out}, W_{out})$ or : $C_{out}, H_{out}, W_{out})$ , where

$kernel_size [ 0 ] − 1 ) − 1 stride [ 0 ] + 1 ⌋ H_{out} = \left\lfloor\frac{H_{in} + 2 \times \text{padding}[0] - \text{dilation}[0] \times (\text{kernel\_size}[0] - 1) - 1}{\text{stride}[0]} + 1\right\rfloor$

$kernel_size [ 1 ] − 1 ) − 1 stride [ 1 ] + 1 ⌋ W_{out} = \left\lfloor\frac{W_{in} + 2 \times \text{padding}[1] - \text{dilation}[1] \times (\text{kernel\_size}[1] - 1) - 1}{\text{stride}[1]} + 1\right\rfloor$

Examples:
# 需要输入到nn.Conv2d(in_channels, out_channels)的中的tensor[B C H W]需要满足：C = in_channels
        >>> # With square kernels and equal stride
        >>> m = nn.Conv2d(16, 33, 3, stride=2)
        >>> # non-square kernels and unequal stride and with padding
        >>> m = nn.Conv2d(16, 33, (3, 5), stride=(2, 1), padding=(4, 2))
        >>> # non-square kernels and unequal stride and with padding and dilation
        >>> m = nn.Conv2d(16, 33, (3, 5), stride=(2, 1), padding=(4, 2), dilation=(3, 1))
        >>> input = torch.randn(20, 16, 50, 100)
        >>> output = m(input)