CS231n Assignment 3(3) : Generative Adversarial Networks

2024. 12. 19. 22:47·Stanford CS231n
반응형

Inline Question 1

What does your final vanilla GAN image look like?

 

Your Answer : iteration 초기에는 검은 배경과 흐릿한 배경 형태이고, 점점 흐릿한 형태들이 보이며, 최종 출력은 절반 정도는 어느정도 인식이 가능한 형태로 나타났습니다.

Inline Question 2

What does your final LSGAN image look like?

 

Your Answer : 최종 출력은 Vanilla GAN에 비해 인식 가능한 이미지가 많아진 것 같고, 흐릿한 형태의 숫자들이 어느정도 밀접한 군집의 형태로 나타났습니다.

Inline Question 3

What does your final DCGAN image look like?

 

Your Answer : 최종 출력은 Vanilla GAN, LSGAN에 비해 흐릿한 영역이 많이 없어졌으며, 대부분이 실제 사람이 쓴 숫자처럼 나타났습니다. 많은 숫자들을 인식하는 것이 가능해보입니다.

Inline Question 4

We will look at an example to see why alternating minimization of the same objective (like in a GAN) can be tricky business.
Consider $f(x,y)=xy$. What does $min_xmax_yf(x,y)$ evaluate to? (Hint: minmax tries to minimize the maximum value achievable.)
Now try to evaluate this function numerically for 6 steps, starting at the point (1,1), by using alternating gradient (first updating y, then updating x using that updated y) with step size 1. Here step size is the learning_rate, and steps will be learning_rate * gradient. You'll find that writing out the update step in terms of $x_t, y_t, x_{t+1}, y_{t+1}$ will be useful. Breifly explain what $min_xmax_yf(x,y)$ evaluates to and record the six pairs of explicit values for $(x_t,y_t)$ in the table below.

 

Your Answer :

Inline Question 5

Using this method, will we ever reach the optimal value? Why or why not?

 

Your Answer : 아니요. 이 예제에서는 6번의 반복으로 cycle이 발생합니다. 따라서, 최적값에 도달하지 못 하고 계속해서 initial point를 진동하게 됩니다.

 

Inline Question 6

If the generator loss decreases during training while the discriminator loss stays at a constant high value from the start, is this a good sign? Why or why not? A qualitative answer is sufficient.

 

Your Answer : 아니요. 이상적인 상황은 Generator와 Discriminator가 서로 경쟁하며, 두 모델의 성능이 같이 개선되어야 합니다. Discriminator의 성능이 향상되지 않으면, Generator 또한 학습을 통해 성능이 개선되기 힘들 것 입니다.

 

Code

noise를 생성하는 함수입니다. torch.rand는 0~1 사이의 난수를 생성하기 때문에 (-1 ,1) 범위를 가지기 위해서 (2*난수 - 1)를 return 해줍니다.

def sample_noise(batch_size, dim, seed=None):
    """
    Generate a PyTorch Tensor of uniform random noise.

    Input:
    - batch_size: Integer giving the batch size of noise to generate.
    - dim: Integer giving the dimension of noise to generate.

    Output:
    - A PyTorch Tensor of shape (batch_size, dim) containing uniform
      random noise in the range (-1, 1).
    """
    if seed is not None:
        torch.manual_seed(seed)

    # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

    return 2 * torch.rand(batch_size, dim) - 1 # torch.rand: 0 ~ 1 사이 난수 생성

    # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

 

 

입력 이미지를 Real(1), Fake(0)로 분류하는 Discriminator 함수입니다.

 

입력 이미지 Feature들을 Linear Layer를 통해 1개의 Feature로 만듭니다. 이 1개의 Feature를 통해 이미지가 Real(1)인지, Fake(0) 인지 구분할 수 있습니다.

def discriminator(seed=None):
    """
    Build and return a PyTorch model implementing the architecture above.
    """

    if seed is not None:
        torch.manual_seed(seed)

    model = None

    ##############################################################################
    # TODO: Implement architecture                                               #
    #                                                                            #
    # HINT: nn.Sequential might be helpful. You'll start by calling Flatten().   #
    ##############################################################################
    # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

    model = nn.Sequential(
      Flatten(),
      nn.Linear(784, 256),
      nn.LeakyReLU(0.01),
      nn.Linear(256, 256),
      nn.LeakyReLU(0.01),
      nn.Linear(256, 1)
    )

    # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
    ##############################################################################
    #                               END OF YOUR CODE                             #
    ##############################################################################
    return model

 

 

Noise로부터 이미지를 생성하는 Generator 함수입니다. Linear Layer를 통해 [batch, 784] 로 변환합니다. 784는 추후 28*28 이미지로 reshape하여 시각화합니다.

def generator(noise_dim=NOISE_DIM, seed=None):
    """
    Build and return a PyTorch model implementing the architecture above.
    """

    if seed is not None:
        torch.manual_seed(seed)

    model = None

    ##############################################################################
    # TODO: Implement architecture                                               #
    #                                                                            #
    # HINT: nn.Sequential might be helpful.                                      #
    ##############################################################################
    # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

    model = nn.Sequential(
      nn.Linear(noise_dim, 1024),
      nn.ReLU(),
      nn.Linear(1024,1024),
      nn.ReLU(),
      nn.Linear(1024, 784),
      nn.Tanh()
    )

    # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
    ##############################################################################
    #                               END OF YOUR CODE                             #
    ##############################################################################
    return model

 

 

Discriminator와 Generator의 loss 함수입니다.

 

각각의 loss는 다음과 같습니다.

 

$$\ell_G  =  -\mathbb{E}_{z \sim p(z)}\left[\log D(G(z))\right]$$
$$ \ell_D = -\mathbb{E}_{x \sim p_\text{data}}\left[\log D(x)\right] - \mathbb{E}_{z \sim p(z)}\left[\log \left(1-D(G(z))\right)\right]$$

 

loss에 사용되는 bce_loss는 다음과 같습니다.

$$ bce(s, y) = -y * \log(s) - (1 - y) * \log(1 - s) $$

s는 sigmoid(input), y는 target 입니다. 실제로는 코드에서 BCEwithLogitsLoss(=BCE(Sigmoid(input))를 사용하면서 입력 값이 0~1 일 필요가 없습니다. s와 y가 같은 값이면 loss는 0, 다를 수록 loss가 증가합니다.

 

def discriminator_loss(logits_real, logits_fake):
    """
    Computes the discriminator loss described above.

    Inputs:
    - logits_real: PyTorch Tensor of shape (N,) giving scores for the real data.
    - logits_fake: PyTorch Tensor of shape (N,) giving scores for the fake data.

    Returns:
    - loss: PyTorch Tensor containing (scalar) the loss for the discriminator.
    """
    loss = None
    # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

    logits_real = logits_real.reshape(-1)
    logits_fake = logits_fake.reshape(-1)

    loss = bce_loss(logits_real, torch.ones(logits_real.size()).type(dtype)) + bce_loss(logits_fake, torch.zeros(logits_fake.size()).type(dtype))

    # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
    return loss

def generator_loss(logits_fake):
    """
    Computes the generator loss described above.

    Inputs:
    - logits_fake: PyTorch Tensor of shape (N,) giving scores for the fake data.

    Returns:
    - loss: PyTorch Tensor containing the (scalar) loss for the generator.
    """
    loss = None
    # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

    logits_fake = logits_fake.reshape(-1)

    loss = bce_loss(logits_fake, torch.ones(logits_fake.size()).type(dtype))

    # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
    return loss

 

 

Optimizer를 return하는 함수입니다. (설명 생략)

def get_optimizer(model):
    """
    Construct and return an Adam optimizer for the model with learning rate 1e-3,
    beta1=0.5, and beta2=0.999.

    Input:
    - model: A PyTorch model that we want to optimize.

    Returns:
    - An Adam optimizer for the model with the desired hyperparameters.
    """
    optimizer = None
    # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

    learning_rate = 1e-3
    beta1 = 0.5
    beta2 = 0.999

    optimizer = optim.Adam(model.parameters(), lr = learning_rate, betas = (beta1, beta2))

    # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
    return optimizer

 

 

Least Squares GAN 에서 소개하는 Generator와 Discriminator의 loss 입니다.

 

$$\ell_G  =  \frac{1}{2}\mathbb{E}_{z \sim p(z)}\left[\left(D(G(z))-1\right)^2\right]$$
and the discriminator loss:
$$ \ell_D = \frac{1}{2}\mathbb{E}_{x \sim p_\text{data}}\left[\left(D(x)-1\right)^2\right] + \frac{1}{2}\mathbb{E}_{z \sim p(z)}\left[ \left(D(G(z))\right)^2\right]$$

 

수식을 그대로 코드로 옮겼으며, 논문을 읽어보진 않았기 때문에 설명은 생략하겠습니다.

def ls_discriminator_loss(scores_real, scores_fake):
    """
    Compute the Least-Squares GAN loss for the discriminator.

    Inputs:
    - scores_real: PyTorch Tensor of shape (N,) giving scores for the real data.
    - scores_fake: PyTorch Tensor of shape (N,) giving scores for the fake data.

    Outputs:
    - loss: A PyTorch Tensor containing the loss.
    """
    loss = None
    # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

    loss = 0.5 * torch.mean((scores_real - 1).pow(2)) + 0.5 * torch.mean((scores_fake).pow(2))

    # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
    return loss

def ls_generator_loss(scores_fake):
    """
    Computes the Least-Squares GAN loss for the generator.

    Inputs:
    - scores_fake: PyTorch Tensor of shape (N,) giving scores for the fake data.

    Outputs:
    - loss: A PyTorch Tensor containing the loss.
    """
    loss = None
    # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

    loss = 0.5 * torch.mean((scores_fake - 1).pow(2))

    # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
    return loss

 

 

DCGAN 의 Generator와 Discriminator 입니다. 처음 작성한 Vanilla GAN의 경우 linear layer를 통해 이미지를 생성했습니다. 이제 Deep Convolution Network를 통해 이미지를 생성해봅니다. 물론 결과는 훨씬 좋습니다. (Inline Question 3)

def build_dc_classifier(batch_size):
    """
    Build and return a PyTorch model for the DCGAN discriminator implementing
    the architecture above.
    """

    ##############################################################################
    # TODO: Implement architecture                                               #
    #                                                                            #
    # HINT: nn.Sequential might be helpful.                                      #
    ##############################################################################
    # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

    model = nn.Sequential(
      Unflatten(batch_size, 1, 28, 28),
      nn.Conv2d(1, 32, 5, stride=1),
      nn.LeakyReLU(0.01),
      nn.MaxPool2d(2, stride=2),
      nn.Conv2d(32, 64, 5, stride=1),
      nn.LeakyReLU(0.01),
      nn.MaxPool2d(2, stride=2),
      Flatten(),
      nn.Linear(1024, 1024),
      nn.LeakyReLU(0.01),
      nn.Linear(1024,1)
    )

    return model
    # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
    ##############################################################################
    #                               END OF YOUR CODE                             #
    ##############################################################################


def build_dc_generator(noise_dim=NOISE_DIM):
    """
    Build and return a PyTorch model implementing the DCGAN generator using
    the architecture described above.
    """

    ##############################################################################
    # TODO: Implement architecture                                               #
    #                                                                            #
    # HINT: nn.Sequential might be helpful.                                      #
    ##############################################################################
    # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****

    model = nn.Sequential(
      nn.Linear(noise_dim, 1024),
      nn.ReLU(inplace=True),
      nn.BatchNorm1d(1024),
      nn.Linear(1024, 7*7*128),
      nn.ReLU(inplace=True),
      nn.BatchNorm1d(7*7*128),
      Unflatten(),
      nn.ConvTranspose2d(128, 64, 4, stride=2, padding=1),
      nn.ReLU(inplace=True),
      nn.BatchNorm2d(64),
      nn.ConvTranspose2d(64, 1, 4, stride=2, padding=1),
      nn.Tanh(),
      Flatten()
    )

    return model
    # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
    ##############################################################################
    #                               END OF YOUR CODE                             #
    ##############################################################################

 

반응형

'Stanford CS231n' 카테고리의 다른 글

CS231n Assignment 3(4) : Self-Supervised Learning for Image Classification  (1) 2024.12.20
CS231n Assignment 3(2) : Image Captioning with Transformers  (0) 2024.12.19
CS231n Assignment 3(1) : RNN_Captioning  (1) 2024.12.18
CS231n Assignment2(4) : Convolutional Neural Networks  (0) 2024.12.18
CS231n Assignment 2(3) : Dropout  (0) 2024.12.18
'Stanford CS231n' 카테고리의 다른 글
  • CS231n Assignment 3(4) : Self-Supervised Learning for Image Classification
  • CS231n Assignment 3(2) : Image Captioning with Transformers
  • CS231n Assignment 3(1) : RNN_Captioning
  • CS231n Assignment2(4) : Convolutional Neural Networks
hangyuwon
hangyuwon
  • hangyuwon
    191
    hangyuwon
  • 전체
    오늘
    어제
  • 글쓰기 관리
    • 분류 전체보기 (38)
      • 기타 (1)
      • Stanford CS231n (19)
      • 논문 리뷰 (5)
      • Error (4)
      • 알고리즘 (2)
      • Linux (1)
      • 잡동사니 (2)
      • 딥러닝 (4)
  • 인기 글

  • 태그

    error
    논문 리뷰
    알고리즘
  • 최근 댓글

  • 최근 글

  • hELLO· Designed By정상우.v4.10.0
hangyuwon
CS231n Assignment 3(3) : Generative Adversarial Networks
상단으로

티스토리툴바