Open In App

Image Generation using Generative Adversarial Networks (GANs)

Last Updated : 21 Jun, 2024
Improve
Improve
Like Article
Like
Save
Share
Report

Generative Adversarial Networks (GANs) represent a revolutionary approach to, artificial intelligence, particularly for generating images. Introduced in 2014, GANs have significantly advanced the ability to create realistic and high-quality images from random noise.

In this article, we are going to train GANs model on MNIST dataset for generating images.

Training GANs for Image Generation

Generative Adversarial Networks (GANs) employ two neural networks, the Generator, and the Discriminator, in a competitive framework where the Generator synthesizes images from random noise, striving to produce outputs indistinguishable from real data.

Training Generative Adversarial Networks (GANs) is an iterative process that revolves around the interaction between two neural networks:

Training the Discriminator

The Discriminator starts by being trained on a dataset containing real images. Its goal is to differentiate between these real images and fake images generated by the Generator. Through backpropagation and gradient descent, the Discriminator adjusts its parameters to improve its ability to accurately classify real and generated images.

Training the Generator

Concurrently, the Generator is trained to produce images that are increasingly difficult for the Discriminator to distinguish from real images. Initially, the Generator generates random noise, but as training progresses, it learns to generate images that resemble those in the training dataset. The Generator’s parameters are adjusted based on the feedback from the Discriminator, optimizing the Generator’s ability to create more realistic and high-quality images.

Implementing Generative Adversarial Networks (GANs) for Image Generation

Step 1: Import Necessary Libraries and Load Dataset

Import necessary libraries including TensorFlow, Keras layers and models, NumPy for numerical operations, and Matplotlib for plotting.

import numpy as np
import tensorflow as tf
from tensorflow.keras import layers, models, optimizers
import matplotlib.pyplot as plt

Proper data preparation is crucial for the successful training of neural networks. For the MNIST dataset, the preprocessing steps include loading the dataset, reshaping the images to ensure they are in the correct format for TensorFlow processing, and normalizing the pixel values to the range [0,1]. Normalization helps stabilize the training process by keeping the input values small.

# Step 1: Dataset Preparation
# Assuming you have a dataset of images (e.g., MNIST), load and preprocess them
(x_train, _), (_, _) = tf.keras.datasets.mnist.load_data()
x_train = x_train.reshape((-1, 28, 28, 1)).astype('float32') / 255.0

Step 2: Building the Models

This step involves defining the architecture for both the generator and the discriminator using convolutional neural network (CNN) layers, tailored to efficiently process and generate image data.

Generator Model with CNN Layers

The generator’s role in a GAN is to synthesize new images that mimic the distribution of a given dataset. In this case, we use convolutional transpose layers, which are effective for upscaling the input and creating detailed images from a lower-dimensional noise vector.

  • Dense Layer: Converts the input 100-dimensional noise vector into a high-dimensional feature map.
  • Reshape: Transforms the feature map into a 3D shape that can be processed by convolutional layers.
  • Conv2DTranspose Layers: These layers perform upscaling and convolution simultaneously, gradually increasing the resolution of the generated image.
  • BatchNormalization: Stabilizes the learning process and helps in faster convergence.
  • Activation Functions: ‘ReLU’ is used for non-linearity in intermediate layers, while ‘sigmoid’ is used in the output layer to normalize the pixel values between 0 and 1.
def build_generator_cnn():
model = models.Sequential([
# Start with a fully connected layer to convert the input noise vector into a suitable shape
layers.Dense(7*7*128, input_dim=100, activation='relu'),
layers.Reshape((7, 7, 128)), # Reshape into an initial image format

# First upsampling and convolution to increase image size to 14x14
layers.Conv2DTranspose(128, kernel_size=4, strides=2, padding='same', activation='relu'),
layers.BatchNormalization(),

# Second upsampling and convolution to increase image size to 28x28
layers.Conv2DTranspose(128, kernel_size=4, strides=2, padding='same', activation='relu'),
layers.BatchNormalization(),

# Final convolution to produce a 28x28 image with 1 output channel (grayscale)
layers.Conv2D(1, kernel_size=7, activation='sigmoid', padding='same')
])
return model

Discriminator Model with CNN Layers

The discriminator is a binary classifier that determines whether a given image is real (from the dataset) or fake (generated by the generator).

  • Conv2D Layers: Perform convolutions with a stride of 2 to downsample the image, reducing its dimensionality and increasing the field of view of the filters.
  • BatchNormalization: Used here as well to ensure stable training.
  • Flatten: Converts the 2D feature maps into a 1D feature vector necessary for classification.
  • Dense Output Layer: Outputs a single probability indicating the likelihood that the input image is real.
def build_discriminator_cnn():
model = models.Sequential([
# Input layer, starting convolution
layers.Conv2D(64, kernel_size=3, strides=2, input_shape=(28, 28, 1), padding='same', activation='relu'),

# Second convolution to further downsample
layers.Conv2D(128, kernel_size=3, strides=2, padding='same', activation='relu'),
layers.BatchNormalization(),

# Flatten the convolution output to connect to a dense output layer
layers.Flatten(),
layers.Dense(1, activation='sigmoid')
])
return model

Step 3: Compiling the Models

First, compile and set up the combined GAN model, which connects the generator and discriminator. This setup is crucial for training the generator while keeping the discriminator’s parameters fixed during the generator’s training updates.

# Instantiate the generator and discriminator models
generator_cnn = build_generator_cnn()
discriminator_cnn = build_discriminator_cnn()

# Compile the Discriminator
discriminator_cnn.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.0002),
loss='binary_crossentropy',
metrics=['accuracy'])

# Set the discriminator to non-trainable when we are training the generator
discriminator_cnn.trainable = False

# Combined model
gan_input = layers.Input(shape=(100,))
gan_output = discriminator_cnn(generator_cnn(gan_input))
gan_cnn = models.Model(gan_input, gan_output)
gan_cnn.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.0002),
loss='binary_crossentropy')

Step 4: Model Training and Visualizing

The training loop involves alternately training the discriminator and the generator. The discriminator learns to distinguish real images from the fake ones produced by the generator. Simultaneously, the generator learns to fool the discriminator by generating increasingly realistic images.

epochs = 10000
batch_size = 64

for epoch in range(epochs):
# Random noise for generator
noise = np.random.normal(0, 1, (batch_size, 100))
generated_images = generator_cnn.predict(noise)

# Get a random batch of real images
idx = np.random.randint(0, x_train.shape[0], batch_size)
real_images = x_train[idx]

# Labels for real and generated data
real_labels = np.ones((batch_size, 1))
fake_labels = np.zeros((batch_size, 1))

# Train the Discriminator
d_loss_real = discriminator_cnn.train_on_batch(real_images, real_labels)
d_loss_fake = discriminator_cnn.train_on_batch(generated_images, fake_labels)
d_loss = 0.5 * np.add(d_loss_real, d_loss_fake)

# Train the Generator
noise = np.random.normal(0, 1, (batch_size, 100))
valid_labels = np.ones((batch_size, 1))
g_loss = gan_cnn.train_on_batch(noise, valid_labels)

# Output training progress
if epoch % 100 == 0:
print(f"Epoch {epoch}: D Loss: {d_loss[0]}, G Loss: {g_loss}")

# Save and display generated images at intervals
if epoch % 1000 == 0:
test_noise = np.random.normal(0, 1, (1, 100))
test_img = generator_cnn.predict(test_noise)[0].reshape(28, 28)
plt.imshow(test_img, cmap='gray')
plt.axis('off')
plt.show()

Complete Code to Generate Images using GANs

Python
import numpy as np
import tensorflow as tf
from tensorflow.keras import layers, models, optimizers
import matplotlib.pyplot as plt

# Step 1: Dataset Preparation
# Assuming you have a dataset of images (e.g., MNIST), load and preprocess them
(x_train, _), (_, _) = tf.keras.datasets.mnist.load_data()
x_train = x_train.reshape((-1, 28, 28, 1)).astype('float32') / 255.0

def build_generator_cnn():
    model = models.Sequential([
        # Start with a fully connected layer to interpret the seed
        layers.Dense(7*7*128, input_dim=100, activation='relu'),
        layers.Reshape((7, 7, 128)),  # Reshape into an image format

        # Upsample to 14x14
        layers.Conv2DTranspose(128, kernel_size=4, strides=2, padding='same', activation='relu'),
        layers.BatchNormalization(),

        # Upsample to 28x28
        layers.Conv2DTranspose(128, kernel_size=4, strides=2, padding='same', activation='relu'),
        layers.BatchNormalization(),

        # Output layer with the shape of the target image, 1 channel for grayscale
        layers.Conv2D(1, kernel_size=7, activation='sigmoid', padding='same')
    ])
    return model

def build_discriminator_cnn():
    model = models.Sequential([
        # Input layer with the shape of the target image
        layers.Conv2D(64, kernel_size=3, strides=2, input_shape=(28, 28, 1), padding='same', activation='relu'),
        
        # Downsample to 14x14
        layers.Conv2D(128, kernel_size=3, strides=2, padding='same', activation='relu'),
        layers.BatchNormalization(),

        # Further downsampling and flattening to feed into a dense output layer
        layers.Flatten(),
        layers.Dense(1, activation='sigmoid')
    ])
    return model

# Instantiate the CNN-based Generator and Discriminator
generator_cnn = build_generator_cnn()
discriminator_cnn = build_discriminator_cnn()

# Compile the Discriminator
discriminator_cnn.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.0002), loss='binary_crossentropy', metrics=['accuracy'])

# Set the Discriminator's weights to non-trainable (important when we train the combined GAN model)
discriminator_cnn.trainable = False

# Combined GAN model with CNN
gan_input = layers.Input(shape=(100,))
gan_output = discriminator_cnn(generator_cnn(gan_input))
gan_cnn = models.Model(gan_input, gan_output)
gan_cnn.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.0002), loss='binary_crossentropy')

import matplotlib.pyplot as plt
import numpy as np

# Load and preprocess the MNIST dataset
(x_train, _), (_, _) = tf.keras.datasets.mnist.load_data()
x_train = x_train.reshape((-1, 28, 28, 1)).astype('float32') / 255.0

epochs = 10000
batch_size = 64

for epoch in range(epochs):
    ############################
    # 1. Train the Discriminator
    ############################
    
    # Generate batch of noise
    noise = np.random.normal(0, 1, (batch_size, 100))
    generated_images = generator_cnn.predict(noise)

    # Get a random batch of real images
    idx = np.random.randint(0, x_train.shape[0], batch_size)
    real_images = x_train[idx]

    # Labels for generated and real data
    fake_labels = np.zeros((batch_size, 1))
    real_labels = np.ones((batch_size, 1))

    # Train the Discriminator (real classified as ones and generated as zeros)
    d_loss_real = discriminator_cnn.train_on_batch(real_images, real_labels)
    d_loss_fake = discriminator_cnn.train_on_batch(generated_images, fake_labels)

    #################################
    # 2. Train the Generator (via GAN)
    #################################
    
    # Train the generator (note that we want the Discriminator to mistake images as real)
    noise = np.random.normal(0, 1, (batch_size, 100))
    valid_labels = np.ones((batch_size, 1))
    g_loss = gan_cnn.train_on_batch(noise, valid_labels)

    # Plot the progress
    if epoch % 100 == 0:
        print(f"Epoch {epoch}: D Loss Real: {d_loss_real[0]}, D Loss Fake: {d_loss_fake[0]}, G Loss: {g_loss}")
    
    # Optionally, save generated images and display
    if epoch % 1000 == 0:
        generated_image = generator_cnn.predict(np.random.normal(0, 1, (1, 100)))
        plt.imshow(generated_image[0, :, :, 0], cmap='gray')
        plt.axis('off')
        plt.show()
        plt.close()

  

Output:

Screenshot-(930)

Generated Images

Challenges and Considerations

Training Generative Adversarial Networks (GANs) presents several challenges, including:

  1. Mode Collapse: Occurs when the Generator produces limited varieties of outputs, failing to explore the full diversity of the data distribution.
  2. Training Instability: Manifests as oscillations or divergence in training, where the Generator and Discriminator struggle to reach equilibrium.
  3. Hyperparameter Sensitivity: Parameters such as learning rates and network architectures significantly impact GANs’ performance and stability.

Conclusion

Generative Adversarial Networks have redefined image generation capabilities, offering powerful tools for creating diverse and realistic visual content. Despite challenges, ongoing research and advancements promise even greater applications and innovations in the field.



Similar Reads

Generative Adversarial Networks (GANs) in PyTorch
The aim of the article is to implement GANs architecture using PyTorch framework. The article provides comprehensive understanding of GANs in PyTorch along with in-depth explanation of the code. Generative Adversarial Networks (GANs) are a class of artificial intelligence algorithms used in unsupervised machine learning. They consist of two neural
9 min read
Wasserstein Generative Adversarial Networks (WGANs) Convergence and Optimization
Wasserstein Generative Adversarial Network (WGANs) is a modification of Deep Learning GAN with few changes in the algorithm. GAN, or Generative Adversarial Network, is a way to build an accurate generative model. This network was introduced by Martin Arjovsky, Soumith Chintala, and Léon Bottou in 2017. It is widely used to generate realistic images
9 min read
Generative Models in AI: A Comprehensive Comparison of GANs and VAEs
The world of artificial intelligence has witnessed a significant surge in the development of generative models, which have revolutionized the way we approach tasks like image and video generation, data augmentation, and more. Among the most popular and widely used generative models are Generative Adversarial Networks (GANs) and Variational Autoenco
11 min read
What is so special about Generative Adversarial Network (GAN)
Fans are ecstatic for a variety of reasons, including the fact that GANs were the first generative algorithms to produce convincingly good results, as well as the fact that they have opened up many new research directions. In the last several years, GANs are considered to be the most prominent machine learning research, and since then, GANs have re
5 min read
Conditional Generative Adversarial Network
Imagine a situation where you can generate images of cats that match your ideal vision or a landscape that adheres to a specific artistic style. CGANs is a neural network that enables the generation of data that aligns with specific properties, which can be class labels, textual descriptions, or other traits, by harnessing the power of conditions.
13 min read
Generative Adversarial Network (GAN)
GAN(Generative Adversarial Network) represents a cutting-edge approach to generative modeling within deep learning, often leveraging architectures like convolutional neural networks. The goal of generative modeling is to autonomously identify patterns in input data, enabling the model to produce new examples that feasibly resemble the original data
15+ min read
Selection of GAN vs Adversarial Autoencoder models
In this article, we are going to see the selection of GAN vs Adversarial Autoencoder models. Generative Adversarial Network (GAN)The Generative Adversarial Network, or GAN, is one of the most prominent deep generative modeling methodologies right now. The primary distinction between GAN and VAE is that GAN seeks to match the pixel level distributio
6 min read
Adversarial Search Algorithms
Adversarial search algorithms are the backbone of strategic decision-making in artificial intelligence, it enables the agents to navigate competitive scenarios effectively. This article offers concise yet comprehensive advantages of these algorithms from their foundational principles to practical applications. Let's uncover the strategies that driv
15+ min read
Alpha-Beta pruning in Adversarial Search Algorithms
In artificial intelligence, particularly in game playing and decision-making, adversarial search algorithms are used to model and solve problems where two or more players compete against each other. One of the most well-known techniques in this domain is alpha-beta pruning. This article explores the concept of alpha-beta pruning, its implementation
6 min read
Explain the role of minimax algorithm in adversarial search for optimal decision-making?
In the realm of artificial intelligence (AI), particularly in game theory and decision-making scenarios involving competition, the ability to predict and counteract an opponent's moves is paramount. This is where adversarial search algorithms come into play. Among the most prominent and foundational of these algorithms is the Minimax algorithm. It
11 min read