Fundamental 22
2022. 2. 4. 11:09
ImageNet Challenge
이미지넷은 2010년 ILSVRC2010을 시작으로 대량의 이미지 데이터를 포함하는 데이터셋이다. 이미지넷은 1만 개가 넘는 카테고리에 대해 100만 장 규모의 이미지를 가지고 있고 이미지 데이터를 수집할 당시 10억 장의 이미지에서 167개국에서 모인 5만 명의 작업자가 라벨링에 참여하였다.
ImageNet Pretrained Model의 Accurancy
- Top-1 Accuracy: 자신이 예측한 클래스중 확률이 가장 높은 클래스가 정답일 확률
- Top-5 Accuracy: 자신이 예측한 클래스중 확률이 상위 5개 클래스에서 정답일 확률
딥네트워크의 시작

AlexNet
- ReLU활성화 함수
- 드롭아웃
- 오버래핑 풀링
- 7개의 CNN (Pre-training)
- 2개의 FCN
CNN을 잘쓰자

VGG
- 간결한 구조로 많은 활용이 이루어졌다.
- 10개의 CNN을 가지고 총 16개, 19개의 층으로 이뤄진다.
- 3x3의 커널을 사용해서 더 많은 레이어를 쌓고 이미지의 비서형적 특성을 더 잘 잡아낼 수 있게 하였다.
멀리 있으면 잘 안들려요

기울기 소실(경사소실, Vanishing Gradient)이란?
네트워크를 깊게 쌓으면 모델의 학습을 위한 Gradient가 사라지는 현상이 발생한다.
- 레이어가 깊어지면서 Gradient가 매우 커지거나 작아진다. 레이어의 가중치가 반복돼서 곱해지면, 1보다 작을 때는 0에 너무 가까워 져버리고, 1보다 클 때에는 그 값이 기하급수적으오 커지게 된다.
ResNet
- 총 152개 이상의 레이어
- Skip connertion을 활용하여 Vanishing/Exploding Gradient 문제 해결


Skip Connection은 위의 그림처럼 레이어의 입력을 다른 곳에 이어서 Gradient가 깊은 곳까지 이어지도록한다.
Model API
초기에는 Tensorflow와 Keras의 포함 관계는 없었다. 이 후 Keras의 창시자인 프랑소아 숄레가 구글에 함류하고, Tensorflow 2.0에 이르러서 Keras를 직접 지원하면서 포함된 형태를 보이게 되었다.
VGG16
데이터 셋 가져오기
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
# CIFAR100 데이터셋을 가져옵시다.
cifar100 = keras.datasets.cifar100
(x_train, y_train), (x_test, y_test) = cifar100.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
print("x_train:", len(x_train), "x_test:", len(x_test))
기본적인 CNN 모델 구현
img_input = keras.Input(shape=(32, 32, 3))
x = keras.layers.Conv2D(16, 3, activation='relu')(img_input)
x = keras.layers.MaxPool2D((2,2))(x)
x = keras.layers.Conv2D(32, 3, activation='relu')(x)
x = keras.layers.MaxPool2D((2,2))(x)
x = keras.layers.Flatten()(x)
x = keras.layers.Dense(256, activation='relu')(x)
predictions = keras.layers.Dense(100, activation='softmax')(x)
model = keras.Model(inputs=img_input, outputs=predictions)
model.summary()
'''
Model: "model"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(None, 32, 32, 3)] 0
_________________________________________________________________
conv2d (Conv2D) (None, 30, 30, 16) 448
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 15, 15, 16) 0
_________________________________________________________________
conv2d_1 (Conv2D) (None, 13, 13, 32) 4640
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 6, 6, 32) 0
_________________________________________________________________
flatten (Flatten) (None, 1152) 0
_________________________________________________________________
dense (Dense) (None, 256) 295168
_________________________________________________________________
dense_1 (Dense) (None, 100) 25700
=================================================================
Total params: 325,956
Trainable params: 325,956
Non-trainable params: 0
_________________________________________________________________
'''
VGG16모델 구현
# Block 1
x = layers.Conv2D(64, (3, 3),
activation='relu',
padding='same',
name='block1_conv1')(img_input)
x = layers.Conv2D(64, (3, 3),
activation='relu',
padding='same',
name='block1_conv2')(x)
x = layers.MaxPooling2D((2, 2), strides=(2, 2), name='block1_pool')(x)
# Block 2
x = layers.Conv2D(128, (3, 3),
activation='relu',
padding='same',
name='block2_conv1')(x)
x = layers.Conv2D(128, (3, 3),
activation='relu',
padding='same',
name='block2_conv2')(x)
x = layers.MaxPooling2D((2, 2), strides=(2, 2), name='block2_pool')(x)
# Block 3
x = layers.Conv2D(256, (3, 3),
activation='relu',
padding='same',
name='block3_conv1')(x)
x = layers.Conv2D(256, (3, 3),
activation='relu',
padding='same',
name='block3_conv2')(x)
x = layers.Conv2D(256, (3, 3),
activation='relu',
padding='same',
name='block3_conv3')(x)
x = layers.MaxPooling2D((2, 2), strides=(2, 2), name='block3_pool')(x)
# Block 4
x = layers.Conv2D(512, (3, 3),
activation='relu',
padding='same',
name='block4_conv1')(x)
x = layers.Conv2D(512, (3, 3),
activation='relu',
padding='same',
name='block4_conv2')(x)
x = layers.Conv2D(512, (3, 3),
activation='relu',
padding='same',
name='block4_conv3')(x)
x = layers.MaxPooling2D((2, 2), strides=(2, 2), name='block4_pool')(x)
# Block 5
x = layers.Conv2D(512, (3, 3),
activation='relu',
padding='same',
name='block5_conv1')(x)
x = layers.Conv2D(512, (3, 3),
activation='relu',
padding='same',
name='block5_conv2')(x)
x = layers.Conv2D(512, (3, 3),
activation='relu',
padding='same',
name='block5_conv3')(x)
x = layers.MaxPooling2D((2, 2), strides=(2, 2), name='block5_pool')(x)
# Block 6
x = layers.Flatten(name='flatten')(x)
x = layers.Dense(4096, activation='relu', name='fc1')(x)
x = layers.Dense(4096, activation='relu', name='fc2')(x)
classes=100
x = layers.Dense(classes, activation='softmax', name='predictions')(x) # CIFAR100을 위한 모델 Output
model = keras.Model(name="VGG-16", inputs=img_input, outputs=x)
model.summary()
'''
Model: "VGG-16"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(None, 32, 32, 3)] 0
_________________________________________________________________
block1_conv1 (Conv2D) (None, 32, 32, 64) 1792
_________________________________________________________________
block1_conv2 (Conv2D) (None, 32, 32, 64) 36928
_________________________________________________________________
block1_pool (MaxPooling2D) (None, 16, 16, 64) 0
_________________________________________________________________
block2conv1 (Conv2D) (None, 16, 16, 128) 73856
_________________________________________________________________
block2conv2 (Conv2D) (None, 16, 16, 128) 147584
_________________________________________________________________
block2_pool (MaxPooling2D) (None, 8, 8, 128) 0
_________________________________________________________________
block3conv1 (Conv2D) (None, 8, 8, 256) 295168
_________________________________________________________________
block3conv2 (Conv2D) (None, 8, 8, 256) 590080
_________________________________________________________________
block3conv3 (Conv2D) (None, 8, 8, 256) 590080
_________________________________________________________________
block3pool (MaxPooling2D) (None, 4, 4, 256) 0
_________________________________________________________________
block4conv1 (Conv2D) (None, 4, 4, 512) 1180160
_________________________________________________________________
block4conv2 (Conv2D) (None, 4, 4, 512) 2359808
_________________________________________________________________
block4conv3 (Conv2D) (None, 4, 4, 512) 2359808
_________________________________________________________________
block4pool (MaxPooling2D) (None, 2, 2, 512) 0
_________________________________________________________________
block5conv1 (Conv2D) (None, 2, 2, 512) 2359808
_________________________________________________________________
block5conv2 (Conv2D) (None, 2, 2, 512) 2359808
_________________________________________________________________
block5conv3 (Conv2D) (None, 2, 2, 512) 2359808
_________________________________________________________________
block5pool (MaxPooling2D) (None, 1, 1, 512) 0
_________________________________________________________________
flatten (Flatten) (None, 512) 0
_________________________________________________________________
fc1 (Dense) (None, 4096) 2101248
_________________________________________________________________
fc2 (Dense) (None, 4096) 16781312
_________________________________________________________________
predictions (Dense) (None, 100) 409700
=================================================================
Total params: 34,006,948
Trainable params: 34,006,948
Non-trainable params: 0
_________________________________________________________________
'''
ResNet-50
- 블록마다 feature의 크기가 서로 다르다.
# 추가로 import해야 할 패키지들을 먼저 가져옵니다.
from tensorflow.keras import backend, regularizers, initializers, models
# block 안에 반복적으로 활용되는 L2 regularizer를 선언해 줍니다.
def _gen_l2_regularizer(use_l2_regularizer=True, l2_weight_decay=1e-4):
return regularizers.l2(l2_weight_decay) if use_l2_regularizer else None
Conv block
def conv_block(input_tensor,
kernel_size,
filters,
stage,
block,
strides=(2, 2),
use_l2_regularizer=True,
batch_norm_decay=0.9,
batch_norm_epsilon=1e-5):
"""A block that has a conv layer at shortcut.
Note that from stage 3,
the second conv layer at main path is with strides=(2, 2)
And the shortcut should have strides=(2, 2) as well
Args:
input_tensor: input tensor
kernel_size: default 3, the kernel size of middle conv layer at main path
filters: list of integers, the filters of 3 conv layer at main path
stage: integer, current stage label, used for generating layer names
block: 'a','b'..., current block label, used for generating layer names
strides: Strides for the second conv layer in the block.
use_l2_regularizer: whether to use L2 regularizer on Conv layer.
batch_norm_decay: Moment of batch norm layers.
batch_norm_epsilon: Epsilon of batch borm layers.
Returns:
Output tensor for the block.
"""
filters1, filters2, filters3 = filters
if backend.image_data_format() == 'channels_last':
bn_axis = 3
else:
bn_axis = 1
conv_name_base = 'res' + str(stage) + block + '_branch'
bn_name_base = 'bn' + str(stage) + block + '_branch'
x = layers.Conv2D(
filters1, (1, 1),
use_bias=False,
kernel_initializer='he_normal',
kernel_regularizer=_gen_l2_regularizer(use_l2_regularizer),
name=conv_name_base + '2a')(
input_tensor)
x = layers.BatchNormalization(
axis=bn_axis,
momentum=batch_norm_decay,
epsilon=batch_norm_epsilon,
name=bn_name_base + '2a')(
x)
x = layers.Activation('relu')(x)
x = layers.Conv2D(
filters2,
kernel_size,
strides=strides,
padding='same',
use_bias=False,
kernel_initializer='he_normal',
kernel_regularizer=_gen_l2_regularizer(use_l2_regularizer),
name=conv_name_base + '2b')(
x)
x = layers.BatchNormalization(
axis=bn_axis,
momentum=batch_norm_decay,
epsilon=batch_norm_epsilon,
name=bn_name_base + '2b')(
x)
x = layers.Activation('relu')(x)
x = layers.Conv2D(
filters3, (1, 1),
use_bias=False,
kernel_initializer='he_normal',
kernel_regularizer=_gen_l2_regularizer(use_l2_regularizer),
name=conv_name_base + '2c')(
x)
x = layers.BatchNormalization(
axis=bn_axis,
momentum=batch_norm_decay,
epsilon=batch_norm_epsilon,
name=bn_name_base + '2c')(
x)
shortcut = layers.Conv2D(
filters3, (1, 1),
strides=strides,
use_bias=False,
kernel_initializer='he_normal',
kernel_regularizer=_gen_l2_regularizer(use_l2_regularizer),
name=conv_name_base + '1')(
input_tensor)
shortcut = layers.BatchNormalization(
axis=bn_axis,
momentum=batch_norm_decay,
epsilon=batch_norm_epsilon,
name=bn_name_base + '1')(
shortcut)
x = layers.add([x, shortcut])
x = layers.Activation('relu')(x)
return x
Identity block
def identity_block(input_tensor,
kernel_size,
filters,
stage,
block,
use_l2_regularizer=True,
batch_norm_decay=0.9,
batch_norm_epsilon=1e-5):
"""The identity block is the block that has no conv layer at shortcut.
Args:
input_tensor: input tensor
kernel_size: default 3, the kernel size of middle conv layer at main path
filters: list of integers, the filters of 3 conv layer at main path
stage: integer, current stage label, used for generating layer names
block: 'a','b'..., current block label, used for generating layer names
use_l2_regularizer: whether to use L2 regularizer on Conv layer.
batch_norm_decay: Moment of batch norm layers.
batch_norm_epsilon: Epsilon of batch borm layers.
Returns:
Output tensor for the block.
"""
filters1, filters2, filters3 = filters
if backend.image_data_format() == 'channels_last':
bn_axis = 3
else:
bn_axis = 1
conv_name_base = 'res' + str(stage) + block + '_branch'
bn_name_base = 'bn' + str(stage) + block + '_branch'
x = layers.Conv2D(
filters1, (1, 1),
use_bias=False,
kernel_initializer='he_normal',
kernel_regularizer=_gen_l2_regularizer(use_l2_regularizer),
name=conv_name_base + '2a')(
input_tensor)
x = layers.BatchNormalization(
axis=bn_axis,
momentum=batch_norm_decay,
epsilon=batch_norm_epsilon,
name=bn_name_base + '2a')(
x)
x = layers.Activation('relu')(x)
x = layers.Conv2D(
filters2,
kernel_size,
padding='same',
use_bias=False,
kernel_initializer='he_normal',
kernel_regularizer=_gen_l2_regularizer(use_l2_regularizer),
name=conv_name_base + '2b')(
x)
x = layers.BatchNormalization(
axis=bn_axis,
momentum=batch_norm_decay,
epsilon=batch_norm_epsilon,
name=bn_name_base + '2b')(
x)
x = layers.Activation('relu')(x)
x = layers.Conv2D(
filters3, (1, 1),
use_bias=False,
kernel_initializer='he_normal',
kernel_regularizer=_gen_l2_regularizer(use_l2_regularizer),
name=conv_name_base + '2c')(
x)
x = layers.BatchNormalization(
axis=bn_axis,
momentum=batch_norm_decay,
epsilon=batch_norm_epsilon,
name=bn_name_base + '2c')(
x)
x = layers.add([x, input_tensor])
x = layers.Activation('relu')(x)
return x
ResNet-50
def resnet50(num_classes,
batch_size=None,
use_l2_regularizer=True,
rescale_inputs=False,
batch_norm_decay=0.9,
batch_norm_epsilon=1e-5):
"""Instantiates the ResNet50 architecture.
Args:
num_classes: `int` number of classes for image classification.
batch_size: Size of the batches for each step.
use_l2_regularizer: whether to use L2 regularizer on Conv/Dense layer.
rescale_inputs: whether to rescale inputs from 0 to 1.
batch_norm_decay: Moment of batch norm layers.
batch_norm_epsilon: Epsilon of batch borm layers.
Returns:
A Keras model instance.
"""
input_shape = (32, 32, 3) # CIFAR100을 위한 input_shape 조정입니다.
img_input = layers.Input(shape=input_shape, batch_size=batch_size)
if rescale_inputs:
# Hub image modules expect inputs in the range [0, 1]. This rescales these
# inputs to the range expected by the trained model.
x = layers.Lambda(
lambda x: x * 255.0 - backend.constant(
imagenet_preprocessing.CHANNEL_MEANS,
shape=[1, 1, 3],
dtype=x.dtype),
name='rescale')(
img_input)
else:
x = img_input
if backend.image_data_format() == 'channels_first':
x = layers.Permute((3, 1, 2))(x)
bn_axis = 1
else: # channels_last
bn_axis = 3
block_config = dict(
use_l2_regularizer=use_l2_regularizer,
batch_norm_decay=batch_norm_decay,
batch_norm_epsilon=batch_norm_epsilon)
x = layers.ZeroPadding2D(padding=(3, 3), name='conv1_pad')(x)
x = layers.Conv2D(
64, (7, 7),
strides=(2, 2),
padding='valid',
use_bias=False,
kernel_initializer='he_normal',
kernel_regularizer=_gen_l2_regularizer(use_l2_regularizer),
name='conv1')(
x)
x = layers.BatchNormalization(
axis=bn_axis,
momentum=batch_norm_decay,
epsilon=batch_norm_epsilon,
name='bn_conv1')(
x)
x = layers.Activation('relu')(x)
x = layers.MaxPooling2D((3, 3), strides=(2, 2), padding='same')(x)
x = conv_block(
x, 3, [64, 64, 256], stage=2, block='a', strides=(1, 1), **block_config)
x = identity_block(x, 3, [64, 64, 256], stage=2, block='b', **block_config)
x = identity_block(x, 3, [64, 64, 256], stage=2, block='c', **block_config)
x = conv_block(x, 3, [128, 128, 512], stage=3, block='a', **block_config)
x = identity_block(x, 3, [128, 128, 512], stage=3, block='b', **block_config)
x = identity_block(x, 3, [128, 128, 512], stage=3, block='c', **block_config)
x = identity_block(x, 3, [128, 128, 512], stage=3, block='d', **block_config)
x = conv_block(x, 3, [256, 256, 1024], stage=4, block='a', **block_config)
x = identity_block(x, 3, [256, 256, 1024], stage=4, block='b', **block_config)
x = identity_block(x, 3, [256, 256, 1024], stage=4, block='c', **block_config)
x = identity_block(x, 3, [256, 256, 1024], stage=4, block='d', **block_config)
x = identity_block(x, 3, [256, 256, 1024], stage=4, block='e', **block_config)
x = identity_block(x, 3, [256, 256, 1024], stage=4, block='f', **block_config)
x = conv_block(x, 3, [512, 512, 2048], stage=5, block='a', **block_config)
x = identity_block(x, 3, [512, 512, 2048], stage=5, block='b', **block_config)
x = identity_block(x, 3, [512, 512, 2048], stage=5, block='c', **block_config)
x = layers.GlobalAveragePooling2D()(x)
x = layers.Dense(
num_classes,
kernel_initializer=initializers.RandomNormal(stddev=0.01),
kernel_regularizer=_gen_l2_regularizer(use_l2_regularizer),
bias_regularizer=_gen_l2_regularizer(use_l2_regularizer),
name='fc1000')(
x)
# A softmax that is followed by the model loss must be done cannot be done
# in float16 due to numeric issues. So we pass dtype=float32.
x = layers.Activation('softmax', dtype='float32')(x)
# Create model.
return models.Model(img_input, x, name='resnet50')'AIFFEL > fundametal' 카테고리의 다른 글
| Fundamental 24 (0) | 2022.02.09 |
|---|---|
| Fundamental 23 (0) | 2022.02.07 |
| Fundamental 21 (0) | 2022.01.28 |
| Fundamental 20 (0) | 2022.01.26 |
| Fundamental 19 (0) | 2022.01.24 |



