Notice
Recent Posts
Recent Comments
Link
일 | 월 | 화 | 수 | 목 | 금 | 토 |
---|---|---|---|---|---|---|
1 | 2 | |||||
3 | 4 | 5 | 6 | 7 | 8 | 9 |
10 | 11 | 12 | 13 | 14 | 15 | 16 |
17 | 18 | 19 | 20 | 21 | 22 | 23 |
24 | 25 | 26 | 27 | 28 | 29 | 30 |
Tags
- focal loss
- sm-segmentation
- LightGBM
- SSAC
- Dice Loss
- segmentation
- loss function
- Satellite Image
- Satel
- 시계열
Archives
- Today
- Total
고양이는 털털해
SSACxAIFFEL 10주차: 21'03.08. ~ 03.12; 전이학습, keras 본문
전이학습 Transfer Learning¶
전이학습이란 내가 모델에게 학습시켜 풀고자 하는 문제와 데이터셋과 유사한 데이터셋, 문제에 대해 잘 학습되어 있는 사전 학습된 모델을 이용하는 것을 말한다. 남들이 만든 모델로 파이프라인을 만들고 fine tuning 하는 모듈식 사고구조를 만든 것이 딥러닝에서의 혁신이었다고 한다. 풀고자 하는 문제가 이용하고자 하는 모델이 학습한 데이터와 다르거나 하면 전이학습이 잘 안될 수도 있다고 하지만 시작 토대를 어느정도 가지고 시작할 수 있는 선택지가 있다면 많은 도움이 될 것 같다. 그렇다면 어떻게 다른 사람이 학습시킨 모델을 가져와서 이걸 잘라내서 활용할 수 있을까?
- 학습한 노드에서는 아래와 같이 keras에서 불러온 vgg19 를 사용했다.
- keras applications 에서는 resnet, vgg16, efficentnet, xception등 다른 모델들도 제공한다. tf.keras 공식문서
- 노드에서는 어떻게 잘라서 어떻게 활용했는지 보자.
- layer 자르기
In [1]:
from tensorflow.python.keras import applications
vgg = applications.vgg19.VGG19(
include_top=False, # 마지막 fcl 3층이 있는 상태로 모델을 구성할 것인지?
weights="imagenet", # weights 초기값을 어떻게 설정할 것인지? None, imagenet, 저장된 다른 값 세가지 선택지 중 택일. default는 iamgenet
input_shape=(None, None, 3) # include_top=False로 설정한 경우 지정해 줘야 합니다.
)
In [2]:
vgg.summary()
Model: "vgg19" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= input_1 (InputLayer) [(None, None, None, 3)] 0 _________________________________________________________________ block1_conv1 (Conv2D) (None, None, None, 64) 1792 _________________________________________________________________ block1_conv2 (Conv2D) (None, None, None, 64) 36928 _________________________________________________________________ block1_pool (MaxPooling2D) (None, None, None, 64) 0 _________________________________________________________________ block2_conv1 (Conv2D) (None, None, None, 128) 73856 _________________________________________________________________ block2_conv2 (Conv2D) (None, None, None, 128) 147584 _________________________________________________________________ block2_pool (MaxPooling2D) (None, None, None, 128) 0 _________________________________________________________________ block3_conv1 (Conv2D) (None, None, None, 256) 295168 _________________________________________________________________ block3_conv2 (Conv2D) (None, None, None, 256) 590080 _________________________________________________________________ block3_conv3 (Conv2D) (None, None, None, 256) 590080 _________________________________________________________________ block3_conv4 (Conv2D) (None, None, None, 256) 590080 _________________________________________________________________ block3_pool (MaxPooling2D) (None, None, None, 256) 0 _________________________________________________________________ block4_conv1 (Conv2D) (None, None, None, 512) 1180160 _________________________________________________________________ block4_conv2 (Conv2D) (None, None, None, 512) 2359808 _________________________________________________________________ block4_conv3 (Conv2D) (None, None, None, 512) 2359808 _________________________________________________________________ block4_conv4 (Conv2D) (None, None, None, 512) 2359808 _________________________________________________________________ block4_pool (MaxPooling2D) (None, None, None, 512) 0 _________________________________________________________________ block5_conv1 (Conv2D) (None, None, None, 512) 2359808 _________________________________________________________________ block5_conv2 (Conv2D) (None, None, None, 512) 2359808 _________________________________________________________________ block5_conv3 (Conv2D) (None, None, None, 512) 2359808 _________________________________________________________________ block5_conv4 (Conv2D) (None, None, None, 512) 2359808 _________________________________________________________________ block5_pool (MaxPooling2D) (None, None, None, 512) 0 ================================================================= Total params: 20,024,384 Trainable params: 20,024,384 Non-trainable params: 0 _________________________________________________________________
- 노드에서는 불러온 vgg 모델을 Model 메소드로 감싸서 vgg의 input을 그대로 inputs으로 활용하고 outputs에는 모델 layers 인덱스 20번째를 설정해서 마지막 pooling layer를 제거했다.
In [17]:
from tensorflow.keras import Model
vgg_new = Model(inputs=vgg.input, outputs=vgg.layers[20].output)
In [18]:
vgg_new.summary()
Model: "model" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= input_1 (InputLayer) [(None, None, None, 3)] 0 _________________________________________________________________ block1_conv1 (Conv2D) (None, None, None, 64) 1792 _________________________________________________________________ block1_conv2 (Conv2D) (None, None, None, 64) 36928 _________________________________________________________________ block1_pool (MaxPooling2D) (None, None, None, 64) 0 _________________________________________________________________ block2_conv1 (Conv2D) (None, None, None, 128) 73856 _________________________________________________________________ block2_conv2 (Conv2D) (None, None, None, 128) 147584 _________________________________________________________________ block2_pool (MaxPooling2D) (None, None, None, 128) 0 _________________________________________________________________ block3_conv1 (Conv2D) (None, None, None, 256) 295168 _________________________________________________________________ block3_conv2 (Conv2D) (None, None, None, 256) 590080 _________________________________________________________________ block3_conv3 (Conv2D) (None, None, None, 256) 590080 _________________________________________________________________ block3_conv4 (Conv2D) (None, None, None, 256) 590080 _________________________________________________________________ block3_pool (MaxPooling2D) (None, None, None, 256) 0 _________________________________________________________________ block4_conv1 (Conv2D) (None, None, None, 512) 1180160 _________________________________________________________________ block4_conv2 (Conv2D) (None, None, None, 512) 2359808 _________________________________________________________________ block4_conv3 (Conv2D) (None, None, None, 512) 2359808 _________________________________________________________________ block4_conv4 (Conv2D) (None, None, None, 512) 2359808 _________________________________________________________________ block4_pool (MaxPooling2D) (None, None, None, 512) 0 _________________________________________________________________ block5_conv1 (Conv2D) (None, None, None, 512) 2359808 _________________________________________________________________ block5_conv2 (Conv2D) (None, None, None, 512) 2359808 _________________________________________________________________ block5_conv3 (Conv2D) (None, None, None, 512) 2359808 _________________________________________________________________ block5_conv4 (Conv2D) (None, None, None, 512) 2359808 ================================================================= Total params: 20,024,384 Trainable params: 20,024,384 Non-trainable params: 0 _________________________________________________________________
- outputs를 모델 레이어 attribute에서 index를 붙여서 지정한 것을 보면 layers는 순서를 지정해 반복적으로 꺼낼 수 있는 iterable한 객체인 모양이다. layers를 살펴보자.
In [3]:
vgg.layers
Out[3]:
[<tensorflow.python.keras.engine.input_layer.InputLayer at 0x7f78ec27aa90>, <tensorflow.python.keras.layers.convolutional.Conv2D at 0x7f78176a77d0>, <tensorflow.python.keras.layers.convolutional.Conv2D at 0x7f78ec27ee90>, <tensorflow.python.keras.layers.pooling.MaxPooling2D at 0x7f7816e787d0>, <tensorflow.python.keras.layers.convolutional.Conv2D at 0x7f781441e710>, <tensorflow.python.keras.layers.convolutional.Conv2D at 0x7f781441e210>, <tensorflow.python.keras.layers.pooling.MaxPooling2D at 0x7f78144372d0>, <tensorflow.python.keras.layers.convolutional.Conv2D at 0x7f781443c450>, <tensorflow.python.keras.layers.convolutional.Conv2D at 0x7f7814437b90>, <tensorflow.python.keras.layers.convolutional.Conv2D at 0x7f78143cb210>, <tensorflow.python.keras.layers.convolutional.Conv2D at 0x7f78176c98d0>, <tensorflow.python.keras.layers.pooling.MaxPooling2D at 0x7f781773bf10>, <tensorflow.python.keras.layers.convolutional.Conv2D at 0x7f78177431d0>, <tensorflow.python.keras.layers.convolutional.Conv2D at 0x7f78181cc7d0>, <tensorflow.python.keras.layers.convolutional.Conv2D at 0x7f78198a9150>, <tensorflow.python.keras.layers.convolutional.Conv2D at 0x7f7819896990>, <tensorflow.python.keras.layers.pooling.MaxPooling2D at 0x7f781996ac50>, <tensorflow.python.keras.layers.convolutional.Conv2D at 0x7f781988a650>, <tensorflow.python.keras.layers.convolutional.Conv2D at 0x7f78144333d0>, <tensorflow.python.keras.layers.convolutional.Conv2D at 0x7f78144336d0>, <tensorflow.python.keras.layers.convolutional.Conv2D at 0x7f78143f8490>, <tensorflow.python.keras.layers.pooling.MaxPooling2D at 0x7f78143fdc90>]
In [5]:
type(vgg.layers)
Out[5]:
list
In [4]:
len(vgg.layers)
Out[4]:
22
- 리스트 형태이고 길이도 가지고 있다. 그러면 리스트에서 하던 것 처럼 레이어도 다룰 수 있을까?
In [6]:
vgg.layers.pop()
Out[6]:
<tensorflow.python.keras.layers.pooling.MaxPooling2D at 0x7f78143fdc90>
- pop()이 작동하는것 처럼 보인다.
summary()
메소드로 모델이 변화했는지 확인해 보자.
In [12]:
vgg.summary()
Model: "vgg19" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= input_1 (InputLayer) [(None, None, None, 3)] 0 _________________________________________________________________ block1_conv1 (Conv2D) (None, None, None, 64) 1792 _________________________________________________________________ block1_conv2 (Conv2D) (None, None, None, 64) 36928 _________________________________________________________________ block1_pool (MaxPooling2D) (None, None, None, 64) 0 _________________________________________________________________ block2_conv1 (Conv2D) (None, None, None, 128) 73856 _________________________________________________________________ block2_conv2 (Conv2D) (None, None, None, 128) 147584 _________________________________________________________________ block2_pool (MaxPooling2D) (None, None, None, 128) 0 _________________________________________________________________ block3_conv1 (Conv2D) (None, None, None, 256) 295168 _________________________________________________________________ block3_conv2 (Conv2D) (None, None, None, 256) 590080 _________________________________________________________________ block3_conv3 (Conv2D) (None, None, None, 256) 590080 _________________________________________________________________ block3_conv4 (Conv2D) (None, None, None, 256) 590080 _________________________________________________________________ block3_pool (MaxPooling2D) (None, None, None, 256) 0 _________________________________________________________________ block4_conv1 (Conv2D) (None, None, None, 512) 1180160 _________________________________________________________________ block4_conv2 (Conv2D) (None, None, None, 512) 2359808 _________________________________________________________________ block4_conv3 (Conv2D) (None, None, None, 512) 2359808 _________________________________________________________________ block4_conv4 (Conv2D) (None, None, None, 512) 2359808 _________________________________________________________________ block4_pool (MaxPooling2D) (None, None, None, 512) 0 _________________________________________________________________ block5_conv1 (Conv2D) (None, None, None, 512) 2359808 _________________________________________________________________ block5_conv2 (Conv2D) (None, None, None, 512) 2359808 _________________________________________________________________ block5_conv3 (Conv2D) (None, None, None, 512) 2359808 _________________________________________________________________ block5_conv4 (Conv2D) (None, None, None, 512) 2359808 _________________________________________________________________ block5_pool (MaxPooling2D) (None, None, None, 512) 0 ================================================================= Total params: 20,024,384 Trainable params: 20,024,384 Non-trainable params: 0 _________________________________________________________________
- 제일 마지막 pooling layer가 남아있다.
model.layers
는 모델 레이어의 shallow copy list를 반환하게 되기 때문에layers.pop()
은 원 모델을 수정하는 것이 아니고 마지막 레이어 만으로 새로운 모델을 만드는 형태가 된다고 한다.- 원래의 모델을 수정하려면
model._layers
로 접근해야 한다고 한다.
- 원래의 모델을 수정하려면
In [7]:
vgg._layers
Out[7]:
[<tensorflow.python.keras.engine.input_layer.InputLayer at 0x7f78ec27aa90>, <tensorflow.python.keras.layers.convolutional.Conv2D at 0x7f78176a77d0>, <tensorflow.python.keras.layers.convolutional.Conv2D at 0x7f78ec27ee90>, <tensorflow.python.keras.layers.pooling.MaxPooling2D at 0x7f7816e787d0>, <tensorflow.python.keras.layers.convolutional.Conv2D at 0x7f781441e710>, <tensorflow.python.keras.layers.convolutional.Conv2D at 0x7f781441e210>, <tensorflow.python.keras.layers.pooling.MaxPooling2D at 0x7f78144372d0>, <tensorflow.python.keras.layers.convolutional.Conv2D at 0x7f781443c450>, <tensorflow.python.keras.layers.convolutional.Conv2D at 0x7f7814437b90>, <tensorflow.python.keras.layers.convolutional.Conv2D at 0x7f78143cb210>, <tensorflow.python.keras.layers.convolutional.Conv2D at 0x7f78176c98d0>, <tensorflow.python.keras.layers.pooling.MaxPooling2D at 0x7f781773bf10>, <tensorflow.python.keras.layers.convolutional.Conv2D at 0x7f78177431d0>, <tensorflow.python.keras.layers.convolutional.Conv2D at 0x7f78181cc7d0>, <tensorflow.python.keras.layers.convolutional.Conv2D at 0x7f78198a9150>, <tensorflow.python.keras.layers.convolutional.Conv2D at 0x7f7819896990>, <tensorflow.python.keras.layers.pooling.MaxPooling2D at 0x7f781996ac50>, <tensorflow.python.keras.layers.convolutional.Conv2D at 0x7f781988a650>, <tensorflow.python.keras.layers.convolutional.Conv2D at 0x7f78144333d0>, <tensorflow.python.keras.layers.convolutional.Conv2D at 0x7f78144336d0>, <tensorflow.python.keras.layers.convolutional.Conv2D at 0x7f78143f8490>, <tensorflow.python.keras.layers.pooling.MaxPooling2D at 0x7f78143fdc90>]
In [8]:
type(vgg._layers)
Out[8]:
list
In [13]:
vgg._layers.pop()
Out[13]:
<tensorflow.python.keras.layers.pooling.MaxPooling2D at 0x7f5cf02f3490>
In [14]:
vgg.summary()
Model: "vgg19" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= input_1 (InputLayer) [(None, None, None, 3)] 0 _________________________________________________________________ block1_conv1 (Conv2D) (None, None, None, 64) 1792 _________________________________________________________________ block1_conv2 (Conv2D) (None, None, None, 64) 36928 _________________________________________________________________ block1_pool (MaxPooling2D) (None, None, None, 64) 0 _________________________________________________________________ block2_conv1 (Conv2D) (None, None, None, 128) 73856 _________________________________________________________________ block2_conv2 (Conv2D) (None, None, None, 128) 147584 _________________________________________________________________ block2_pool (MaxPooling2D) (None, None, None, 128) 0 _________________________________________________________________ block3_conv1 (Conv2D) (None, None, None, 256) 295168 _________________________________________________________________ block3_conv2 (Conv2D) (None, None, None, 256) 590080 _________________________________________________________________ block3_conv3 (Conv2D) (None, None, None, 256) 590080 _________________________________________________________________ block3_conv4 (Conv2D) (None, None, None, 256) 590080 _________________________________________________________________ block3_pool (MaxPooling2D) (None, None, None, 256) 0 _________________________________________________________________ block4_conv1 (Conv2D) (None, None, None, 512) 1180160 _________________________________________________________________ block4_conv2 (Conv2D) (None, None, None, 512) 2359808 _________________________________________________________________ block4_conv3 (Conv2D) (None, None, None, 512) 2359808 _________________________________________________________________ block4_conv4 (Conv2D) (None, None, None, 512) 2359808 _________________________________________________________________ block4_pool (MaxPooling2D) (None, None, None, 512) 0 _________________________________________________________________ block5_conv1 (Conv2D) (None, None, None, 512) 2359808 _________________________________________________________________ block5_conv2 (Conv2D) (None, None, None, 512) 2359808 _________________________________________________________________ block5_conv3 (Conv2D) (None, None, None, 512) 2359808 _________________________________________________________________ block5_conv4 (Conv2D) (None, None, None, 512) 2359808 ================================================================= Total params: 20,024,384 Trainable params: 20,024,384 Non-trainable params: 0 _________________________________________________________________
- 모델이 수정된 것을 확인할 수 있다.
- weight 고정하기
model.trainable
로 접근해서 모델 전체의 학습가능을 결정하거나model.layers[index].trainable
로 레이어별로 접근하여 설정할 수도 있다고 한다.
In [10]:
vgg.trainable
Out[10]:
True
In [11]:
vgg.layers[10].trainable
Out[11]:
True
In [24]:
vgg.trainable = False
print(vgg.trainable, vgg.layers[2].trainable)
False False
- layer 별로 접근해서 학습가능 여부를 변경하면 해당 레이어만 변경할 수 있다.
In [34]:
print(f'vgg 모델 idx 2번 학습 가능 확인: {vgg.layers[2].trainable}')
vgg.layers[2].trainable = True
print(f'vgg 모델 idx 2번 학습 가능 확인: {vgg.layers[2].trainable}, idx 10번 학습 가능 확인{vgg.layers[10].trainable}')
vgg 모델 idx 2번 학습 가능 확인: True vgg 모델 idx 2번 학습 가능 확인: True, idx 10번 학습 가능 확인False
'공부 > SSACxAIFFEL' 카테고리의 다른 글
lightgbm 하이퍼 파라미터 (0) | 2021.06.21 |
---|---|
Semantic Segmentation : DICE LOSS (0) | 2021.06.21 |
SSACxAIFFEL 5주차 : 21' 02.01 ~ 02.05 (0) | 2021.02.05 |
SSACxAIFFEL 4주차 : 21' 01.25 ~ 01.29 (0) | 2021.01.31 |
SSACxAIFFEL 3주차 : 21' 01.18 ~ 01.22 ; loss functions (0) | 2021.01.22 |