๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ
๋จธ์‹ ๋Ÿฌ๋‹, ๋”ฅ๋Ÿฌ๋‹/OCR

Variable Scope ๊ณต๋ถ€

by ํ–‰๋ฑ 2019. 8. 13.

์ฝ์€ ์ž๋ฃŒ: https://tensorflowkorea.gitbooks.io/tensorflow-kr/content/g3doc/how_tos/variable_scope/

์ฃผ์š” ํ•จ์ˆ˜

- tf.get_variable(<name>, <shape>, <initializer>): ์ž…๋ ฅ๋œ ์ด๋ฆ„์˜ ๋ณ€์ˆ˜๋ฅผ ์ƒ์„ฑํ•˜๊ฑฐ๋‚˜ ๋ฐ˜ํ™˜

- tf.variable_scope(<scope_name>): tf.get_variable()์— ์ „๋‹ฌ๋œ ์ด๋ฆ„์˜ ๋„ค์ž„์ŠคํŽ˜์ด์Šค๋ฅผ ๊ด€๋ฆฌ (with์™€ ํ•จ๊ป˜ ์“ฐ์ž„)

- tf.get_variable_scope(): ํ˜„์žฌ variable scope๋ฅผ ๋ฐ˜ํ™˜

 

tf.get_variable()

- tf.Variable()์ฒ˜๋Ÿผ ์ง์ ‘ ๊ฐ’์„ ์ „๋‹ฌํ•˜์ง€ ์•Š๊ณ  Initializer๋ฅผ ์‚ฌ์šฉ

tf.constant_initializer(value): ์ œ๊ณต๋œ ๊ฐ’์œผ๋กœ ๋ชจ๋“  ๊ฒƒ์„ ์ดˆ๊ธฐํ™”
tf.random_uniform_initializer(a, b): [a, b]๋ฅผ ๊ท ์ผํ•˜๊ฒŒ ์ดˆ๊ธฐํ™”
tf.random_normal_initializer(mean, stddev): ์ฃผ์–ด์ง„ mean๊ณผ stddev๋กœ ์ •๊ทœ๋ถ„ํฌ์—์„œ ์ดˆ๊ธฐํ™”

 

- variable scope์˜ reuse์—ฌ๋ถ€์— ๋”ฐ๋ผ 1) ๋ณ€์ˆ˜๋ฅผ ์ƒ์„ฑํ•˜๊ฑฐ๋‚˜ 2) ๋ณ€์ˆ˜๋ฅผ ๋ฐ˜ํ™˜ํ•จ

1) tf.get_variable_scope().reuse == False ์ผ ๋•Œ
- ๋ณ€์ˆ˜์˜ ์ด๋ฆ„์€ 'ํ˜„์žฌ variable scope ์ด๋ฆ„ + ์ œ๊ณต๋œ name' ์œผ๋กœ ์ •ํ•ด์ง€๋ฉฐ, ์ด๋Ÿฐ ์ด๋ฆ„์„ ๊ฐ€์ง„ ๋ณ€์ˆ˜๊ฐ€ ์žˆ๋Š”์ง€ ํ™•์ธํ•จ
- ๊ทธ๋Ÿฐ ๋ณ€์ˆ˜๊ฐ€ ์กด์žฌํ•œ๋‹ค๋ฉด Raise ValueError
- ๊ทธ๋Ÿฐ ๋ณ€์ˆ˜๊ฐ€ ์—†๋‹ค๋ฉด ๋ณ€์ˆ˜๋ฅผ ์ƒ์„ฑ

2) tf.get_variable_scope().reuse == True ์ผ ๋•Œ
- ๋ณ€์ˆ˜์˜ ์ด๋ฆ„์€ 'ํ˜„์žฌ variable scope ์ด๋ฆ„ + ์ œ๊ณต๋œ name' ์œผ๋กœ ์ •ํ•ด์ง€๋ฉฐ, ์ด๋Ÿฐ ์ด๋ฆ„์„ ๊ฐ€์ง„ ๋ณ€์ˆ˜๊ฐ€ ์žˆ๋Š”์ง€ ํ™•์ธํ•จ
- ๊ทธ๋Ÿฐ ๋ณ€์ˆ˜๊ฐ€ ์กด์žฌํ•œ๋‹ค๋ฉด ํ•ด๋‹น ๋ณ€์ˆ˜๋ฅผ ใ„น๋ฐ˜ํ™˜
- ๊ทธ๋Ÿฐ ๋ณ€์ˆ˜๊ฐ€ ์—†๋‹ค๋ฉด Raise ValueError

 

tf.variable_scope()

- tf.get_variable_scope().reuse_variables()๋ฅผ ํ˜ธ์ถœํ•ด ํ˜„์žฌ variable scope์˜ reuse ํ”Œ๋ž˜๊ทธ๋ฅผ True๋กœ ์„ค์ •ํ•  ์ˆ˜ ์žˆ์Œ (reuse ํ”Œ๋ž˜๊ทธ๋Š” ํ•ด๋‹น variable scope์˜ ๋ณ€์ˆ˜๋“ค์˜ ์žฌ์‚ฌ์šฉ(๊ณต์œ ) ์—ฌ๋ถ€๋ฅผ ๋‚˜ํƒ€๋ƒ„)

- ๊ทธ๋Ÿฌ๋‚˜ reuse ํ”Œ๋ž˜๊ทธ๋ฅผ False๋กœ ์„ค์ •ํ•  ์ˆ˜๋Š” ์—†์Œ

์ด์œ : 
์ด์œ ๋Š” ๋ชจ๋ธ์„ ์ƒ์„ฑํ•˜๋Š” ํ•จ์ˆ˜๋ฅผ ๊ตฌ์„ฑํ•˜๋Š” ๊ฒƒ์„ ํ—ˆ์šฉํ•˜๊ธฐ ์œ„ํ•ด์„œ์ž…๋‹ˆ๋‹ค. ์ด์ „์ฒ˜๋Ÿผmy_image_filter(inputs) ํ•จ์ˆ˜๋ฅผ ์ž‘์„ฑํ•œ๋‹ค๊ณ  ์ƒ์ƒํ•ด๋ณด์‹ญ์‹œ์˜ค. ๋ณ€์ˆ˜ ๋ฒ”์œ„์—์„œ reuse=True์™€ ํ•จ๊ป˜ ํ•จ์ˆ˜๋ฅผ ํ˜ธ์ถœํ•˜๋Š” ๋ˆ„๊ตฐ๊ฐ€๋Š” ๋ชจ๋“  ๋‚ด๋ถ€ ๋ณ€์ˆ˜๊ฐ€ ์žฌ์‚ฌ์šฉ ๋  ๊ฒƒ์œผ๋กœ ์˜ˆ์ƒํ•ฉ๋‹ˆ๋‹ค. ํ•จ์ˆ˜ ๋‚ด๋ถ€์—์„œ reuse=False๋ฅผ ๊ฐ•์ œ๋กœ ํ—ˆ์šฉํ•˜๋ฉด ์ด ๊ณ„์•ฝ์ด ๊นจ์ง€๊ฒŒ ๋˜๊ณ , ์ด๋Ÿฐ ๋ฐฉ๋ฒ•์œผ๋กœ ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ๊ณต์œ ํ•˜๋Š” ๊ฒƒ์„ ์–ด๋ ต๊ฒŒ ๋งŒ๋“ญ๋‹ˆ๋‹ค.

 

- ์ƒ์œ„ variable scope์˜ reuse ํ”Œ๋ž˜๊ทธ๊ฐ€ True์ด๋ฉด ํ•˜์œ„ variable scope์˜ reuse ํ”Œ๋ž˜๊ทธ๋„ True๊ฐ€ ๋จ (sub-scope๋กœ reuse๊ฐ€ ์ƒ์†๋œ๋‹ค๊ณ  ์ƒ๊ฐํ•  ์ˆ˜ ์žˆ์Œ)

- reuse ํ”Œ๋ž˜๊ทธ๊ฐ€ True์ธ variable scope๋ฅผ ์ข…๋ฃŒํ•˜์—ฌ reuse ํ”Œ๋ž˜๊ทธ๊ฐ€ False์ธ variable scope๋กœ ๋‚˜๊ฐˆ ์ˆ˜ ์žˆ์Œ

with tf.variable_Scope("root"):
    assert tf.get_variable_scope().reuse == False
    
    with tf.variable_scope("foo"):
        assert tf.get_variable_scope().reuse == False
    
    with tf.variable_scope("foo", reuse=True):
        assert tf.get_variable_scope().reuse == True
        
        with tf.variable_scope("bar"):
            assert tf.get_variable_scope().reuse == True
 
    assert tf.get_variable_scope().reuse == False

 

- with ... as ... ๋ฅผ ์ด์šฉํ•ด variable scope๋ฅผ ์บก์ฒ˜ํ•ด๋‘๊ณ  ์ดํ›„์— ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Œ

with tf.variable_scope("foo") as foo_scope:
    assert foo_scope.name == "foo"

with tf.variable_scope("bar"):
    with tf.variable_scope("baz") as other_scope:
        assert other_scope.name == "bar/baz"
        
        # Attend here #
        with tf.variable_scope(foo_scope) as foo_scope2:
            assert foo_scope2.name == "foo"

 

- variable scope ์ž์ฒด์— initializer๋ฅผ ์„ค์ •ํ•ด ๊ทธ ์•„๋ž˜์„œ ์ƒ์„ฑํ•œ ๋ณ€์ˆ˜๋“ค์˜ initializer๋ฅผ ํ•œ ๋ฒˆ์— ์„ค์ •ํ•ด์ค„ ์ˆ˜ ์žˆ์Œ (๊ทธ๋Ÿฌ๋‚˜ tf.get_variable()์— initializer๋ฅผ ๋ช…์‹œํ•˜๋ฉด ์˜ค๋ฒ„๋ผ์ด๋“œ๋จ)

with tf.variable_Scope("foo", initializer=tf.constant_initializer(0.4)):
    v = tf.get_variable("v", [1])
    assert v.eval() == 0.4 # Default initializer
    
    w = tf.get_variable("w", [1], initializer=tf.constant_initializer(0.3)):
    assert w.eval() == 0.3 # Specific initializer
    
    with tf.variable_scope("bar"):
        v = tf.get_variable("v", [1])
        assert v.eval() == 0.4 # Inherited default initializer
    
    with tf.variable_scope("baz", initializer=tf.constant_initializer(0.2)):
        v = tf.get_variable("v", [1])
        assert v.eval() == 0.2 # Changed default initializer

์˜ˆ์ œ

Incorrect example

def my_image_filter(input_images):
    # conv1
    conv1_weights = tf.Variable(tf.random_normal([5, 5, 32, 32]),
                                name="conv1_weights")
    conv1_biases = tf.Variable(tf.zeros([32]), name="conv1_biases")
    
    conv1 = tf.nn.conv2d(input_images, conv1_weights,
                         strides=[1, 1, 1, 1], padding='SAME')
    relu1 = tf.nn.relu(conv1 + conv1_biases)
    
    # conv2
    conv2_weights = tf.Variable(tf.random_normal([5, 5, 32, 32]),
                                name="conv2_weights")
    conv2_biases = tf.Variable(tf.zeros([32]), name="conv2_biases")
    
    conv2 = tf.nn.conv2d(relu1, conv2_weights,
                         strides=[1, 1, 1, 1], padding='SAME')
    relu1 = tf.nn.relu(conv2 + conv2_biases)

# Apply 2 images to my_image_filter
result1 = my_image_filter(image1)
result2 = my_image_filter(image2)

โž” ์ž˜๋ชป๋œ ์ด์œ : image1๊ณผ image2๋ฅผ my_image_filter์— ์ ์šฉํ•  ๋•Œ ๊ฐ™์€ ๋ณ€์ˆ˜ ์„ธํŠธ๋ฅผ ๊ณต์œ ํ•ด์•ผ ํ•˜๋Š”๋ฐ ๋ณ€์ˆ˜ ์„ธํŠธ๊ฐ€ ๊ฐ๊ฐ ์ƒ์„ฑ๋จ (์—๋Ÿฌ๊ฐ€ ๋‚˜๋Š” ๊ฑด ์•„๋‹ˆ์ง€๋งŒ ์˜๋„์— ๋งž์ง€ ์•Š์Œ)

 

Correct example

def conv_relu(input, kernel_shape, bias_shape):
    # Create variable named "weights"
    weights = tf.get_variable("weights", kernel_shape,
                              initializer=tf.random_normal_initializer())
    # Create variable named "biases"
    biases = tf.get_variable("biases", bias_shape,
                             initializer=tf.constant_initializer(0.0))
    
    conv = tf.nn.conv2d(input, wegiths, strides=[1, 1, 1, 1], padding='SAME')
    return tf.nn.relu(conv + biases)

def my_image_filter(input_images):
    with tf.variable_scope("conv1"):
        # Variables created here will be named "conv1/weights", "conv1/biases"
        relu1 = conv_relu(input_images, [5, 5, 32, 32], [32])
        
    with tf.variable_scope("conv2"):
        # Variables created here will be named "conv2/weights", "conv2/biases"
        return conv_relu(relu1, [5, 5, 32, 32], [32])
        
# Raises ValueError
result1 = my_image_filter(image1)
result2 = my_image_filter(image2)

# Use this instead
with tf.variable_scope("image_filters") as scope:
    result1 = my_image_filter(image1)
    scope.reuse_variables()
    result2 = my_image_filter(image2)

my_image_filter ๊ตฌ์กฐ๋Š” conv1 + relu1 + conv2 + relu2 ์ธ๋ฐ ์—ฌ๊ธฐ์„œ (conv + relu) ์„ธํŠธ๊ฐ€ ๋‘ ๊ฐœ ์žˆ๋‹ค๊ณ  ์ƒ๊ฐํ•˜์—ฌ conv_relu() ๋ชจ๋“ˆ์„ ๋งŒ๋“  ๊ฒƒ์ด๋‹ค. conv_relu() ๋‚ด๋ถ€์—์„œ weights์™€ biases๋ผ๋Š” ์ด๋ฆ„์˜ ๋ณ€์ˆ˜๋ฅผ tf.get_variable()๋กœ ๊ฐ€์ ธ์™”๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ conv1๊ณผ conv2๋Š” ์„œ๋กœ ๋‹ค๋ฅธ ๊ฐ’์˜ weights์™€ biases๋ฅผ ๊ฐ€์ ธ์•ผ ํ•˜๊ธฐ ๋•Œ๋ฌธ์— tf.variable_scope()๋ฅผ ์ด์šฉํ•ด ๊ฐ๊ฐ conv1/weights, conv1/biases, conv2/weights, conv2/biases๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•œ๋‹ค.

๊ทธ๋ฆฌ๊ณ  ๋‘ ์ด๋ฏธ์ง€ image1, image2์— ๋Œ€ํ•˜์—ฌ my_image_filter๋ฅผ ์ ์šฉํ•  ๋•Œ๋Š” conv1/weights, conv1/biases, conv2/weights, conv2/biases๊ฐ€ (๋‘ ์ด๋ฏธ์ง€์— ๋Œ€ํ•˜์—ฌ) ๊ณต์œ ๋˜๋Š” ๊ฐ’์ด์–ด์•ผ ํ•œ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์ด๋Ÿฐ ์˜๋„๋ฅผ ๊ฐ€์ง€๊ณ  ๊ทธ๋ƒฅ my_image_filter๋ฅผ ๋‘ ๋ฒˆ ๋Œ๋ ค๋ฒ„๋ฆฌ๋ฉด ValueError(... conv1/weights already exists ...)๊ฐ€ ๋‚œ๋‹ค. ์ฆ‰ ๋ณ€์ˆ˜๊ฐ€ ์šฐ์—ฐํžˆ ๊ณต์œ ๋œ ๊ฒƒ์ธ์ง€๋ฅผ ํ™•์ธํ•˜๋Š” ๊ฒƒ์ด๋‹ค. ๋ณ€์ˆ˜๋ฅผ ๊ณต์œ ํ•˜๋ ค๋ฉด reuse_variables()๋ฅผ ์ด์šฉํ•ด ๋ณ€์ˆ˜๋ฅผ ์žฌ์‚ฌ์šฉํ•  ๊ฒƒ์ž„์„ ๋ช…์‹œํ•ด์•ผ ํ•œ๋‹ค.

 

 

๋Œ“๊ธ€