[ 딥러닝 ] LSTM

Updated: February 13, 2022

Long Short Term Memory (LSTM)

LSTM 구조.jpg

Untitled

LSTM cell 구조.png

RNN의 hidden state에 cell state를 추가한 구조
- cell state : 유용한 정보만 저장한다.
입력
- input
- previous cell state
- previous hidden state (output)
출력
- output (혹은 hidden state)
- next cell state
- next hidden state

Forget Gate

Untitled

\[f_t = \sigma(W_f \cdot[h_{t-1},x_t]) = \sigma(W_{xh\_f}x_t+W_{hh\_f}h_{t-1})\]

$\sigma$ : sigmoid function
어떤 정보를 cell state에서 제거할 것인가?

Input Gate

Untitled

\[\begin{align*} &i_t \odot \tilde{C}_t \\ & i_t = \sigma(W_i \cdot[h_{t-1},x_t]) = \sigma(W_{xh\_i}x_t+W_{hh\_i}h_{t-1}) \\ & \tilde{C}_t = tanh(W_C \cdot[h_{t-1},x_t]) = tanh(W_{xh\_\tilde{C}}x_t+W_{hh\_\tilde{C}}h_{t-1}) \end{align*}\]

어떤 정보를 cell state에서 더해줄 것인가?
$i_t$ : 현재 정보를 cell state에 올릴지 말지 결정
$\tilde{C}_t$ : 현재 정보와 이전 output으로 얻어지는 cell state candidate

Update Cell

Untitled

\[c_t=f_t \odot c_{t-1}+i_t \odot \tilde{C}_t\]

forget gate와 input gate를 통과한 결과들을 가지고 cell state를 업데이트한다.
cell state에 대한 정보는 cell 밖으로 나가지 않는다.

Output Gate

Untitled

\[\begin{align*} & o_t = \sigma(W_{xh\_o}x_t+W_{hh\_o}h_{t-1}) \\ & h_t = o_t \odot tanh(c_t) \end{align*}\]

어떤 정보를 읽을 것인가?
업데이트 된 cell state를 기반으로 결과를 출력시킨다.

Twitter Facebook LinkedIn

JeongBeen Seo

[ 딥러닝 ] LSTM

Long Short Term Memory (LSTM)

Forget Gate

Input Gate

Update Cell

Output Gate

You May Also Enjoy

[ 프로그래머스 ] 올바른 괄호 (Python)

[ 프로그래머스 ] 같은 숫자는 싫어 (Python)

[ 데이터구조 ] DFS, BFS

[ 백준 ] 11729번 - 하노이 탑 이동 순서 (Python)