BERT

티스토리 뷰

NLP /이것저것

BERT

제이gnoej 2019. 8. 20. 17:55

BERT 를 이해하기 위해 읽은 글들

1. https://medium.com/dissecting-bert/dissecting-bert-part-1-d3c3d495cdb3

개괄적인 이해에 좋음. 인풋 행렬의 임베딩이 어떻게 되는지 실제 행렬 예시를 들어 보여주기 때문에 이해 하기 좋음.

Dissecting BERT Part 1: The Encoder

This is Part 1/2 of Understanding BERT written jointly by Miguel Romero and Francisco Ingham. If you already understand the Encoder…

medium.com

2. http://jalammar.github.io/illustrated-transformer/

내부 구조를 상당히 자세하게 이미지화 해놔서 이해하기에 최고. 단, 포지셔널 임베딩에 대해서는 오류가 있으니 참고 (짝수 홀수 위치 나눠서 싸인 코싸인을 적용해야 하는데, 처음 절반은 싸인, 나머지 절반은 코싸인이라고 설명함)

The Illustrated Transformer

Discussions: Hacker News (65 points, 4 comments), Reddit r/MachineLearning (29 points, 3 comments) Translations: Chinese (Simplified), Korean Watch: MIT’s Deep Learning State of the Art lecture referencing this post In the previous post, we looked at Atten

jalammar.github.io

3.https://medium.com/@_init_/how-self-attention-with-relative-position-representations-works-28173b8c245a

사실 제일 궁금했던 부분.. 도대체 포지셔널 임베딩이 왜 트랜스포머 구조에서 중요한지, 왜 이게 먹히는지가 이해가 안 됐는데 이글 보고 이해 됨. 위치가 임베딩 되지 않았을 때 어떤 문제가 생기는지를 보여줌 (같은 단어면 위치에 상관없이 같은 아웃풋 생셩)

How Self-Attention with Relative Position Representations works

An explanation of how Relative Position Representation embeddings allow Transformers to encode sequential information in an input sequence.

medium.com

저작자표시 (새창열림)

'NLP > 이것저것' 카테고리의 다른 글

Word embedding vs Contextual embedding (0)	2021.05.04
Cross entropy loss (feat. negative log likelihood) (3)	2019.10.21
Subword encoder - tensorflow (0)	2019.09.02
Subclassing code example - tf.keras (0)	2019.08.28
음성 처리 (Audio processing) (0)	2019.07.23

공지사항

최근에 올라온 글

최근에 달린 댓글

Total

Today

Yesterday

링크

bowbowbow

TAG more

« 2025/06 »
일	월	화	수	목	금	토
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

글 보관함

공부하는 제이의 블로그

티스토리 뷰

BERT

'NLP > 이것저것' 카테고리의 다른 글

티스토리툴바