'분류 전체보기' 카테고리의 글 목록

AI 콘텐츠 탐지기 (feat. LLM 워터마킹, GPTZero)

참고자료 : https://youtu.be/-vToUx5SDW4?si=AzCDhJ51m6VI9fvZ 커피빈 선생님 최고 어떻게 AI 가 생성한 텍스트인지 알 수 있을까? 2가지 방법이 있는데 크게 1) perplexity / burstiness 2) watermarking 이 2가지를 이용하는 방법이 있다. 첫번째의 경우 모델이 생성한 텍스트만 (많이) 있으면 쉽게 Detector 를 만들 수 있다. (유저가 만들 수 있음) 후자는 아예 모델의 디코딩 과정을 약간 수정해서 고유의 워터마크를 집어 넣는 방법이다. (개발자만 가능) 1. Perplexitly / Burstiness GPT_0 가 바로 첫번째 방법으로 만들어진 모델이라고 할 수 있음 (텍스트를 복붙하면 model generated text 인..

NLP 2023. 11. 17. 22:05

Framenet , Propbank, Verbnet

차이 정리한 slides: framenet,prombank,vernet모두 lexical database 라고 보면 될까?Framenet: The basic idea is straightforward: that the meanings of most words can best be understood on the basis of a semantic frame: a description of a type of event, relation, or entity and the participants in it. (ref: https://www.nltk.org/howto/framenet.html) => 그래서 비슷한 frame 을 가지는 단어들을 그룹화 하고, 그 그룹의 description, 그 그룹이 가지는 ..

NLP /이것저것 2023. 10. 4. 18:18

[스크랩] batch size 는 꼭 2의 n제곱이어야 할까?

https://wandb.ai/datenzauberai/Batch-Size-Testing/reports/Do-Batch-Sizes-Actually-Need-To-Be-Powers-of-2---VmlldzoyMDkwNDQx Do Batch Sizes Actually Need To Be Powers of 2? Is the fixation on powers of 2 for efficient GPU utilization an urban myth? In this article, we explore whether this argument is true when using today's GPUs. wandb.ai => 여러 batchsize 를 가지고 total runtime 을 비교함. 결과적으로, 사이즈가 2의 ..

카테고리 없음 2023. 4. 12. 16:12

[스크랩] DDP 커뮤니케이션 타입 ex. all reduce, scatter..

1. ddp backend 타입 별로 사용가능한 communication strategy https://pytorch.org/docs/stable/distributed.html 2. 각 communication strategy 설명 (그림) https://pytorch.org/tutorials/intermediate/dist_tuto.html

Programming 2023. 4. 11. 23:15

Python GIL - 왜 Data parallel 이 DDP 보다 속도 향상에 도움이 되지 않을까?

https://www.youtube.com/watch?v=SyIvYUgDbjA => 기깔나게 설명해 놓았음.. 참고하기 결론적으로 파이썬은 One process, Multi-threads 는 안됨. 그니까 파이썬 코드 짤 때는 그렇게 짤 수 있지만 실제로 동작 방식이 그렇게 진행이 안 됨. 그게 GIL (Global Interpreter Lock) 때문임 => reference counting 으로 변수 메모리를 관리하는 파이썬 특성때문에 그럼. 우선, Multithreads 는 하나의 process 내에서는 변수, 데이터를 공유하는 구조임. 그래서 multi-threads 를 허용하면 thread 1 에서 이미 삭제된 변수를 thread 2 가 접근하려고 할 때 - (다시 한번 말하지만 multi th..

카테고리 없음 2023. 3. 31. 01:36

accuracy 가 높아지는데 loss 도 같이 증가한다?! over-fitting 의 증거

http://www.jussihuotari.com/2018/01/17/why-loss-and-accuracy-metrics-conflict/ Why Loss and Accuracy Metrics Conflict? – Jussi Huotari's Web A loss function is used to optimize a machine learning algorithm. An accuracy metric is used to measure the algorithm’s performance (accuracy) in an interpretable way. It goes against my intuition that these two sometimes conflict: loss is getting bett www...

Deep learning (일반) 2023. 2. 9. 22:45

Calculus on Computational Graphs: Backpropagation

https://colah.github.io/posts/2015-08-Backprop/

카테고리 없음 2023. 1. 26. 21:42

Input normalization / Batch normalization / Layer normalization

1. Input normalization 은 빠른 learning 을 위해 필요하다 x1= 1 에서 1000 까지의 value 를 가지고 x2 = 0..1 까지의 value 를 가진다고 할 때, 그럼 마찬가지로 각각 input feature에 상응하는 weight 도 다른 scale 을 갖게 됨. 결과적으로 cost function 이 왼쪽으로 한쪽은 굉장히 좁고 (elongated) , 한쪽은 넓은 모양이 되는데 이 경우에는 learning_ratio 를 아주 작게 해서 아주 많은 steps 을 밟아야지만 최적에 이르게 된다 (위에서 왼쪽 하단 이미지 참고) . 그렇게 때문에 input normalization 해서 두 input feature 의 scale 을 조정해주면 오른쪽 하단처럼 되고 결국 i..

NLP 2023. 1. 26. 21:33

Naive Bayes: Generative Learing Algorithm (↔ Discriminative)

"해당 포스팅은 Andew Ng 의 Machine learning class - Naive Bayes Classifier 를 요약한 것입니다." Discriminative model vs Generative model Discriminative model 의 대표적인 예는 Logistic regression 이라고 함. $P(y|x)$ 즉 $x$ features 가 주어졌을 때 바로 $P(y)$ 를 구하는 거임. 아래 그림처럼 Binary classification 에서 두 class 를 구분하는 초록색 선을 찾는게 목표임. Generative Model 로도 classifier 를 만들 수 있는데 방법이 조금 다름. 일단 formal 하게 말하자면 Generative Model 은 $P(y|x)$ 를 ..

카테고리 없음 2022. 10. 28. 17:20

Dependency parsing

"해당 포스팅은 Dan jurafsky 교수의 Speech and Language Processing Chapter 14. Dependency Parsing 을 요약한 것이며 모든 이미지 자료로 책에서 인용한 것입니다" 해당 포스팅 전체에서 child node 가 단 한 개의 parent node 만 갖는다고 가정한다 1. Transition-based approach Transition-based 는 Parser (=predictor / oracle) 이 sequence of operator (left arc operator, right arc operator, shift operator, reduce operator) 를 predict 하는 방식. Each step 기준으로 보면 current ste..

NLP /이것저것 2022. 6. 8. 18:16

공부하는 제이의 블로그

티스토리툴바

« 2025/07 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31