728x90
๋ฐ˜์‘ํ˜•

์ „์ฒด ๊ธ€ 80

[ ๊ฐ•ํ™”ํ•™์Šต ] 3. Finite Markov Decision Processes

์ด ์ฑ…์˜ ๋‚จ์€ ํŒŒํŠธ์—์„œ ์ง€์†์ ์œผ๋กœ ๋‹ค๋ฃฐ ๋ฌธ์ œ๋ฅผ ์†Œ๊ฐœํ•˜๋Š” ์ค‘์š”ํ•œ ์ฑ•ํ„ฐ๋กœ ์ด ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์šฐ๋ฆฌ๋Š” ๊ฐ•ํ™”ํ•™์Šต์ด๋ผ ์—ฌ๊ธด๋‹ค. ์ด๋ฒˆ ์ฑ•ํ„ฐ๋ฅผ ํ†ตํ•ด ๊ฐ•ํ™”ํ•™์Šต ๋ฌธ์ œ๊ฐ€ ์–ด๋–ค ๊ฒƒ์ธ์ง€ ๊ฐœ๊ด„์ ์œผ๋กœ ์•Œ์•„๋ณด๊ณ  ๊ทธ ์‘์šฉ์— ๋Œ€ํ•ด ๋‹ค๋ฃฌ๋‹ค. ๋˜ํ•œ ๊ฐ•ํ™”ํ•™์Šต ๋ฌธ์ œ์˜ ์ˆ˜ํ•™์ ์œผ๋กœ ์ด์ƒ์ ์ธ ํ˜•ํƒœ๋ฅผ ๋‹ค๋ฃจ๊ณ  Bellman equation์ด๋‚˜ Value function๊ณผ ๊ฐ™์ด ๊ฐ•ํ™”ํ•™์Šต ๋ฌธ์ œ์˜ ์ˆ˜ํ•™์  ๊ตฌ์กฐ์˜ ์ค‘์š”ํ•œ ์š”์†Œ๋“ค์— ๋Œ€ํ•ด ํ•™์Šตํ•œ๋‹ค. 3.1. The Agent-Environment Interface ์•ž์„œ ๊ณ„์† ์–ธ๊ธ‰ํ•˜์˜€๋“ฏ ๊ฐ•ํ™”ํ•™์Šต์—์„œ agent๋Š” actions๋ฅผ ์„ ํƒํ•˜๊ณ  environment๋Š” ๊ทธ actions์— ๋ฐ˜์‘ํ•˜์—ฌ agent์—๊ฒŒ ์ƒˆ๋กœ์šด situation์„ ์ œ์‹œํ•˜๋ฉฐ, reward๋ฅผ ๋ฐœ์ƒ์‹œํ‚จ๋‹ค. ๊ทธ๋ฆฌ๊ณ  agent๋Š” ์‹œ๊ฐ„์„ ๊ฑฐ์ณ ๊ทธ rew..

[ ์˜๋ฃŒ์˜์ƒ ] Medical Image Segmentation (5.Graph Cut Optimization)

MOOC ๊ฐ•์ขŒ '์ปดํ“จํ„ฐ๋น„์ „, ๋จธ์‹ ๋Ÿฌ๋‹, ๋”ฅ๋Ÿฌ๋‹์„ ์ด์šฉํ•œ ์˜๋ฃŒ์˜์ƒ๋ถ„์„' ๊ฐ•์˜๋ฅผ ๋ฆฌ๋ทฐ ๋ฐ ์ •๋ฆฌํ•œ ๊ฒŒ์‹œ๋ฌผ์ž…๋‹ˆ๋‹ค. Graph Model์„ ์–ด๋–ป๊ฒŒ ์ •์˜ํ•˜๊ณ , ์ •์˜๋œ Model๋กœ๋ถ€ํ„ฐ ์–ด๋–ป๊ฒŒ ์ตœ์ ํ™”๋ฅผ ํ•ด์„œ Label์„ ์–ป๋Š”์ง€์— ๊ด€ํ•œ ์ด์•ผ๊ธฐ๋‹ค. ๊ทธ์ „์— ๋ฆฌ๋ทฐ๋ฅผ ํ•ด๋ณด์ž. ์œ„ ๊ทธ๋ฆผ์ฒ˜๋Ÿผ ๋…ธ๋“œ๊ฐ€ N=9์ผ ๋•Œ๋Š” ๊ฐ๊ฐ ํ™•๋ฅ ์„ ๊ตฌํ•ด์„œ ๊ณฑํ•ด์ฃผ๋ฉด ์‰ฝ์ง€๋งŒ 100x100 ์ด๋ฏธ์ง€๋งŒ ์ƒ๊ฐํ•ด๋ด๋„ ๋…ธ๋“œ๊ฐ€ N=10,000์ด ๋˜๊ณ , 0 ~ 1 ์‚ฌ์ด์˜ ํ™•๋ฅ ๊ฐ’์„ 10,000๋ฒˆ ๊ณฑํ•˜๋ฉด 0์— ๊ฐ€๊นŒ์šด ๊ฐ’์ด ๊ทธ ๊ฒฐ๊ณผ๋กœ ๋„์ถœ๋  ๊ฒƒ์ด๋‹ค. ์ด๋Ÿฌํ•œ ๊ฒฝ์šฐ๊ฐ€ ๋ฐœ์ƒํ•˜๋Š” ๊ฒƒ์„ ์˜ˆ๋ฐฉํ•˜๊ธฐ ์œ„ํ•ด Likelihood Probability์™€ Prior Probability์˜ ๊ณฑ์— - log ๋ฅผ ์”Œ์›Œ์ค€๋‹ค. log ์„ฑ์งˆ์„ ์ด์šฉํ•˜๋ฉด ์ง„์ˆ˜์˜ ๊ณฑ์€ ๋กœ๊ทธ์˜ ๋ง์…ˆ์œผ๋กœ ๋‚˜๋ˆ„์–ด์ง€๊ณ , ๊ฒฐ๊ตญ ๋‘ Pro..

[ ์˜๋ฃŒ์˜์ƒ ] Medical Image Segmentation (4.Graph Cut Method)

MOOC ๊ฐ•์ขŒ '์ปดํ“จํ„ฐ๋น„์ „, ๋จธ์‹ ๋Ÿฌ๋‹, ๋”ฅ๋Ÿฌ๋‹์„ ์ด์šฉํ•œ ์˜๋ฃŒ์˜์ƒ๋ถ„์„' ๊ฐ•์˜๋ฅผ ๋ฆฌ๋ทฐ ๋ฐ ์ •๋ฆฌํ•œ ๊ฒŒ์‹œ๋ฌผ์ž…๋‹ˆ๋‹ค. Graph Cut Method๋ฅผ ์ด์šฉํ•˜์—ฌ Segmentation Task๋ฅผ ์ง„ํ–‰ํ•  ๋•Œ๋Š” ์ด๋ฏธ์ง€๋ฅผ Graph Model๋กœ ์ •์˜ํ•˜๊ณ , Graph์˜ Label์„ ์ •์˜ํ•˜๋Š” ๋ฌธ์ œ๋กœ ์ƒ๊ฐํ•œ๋‹ค. ์šฐ์ธก ์ด๋ฏธ์ง€์˜ ๊ฒ€์€์ƒ‰์ด background, ํฐ์ƒ‰์ด foreground๋ผ๊ณ  ๊ฐ€์ •ํ•œ๋‹ค. Graph Model์—์„œ๋Š” ๊ด€์ฐฐ๋œ ์˜์ƒ์˜ color ๊ฐ’์„ obsevation์ด๋ผ ํ•œ๋‹ค. ํ‘ธ๋ฅธ์ƒ‰๊ณผ ๋ณด๋ผ์ƒ‰์„ ๋”ฐ๋กœ ๋ถ„ํ• ํ•œ๋‹ค๊ณ  ํ•˜๋ฉด ํ‘ธ๋ฅธ์ƒ‰์„ 0, ๋ณด๋ผ์ƒ‰์„ 1์ด๋ผ๊ณ  ์ง€์ •ํ•ด์•ผ ํ•œ๋‹ค. (Labeling) ํ•ด๋‹น ์ด๋ฏธ์ง€๋ฅผ ๊ทธ๋ž˜ํ”„ํ™” ํ•œ ๊ทธ๋ฆผ์ด ์šฐ์ธก ๊ทธ๋ฆผ์˜ ์ƒ๋‹จ๋ถ€์ด๊ณ , ์šฐ๋ฆฌ๊ฐ€ ๊ตฌํ•˜๊ณ  ์‹ถ์€ ๊ฒƒ์€ X๊ฐ’๋“ค์ด๋‹ค. ์ด๋Š” P(X1 ~ X9ใ…ฃZ1 ~ Z9) ๋ผ๋Š” ..

[ ์˜๋ฃŒ์˜์ƒ ] Medical Image Segmentation (3.Region Growing / Watershed Algorithm)

MOOC ๊ฐ•์ขŒ '์ปดํ“จํ„ฐ๋น„์ „, ๋จธ์‹ ๋Ÿฌ๋‹, ๋”ฅ๋Ÿฌ๋‹์„ ์ด์šฉํ•œ ์˜๋ฃŒ์˜์ƒ๋ถ„์„' ๊ฐ•์˜๋ฅผ ๋ฆฌ๋ทฐ ๋ฐ ์ •๋ฆฌํ•œ ๊ฒŒ์‹œ๋ฌผ์ž…๋‹ˆ๋‹ค. Region Growing ์™ผ์ชฝ ์‚ฌ์ง„์€ Thresholding Method์˜ ๊ฒฐ๊ณผ๋ฌผ์ด๊ณ , ์˜ค๋ฅธ์ชฝ ์‚ฌ์ง„์€ ํ•ด๋‹น ์‚ฌ์ง„์— Morphological Processing์„ ์ง„ํ–‰ํ•œ ๊ฒฐ๊ณผ๋ฌผ์ด๋‹ค. Noise์™€ Hole์ด ์ œ๊ฑฐ๋˜์—ˆ์Œ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค. (๋งˆํฌ์—…์œผ๋กœ ์น ํ•œ ์‚ฌ์ง„์ด๋‹ค. ์‹ค์ œ๋กœ ํ•ด๋‹น ํ”„๋กœ์„ธ์‹ฑ ์ง„ํ–‰ํ•œ ์‚ฌ์ง„ ์•„๋‹ˆ๋‹ค,,) ํ•ด๋‹น ์ด๋ฏธ์ง€๋ฅผ ์ธํ’‹์œผ๋กœ ํ•˜์—ฌ ์‚ฌ์šฉ์ž ์ž…๋ ฅ ์ง€์ ์„ ๋ฐ›์œผ๋ฉด ํ•ด๋‹น ์ง€์ ์œผ๋กœ๋ถ€ํ„ฐ ํผ์ ธ๋‚˜๊ฐ€๋ฉด์„œ ํ”ฝ์…€๊ฐ’์ด 255๊ฐ€ ์•„๋‹Œ ์ง€์ ๊นŒ์ง€ ๊ทธ ์˜์—ญ์„ ๋„“ํžŒ๋‹ค. Region Growing์— ํ•ด๋‹นํ•˜๋Š” ๋ถ€๋ถ„๋งŒ Labeling ๋˜์–ด ์›ํ•˜๋Š” ๋ถ€๋ถ„๋งŒ Segmentation ์ด ์ง„ํ–‰๋œ Binary Map์„ ์–ป์„ ์ˆ˜ ์žˆ๋‹ค. Bi..

[ ์˜๋ฃŒ์˜์ƒ ] Medical Image Segmentation (2.Morphological Processing)

MOOC ๊ฐ•์ขŒ '์ปดํ“จํ„ฐ๋น„์ „, ๋จธ์‹ ๋Ÿฌ๋‹, ๋”ฅ๋Ÿฌ๋‹์„ ์ด์šฉํ•œ ์˜๋ฃŒ์˜์ƒ๋ถ„์„' ๊ฐ•์˜๋ฅผ ๋ฆฌ๋ทฐ ๋ฐ ์ •๋ฆฌํ•œ ๊ฒŒ์‹œ๋ฌผ์ž…๋‹ˆ๋‹ค. ์œ„ ์‚ฌ์ง„์€ ์•ž์„  ๊ฒŒ์‹œ๋ฌผ์—์„œ ๋‹ค๋ค˜๋˜ Thresholding์˜ ๊ฒฐ๊ณผ๋ฌผ์ด๋‹ค. 100% ์›ํ•˜๋Š” ๊ฒฐ๊ณผ๋Š” ์•„๋‹˜์„ ์‰ฝ๊ฒŒ ์•Œ ์ˆ˜ ์žˆ๋‹ค. Noise๊ฐ€ ์กด์žฌํ•˜๊ณ , ์›ํ•˜๋Š” ์˜์—ญ ๋‚ด๋ถ€์—๋„ ์˜ฌ๋ฐ”๋ฅด๊ฒŒ Segmentation์ด ์ด๋ฃจ์–ด์ง€์ง€ ์•Š์€ ํ”์ ๋“ค์ด ์กด์žฌํ•œ๋‹ค. Morphological Processing์„ ์ด์šฉํ•˜๋ฉด Noise๋ฅผ ์ œ๊ฑฐํ•˜๊ฑฐ๋‚˜ Hole์„ ์ฑ„์›Œ์ค„ ์ˆ˜ ์žˆ๋‹ค. 1. Dilation ์šฐ์ธก ๊ทธ๋ฆผ์—์„œ ํ‘ธ๋ฅธ์ƒ‰์œผ๋กœ ํ‘œ์‹œ๋œ Foreground๋ฅผ ํ™•์žฅํ•˜๊ณ  ์‹ถ์„ ๋•Œ, Dilation ๊ธฐ๋ฒ•์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋‹ค. Morphological Processing์€ Convolution ๊ณผ์ •๊ณผ ์ƒ๋‹นํžˆ ์œ ์‚ฌํ•˜๋‹ค. ๋จผ์ €, Convolution์—์„œ ..

[ ์˜๋ฃŒ์˜์ƒ ] Medical Image Segmentation (1.Thresholding Method)

MOOC ๊ฐ•์ขŒ '์ปดํ“จํ„ฐ๋น„์ „, ๋จธ์‹ ๋Ÿฌ๋‹, ๋”ฅ๋Ÿฌ๋‹์„ ์ด์šฉํ•œ ์˜๋ฃŒ์˜์ƒ๋ถ„์„' ๊ฐ•์˜๋ฅผ ๋ฆฌ๋ทฐ ๋ฐ ์ •๋ฆฌํ•œ ๊ฒŒ์‹œ๋ฌผ์ž…๋‹ˆ๋‹ค. Segmentation์ด๋ž€, ์˜์ƒ์„ ๋ถ„ํ• ํ•˜๋Š” ์ž‘์—…์œผ๋กœ ์‹œ๊ฐ„ ๊ฐ„๊ฒฉ์„ ๋‘๊ณ  ์ดฌ์˜ํ•œ ์˜์ƒ์„ ๋น„๊ตํ•˜๋Š” Longitudinal Study๋ฅผ ๋น„๋กฏํ•œ ๋‹ค์–‘ํ•œ ์—ฐ๊ตฌ์—์„œ ํ™œ์šฉ๋œ๋‹ค. ํ˜„์žฌ ์ง„ํ–‰ํ•˜๊ณ  ์žˆ๋Š” ํ”„๋กœ์ ํŠธ์—์„œ๋„ ROI Cropping ์ „์— 3D TOF MRA ์˜์ƒ์—์„œ Skull Stripping ์ฆ‰, Brain Extraction ๊ณผ์ •์„ ์„ ์ œ์ ์œผ๋กœ ์ง„ํ–‰ํ•˜์˜€๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, Axial Plane์—์„œ ๊ด€์ฐฐํ•˜๊ณ  ์‹ถ์€ Organ์ด๋‚˜ ์ด์ƒ ์˜์—ญ์„ Segmentation ํ•˜๋Š” ์ž‘์—…์€ ์ง„๋‹จ์„ ํ•  ๋•Œ ๊ทธ ํŒ๋‹จ ๊ธฐ์ค€์ด ๋  ์ˆ˜ ์žˆ๋‹ค๋Š” ์ ์—์„œ ์ค‘์š”ํ•˜๋‹ค. ์ด๋ฅผ ์œ„ํ•œ ๋‹ค์–‘ํ•œ ๊ธฐ๋ฒ•์ด ์ œ์•ˆ๋˜์—ˆ๋‹ค. 1. Thresholding Method..

0. ์ถœ๊ทผ

20210705 ์ฒซ ์ถœ๊ทผ ์—ฐ์„ธ๋Œ€ํ•™๊ต ์˜๊ณผ๋Œ€ํ•™ ์˜์ƒ๋ช…์‹œ์Šคํ…œ์ •๋ณดํ•™๊ต์‹ค TAILab https://sites.google.com/view/tailab/home?authuser=0 Translational AI Lab Research Group for AI & Machine Learning in Medicine at Yonsei University College of Medicine, Severance Hospital sites.google.com ๋ฐฐ์šฐ์ž !

[ ๊ฐ•ํ™”ํ•™์Šต ] 2. Multi-arm Bandits

Part โ… . Tabular Solution Methods ๊ฐ•ํ™”ํ•™์Šต์˜ simplest forms์— ๋Œ€ํ•˜์—ฌ ๋ฐฐ์šฐ๋Š” ์ฑ•ํ„ฐ๋‹ค. action-value function์ด array๋‚˜ table ํ˜•ํƒœ๋กœ ๋‚˜ํƒ€๋‚˜๊ธฐ์— ์ถฉ๋ถ„ํ•  ์ •๋„๋กœ ๊ทธ state์™€ action space๊ฐ€ ์ž‘์€ ํ˜•ํƒœ๋‹ค. ์ด๋Ÿฌํ•œ ๊ฒฝ์šฐ, optimal value function๊ณผ optimal policy๋ฅผ ์ฐพ์„ ๊ฐ€๋Šฅ์„ฑ์ด ๋†’๋‹ค. ์ด๋Š” ์˜ค์ง approximate solutions๋งŒ ์ฐพ์•„๋‚ด๋Š” much larger problems๊ณผ ๋Œ€๋น„๋œ๋‹ค. ๊ฐ•ํ™”ํ•™์Šต์ด ๋‹ค๋ฅธ ํ•™์Šต๋“ค๊ณผ ๊ตฌ๋ถ„๋˜๋Š” ๊ฐ€์žฅ ์ค‘์š”ํ•œ ํŠน์ง•์€ correct actions์— ๋Œ€ํ•œ ์ •๋ณด๋ฅผ ์ œ๊ณตํ•˜์—ฌ instruct ํ•˜๋Š” ๊ฒƒ์ด ์•„๋‹ˆ๋ผ actions์„ ํ‰๊ฐ€ํ•œ๋‹ค๋Š” ๊ฒƒ์ด๋‹ค. ์ด๊ฒƒ์ด ๊ณง active exploration์˜..

[ ๊ฐ•ํ™”ํ•™์Šต ] 1. The Reinforcement Learning Problem

์ฃผ์–ด์ง„ ์–ด๋–ค ์ƒํ™ฉ(state)์—์„œ ๋ณด์ƒ(reward)์„ ์ตœ๋Œ€ํ™”ํ•  ์ˆ˜ ์žˆ๋Š” ํ–‰๋™(action)์— ๋Œ€ํ•ด ํ•™์Šต ๋‹ต์ด ์กด์žฌํ•˜๋Š” ํ›ˆ๋ จ๋ฐ์ดํ„ฐ๋ฅผ ํ† ๋Œ€๋กœ ํ•œ ํ•™์Šต์ด ์•„๋‹Œ ํ™˜๊ฒฝ๊ณผ์˜ ์ƒํ˜ธ์ž‘์šฉ์„ ํ†ตํ•ด ํ•™์Šต ํ˜„์žฌ ์„ ํƒํ•œ Action์ด ๋ฏธ๋ž˜์˜ ์ˆœ์ฐจ์  ๋ณด์ƒ์— ์˜ํ–ฅ (Delayed Reward) External Supervisor์ด ์กด์žฌํ•˜์ง€ ์•Š๋Š”๋‹ค. [ Trade-off between Exploitation and Exploration ] Agent๋Š” reward๋ฅผ ์–ป๊ธฐ ์œ„ํ•œ action์„ ์„ ํƒํ•˜๊ธฐ ์œ„ํ•ด ์ด๋ฏธ ๊ฒฝํ—˜ํ•œ ๊ฒƒ์„ exploitํ•˜๊ฑฐ๋‚˜ ๋ฏธ๋ž˜์— ๋” ๋‚˜์€ action selection์„ ์œ„ํ•œ environment์™€์˜ ์ƒํ˜ธ์ž‘์šฉ์„ ์œ„ํ•ด exploreํ•œ๋‹ค. ์œ„ ๋‘ ๋ฐฉ๋ฒ• ์ค‘์— ๋” ๋‚˜์€ ๊ฒฐ๊ณผ๋ฅผ ๋งŒ๋“ค ๋ฐฉ๋ฒ•์„ ํƒํ•˜์—ฌ์•ผ ํ•œ๋‹ค. ๊ฐ•ํ™”ํ•™์Šต ๊ตฌ์„ฑ์š”..

[ ๊ฐ•ํ™”ํ•™์Šต ] 0. Introduction

๊ฐ•ํ™”ํ•™์Šต ( Reinforcement Learning ) ํ™˜๊ฒฝ(Environment)์„ ํƒ์ƒ‰ํ•˜๋Š” ํ•™์Šต์ฃผ์ฒด(Agent)๋Š” ํ˜„์žฌ ์ƒํƒœ(State)๋ฅผ ์ธ์‹ํ•˜์—ฌ ์–ด๋–ค ํ–‰๋™(Action)์„ ์ทจํ•˜๊ณ , ํ™˜๊ฒฝ์œผ๋กœ๋ถ€ํ„ฐ ๋ณด์ƒ(Reward)๋ฅผ ์–ป๋Š”๋‹ค. ๊ฐ•ํ™”ํ•™์Šต์˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ Agent๊ฐ€ ์•ž์œผ๋กœ ๋ˆ„์ ๋  Reward๋ฅผ ์ตœ๋Œ€ํ™”ํ•˜๋Š” ์ผ๋ จ์˜ Actions๋กœ ์ •์˜๋˜๋Š” Policy๋ฅผ ์ฐพ๋Š” ๋ฐฉ๋ฒ•์ด๋‹ค. ํ˜„์žฌ ์„ ํƒํ•œ Action์ด ๋ฏธ๋ž˜์˜ ์ˆœ์ฐจ์  Reward์— ์˜ํ–ฅ์„ ๋ฏธ์นœ๋‹ค๋Š” ๊ฒƒ์ด ์ค‘์š”ํ•˜๋‹ค. (Delayed Reward) ์œ„ ์„œ์ ๊ณผ ๋ฐ•์œ ์„ฑ ๊ต์ˆ˜๋‹˜์˜ ์„œ์ ์„ ์ฐธ๊ณ ํ•˜์—ฌ ๊ฐ•ํ™”ํ•™์Šต์— ๋Œ€ํ•œ ์ด๋ก ์ ์ธ ์ดํ•ด๋ฅผ, Python OpenAI Gym ๋ผ์ด๋ธŒ๋Ÿฌ๋ฅผ ํ™œ์šฉํ•˜์—ฌ ๊ทธ ๊ตฌํ˜„์„ ๋ชฉํ‘œ๋กœ ๊ณต๋ถ€ํ•˜๊ณ  ํ•ด๋‹น ๋‚ด์šฉ์„ ์ •..

728x90
๋ฐ˜์‘ํ˜•