๐Ÿ” ๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€ (Logistic Regression) ์‹ฌ์ธต ๋ณด๊ณ ์„œ

1. ๊ฐœ์š”

๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€(Logistic Regression)๋Š” ์ด์ง„ ๋ถ„๋ฅ˜(Binary Classification) ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๋Š” ๋Œ€ํ‘œ์ ์ธ ์ง€๋„ํ•™์Šต(Supervised Learning) ์•Œ๊ณ ๋ฆฌ์ฆ˜์ž…๋‹ˆ๋‹ค. ์ด๋ฆ„๊ณผ๋Š” ๋‹ฌ๋ฆฌ ํšŒ๊ท€๊ฐ€ ์•„๋‹Œ ๋ถ„๋ฅ˜ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.

์˜ˆ์‹œ:

  • ์ด๋ฉ”์ผ์ด ์ŠคํŒธ์ธ์ง€ ์•„๋‹Œ์ง€
  • ์ด๋ฏธ์ง€๊ฐ€ ๊ณ ์–‘์ด์ธ์ง€ ๊ฐœ์ธ์ง€

2. ์ž‘๋™ ์›๋ฆฌ

2.1 ์„ ํ˜• ์กฐํ•ฉ (Linear Combination)

๋จผ์ € ์ž…๋ ฅ ํŠน์„ฑ(feature)์˜ ์„ ํ˜• ์กฐํ•ฉ์„ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค:

$z = w_0 + w_1 x_1 + w_2 x_2 + \cdots + w_n x_n = \mathbf{w}^T \mathbf{x}$

2.2 ์‹œ๊ทธ๋ชจ์ด๋“œ ํ•จ์ˆ˜ (Sigmoid Function)

์„ ํ˜• ์กฐํ•ฉ๋œ ๊ฐ’์„ ์‹œ๊ทธ๋ชจ์ด๋“œ ํ•จ์ˆ˜์— ํ†ต๊ณผ์‹œ์ผœ ํ™•๋ฅ  ๊ฐ’์œผ๋กœ ๋ณ€ํ™˜ํ•ฉ๋‹ˆ๋‹ค:

$\sigma(z) = \frac{1}{1 + e^{-z}}$

์ด ๊ฐ’์€ ํด๋ž˜์Šค 1์ผ ํ™•๋ฅ ์„ ๋‚˜ํƒ€๋ƒ…๋‹ˆ๋‹ค. ์ผ๋ฐ˜์ ์œผ๋กœ ์ž„๊ณ„๊ฐ’ 0.5๋ฅผ ๊ธฐ์ค€์œผ๋กœ ๋ถ„๋ฅ˜ํ•ฉ๋‹ˆ๋‹ค:

$\hat{y} = \begin{cases} 1 & \text{if } \sigma(z) \geq 0.5 \ 0 & \text{otherwise} \end{cases}$

3. ๋น„์šฉ ํ•จ์ˆ˜ (Loss Function)

๋ชจ๋ธ์ด ํ•™์Šต ์ค‘ ์ตœ์†Œํ™”ํ•˜๋Š” ๋น„์šฉ ํ•จ์ˆ˜๋Š” **์ด์ง„ ๊ต์ฐจ ์—”ํŠธ๋กœํ”ผ(Binary Cross-Entropy)**์ž…๋‹ˆ๋‹ค:

$J(\theta) = -\frac{1}{m} \sum_{i=1}^{m} \left[ y^{(i)} \log(\hat{y}^{(i)}) + (1 – y^{(i)}) \log(1 – \hat{y}^{(i)}) \right]$

์—ฌ๊ธฐ์„œ:

  • $m$: ๋ฐ์ดํ„ฐ ์ƒ˜ํ”Œ ์ˆ˜
  • $y^{(i)}$: ์‹ค์ œ ํด๋ž˜์Šค (0 ๋˜๋Š” 1)
  • $\hat{y}^{(i)}$: ์˜ˆ์ธก ํ™•๋ฅ ๊ฐ’

์ตœ์ ํ™”๋Š” ์ฃผ๋กœ **๊ฒฝ์‚ฌ ํ•˜๊ฐ•๋ฒ• (Gradient Descent)**์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

4. ์ฃผ์š” ํŠน์ง•

ํ•ญ๋ชฉ์„ค๋ช…
๋ถ„๋ฅ˜ ์œ ํ˜•์ด์ง„ ๋˜๋Š” ๋‹ค์ค‘ ํด๋ž˜์Šค (OvR ๋ฐฉ์‹)
์ถœ๋ ฅํด๋ž˜์Šค ํ™•๋ฅ ๊ฐ’
ํ•™์Šต ์†๋„๋น ๋ฆ„
ํ•ด์„ ๊ฐ€๋Šฅ์„ฑ๋†’์Œ (๊ฐ€์ค‘์น˜ ํ™•์ธ ๊ฐ€๋Šฅ)
์ •๊ทœํ™” ๊ฐ€๋ŠฅL1 / L2 ๊ทœ์ œ ๊ฐ€๋Šฅ

5. ์žฅ์ ๊ณผ ๋‹จ์ 

โœ… ์žฅ์ 

  • ๊ฐ„๋‹จํ•˜๊ณ  ์ง๊ด€์ ์ด๋ฉฐ ๋น ๋ฆ„
  • ํ•ด์„ ๊ฐ€๋Šฅ (ํŠน์„ฑ ์ค‘์š”๋„ ํŒŒ์•… ๊ฐ€๋Šฅ)
  • ํ™•๋ฅ  ๊ธฐ๋ฐ˜ ์˜ˆ์ธก ์ œ๊ณต

โŒ ๋‹จ์ 

  • ์„ ํ˜• ๊ฒฐ์ • ๊ฒฝ๊ณ„๋งŒ ๊ฐ€๋Šฅ
  • ์ด์ƒ์น˜์— ๋ฏผ๊ฐ
  • ๋ณต์žกํ•œ ํŒจํ„ด ํ•™์Šต์— ํ•œ๊ณ„ ์žˆ์Œ

6. ๋‹ค์ค‘ ํด๋ž˜์Šค ํ™•์žฅ

  • ๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€๋Š” ๊ธฐ๋ณธ์ ์œผ๋กœ ์ด์ง„ ๋ถ„๋ฅ˜์šฉ
  • ๋‹ค์ค‘ ํด๋ž˜์Šค ๋ฌธ์ œ์—๋Š” One-vs-Rest (OvR) ๋ฐฉ์‹ ์‚ฌ์šฉ

7. ์„ฑ๋Šฅ ํ‰๊ฐ€ ์ง€ํ‘œ

๋ชจ๋ธ ์„ฑ๋Šฅ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์€ ์ง€ํ‘œ๋กœ ํ‰๊ฐ€ํ•ฉ๋‹ˆ๋‹ค:

  • ์ •ํ™•๋„ (Accuracy)
  • ์ •๋ฐ€๋„ (Precision)
  • ์žฌํ˜„์œจ (Recall)
  • F1-score
  • ROC-AUC
  • ํ˜ผ๋™ ํ–‰๋ ฌ (Confusion Matrix)

8. ๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€ vs. ๋‹ค๋ฅธ ๋ชจ๋ธ

๋ชจ๋ธํŠน์ง•
๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€์„ ํ˜• ๋ถ„๋ฅ˜, ํ•ด์„ ์šฉ์ด
SVM๋งˆ์ง„ ๊ธฐ๋ฐ˜, ๊ณ ์ฐจ์› ํŠนํ™”
๊ฒฐ์ • ํŠธ๋ฆฌ๋น„์„ ํ˜• ์ฒ˜๋ฆฌ ๊ฐ€๋Šฅ, ํ•ด์„ ๊ฐ€๋Šฅ
KNN์ธ์Šคํ„ด์Šค ๊ธฐ๋ฐ˜, ์ง๊ด€์ 
์‹ ๊ฒฝ๋ง๋ณต์žกํ•œ ํŒจํ„ด ํ•™์Šต ๊ฐ€๋Šฅ, ํ•ด์„ ์–ด๋ ค์›€

9. ๊ฒฐ๋ก 

๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€๋Š” ์„ ํ˜•์ ์ธ ๋ฌธ์ œ์—์„œ ๋น ๋ฅด๊ณ  ํšจ๊ณผ์ ์ธ ๊ฒฐ๊ณผ๋ฅผ ์ œ๊ณตํ•˜๋ฉฐ, ํ•ด์„์„ฑ๊ณผ ์˜ˆ์ธก๋ ฅ์˜ ๊ท ํ˜•์„ ๊ฐ–์ถ˜ ๋ถ„๋ฅ˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜์ž…๋‹ˆ๋‹ค. ํŠนํžˆ ๋น„์ฆˆ๋‹ˆ์Šค ์ธ์‚ฌ์ดํŠธ๋ฅผ ๋„์ถœํ•˜๊ฑฐ๋‚˜ ํ™•๋ฅ  ๊ธฐ๋ฐ˜ ์˜์‚ฌ๊ฒฐ์ •์ด ํ•„์š”ํ•œ ๋ฌธ์ œ์— ์ ํ•ฉํ•ฉ๋‹ˆ๋‹ค. ๋ณต์žกํ•œ ๋ชจ๋ธ๋กœ ๊ฐ€๊ธฐ ์ „, ํ•ญ์ƒ ์‹œ๋„ํ•ด๋ณผ ๊ฐ€์น˜๊ฐ€ ์žˆ๋Š” ๋ฒ ์ด์Šค๋ผ์ธ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. <script type=”text/javascript” async src=”https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.7/MathJax.js?config=TeX-MML-AM_CHTML”> </script> <script type=”text/x-mathjax-config”> MathJax.Hub.Config({ tex2jax: { inlineMath: [[‘$’,’$’], [‘\\(‘,’\\)’]], displayMath: [[‘$$’,’$$’], [‘\\[‘,’\\]’]], processEscapes: true } }); </script>

์ฝ”๋ฉ˜ํŠธ

“๐Ÿ” ๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€ (Logistic Regression) ์‹ฌ์ธต ๋ณด๊ณ ์„œ” ์— ํ•˜๋‚˜์˜ ๋‹ต๊ธ€

  1. binance ์•„๋ฐ”ํƒ€

    Your point of view caught my eye and was very interesting. Thanks. I have a question for you.

๋‹ต๊ธ€ ๋‚จ๊ธฐ๊ธฐ

์ด๋ฉ”์ผ ์ฃผ์†Œ๋Š” ๊ณต๊ฐœ๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ํ•„์ˆ˜ ํ•„๋“œ๋Š” *๋กœ ํ‘œ์‹œ๋ฉ๋‹ˆ๋‹ค