SP24) Random Variable III

ㅋㅋ 이제 슬슬 어지러워지기 시작했음

오늘 다룰 내용은 기댓값, $E$에 관한 내용입니다. 정확히 말하면 moment라는 개념!

Expected Value

일단 moment가 무엇인지 알기에 앞서 쉬운 것부터 이야기해보자구요. 기댓값이 무어냐~

우리가 다른 말로 평균이라고 부르죠! 어떤 Random variable $X$와 그 확률 $P(X=x)$에 대해, 기댓값 $E[X]$는 다음과 같이 정의합니다.

\begin{equation}\label{eq1}
E[X] = \sum_{x\in S_x} x P(X=x) \tag{1.a}
\end{equation}

$\ref{eq1}$는 이산형이구요, 그럼 연속형은?!

\begin{equation}\label{eq2}
E[X] = \int_{-\infty}^{\infty} x f_X(x)dx \tag{1.b}
\end{equation}

이때 $f_X(x)$는 확률밀도함수(PDF)입니다. ㅋㅋ 교수님이 PDF, CDF 발음하실 때마다 교실 곳곳에서 Siri가 힘차게 "네!" 하고 대답하는 거 이제 너무 익숙해서 오히려 안 들리면 섭섭할 지경임

그렇다면, $Y=g(X)$는 어떨까요?!

\begin{equation}\label{eq3}
E[Y] = \int_{-\infty}^{\infty} y f_Y(y)dy= \int_{-\infty}^{\infty} g(x) f_X(x)dy = E[g(X)] \tag{1.c}
\end{equation}

이게 이상하다고 느낄 수도 있는데 음 이런 상황을 생각해보지요. 우리가 주사위를 $n$번 던져서 6의 약수가 나온 횟수를 $X$라 하고, 그럴 때마다 500원씩 따고 아니면 600원씩 잃는다고 해보자구요. 그럼 $Y=500X+600(n-X)=600n-100X$가 되지 않습니까!? 그럼 $f_Y(y)$는 결국 $f_X(x)$를 그대로 사용할 수 있겠네요! 어차피 $Y$는 $X$에 의해 결정되는 값이니까요.

Moments of Random Variable

자자 아는 건 빨랑빨랑 넘어가자고^^

앞서 소개한 기댓값은 말하자면 1st moment입니다. n-th moment는 $E[X^n]$라고 정의하고, 이렇게 나타냅니다.

\begin{equation}\label{eq4}
E[X^n] = \int_{-\infty}^{\infty} x^n f_X(x)dx \tag{2.a}
\end{equation}

3차 moment는 skewness라고 하고... 4차 moment는 Kurtosis라는, 음.. 잘 모르는 개념들이네요^^

위키피디아를 가져와봤습니다.

"the Kurtosis, which indicates the degree of central 'peakedness' or, equivalently, the 'fatness' of the outer tails."

네 여전히 뭔소린지 모르겠음; 차차 알게 되겠지 뭐^^

그리고 평균 하면 뭐가 또 떠오르시나요.. 분산!! 외쳐 분산!! 표준편차!!! 그것도 n-th 버전이 있는데요, 우리가 분산을 정의하길 편차²의 평균이라고 하잖아요? 여기서 확장해서 $E[(X-E[X])^n]$를 "n-th central moment"라고 정의합니다.

\begin{equation}\label{eq5}
E[(X-E[X])^n] = \int_{-\infty}^{\infty} (x-\mu_x)^n f_X(x)dx \tag{2.b}
\end{equation}

Joint moments도 정의할 수 있다고 하네요 어휴 뭔데 이렇게 지지고복고 난리도 아니니

\begin{equation}\label{eq6}
M_{ij} = E[X^iY^j] = \iint_{-\infty}^{\infty} x^iy^jf(x,y)dxdy \tag{2.c}
\end{equation}

헐 깨달음 더블 인테그랄이 \iint구나.. 역시 똑똑한 Claude3... 이것도 역시 central moment로 나타낼수 있그등요

\begin{equation}\label{eq7}
M_{ij} = \iint_{-\infty}^{\infty} (x - \mu_X)^i(y - \mu_Y)^jf(x,y)dxdy \tag{2.d}
\end{equation}
둘다 $M_{ij}$로 표기한 게 좀 거슬리긴 하는데 이번만 눈감아주기로 하고^^

암튼 이런 게 있다고 합니다. 그런데.. 여기서 끝이 아니고! 이 joint moment라는 아이가 조금 중요한가봐요. 이게 X와 Y의 correlation coeffient를 구하는 것에도 쓰이는데요,

\begin{equation}\label{eq8}
\begin{split}
\rho_{XY} & = \frac{E[(X - \mu_X)(Y - \mu_Y)]}{\sqrt{E[(X - \mu_X)^2]E[(Y - \mu_Y)^2]}} \\ &=\frac{M_{11}}{\sqrt{\sigma_X^2\sigma_Y^2}} = \frac{M_{11}}{\sigma_X\sigma_Y} \\ &= \frac{\text{Cov}(X,Y)}{\sqrt{\text{Var}(X)\text{Var}(Y)}} \end{split} \tag{2.e}
\end{equation}

흠 근데 이쯤 되니 correlation coeffiicient가 대체 무엇이냐? 알고 계산해야하지 않겠느냐!? 하는 생각이 들지요. 클로드야, 상관계수가 뭐니?

The correlation coefficient $\rho_{XY}$ is a normalized version of the covariance, taking values between -1 and 1. It measures the strength and direction of the linear relationship between two random variables $X$ and $Y$.

보면 $\ref{eq8}$에서 분자가 공분산(covariance)형태인 것을 알 수 있어요. ㅋㅋ 그럼 공분산은 뭔데.. 위키피디아야 도와줘..

공분산(共分散, 영어: covariance)은 2개의 확률변수의 선형 관계를 나타내는 값이다. 만약 2개의 변수중 하나의 값이 상승하는 경향을 보일 때 다른 값도 상승하는 선형 상관성이 있다면 양수의 공분산을 가진다.반대로 2개의 변수중 하나의 값이 상승하는 경향을 보일 때 다른 값이 하강하는 선형 상관성을 보인다면 공분산의 값은 음수가 된다.
...
공분산이 0인 확률변수를 비상관 확률변수라고 한다.

오호 그렇군요.

일단 넘어가자^^

그리고 이 공분산과 $M_{11}$은 위에서 말한 비상관(uncorrelated)과 orthogonality와 관련이 있어요.

\begin{equation}\label{eq9}
\begin{split}
M_{11} & = E[(X - \mu_X)(Y - \mu_Y)] \\ & = E[XY-\mu_YX-\mu_XY+\mu_X\mu_Y] = E[XY]-\mu_X\mu_Y = 0, \\
\therefore \quad & E[XY] = \mu_X\mu_Y = E[X]E[Y]
\end{split} \tag{2.f}
\end{equation}

저 $E[XY]$가 0이면 $X$와 $Y$가 서로 orthogonal하다고 한답니다.

자 그럼 쉬어가는(?) OX 퀴즈

If $X$ and $Y$ are uncorrelated, they are statistically independent ? No.
- 반례: uniform distribution인 $X$가 $S_X\in\{1, -1\}$이고 $Y=X^2$일 때, $E[X]=0$이고 $E[XY]=E[X^3]=0$이지만 전혀 독립이지 않죠.
If $X$ and $Y$ are statistically independent, they are uncorrelated ? Yes!
올 잘썼는데~

If $X$ and $Y$ are uncorrelated, they are orthogonal ? No.. like.. obviously..
If $X$ and $Y$ are orthogonal, they are uncorrelated ? Yes!

Moment Generation Function and Characteristic Function

MGF는 다음과 같이 정의합니다.

\begin{equation} \label{eq10}
\Psi_X(t) \equiv E[e^{tX}] = \int_{-\infty}^{\infty} e^{tx} f_X(x) dx \tag{3.a}
\end{equation}

근데 MGF를 대체 왜 정의해서 쓰는 거지?! 응용이 이론보다 더 좋은 잼민ai는 참을 수가 없습니다.

Why is the MGF useful? There are basically two reasons for this. First, the MGF of $X$ gives us all moments of $X$. That is why it is called the moment generating function. Second, the MGF (if it exists) uniquely determines the distribution. That is, if two random variables have the same MGF, then they must have the same distribution. Thus, if you find the MGF of a random variable, you have indeed determined its distribution. We will see that this method is very useful when we work on sums of several independent random variables. Let's discuss these in detail.

음 그러니까 요약하자면, MGF가 존재한다면 우리가 힘들게 적분식 써가면서 moment를 구하지 않아도 여기서 바로 구할 수가 있대요. 그래서 moment-generating function이라고 하는군요~.~ 그리고 이게 중요한 것 같은데, MGF가 존재하면 그게 곧 확률 분포를 유일하게 결정한다고 하는군요?! 다른 말로, MGF로도 확률 분포를 제대로 알 수 있다~ 어쩐지 위키피디아도 alternative 어쩌구 하드라ㅇㅇ

Characteristic funtion은 다음과 같이 정의해요.

\begin{equation}\label{eq11}
\Phi_X(\omega) \equiv E[e^{j\omega X}] = \int_{-\infty}^{\infty} e^{j\omega x} f_X(x) dx \tag{3.b}
\end{equation}

이걸 굳이 왜 이렇게 썼냐면, characteristic funtion이 곧 PDF나 PMF의 푸리에 변환 꼴이어서 그렇습니다. 그래서 신호/시스템 관점에서는, $e^{j\omega X}$이 일종의 신호고 (빈번히 등장하긴 하드라ㅋㅋ) $\Phi_X(\omega)$는 푸리에 변환에서 말하는 'frequency' 도메인에서의 $X$의 분포를 의미한다고 해요. 위키피디아야 보충설명 부탁해~

In probability theory and statistics, the characteristic function of any real-valued random variable completely defines its probability distribution. If a random variable admits a probability density function, then the characteristic function is the Fourier transform (with sign reversal) of the probability density function. Thus it provides an alternative route to analytical results compared with working directly with probability density functions or cumulative distribution functions. There are particularly simple results for the characteristic functions of distributions defined by the weighted sums of random variables.

결국 다 확률 분포를 알기 위한 루트네요 쩝

The use of the characteristic function is almost identical to that of the moment generating function:
1. it can be used to easily derive the moments of a random variable;
2. it uniquely determines its associated probability distribution; it is often used to prove that two distributions are equal.
The CF has an important advantage over the moment generating function: while some random variables do not possess the latter, all random variables have a characteristic function.

이건 여기서 긁어왔는데, RV중엔 MGF가 없는 애들도 있는데 characteristic function은 무조건 가지나보죠!? 그래서 MGF보다 좀 더 general(?)하다..고 할 수도 있겠네요.

어휴 교수님이 왕 난해한 식을 썼는데 말이야

\begin{equation}\label{eq12}
\begin{split}
\Phi_X(\omega) & = \int_{-\infty}^{\infty} \left(1+j\omega x+\frac{(j\omega x)^2}{2!} + \cdots \right) f_X(x) dx \\& = \int_{-\infty}^{\infty}f_X(x) dx + j\omega\int_{-\infty}^{\infty} xf_X(x) dx + \frac{-\omega^2}{2!}\int_{-\infty}^{\infty}x^2f_X(x) dx + \cdots \\
&= 1 + j\omega E[X] + \frac{-\omega^2}{2!}E[X^2] + \cdots + \frac{(j\omega)^k}{2!}E[X^k] + \cdots
\end{split}
\tag{3.c}
\end{equation}

응 네 그렇다고 합니다..

그리고 이 characteristic function의 아주 중요한 특징! McLaurin series로 확장했을 때, $\omega=0$에서의 미분계수가 nth moment래요.

\begin{equation}\label{eq13}
\Phi_X(\omega) = \Phi_X(0) + \frac{d\Phi_X}{d\omega}\bigg\vert_{\omega=0}\omega+\frac{1}{2!}\frac{d^2\Phi_X}{d\omega^2}\bigg\vert_{\omega=0}\omega^2 + \cdots + \frac{1}{k!}\frac{d^k\Phi_X}{d\omega^k}\bigg\vert_{\omega=0}\omega^k\tag{3.d}
\end{equation}

에? 이건 뭔 소린지 모르겠으니까 질문ㄱ

728x90

'coursework > Random Process' 카테고리의 다른 글

SP24) Random Variable V (3)	2024.03.31
SP24) Random Variable IV (2)	2024.03.30
SP24) Random Variable II (0)	2024.03.20

TEACHME.md

SP24) Random Variable III

Expected Value

Moments of Random Variable

Moment Generation Function and Characteristic Function

'coursework > Random Process' 카테고리의 다른 글

티스토리툴바

SP24) Random Variable III

Expected Value

Moments of Random Variable

Moment Generation Function and Characteristic Function

'coursework > Random Process' 카테고리의 다른 글

관련글

티스토리툴바