Winter 2024 Midterm Exam 1

This exam was administered in person. Students had 50 minutes to take this exam.

Problem 1

Consider a dataset D with 5 data points \{7,5,1,2,a\}, where a is a positive real number. Note that a is not necessarily an integer.

Problem 1.1

Express the mean of D as a function of a, simplify the expression as much as possible.

\text{Mean($D$)} = \frac{a}{5} + 3

Problem 1.2

Depending on the range of a, the median of D could assume one of three possible values. Write out all possible median of D along with the corresponding range of a for each case. Express the ranges using double inequalities, e.g., i.e. 3<a\leq8:

\begin{cases} \text{Median($D$)} = 2 & \text{if a is in the range of } 0<a\leq2 \\ \text{Median($D$)} = a & \text{if a is in the range of } 2<a\leq5 \\ \text{Median($D$)} = 5 & \text{if a is in the range of } 5<a\leq\infty \\ \end{cases}

Problem 1.3

Determine the range of a that satisfies: \text{Mean}(D) < \text{Median}(D) Make sure to show your work.

\dfrac{15}{4}<a<10

Since there are 3 possible median values, we will have to discuss each situation separately.

In case 1, when 0<a\leq2, \text{Median}(D) = 2. So, we have:

\begin{align*} \text{Mean}(D) &< \text{Median}(D)\\ 3 + \frac{a}{5} &< 2\\ a&<-5 \end{align*}

But a<-5 is in conflict with the condition 0<a\leq2, therefore there is no solution in this situation, and Median(D) = 2 is impossible.

In case 2, when 2<a<5, \text{Median}(D) = a. So, we have:

\begin{align*} \text{Mean}(D) &< \text{Median}(D)\\ 3 + \frac{a}{5} &< a\\ 3 &< \frac{4}{5} a\\ a &> \frac{15}{4}\\ \end{align*}

So a has to be larger than \frac{15}{4}. But remember from the prerequisite condition that 2<a<5.

To satisfy both conditions, we must have \frac{15}{4}<a<5.

In case 3, when a\geq5, \text{Median}(D) = 5. So, we have: \begin{align*} \text{Mean}(D) &< \text{Median}(D)\\ 3 + \frac{a}{5} &< 5\\ a&<10 \end{align*}

combining with the prerequisite condition, we have 5\leq a<10

Combining the range of all three cases, we have \dfrac{15}{4}<a<10 as our final answer.

Problem 2

Let R_{sq}(h) represent the mean squared error of a constant prediction h for a given dataset. For the dataset \{3, y_{1}\}, the graph of R_{sq}(h) has its minimum at the point (5,r_{1}). Find out the value of y_{1} and r_{1}

y_1 = 7, r_1 = 4

The mean squared error is written as: \begin{align*} R_{sq}(h) = \frac{1}{n}\sum_{i=0}^{n}(y_{i}-h)^2 \end{align*}

Since we only have two data points (n=2), the equation simplifies to:

\begin{align*} R_{sq}(h) = \frac{1}{2}((y_{0}-h)^2+ (y_{1}-h)^2) \end{align*}

Taking the derivative with respect to h, we have: \begin{align*} \frac{dR_{sq}(h)}{dh} = -(y_{0}-h)- (y_{1}-h) \end{align*}

We know that the derivative has to be 0 at the local minima, therefore at h=5, we have:

\begin{align*} \frac{dR_{sq}(h)}{dh} = -(3-5)- (y_{1}-5) &= 0\\ % -2+y_1-5 &=0\\ y_1 &= 7 \end{align*}

So we know that the dataset is \{3,7\}. Given all these information, we can calculate r_1 with:

\begin{align*} R_{sq}(5) &= \frac{1}{2}((y_{0}-5)^2+ (y_{1}-5)^2)\\ &=\frac{1}{2}((3-5)^2+ (7-5)^2)\\ &=\frac{1}{2}(4+4)=4 \end{align*}

Problem 3

The hyperbolic cosine function is defined as cosh(x) = \frac{1}{2}(e^{x} + e^{-x}). In this problem, we aim to prove the convexity of this function using power series expansion.

Problem 3.1

Take the second derivative of f:

\begin{align*} f'(x) &= nx^{n-1}\\ f''(x) &= n(n-1)x^{n-2} \end{align*}

If n is even, then n-2 must also be even, therefore f''(x) = n(n-1)x^{n-2} will always be a positive number. This means the second derivative of f(x) is always larger than 0 and therefore passes the second derivative test.

Problem 3.2

Power series expansion is a powerful tool to analyze complicated functions. In power series expansion, a function can be written as an infinite sum of polynomial functions with certain coefficients. For example, the exponential function can be written as: \begin{align*} e^{x} = \sum_{n=0}^{\infty}\frac{x^{n}}{n!} = 1 + x + \frac{x^{2}}{2} + \frac{x^{3}}{6} + \frac{x^{4}}{24} + ... \end{align*}

where n! denotes the factorial of n, defined as the product of all positive integers up to n, i.e. n! = 1\cdot 2\cdot 3\cdot ... \cdot (n-1)\cdot n. Given the power series expansion of e^{x} above, write the power series expansion of e^{-x} and explicitly specify the first 5 terms, i.e., similar to the format of the equation above.

By plugging -x in for each x, we get:

e^{-x} = \displaystyle\sum_{n=0}^{\infty}\frac{(-x)^{n}}{n!}=1-x+\frac{x^{2}}{2} - \frac{x^{3}}{6}+\frac{x^{4}}{24}+ ...

Problem 3.3

Using the conclusions you reached in part (a) and part (b), prove that cosh(x) = \frac{1}{2}(e^{x} + e^{-x}) is convex.

Given that:

\begin{align*} e^{x} &= \sum_{n=0}^{\infty}\frac{x^{n}}{n!} = 1 + x + \frac{x^{2}}{2} + \frac{x^{3}}{6} + \frac{x^{4}}{24} + ....\\ e^{-x} &= \sum_{n=0}^{\infty}\frac{(-x)^{n}}{n!} = 1 - x + \frac{x^{2}}{2} - \frac{x^{3}}{6} + \frac{x^{4}}{24} + .... \end{align*}

We can add their power series expansion together, and we will obtain:

\begin{align*} e^{x} + e^{-x} &= \sum_{n=0}^{\infty}\frac{x^{n}}{n!} + \sum_{n=0}^{\infty}\frac{x^{n}}{n!}\\ &=\sum_{n=0}^{\infty}\frac{(x)^{n} + (-x)^{n}}{n!} \end{align*}

Within this infinite sum, if n is even, then the negative sign in (-x)^{n} will disappear; if n is odd, then the negative sign in (-x)^{n} will be kept and travel out of the parenthesis. Therefore we have:

\begin{align*} e^{x} + e^{-x} &= \sum_{n=0}^{\infty}\frac{x^{n}+x^{n}}{n!} \mathrm{(for\; even\; n)} + \sum_{n=0}^{\infty}\frac{x^{n}-x^{n}}{n!}\mathrm{(for\; odd\; n)}\\ &=\sum_{n=0}^{\infty}\frac{2x^{n}}{n!} \mathrm{(for\; even\; n)} \end{align*}

Therefore, cosh(x)=\displaystyle\frac{e^{x}+e^{-x}}{2} is a sum of x^{n}, where n is even. Since we have already proved in part (a) that x^{n} are always convex for even n, cosh(x) is an infinite sum of convex functions and therefore also convex.

Problem 4

Note that we have two simplified closed form expressions for the estimated slope w in simple linear regression that you have already seen in discussions and lectures:

\begin{align*} w &= \frac{\sum_i (x_i - \overline{x}) y_i}{\sum_i (x_i - \overline{x})^2} \\ \\ w &= \frac{\sum_i (y_i - \overline{y}) x_i }{\sum_i (x_i - \overline{x})^2} \end{align*}

where we have dataset D = [(x_1,y_1), \ldots, (x_n,y_n)] and sample means \overline{x} = {1 \over n} \sum_{i} x_i, \quad \overline{y} = {1 \over n} \sum_{i} y_i. Without further explanation, \sum_i means \sum_{i=1}^n

Problem 4.1

Are (1) and (2) equivalent? That is, is the following equality true? Prove or disprove it. \sum_i (x_i - \overline{x}) y_i = \sum_i (y_i - \overline{y}) x_i

True. \begin{align*} & \sum_i (x_i - \overline{x}) y_i = \sum_i (y_i - \overline{y}) x_i \\ & \Leftrightarrow \sum_i x_i y_i - \overline{x} \sum_i y_i = \sum_i x_i y_i - \overline{y} \sum_i x_i \\ & \Leftrightarrow \overline{x} \sum_i y_i = \overline{y} \sum_i x_i \\ & \Leftrightarrow {1 \over n} \sum_i x_i \sum_i y_i = {1 \over n} \sum_i y_i \sum_i x_i \\ \end{align*}

Problem 4.2

True or False: If the dataset shifted right by a constant distance a, that is, we have the new dataset D_a = (x_1 + a,y_1), \ldots, (x_n + a,y_n), then will the estimated slope w change or not?

False. By (1) in part (a), we can view w as only being affected by x_i - \overline{x}, which is unchanged after shifting horizontally. Therefore, w is unchanged.

Problem 4.3

True or False: If the dataset shifted up by a constant distance b, that is, we have the new dataset D_b = [(x_1,y_1 + b), \ldots, (x_n,y_n + b)], then will the estimated slope w change or not?

False. By (2) in part (a), we can view w as only being affected by y_i - \overline{y}, which is unchanged after shifting vertically. Therefore, w is unchanged.

Problem 5

\vec{y} = \begin{bmatrix} a\\ b\\ \end{bmatrix} \vec{w}^{*} = \begin{bmatrix} 1\\ 2\\ \end{bmatrix}

Where X is the design matrix, \vec{y} is the observation vector, and \vec{w}^{*} is the optimal parameter vector. Solve for parameters a and b using the normal equations, show your work.

\begin{cases} a = 5\\ b = -1\\ \end{cases}

Since \vec{w}^{*} is the optimal parameter vector, it must satisfy the normal equations:

\begin{align*} X^{T}X\vec{w} = X^{T}\vec{y} \end{align*}

The left hand side of the equation will read:

\begin{align*} X^{T}X\vec{w} &= \begin{bmatrix} 1 & 1\\ 2 & -1 \end{bmatrix} \begin{bmatrix} 1 & 2\\ 1 & -1 \end{bmatrix} \begin{bmatrix} 1\\ 2 \end{bmatrix} \\ &= \begin{bmatrix} 2 & 1\\ 1 & 5 \end{bmatrix} \begin{bmatrix} 1\\ 2 \end{bmatrix} \\ &= \begin{bmatrix} 4\\ 11 \end{bmatrix} \end{align*}

The right hand side of the equation is given by:

\begin{align*} X^{T}\vec{y} &= \begin{bmatrix} 1 & 1\\ 2 & -1 \end{bmatrix} \begin{bmatrix} a\\ b \end{bmatrix} \\ &= \begin{bmatrix} a+b\\ 2a-b \end{bmatrix} \end{align*}

By setting the left hand side and right hand side equal to each other, we will obtain the following system of equations:

\begin{align*} \begin{bmatrix} 4\\ 11 \end{bmatrix} = \begin{bmatrix} a+b\\ 2a-b \end{bmatrix} \end{align*}

\begin{cases} &4 = a + b\\ &11 = 2a-b \end{cases}

To solve this equation set, we can add them together: \begin{align*} 4+11 &= a + b + 2a -b\\ 3a &= 15\\ \\ \\ \end{align*}

\begin{cases} a = 5\\ b = -1\\ \end{cases}

Problem 1

Problem 1.1

Click to view the solution.

Problem 1.2

Click to view the solution.

Problem 1.3

Click to view the solution.

Problem 2

Click to view the solution.

Problem 3

Problem 3.1

Click to view the solution.

Problem 3.2

Click to view the solution.

Problem 3.3

Click to view the solution.

Problem 4

Problem 4.1

Click to view the solution.

Problem 4.2

Click to view the solution.

Problem 4.3

Click to view the solution.

Problem 5

Click to view the solution.

👋 Feedback: Find an error? Still confused? Have a suggestion? Let us know here.