Multivariate Statistics
=======================

Multivariate statistics includes all statistical techniques for
analyzing samples made of two or more variables. The data set (a
:math:`N \times P` matrix :math:`\mathbf{X}`) is a collection of
:math:`N` points (or observations) with :math:`P` variables, i.e., in a
:math:`P`-dimensional space:

.. math::


   \mathbf{X} =
       \begin{bmatrix}
           x_{11} & \cdots & x_{1j} & \cdots & x_{1P} \\
           \vdots &        & \vdots &        & \vdots \\
           x_{i1} & \cdots & x_{ij} & \cdots & x_{iP} \\
           \vdots &        & \vdots &        & \vdots \\
           x_{N1} & \cdots & x_{Nj} & \cdots & x_{NP}
       \end{bmatrix} =
       \begin{bmatrix}
           x_{11} & \ldots     & x_{1P} \\
           \vdots &            & \vdots \\
                  & \mathbf{X} & \\
           \vdots &            & \vdots \\
           x_{N1} & \ldots     & x_{NP}
       \end{bmatrix}_{N \times P}.

,

- Each row :math:`i` is a :math:`P`-dimensional vector of the
  coordinates of the :math:`i`\ ’th observation or point.
- Each column :math:`j` is a :math:`N`-dimensional vector of values of
  the points for the :math:`j`\ ’th variable.

Basic Linear Algebra
--------------------

**The Euclidean norm** of a vector :math:`\mathbf{x} \in \mathbb{R}^P`
is denoted

.. math::


   \|\mathbf{x}\|_2 = \sqrt{\sum_i^P {x_i}^2}

**The Euclidean distance** between two point
:math:`\mathbf{x}, \mathbf{y} \in \mathbb{R}^P` is

.. math::


   \|\mathbf{x}-\mathbf{y}\|_2 = \sqrt{\sum_i^P (x_i-y_i)^2}

**The dot product**, denoted ’‘:math:`\cdot`’’ of two
:math:`P`-dimensional vectors :math:`\mathbf{x} = [x_1, x_2, ..., x_P]`
and :math:`\mathbf{y} = [y_1, y_2, ..., y_P]` is defined as

.. math::


   \mathbf{x} \cdot \mathbf{y} = \mathbf{x}^T \mathbf{y} = \sum_i x_i y_i =
       \begin{bmatrix}
           x_{1} & \ldots &  \mathbf{x}^T  & \ldots & x_{P}
       \end{bmatrix}
       \begin{bmatrix}
           y_{1}\\
           \vdots \\
           \mathbf{y}\\ 
           \vdots\\
           y_{P}
       \end{bmatrix}.

Note that the Euclidean norm of a vector is the square root of the dot
product of the vector with itself:

.. math::


   \left\|\mathbf{x} \right\|_2 = {\sqrt {\mathbf{x} \cdot \mathbf{x}}}.

**Geometric interpretation**: In Euclidean space, a Euclidean vector is
a geometrical object that possesses both a norm (magnitude) and a
direction. A vector can be pictured as an arrow. Its magnitude is its
length, and its direction is the direction that the arrow points. The
norm of a vector :math:`\mathbf{x}` is denoted by
:math:`\|\mathbf{x}\|_2`. The dot product of two Euclidean vectors
:math:`\mathbf{x}` and :math:`\mathbf{y}` is defined by

.. math::


   \mathbf{x} \cdot \mathbf{y} = \|\mathbf{x} \|_2\ \|\mathbf{y} \|_2\cos \theta,

where :math:`\theta` is the angle between :math:`\mathbf{x}` and
:math:`\mathbf{y}`.

In particular, if :math:`\mathbf{x}` and :math:`\mathbf{y}` are
orthogonal, then the angle between them is 90° and
:math:`\mathbf{x} \cdot \mathbf{y} = 0.` At the other extreme, if they
are codirectional, then the angle between them is 0° and
:math:`\mathbf{x} \cdot \mathbf{y} = \left\|\mathbf{x} \right\|_2\,\left\|\mathbf{y} \right\|_2`.

**The (scalar) projection** of a vector :math:`\mathbf{x}` (a point) in
the direction of :math:`\mathbf{v}` is given by

:raw-latex:`\begin{align}
x_{v} & = \left\|\mathbf{x} \right\|_2\cos \theta,\\
      & = \frac{\mathbf{x} \cdot \mathbf{v}}{\|\mathbf{v}\|_2},
\end{align}`

where :math:`\theta` is the angle between :math:`\mathbf{x}` and
:math:`\mathbf{y}`. Note that :math:`x_{v}` is a scalar measuring the
length of :math:`x` projected on :math:`v`.

When we want to project on direction :math:`\mathbf{v}`, we only care
about its direction. Therefore vector :math:`\mathbf{v}` can be
normalized to :math:`\|\mathbf{v}\|_2 = 1` dividing by the vector norm
(without affecting its direction.). With :math:`\|\mathbf{v}\|_2 = 1`
**the projection of any point** :math:`\mathbf{x}` **toward a
direction** :math:`\mathbf{v}` **becomes an simple dot product**:

.. math::


   x_{v} = \mathbf{x} \cdot \mathbf{v}

.. figure:: images/Dot_Product.png
   :alt: Projection of point :math:`x` on vector :math:`v`
   :width: 5cm

   Projection of point :math:`x` on vector :math:`v`


.. code:: ipython3

    import numpy as np
    import scipy
    import pandas as pd
    
    # Plot
    import matplotlib.pyplot as plt
    from matplotlib import cm # color map
    import seaborn as sns
    import pystatsml.plot_utils
    
    # Plot parameters
    plt.style.use('seaborn-v0_8-whitegrid')
    fig_w, fig_h = plt.rcParams.get('figure.figsize')
    plt.rcParams['figure.figsize'] = (fig_w, fig_h * 1.)
    colors = plt.rcParams['axes.prop_cycle'].by_key()['color']

.. code:: ipython3

    import numpy as np
    np.random.seed(42)
    
    a = np.random.randn(10)
    b = np.random.randn(10)
    
    print(np.dot(a, b))


.. parsed-literal::

    -4.085788532659923


Mean vector
-----------

The mean (:math:`P \times 1`) column-vector :math:`\mathbf{\mu}` whose
estimator is

.. math::


   \bar{\mathbf{x}} = \frac{1}{N}\sum_{i=1}^N \mathbf{x_i} =
       \frac{1}{N}\sum_{i=1}^N
           \begin{bmatrix}
               x_{i1}\\
               \vdots\\
               x_{ij}\\
               \vdots\\
               x_{iP}\\
            \end{bmatrix} =
            \begin{bmatrix}
                \bar{x}_{1}\\
                \vdots\\
                \bar{x}_{j}\\
                \vdots\\
                \bar{x}_{P}\\
            \end{bmatrix}.

Covariance matrix
-----------------

- The covariance matrix :math:`\mathbf{\Sigma_{XX}}` is a **symmetric**
  positive semi-definite matrix whose element in the :math:`j, k`
  position is the covariance between the :math:`j^{th}` and
  :math:`k^{th}` elements of a random vector i.e. the :math:`j^{th}` and
  :math:`k^{th}` columns of :math:`\mathbf{X}`.

- The covariance matrix generalizes the notion of covariance to multiple
  dimensions.

- The covariance matrix describe the shape of the sample distribution
  around the mean assuming an elliptical distribution:

.. math::


   \mathbf{\Sigma_{XX}} = E(\mathbf{X}-E(\mathbf{X}))^TE(\mathbf{X}-E(\mathbf{X})),

whose estimator :math:`\mathbf{S_{XX}}` is a :math:`P \times P` matrix
given by

.. math::


   \mathbf{S_{XX}}= \frac{1}{N-1}(\mathbf{X}- \mathbf{1} \bar{\mathbf{x}}^T)^T (\mathbf{X}- \mathbf{1} \bar{\mathbf{x}}^T).

If we assume that :math:`\mathbf{X}` is centered,
i.e. :math:`\mathbf{X}` is replaced by
:math:`\mathbf{X} - \mathbf{1}\bar{\mathbf{x}}^T` then the estimator is

.. math::


   \mathbf{S_{XX}} = \frac{1}{N-1} \mathbf{X}^T\mathbf{X} =
       \frac{1}{N-1} \begin{bmatrix}
                         x_{11} & \cdots & x_{N1} \\
                         x_{1j} & \cdots & x_{Nj} \\
                         \vdots &        & \vdots \\
                         x_{1P} & \cdots & x_{NP} \\
                     \end{bmatrix}
                     \begin{bmatrix}
                         x_{11} & \cdots & x_{1k}& x_{1P}\\
                         \vdots &        & \vdots & \vdots\\
                         x_{N1} & \cdots & x_{Nk}& x_{NP}
                     \end{bmatrix}=
                     \begin{bmatrix}
                         s_{1} & \ldots  & s_{1k} & s_{1P}\\
                               & \ddots  & s_{jk} & \vdots\\
                               &         & s_{k}  & s_{kP}\\      
                               &         &        & s_{P}\\
                     \end{bmatrix},

where

.. math::


   s_{jk} = s_{kj} = \frac{1}{N-1} \mathbf{x_j}^T \mathbf{x_k} = \frac{1}{N-1} \sum_{i=1}^N x_{ij} x_{ik}

is an estimator of the covariance between the :math:`j^{th}` and
:math:`k^{th}` variables.

.. code:: ipython3

    np.random.seed(42)
    colors = sns.color_palette()
    
    n_samples, n_features = 100, 2
    
    mean, Cov, X = [None] * 4, [None] * 4, [None] * 4
    mean[0] = np.array([-2.5, 2.5])
    Cov[0] = np.array([[1, 0],
                       [0, 1]])
    
    mean[1] = np.array([2.5, 2.5])
    Cov[1] = np.array([[1, .5],
                       [.5, 1]])
    
    mean[2] = np.array([-2.5, -2.5])
    Cov[2] = np.array([[1, .9],
                       [.9, 1]])
    
    mean[3] = np.array([2.5, -2.5])
    Cov[3] = np.array([[1, -.9],
                       [-.9, 1]])
    
    # Generate dataset
    for i in range(len(mean)):
        X[i] = np.random.multivariate_normal(mean[i], Cov[i], n_samples)
    
    # Plot
    for i in range(len(mean)):
        # Points
        plt.scatter(X[i][:, 0], X[i][:, 1], color=colors[i], label="class %i" % i)
        # Means
        plt.scatter(mean[i][0], mean[i][1], marker="o", s=200, facecolors='w',
                    edgecolors=colors[i], linewidth=2)
        # Ellipses representing the covariance matrices
        pystatsml.plot_utils.plot_cov_ellipse(Cov[i], pos=mean[i], facecolor='none',
                                              linewidth=2, edgecolor=colors[i])
    
    plt.axis('equal')
    _ = plt.legend(loc='upper left')


.. image:: stat_multiv_files/stat_multiv_6_0.png


Correlation matrix
------------------

.. code:: ipython3

    url = 'https://raw.githubusercontent.com/plotly/datasets/master/mtcars.csv'
    df = pd.read_csv(url)
    df = df.drop('manufacturer', axis=1)
    
    # Compute the correlation matrix
    corr = df.corr()
    
    # Generate a mask for the upper triangle
    mask = np.zeros_like(corr, dtype=np.bool)
    mask[np.triu_indices_from(mask)] = True
    
    f, ax = plt.subplots(figsize=(5.5, 4.5))
    cmap = sns.color_palette("RdBu_r", 11)
    # Draw the heatmap with the mask and correct aspect ratio
    _ = sns.heatmap(corr, mask=None, cmap=cmap, vmax=1, center=0,
                square=True, linewidths=.5, cbar_kws={"shrink": .5})


.. image:: stat_multiv_files/stat_multiv_8_0.png


Re-order correlation matrix using AgglomerativeClustering

.. code:: ipython3

    # convert correlation to distances
    from sklearn.cluster import AgglomerativeClustering
    d = 2 * (1 - np.abs(corr))
    
    clustering = AgglomerativeClustering(
        n_clusters=3, linkage='single', metric="precomputed").fit(d)
    lab = 0
    
    clusters = [list(corr.columns[clustering.labels_ == lab])
                for lab in set(clustering.labels_)]
    print(clusters)
    
    reordered = np.concatenate(clusters)
    
    R = corr.loc[reordered, reordered]
    
    f, ax = plt.subplots(figsize=(5.5, 4.5))
    # Draw the heatmap with the mask and correct aspect ratio
    _ = sns.heatmap(R, mask=None, cmap=cmap, vmax=1, center=0,
                    square=True, linewidths=.5, cbar_kws={"shrink": .5})


.. parsed-literal::

    [['mpg', 'cyl', 'disp', 'hp', 'wt', 'qsec', 'vs', 'carb'], ['am', 'gear'], ['drat']]


.. image:: stat_multiv_files/stat_multiv_10_1.png


Precision matrix
----------------

In statistics, precision is the reciprocal of the variance, and the
precision matrix is the matrix inverse of the covariance matrix.

It is related to **partial correlations** that measures the degree of
association between two variables, while controlling the effect of other
variables.

.. code:: ipython3

    import numpy as np
    
    Cov = np.array([[1.0, 0.9, 0.9, 0.0, 0.0, 0.0],
                    [0.9, 1.0, 0.9, 0.0, 0.0, 0.0],
                    [0.9, 0.9, 1.0, 0.0, 0.0, 0.0],
                    [0.0, 0.0, 0.0, 1.0, 0.9, 0.0],
                    [0.0, 0.0, 0.0, 0.9, 1.0, 0.0],
                    [0.0, 0.0, 0.0, 0.0, 0.0, 1.0]])
    
    print("# Precision matrix:")
    Prec = np.linalg.inv(Cov)
    print(Prec.round(2))
    
    print("# Partial correlations:")
    Pcor = np.zeros(Prec.shape)
    Pcor[::] = np.nan
    
    for i, j in zip(*np.triu_indices_from(Prec, 1)):
        Pcor[i, j] = - Prec[i, j] / np.sqrt(Prec[i, i] * Prec[j, j])
    
    print(Pcor.round(2))


.. parsed-literal::

    # Precision matrix:
    [[ 6.79 -3.21 -3.21  0.    0.    0.  ]
     [-3.21  6.79 -3.21  0.    0.    0.  ]
     [-3.21 -3.21  6.79  0.    0.    0.  ]
     [ 0.    0.    0.    5.26 -4.74  0.  ]
     [ 0.    0.    0.   -4.74  5.26  0.  ]
     [ 0.    0.    0.    0.    0.    1.  ]]
    # Partial correlations:
    [[  nan  0.47  0.47 -0.   -0.   -0.  ]
     [  nan   nan  0.47 -0.   -0.   -0.  ]
     [  nan   nan   nan -0.   -0.   -0.  ]
     [  nan   nan   nan   nan  0.9  -0.  ]
     [  nan   nan   nan   nan   nan -0.  ]
     [  nan   nan   nan   nan   nan   nan]]


Mahalanobis distance
--------------------

- The Mahalanobis distance is a measure of the distance between two
  points :math:`\mathbf{x}` and :math:`\mathbf{\mu}` where the
  dispersion (i.e. the covariance structure) of the samples is taken
  into account.

- The dispersion is considered through covariance matrix.

This is formally expressed as

.. math::


   D_M(\mathbf{x}, \mathbf{\mu}) = \sqrt{(\mathbf{x} - \mathbf{\mu})^T \mathbf{\Sigma}^{-1}(\mathbf{x} - \mathbf{\mu})}.

**Intuitions**

- Distances along the principal directions of dispersion are contracted
  since they correspond to likely dispersion of points.

- Distances othogonal to the principal directions of dispersion are
  dilated since they correspond to unlikely dispersion of points.

For example

.. math::


   D_M(\mathbf{1}) = \sqrt{\mathbf{1}^T \mathbf{\Sigma}^{-1}\mathbf{1}}.

.. code:: ipython3

    ones = np.ones(Cov.shape[0])
    d_euc = np.sqrt(np.dot(ones, ones))
    d_mah = np.sqrt(np.dot(np.dot(ones, Prec), ones))
    
    print("Euclidean norm of ones=%.2f. Mahalanobis norm of ones=%.2f" % 
          (d_euc, d_mah))


.. parsed-literal::

    Euclidean norm of ones=2.45. Mahalanobis norm of ones=1.77


The first dot product that distances along the principal directions of
dispersion are contracted:

.. code:: ipython3

    print(np.dot(ones, Prec))


.. parsed-literal::

    [0.35714286 0.35714286 0.35714286 0.52631579 0.52631579 1.        ]


.. code:: ipython3

    import numpy as np
    import scipy
    import matplotlib.pyplot as plt
    import seaborn as sns
    import pystatsml.plot_utils
    %matplotlib inline
    np.random.seed(40)
    colors = sns.color_palette()
    
    mean = np.array([0, 0]) 
    Cov = np.array([[1, .8],
                    [.8, 1]])
    samples = np.random.multivariate_normal(mean, Cov, 100)
    x1 = np.array([0, 2])
    x2 = np.array([2, 2])
    
    plt.scatter(samples[:, 0], samples[:, 1], color=colors[0])
    plt.scatter(mean[0], mean[1], color=colors[0], s=200, label="mean")
    plt.scatter(x1[0], x1[1], color=colors[1], s=200, label="x1")
    plt.scatter(x2[0], x2[1], color=colors[2], s=200, label="x2")
    
    # plot covariance ellipsis
    pystatsml.plot_utils.plot_cov_ellipse(Cov, pos=mean, facecolor='none', 
                                          linewidth=2, edgecolor=colors[0])
    # Compute distances
    d2_m_x1 = scipy.spatial.distance.euclidean(mean, x1)
    d2_m_x2 = scipy.spatial.distance.euclidean(mean, x2)
    
    Covi = scipy.linalg.inv(Cov)
    dm_m_x1 = scipy.spatial.distance.mahalanobis(mean, x1, Covi)
    dm_m_x2 = scipy.spatial.distance.mahalanobis(mean, x2, Covi)
    
    # Plot distances
    vm_x1 = (x1 - mean) / d2_m_x1
    vm_x2 = (x2 - mean) / d2_m_x2
    jitter = .1
    plt.plot([mean[0] - jitter, d2_m_x1 * vm_x1[0] - jitter],
             [mean[1], d2_m_x1 * vm_x1[1]], color='k')
    plt.plot([mean[0] - jitter, d2_m_x2 * vm_x2[0] - jitter],
             [mean[1], d2_m_x2 * vm_x2[1]], color='k')
    
    plt.plot([mean[0] + jitter, dm_m_x1 * vm_x1[0] + jitter],
             [mean[1], dm_m_x1 * vm_x1[1]], color='r')
    plt.plot([mean[0] + jitter, dm_m_x2 * vm_x2[0] + jitter],
             [mean[1], dm_m_x2 * vm_x2[1]], color='r')
    
    plt.legend(loc='lower right')
    plt.text(-6.1, 3,
             'Euclidian:   d(m, x1) = %.1f<d(m, x2) = %.1f' % (d2_m_x1, d2_m_x2),
             color='k')
    plt.text(-6.1, 3.5,
             'Mahalanobis: d(m, x1) = %.1f>d(m, x2) = %.1f' % (dm_m_x1, dm_m_x2),
             color='r')
    
    plt.axis('equal')
    print('Euclidian   d(m, x1) = %.2f < d(m, x2) = %.2f' % (d2_m_x1, d2_m_x2))
    print('Mahalanobis d(m, x1) = %.2f > d(m, x2) = %.2f' % (dm_m_x1, dm_m_x2))


.. parsed-literal::

    Euclidian   d(m, x1) = 2.00 < d(m, x2) = 2.83
    Mahalanobis d(m, x1) = 3.33 > d(m, x2) = 2.11


.. image:: stat_multiv_files/stat_multiv_17_1.png


If the covariance matrix is the identity matrix, the Mahalanobis
distance reduces to the Euclidean distance. If the covariance matrix is
diagonal, then the resulting distance measure is called a normalized
Euclidean distance.

More generally, the Mahalanobis distance is a measure of the distance
between a point :math:`\mathbf{x}` and a distribution
:math:`\mathcal{N}(\mathbf{x}|\mathbf{\mu}, \mathbf{\Sigma})`. It is a
multi-dimensional generalization of the idea of measuring how many
standard deviations away :math:`\mathbf{x}` is from the mean. This
distance is zero if :math:`\mathbf{x}` is at the mean, and grows as
:math:`\mathbf{x}` moves away from the mean: along each principal
component axis, it measures the number of standard deviations from
:math:`\mathbf{x}` to the mean of the distribution.

Multivariate normal distribution
--------------------------------

The distribution, or probability density function (PDF) (sometimes just
density), of a continuous random variable is a function that describes
the relative likelihood for this random variable to take on a given
value.

The multivariate normal distribution, or multivariate Gaussian
distribution, of a :math:`P`-dimensional random vector
:math:`\mathbf{x} = [x_1, x_2, \ldots, x_P]^T` is

.. math::


   \mathcal{N}(\mathbf{x}|\mathbf{\mu}, \mathbf{\Sigma}) = \frac{1}{(2\pi)^{P/2}|\mathbf{\Sigma}|^{1/2}}\exp\{-\frac{1}{2} (\mathbf{x} - \mathbf{\mu)}^T \mathbf{\Sigma}^{-1}(\mathbf{x} - \mathbf{\mu})\}.

.. code:: ipython3

    import numpy as np
    import matplotlib.pyplot as plt
    import scipy.stats
    from scipy.stats import multivariate_normal
    #from mpl_toolkits.mplot3d import Axes3D
    
    
    def multivariate_normal_pdf(X, mean, sigma):
        """Multivariate normal probability density function over X
        (n_samples x n_features)"""
        P = X.shape[1]
        det = np.linalg.det(sigma)
        norm_const = 1.0 / (((2*np.pi) ** (P/2)) * np.sqrt(det))
        X_mu = X - mu
        inv = np.linalg.inv(sigma)
        d2 = np.sum(np.dot(X_mu, inv) * X_mu, axis=1)
        return norm_const * np.exp(-0.5 * d2)
    
    # mean and covariance
    mu = np.array([0, 0])
    sigma = np.array([[1, -.5],
                      [-.5, 1]])
    
    # x, y grid
    x, y = np.mgrid[-3:3:.1, -3:3:.1]
    X = np.stack((x.ravel(), y.ravel())).T
    norm = multivariate_normal_pdf(X, mean, sigma).reshape(x.shape)
    
    # Do it with scipy
    norm_scpy = multivariate_normal(mu, sigma).pdf(np.stack((x, y), axis=2))
    assert np.allclose(norm, norm_scpy)
    
    # Plot
    fig, ax = plt.subplots(subplot_kw={"projection": "3d"})
    surf = ax.plot_surface(x, y, norm, rstride=3,
            cstride=3, cmap=plt.cm.coolwarm,
            linewidth=1, antialiased=False
        )
    
    ax.set_zlim(0, 0.2)
    ax.zaxis.set_major_locator(plt.LinearLocator(10))
    ax.zaxis.set_major_formatter(plt.FormatStrFormatter('%.02f'))
    
    ax.set_xlabel('X')
    ax.set_ylabel('Y')
    ax.set_zlabel('p(x)')
    
    plt.title('Bivariate Normal/Gaussian distribution')
    fig.colorbar(surf, shrink=0.5, aspect=7, cmap=plt.cm.coolwarm)
    plt.show()


.. image:: stat_multiv_files/stat_multiv_20_0.png


Exercises
---------

Dot product and Euclidean norm
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Given :math:`\mathbf{x} = [2, 1]^T` and :math:`\mathbf{y} = [1, 1]^T`

1. Write a function ``euclidean(x)`` that computes the Euclidean norm of
   vector, :math:`\mathbf{x}`.
2. Compute the Euclidean norm of :math:`\mathbf{x}`.
3. Compute the Euclidean distance of
   :math:`\|\mathbf{x}-\mathbf{y}\|_2`.
4. Compute the projection of :math:`\mathbf{y}` in the direction of
   vector :math:`\mathbf{x}`: :math:`y_{a}`.
5. Simulate a dataset :math:`\mathbf{X}` of :math:`N=100` samples of
   2-dimensional vectors.
6. Project all samples in the direction of the vector
   :math:`\mathbf{x}`.

Covariance matrix and Mahalanobis norm
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

1. Sample a dataset :math:`\mathbf{X}` of :math:`N=100` samples of
   2-dimensional vectors from the bivariate normal distribution
   :math:`\mathcal{N}(\mathbf{\mu}, \mathbf{\Sigma})` where
   :math:`\mathbf{\mu}=[1, 1]^T` and
   :math:`\mathbf{\Sigma}=\begin{bmatrix} 1 & 0.8\\0.8, 1 \end{bmatrix}`.
2. Compute the mean vector :math:`\mathbf{\bar{x}}` and center
   :math:`\mathbf{X}`. Compare the estimated mean
   :math:`\mathbf{\bar{x}}` to the true mean, :math:`\mathbf{\mu}`.
3. Compute the empirical covariance matrix :math:`\mathbf{S}`. Compare
   the estimated covariance matrix :math:`\mathbf{S}` to the true
   covariance matrix, :math:`\mathbf{\Sigma}`.
4. Compute :math:`\mathbf{S}^{-1}` (``Sinv``) the inverse of the
   covariance matrix by using ``scipy.linalg.inv(S)``.
5. Write a function ``mahalanobis(x, xbar, Sinv)`` that computes the
   Mahalanobis distance of a vector :math:`\mathbf{x}` to the mean,
   :math:`\mathbf{\bar{x}}`.
6. Compute the Mahalanobis and Euclidean distances of each sample
   :math:`\mathbf{x}_i` to the mean :math:`\mathbf{\bar{x}}`. Store the
   results in a :math:`100 \times 2` dataframe.