.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_gallery/data_numpy.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_gallery_data_numpy.py: Numpy: Arrays and Matrices ========================== NumPy is an extension to the Python programming language, adding support for large, multi-dimensional (numerical) arrays and matrices, along with a large library of high-level mathematical functions to operate on these arrays. Numpy functions are executed by **compiled in C or Fortran libraries**, providing the performance of compiled languages. **Sources**: `Kevin Markham `_ Computation time: .. GENERATED FROM PYTHON SOURCE LINES 16-32 .. code-block:: Python import numpy as np import time start_time = time.time() l = [v for v in range(10 ** 8)] s = 0 for v in l: s += v print("Python code, time ellapsed: %.2fs" % (time.time() - start_time)) start_time = time.time() arr = np.arange(10 ** 8) arr.sum() print("Numpy code, time ellapsed: %.2fs" % (time.time() - start_time)) .. rst-class:: sphx-glr-script-out .. code-block:: none Python code, time ellapsed: 6.13s Numpy code, time ellapsed: 0.23s .. GENERATED FROM PYTHON SOURCE LINES 33-38 Create arrays ------------- Create ndarrays from lists. note: every element must be the same type (will be converted if possible) .. GENERATED FROM PYTHON SOURCE LINES 38-47 .. code-block:: Python data1 = [1, 2, 3, 4, 5] # list arr = np.array(data1) # 1d array data = [range(1, 5), range(5, 9)] # list of lists arr = np.array(data) # 2d array print(arr) arr.tolist() # convert array back to list .. rst-class:: sphx-glr-script-out .. code-block:: none [[1 2 3 4] [5 6 7 8]] [[1, 2, 3, 4], [5, 6, 7, 8]] .. GENERATED FROM PYTHON SOURCE LINES 48-49 Create special arrays .. GENERATED FROM PYTHON SOURCE LINES 49-58 .. code-block:: Python np.zeros(10) # [0, 0, ..., 0] np.zeros((3, 6)) # 3 x 6 array of zeros np.ones(10) np.linspace(0, 1, 5) # 0 to 1 (inclusive) with 5 points np.logspace(0, 3, 4) # 10^0 to 10^3 (inclusive) with 4 points np.arange(10) # [0, 1 ..., 9] .. rst-class:: sphx-glr-script-out .. code-block:: none array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) .. GENERATED FROM PYTHON SOURCE LINES 59-60 Examining arrays .. GENERATED FROM PYTHON SOURCE LINES 60-79 .. code-block:: Python print("Shape of the array: ", arr.shape) print("Type of the array: ", arr.dtype) print("Number of items in the array: ", arr.size) print("Memory size of one array item in bytes: ", arr.itemsize) # memory size of numpy array in bytes print("Memory size of numpy array in bytes: %i, and in bits: %i" % (arr.size * arr.itemsize, arr.size * arr.itemsize * 8 )) .. rst-class:: sphx-glr-script-out .. code-block:: none Shape of the array: (2, 4) Type of the array: int64 Number of items in the array: 8 Memory size of one array item in bytes: 8 Memory size of numpy array in bytes: 64, and in bits: 512 .. GENERATED FROM PYTHON SOURCE LINES 80-82 Selection --------- .. GENERATED FROM PYTHON SOURCE LINES 82-85 .. code-block:: Python arr[1, 2] # Get third item of the second line .. rst-class:: sphx-glr-script-out .. code-block:: none np.int64(7) .. GENERATED FROM PYTHON SOURCE LINES 86-95 Slicing ~~~~~~~ Syntax: ``start:stop:step`` with ``start`` *(default 0)* ``stop`` *(default last)* ``step`` *(default 1)* - ``:`` is equivalent to ``0:last:1``; ie, take all elements, from 0 to the end with step = 1. - ``:k`` is equivalent to ``0:k:1``; ie, take all elements, from 0 to k with step = 1. - ``k:`` is equivalent to ``k:end:1``; ie, take all elements, from k to the end with step = 1. - ``::-1`` is equivalent to ``0:end:-1``; ie, take all elements, from k to the end in reverse order, with step = -1. .. GENERATED FROM PYTHON SOURCE LINES 95-110 .. code-block:: Python arr[0, :] # Get first line arr[:, 2] # Get third column arr[:, :2] # columns strictly before index 2 (2 first columns) arr[:, 2:] # columns after index 2 included arr2 = arr[:, 1:4] # columns between index 1 (included) and 4 (excluded) print(arr2) # Slicing returns a view (not a copy) # Modification arr2[0, 0] = 33 print(arr2) print(arr) .. rst-class:: sphx-glr-script-out .. code-block:: none [[2 3 4] [6 7 8]] [[33 3 4] [ 6 7 8]] [[ 1 33 3 4] [ 5 6 7 8]] .. GENERATED FROM PYTHON SOURCE LINES 111-112 Reverse order of row 0 .. GENERATED FROM PYTHON SOURCE LINES 112-116 .. code-block:: Python print(arr[0, ::-1]) .. rst-class:: sphx-glr-script-out .. code-block:: none [ 4 3 33 1] .. GENERATED FROM PYTHON SOURCE LINES 117-123 Fancy indexing: Integer or boolean array indexing ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Fancy indexing returns a copy not a view. Integer array indexing .. GENERATED FROM PYTHON SOURCE LINES 123-131 .. code-block:: Python arr2 = arr[:, [1, 2, 3]] # return a copy print(arr2) arr2[0, 0] = 44 print(arr2) print(arr) .. rst-class:: sphx-glr-script-out .. code-block:: none [[33 3 4] [ 6 7 8]] [[44 3 4] [ 6 7 8]] [[ 1 33 3 4] [ 5 6 7 8]] .. GENERATED FROM PYTHON SOURCE LINES 132-133 Boolean arrays indexing .. GENERATED FROM PYTHON SOURCE LINES 133-142 .. code-block:: Python arr2 = arr[arr > 5] # return a copy print(arr2) arr2[0] = 44 print(arr2) print(arr) .. rst-class:: sphx-glr-script-out .. code-block:: none [33 6 7 8] [44 6 7 8] [[ 1 33 3 4] [ 5 6 7 8]] .. GENERATED FROM PYTHON SOURCE LINES 143-145 However, In the context of lvalue indexing (left hand side value of an assignment) Fancy authorizes the modification of the original array .. GENERATED FROM PYTHON SOURCE LINES 145-149 .. code-block:: Python arr[arr > 5] = 0 print(arr) .. rst-class:: sphx-glr-script-out .. code-block:: none [[1 0 3 4] [5 0 0 0]] .. GENERATED FROM PYTHON SOURCE LINES 150-159 Array indexing return copy or view? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ General rules: - Slicing always returns a view. - Fancy indexing (boolean mask, integers) returns copy - lvalue indexing i.e. the indices are placed in the left hand side value of an assignment, provides a view. .. GENERATED FROM PYTHON SOURCE LINES 162-164 Array manipulation ------------------ .. GENERATED FROM PYTHON SOURCE LINES 166-167 Reshaping .. GENERATED FROM PYTHON SOURCE LINES 167-172 .. code-block:: Python arr = np.arange(10, dtype=float).reshape((2, 5)) print(arr.shape) print(arr.reshape(5, 2)) .. rst-class:: sphx-glr-script-out .. code-block:: none (2, 5) [[0. 1.] [2. 3.] [4. 5.] [6. 7.] [8. 9.]] .. GENERATED FROM PYTHON SOURCE LINES 173-174 Add an axis .. GENERATED FROM PYTHON SOURCE LINES 174-182 .. code-block:: Python a = np.array([0, 1]) print(a) a_col = a[:, np.newaxis] print(a_col) #or a_col = a[:, None] .. rst-class:: sphx-glr-script-out .. code-block:: none [0 1] [[0] [1]] .. GENERATED FROM PYTHON SOURCE LINES 183-184 Transpose .. GENERATED FROM PYTHON SOURCE LINES 184-187 .. code-block:: Python print(a_col.T) .. rst-class:: sphx-glr-script-out .. code-block:: none [[0 1]] .. GENERATED FROM PYTHON SOURCE LINES 188-189 Flatten: always returns a flat copy of the original array .. GENERATED FROM PYTHON SOURCE LINES 189-195 .. code-block:: Python arr_flt = arr.flatten() arr_flt[0] = 33 print(arr_flt) print(arr) .. rst-class:: sphx-glr-script-out .. code-block:: none [33. 1. 2. 3. 4. 5. 6. 7. 8. 9.] [[0. 1. 2. 3. 4.] [5. 6. 7. 8. 9.]] .. GENERATED FROM PYTHON SOURCE LINES 196-197 Ravel: returns a view of the original array whenever possible. .. GENERATED FROM PYTHON SOURCE LINES 197-203 .. code-block:: Python arr_flt = arr.ravel() arr_flt[0] = 33 print(arr_flt) print(arr) .. rst-class:: sphx-glr-script-out .. code-block:: none [33. 1. 2. 3. 4. 5. 6. 7. 8. 9.] [[33. 1. 2. 3. 4.] [ 5. 6. 7. 8. 9.]] .. GENERATED FROM PYTHON SOURCE LINES 204-206 Stack arrays `NumPy Joining Array `_ .. GENERATED FROM PYTHON SOURCE LINES 206-210 .. code-block:: Python a = np.array([0, 1]) b = np.array([2, 3]) .. GENERATED FROM PYTHON SOURCE LINES 211-212 Horizontal stacking .. GENERATED FROM PYTHON SOURCE LINES 212-215 .. code-block:: Python np.hstack([a, b]) .. rst-class:: sphx-glr-script-out .. code-block:: none array([0, 1, 2, 3]) .. GENERATED FROM PYTHON SOURCE LINES 216-217 Vertical stacking .. GENERATED FROM PYTHON SOURCE LINES 217-220 .. code-block:: Python np.vstack([a, b]) .. rst-class:: sphx-glr-script-out .. code-block:: none array([[0, 1], [2, 3]]) .. GENERATED FROM PYTHON SOURCE LINES 221-222 Default Vertical .. GENERATED FROM PYTHON SOURCE LINES 222-226 .. code-block:: Python np.stack([a, b]) .. rst-class:: sphx-glr-script-out .. code-block:: none array([[0, 1], [2, 3]]) .. GENERATED FROM PYTHON SOURCE LINES 227-246 Advanced Numpy: reshaping/flattening and selection -------------------------------------------------- Numpy internals: By default Numpy use C convention, ie, Row-major language: The matrix is stored by rows. In C, the last index changes most rapidly as one moves through the array as stored in memory. For 2D arrays, sequential move in the memory will: - iterate over rows (axis 0) - iterate over columns (axis 1) For 3D arrays, sequential move in the memory will: - iterate over plans (axis 0) - iterate over rows (axis 1) - iterate over columns (axis 2) .. figure:: ../images/numpy_array3d.png .. GENERATED FROM PYTHON SOURCE LINES 246-250 .. code-block:: Python x = np.arange(2 * 3 * 4) print(x) .. rst-class:: sphx-glr-script-out .. code-block:: none [ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23] .. GENERATED FROM PYTHON SOURCE LINES 251-252 Reshape into 3D (axis 0, axis 1, axis 2) .. GENERATED FROM PYTHON SOURCE LINES 252-256 .. code-block:: Python x = x.reshape(2, 3, 4) print(x) .. rst-class:: sphx-glr-script-out .. code-block:: none [[[ 0 1 2 3] [ 4 5 6 7] [ 8 9 10 11]] [[12 13 14 15] [16 17 18 19] [20 21 22 23]]] .. GENERATED FROM PYTHON SOURCE LINES 257-258 Selection get first plan .. GENERATED FROM PYTHON SOURCE LINES 258-261 .. code-block:: Python print(x[0, :, :]) .. rst-class:: sphx-glr-script-out .. code-block:: none [[ 0 1 2 3] [ 4 5 6 7] [ 8 9 10 11]] .. GENERATED FROM PYTHON SOURCE LINES 262-263 Selection get first rows .. GENERATED FROM PYTHON SOURCE LINES 263-266 .. code-block:: Python print(x[:, 0, :]) .. rst-class:: sphx-glr-script-out .. code-block:: none [[ 0 1 2 3] [12 13 14 15]] .. GENERATED FROM PYTHON SOURCE LINES 267-268 Selection get first columns .. GENERATED FROM PYTHON SOURCE LINES 268-272 .. code-block:: Python print(x[:, :, 0]) .. rst-class:: sphx-glr-script-out .. code-block:: none [[ 0 4 8] [12 16 20]] .. GENERATED FROM PYTHON SOURCE LINES 273-274 Ravel .. GENERATED FROM PYTHON SOURCE LINES 274-278 .. code-block:: Python print(x.ravel()) .. rst-class:: sphx-glr-script-out .. code-block:: none [ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23] .. GENERATED FROM PYTHON SOURCE LINES 279-281 Vectorized operations --------------------- .. GENERATED FROM PYTHON SOURCE LINES 281-315 .. code-block:: Python nums = np.arange(5) nums * 10 # multiply each element by 10 nums = np.sqrt(nums) # square root of each element np.ceil(nums) # also floor, rint (round to nearest int) np.isnan(nums) # checks for NaN nums + np.arange(5) # add element-wise np.maximum(nums, np.array([1, -2, 3, -4, 5])) # compare element-wise # Compute Euclidean distance between 2 vectors vec1 = np.random.randn(10) vec2 = np.random.randn(10) dist = np.sqrt(np.sum((vec1 - vec2) ** 2)) # math and stats rnd = np.random.randn(4, 2) # random normals in 4x2 array rnd.mean() rnd.std() rnd.argmin() # index of minimum element rnd.sum() rnd.sum(axis=0) # sum of columns rnd.sum(axis=1) # sum of rows # methods for boolean arrays (rnd > 0).sum() # counts number of positive values (rnd > 0).any() # checks if any value is True (rnd > 0).all() # checks if all values are True # random numbers np.random.seed(12234) # Set the seed np.random.rand(2, 3) # 2 x 3 matrix in [0, 1] np.random.randn(10) # random normals (mean 0, sd 1) np.random.randint(0, 2, 10) # 10 randomly picked 0 or 1 .. rst-class:: sphx-glr-script-out .. code-block:: none array([0, 0, 0, 1, 1, 0, 1, 1, 1, 1]) .. GENERATED FROM PYTHON SOURCE LINES 316-340 Broadcasting ------------ Sources: https://docs.scipy.org/doc/numpy-1.13.0/user/basics.broadcasting.html Implicit conversion to allow operations on arrays of different sizes. - The smaller array is stretched or “broadcasted” across the larger array so that they have compatible shapes. - Fast vectorized operation in C instead of Python. - No needless copies. Rules ~~~~~ Starting with the trailing axis and working backward, Numpy compares arrays dimensions. - If two dimensions are equal then continues - If one of the operand has dimension 1 stretches it to match the largest one - When one of the shapes runs out of dimensions (because it has less dimensions than the other shape), Numpy will use 1 in the comparison process until the other shape's dimensions run out as well. .. figure:: ../images/numpy_broadcasting.png Source: http://www.scipy-lectures.org .. GENERATED FROM PYTHON SOURCE LINES 340-351 .. code-block:: Python a = np.array([[ 0, 0, 0], [10, 10, 10], [20, 20, 20], [30, 30, 30]]) b = np.array([0, 1, 2]) print(a + b) .. rst-class:: sphx-glr-script-out .. code-block:: none [[ 0 1 2] [10 11 12] [20 21 22] [30 31 32]] .. GENERATED FROM PYTHON SOURCE LINES 352-353 Center data column-wise .. GENERATED FROM PYTHON SOURCE LINES 353-356 .. code-block:: Python a - a.mean(axis=0) .. rst-class:: sphx-glr-script-out .. code-block:: none array([[-15., -15., -15.], [ -5., -5., -5.], [ 5., 5., 5.], [ 15., 15., 15.]]) .. GENERATED FROM PYTHON SOURCE LINES 357-358 Scale (center, normalise) data column-wise .. GENERATED FROM PYTHON SOURCE LINES 358-363 .. code-block:: Python (a - a.mean(axis=0)) / a.std(axis=0) .. rst-class:: sphx-glr-script-out .. code-block:: none array([[-1.34164079, -1.34164079, -1.34164079], [-0.4472136 , -0.4472136 , -0.4472136 ], [ 0.4472136 , 0.4472136 , 0.4472136 ], [ 1.34164079, 1.34164079, 1.34164079]]) .. GENERATED FROM PYTHON SOURCE LINES 364-387 Examples Shapes of operands A, B and result: :: A (2d array): 5 x 4 B (1d array): 1 Result (2d array): 5 x 4 A (2d array): 5 x 4 B (1d array): 4 Result (2d array): 5 x 4 A (3d array): 15 x 3 x 5 B (3d array): 15 x 1 x 5 Result (3d array): 15 x 3 x 5 A (3d array): 15 x 3 x 5 B (2d array): 3 x 5 Result (3d array): 15 x 3 x 5 A (3d array): 15 x 3 x 5 B (2d array): 3 x 1 Result (3d array): 15 x 3 x 5 .. GENERATED FROM PYTHON SOURCE LINES 391-394 Exercises --------- Given the array: .. GENERATED FROM PYTHON SOURCE LINES 394-397 .. code-block:: Python X = np.random.randn(4, 2) # random normals in 4x2 array .. GENERATED FROM PYTHON SOURCE LINES 398-401 - For each column find the row index of the minimum value. - Write a function ``standardize(X)`` that return an array whose columns are centered and scaled (by std-dev). .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 6.367 seconds) .. _sphx_glr_download_auto_gallery_data_numpy.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: data_numpy.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: data_numpy.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: data_numpy.zip ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_