Learn Machine Learning With Python

Topics

Python Libraries for Machine Learning: Numpy

Introduction

Till now you configured an ML environment using Anaconda.

Python provides various functionalities to support implementing machine learning, with the help of different python libraries. From this chapter onwards, we will start exploring and studying each of them one by one.

We will start with NumPy or Numerical Python.

What is Python NumPy?

Numeric, the ancestor of NumPy, was developed by Jim Hugunin. Another package Numarray was also developed, having some additional functionalities. In 2005, Travis Oliphant created the NumPy package by incorporating the features of Numarray into the Numeric package. There are many contributors to this open-source project.

NumPy or Numerical Python is a python library that provides the following

a powerful N-dimensional array object
sophisticated (broadcasting) functions
tools for integrating C/C++ and Fortran code
useful linear algebra, Fourier Transform, and random number capabilities.

It can also provide an efficient multi-dimensional container of generic data. Arbitrary data-types can be defined. The official website is www.numpy.org

Installing NumPy in Python

1. Ubuntu/ Linux

sudo apt update -y
sudo apt upgrade -y
sudo apt install python3-tk python3-pip -y
sudo pip install numpy -y

2. Anaconda

conda install -c anaconda numpy

NumPy Array

It is a powerful N-dimensional array which is in the form of rows and columns. We can initialize NumPy arrays from the nested Python list and access its elements.

NumPy array is not the same as the Standard Python Library Class array.array, which only handles 1D arrays.

Single Dimensional NumPy Array
1. import numpy as np
2. a = np.array([1,2,3])
3. print(a)

Multi-Dimensional arrays
1. import numpy as np
2. a = np.array([[1,2,3],[4,5,6]])
3. print(a)
the above code will result in [[1 2 3] [4 5 6]]

NumPy Array Attributes

ndarray.ndim

It returns the number of axes (dimensions) of the array.
1. import numpy as np
2. a = np.array([[1,2,3],[4,5,6]])
3. print(a.ndim)
The output of the above code will be 2, since 'a' is a 2D array
ndarray.shape

It returns a tuple of the dimension of the array, i.e. (n,m), where n is number of rows and m is the number of columns
1. import numpy as np
2. a = np.array([[1,2,3],[4,5,6]])
3. print(a.shape)
The output of the above code will be (2,3), i.e. 2 rows and 3 columns
ndarray.size

It returns the total number of elements of the array.
1. import numpy as np
2. a = np.array([[1,2,3],[4,5,6]])
3. print(a.size)
The output of the above code will be 6 i.e. 2 x 3
ndarray.dtype

It returns an object describing the type of the elements in the array.
1. import numpy as np
2. a = np.array([[1,2,3],[4,5,6]])
3. print(a.dtype)
The output of the above code will be "int32" i.e. 32-bit integer

we can explicitly define the data type of a NumPy array
1. import numpy as np
2. a = np.array([[1,2,3],[4,5,6]], dtype = float)
3. print(a.dtype)
The above code will return "float64" i.e. 64-bit float
ndarray.itemsize

It returns the size in bytes of each element of the array.
1. import numpy as np
2. a = np.array([[1,2,3],[4,5,6]])
3. print(a.itemsize)
The output of the above code will be 4 i.e. 32/8
ndarray.data

It returns the buffer containing the actual elements of the array. This is an alternative of accessing the elements through indexing
1. import numpy as np
2. a = np.array([[1,2,3],[4,5,6]])
3. print(a.data)
The above code will return the list of elements
ndarray.sum()

The function will return the sum of all the elements of the ndarray
1. import numpy as np
2. a = np.random.random( (2,3) )
3. print(a)
4. print(a.sum())
The matrix generated for me is [[0.46541517 0.66668157 0.36277909]
[0.7115755 0.57306008 0.64267163]],

hence for me above code will return 3.422183052180838. Since the random number is used here, hence you may not get the same output
ndarray.min()

The function will return the minimum element value from the ndarray
1. import numpy as np
2. a = np.random.random( (2,3) )
3. print(a.min())
The matrix generated for me is [[0.46541517 0.66668157 0.36277909]
[0.7115755 0.57306008 0.64267163]],

hence for me above code will return 0.36277909. Since random number is used here, hence you may not get the same output
ndarray.max()

The function will return the maximum element value from the ndarray
1. import numpy as np
2. a = np.random.random( (2,3) )
3. print(a.max())
The matrix generated for me is [[0.46541517 0.66668157 0.36277909]
[0.7115755 0.57306008 0.64267163]],

hence for me above code will return 0.7115755. Since random number is used here, hence you may not get the same output

NumPy Functions

1. numpy.type()

Syntax

type(numpy.ndarray)

It is a python function is used to return the type of the parameter passed. In the case of numpy array, it will return numpy.ndarray

import numpy as np
a = np.array([[1,2,3],[4,5,6]])
print(type(a))

The above code will return numpy.ndarray

2. numpy.zeros()

Syntax

numpy.zeros((rows,columns), dtype)

The above function will create a numpy array of the given the dimensions with each element being zero. If no dtype is defined, default dtype is taken

import numpy as np
np.zeros((3,3))
print(a)

The above code will result in a 3x3 numpy array with each element being zero.

3. numpy.ones()

Syntax

numpy.ones((rows,columns), dtype)

The above function will create a numpy array of the given dimensions. If no dtype is defined with each element being one, default dtype is taken.

import numpy as np
np.ones((3,3))
print(a)

The above code will result in a 3x3 numpy array with each element being one.

4. numpy.empty()

Syntax

numpy.empty((rows,columns))

The above function creates an array whose initial content is random and depends on the state of the memory.

import numpy as np
np.empty((3,3))
print(a)

The above code will result in a 3x3 numpy array with each element being random.

5. numpy.arange()

Syntax

numpy.arange(start, stop, step)

The above function is used to make a numpy array with elements in the range between the start and stop value with the difference of step value.

import numpy as np
a=np.arange(5,25,4)
print(a)

The output of the above code will be [ 5 9 13 17 21 ]

6. numpy.linspace()

Syntax

numpy.linspace(start, stop, num_of_elements)

import numpy as np
a=np.linspace(5,25,5)
print(a)

The output of the above code will be [ 5 10 15 20 25 ]

7. numpy.logspace()

Syntax

numpy.logspace(start, stop, num_of_elements)

The above function is used to make a numpy array with elements in the range between the start and stop value and num_of_elements as the size of the numpy array. The default dtype of numpy array is float64. All the elements will be spanned over the logarithmic scale i.e the resulting elements are the log of the corresponding element.

import numpy as np
a=np.logspace(5,25,5)
print(a)

The output of the above code will be [1.e+05 1.e+10 1.e+15 1.e+20 1.e+25]

8. numpy.sin()

Syntax

numpy.sin(numpy.ndarray)

The above code will return the sin of the given parameter.

import numpy as np
a=np.logspace(5,25,2)
print(np.sin(a))

The output of the above code will be [ 0.0357488 -0.3052578]

Similarly, there are cos() ,tan(), etc.

9. numpy.reshape()

Syntax

numpy.resahpe(dimensions)

The above function is used to change the dimension of a numpy array. The number of arguments in the reshape decides the dimensions of the numpy array.

import numpy as np
a=np.arange(9).reshape(3,3)
print(a)

The output of the above code will be a 2D array with 3x3 dimensions

10. numpy.random.random()

Syntax

numpy.random.random( (rows, column) )

The above function is used to return a numpy ndarray with the given dimensions and each element of ndarray being randomly generated.

a = np.random.random((2,2))

The above code will return a 2x2 ndarray

11. numpy.exp()

Syntax

numpy.exp(numpy.ndarray)

The above function returns a ndarray with exponential of every element

b = np.exp([10])

The above code returns the value 22026.4657948

12. numpy.sqrt()

Syntax

numpy.sqrt(numpy.ndarray)

The above function returns a ndarray with ex of every element

b = np.sqrt([16])

The above code returns the value 4

NumPy Basic Operations

a = np.array( [ 5, 10, 15, 20, 25] )
b = np.array( [ 0, 1, 2, 3 ] )

1. The below code will return the difference between the two arrays

c = a - b

2. The below code will return the arrays containing the square of each element

b**2

3. The below code will return the value according to the given expression

10* np.sin(a)

4. The below code will return "true" at every element position which satisfies the given condition

a<15

NumPy Array Basic Operations

a = np.array( [[1,1], [0,1]])
b = np.array( [[2,0],[3,4]])

1. The below code will return the elementwise product of both the arrays

a * b

2. The below code will return the matrix product of both the arrays

a @ b

a.dot(b)

Conclusion

In this chapter, we studied Python NumPy. In the next chapter, we will learn about Python Pandas.

Python Pandas is an excellent library used majority for data manipulation and analysis.

Author

Rohit Gupta

13 59.8k 3.2m

Previous « Setting Up Anaconda on WindowsPrevious Next » Python Libraries for Machine Learning: PandasNext