**Why Python?**

Python is becoming one of the most powerful and most widely used computer languages in recent times. The main advantages of python are its wide range of libraries, simple and dynamic language, free and open-source, very clear syntax, easy to use, and capability of interacting with almost all third party languages and platforms. Python is used in many domains like web development, data analysis, Game development, Machine Learning & Artificial intelligence, Web Scrapping, and so on. The great data handling capacity in python makes it the most preferred language for machine learning.

A Python library is a reusable block of code that you may want to include in your programs or projects. Let us look into some of the most used python libraries for Machine Learning.

*SCIENTIFIC COMPUTING:*

- Pandas
- NumPy
- SciPy

*ALGORITHMIC LIBRARIES:*

- Scikit
- Statsmodels

*DATA VISUALIZATION:*

- Matplotlib
- Seaborn

**1-Pandas:**

Pandas is a python library used for data analysis and data manipulation. It allows merging and filtering of data, as well as gathering it from other external sources like Excel. It provides fast, expressive, and flexible data structures to easily work with structured and time-series data. Pandas provide three data structures. They are Series (one dimension), DataFrame (two-dimension), and Panel (three-dimension). Its key data structure is DataFrame. Dataframe data are aligned in rows and columns.

To import pandas library: **import pandas**

To create data structures in pandas:

- pandas.Series(data,index, dtype,copy)
- pandas.DataFrame( data, index, columns, dtype, copy)
- pandas.Panel(data, items, major_axis, minor_axis, dtype, copy)data – It can be narray, list, dict, etc.,
- index – The length of the index should be as same as data length. The index value should be unique
- copy(optional) – default is false.

*Creation of DataFrame:*

import pandas as pd

data= [[‘apple’,24],[‘orange’,34],[‘mango’,28]]

df=pd.DataFrame(data,columns=[‘fruit’,’count’])

print(df)

Output:

fruit count

0 apple 24

1 orange 34

2 mango 28

**2-NumPy:**

NumPy means Numerical python. It is a python library used for working with huge multidimensional matrices and arrays. NumPy is faster than python lists. NumPy’s array class is called ndarray (n dimension arrays). Some of the features of NumPy are Mathematical and logical operations on arrays, Fourier transforms, and routines for shape manipulation,.Operations related to linear algebra. in-built functions for linear algebra and random number generation.

To import NumPy library: **import numpy**

NumPy array creation: numpy.array(data,ndmin)

ndmin=number of dimension

Code:

import numpy as np

arr = np.array([1,2,3,4])

print(arr)

Output:

[1 2 3 4 5]

**3-SciPy:**

SciPy is an Open Source Python-based library, which is used in mathematics, scientific computing, and technical computing. SciPy contains varieties of sub-packages that help to solve the most common issue related to Scientific Computation.

To import the Scipy library: **import Scipy**

Some of the sub-packages of Scipy:

- File input/output – scipy.io
- Special Function – scipy.special
- Statistics and random numbers – scipy.stats
- Optimization and fit – scipy.optimize
- Image manipulation – scipy.ndimage

**4-Scikit:**

Scikit-learn also defined as sklearn is a python library with a lot of efficient tools for machine learning and statistical modeling including classification, regression, clustering, and dimensionality reduction. It features various algorithms like support vector machines, random forests, and k-neighbors.

from sklearn import linear_modelSample implementation:

reg = linear_model.LinearRegression()

reg.fit([[0, 0], [1, 1], [2, 2]], [0, 1, 2])

Output:

LinearRegression()

**5-Statsmodel:**

Statsmodels is a python package that allows users to explore data, estimate statistical models, and perform statistical tests.

*Fitting a model in stats model typically involves 3 easy steps:*

- Use the model class to describe the model
- Fit the model using a class method
- Inspect the results using a summary method.

**6-Matplotlib:**

Matplotlib is a plotting library for the Python programming language. Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python. Matplotlib comes with a wide variety of plots. Plots help to understand trends, patterns, and to make correlations. Maplotlib along with packages like SciPy (Scientific Python) and NumPy is widely used as a replacement for MatLab, a popular platform for technical computing.

Code:import matplotlib.pyplot as plt

plt.plot([1,2,3,4],[1,2,3,4],’ro-‘)

plt.axis([0,10,0,10])

plt.xlabel(“x axis”)

plt.ylabel(“y axis”)

plt.show()

**Output:**

**7-Seaborn:**

Seaborn is a dataset-oriented API for examining relationships between multiple variables. It provides specialized support for using categorical variables to show observations or aggregate statistics. It provides a high-level interface for drawing attractive and informative statistical graphics. Distplot stands for distribution plot, it takes as input an array and plots a curve corresponding to the distribution of points in the array.

import seaborn as snscode:

import matplotlib.pyplot as plt

sns.distplot([0, 1, 2, 3, 4, 5])

plt.show()

*Output* :

**Stay updated with Emerging Technologies and Science.**