If you want jupyter autocorrect mode use this:
pip3 install jupyter-tabnine --user
jupyter nbextension install --py jupyter_tabnine --user
jupyter nbextension enable --py jupyter_tabnine --user
jupyter serverextension enable --py jupyter_tabnine --user
Load CSV File With NumPy
import numpy
filename = ‘xxx.csv’
raw_data = open(filename, ‘rt’)
data = numpy.loadtxt(raw_data, delimiter=”,”)
print(data.shape)
Load CSV using Pandas
import pandas
filename = ‘xxx.csv’
names = [‘preg’, ‘plas’, ‘pres’, ‘skin’, ‘test’, ‘mass’, ‘pedi’, ‘age’, ‘class’] #column heads
data = pandas.read_csv(filename, names=names)
print(data.shape)
convert a one-dimensional list of data to an array
# one dimensional example
from numpy import array
# list of data
data = [11, 22, 33, 44, 55]
# array of data
data = array(data)
print(data)
print(type(data))
convert a two-dimensional list of data to an array
# two dimensional example
from numpy import array
# list of data
data = [[11, 22],
[33, 44],
[55, 66]]
# array of data
data = array(data)
print(data)
print(type(data))
Array indexing
#for one dimensional:
# simple indexing
from numpy import array
# define array
data = array([11, 22, 33, 44, 55])
# index data
print(data[0])
print(data[4])
print(data[-2])
#data slicing
print(data[:])
print(data[0:1])
print(data[-2:])
11
55
44
1
[11 22 33 44 55]
[11]
[44 55]
#Two dimensional indexing
from numpy import array
# define array
data = array([[11, 22], [33, 44], [55, 66]])
# index data
print(data[0,0])
print(data[0,])
#data sclicing
# separate data
data = array([[11, 22, 33],
[44, 55, 66],
[77, 88, 99]])
X, y = data[:, :-1], data[:, -1]
print (x,y)
11
[11,22]
[[1122] [44 55] [77 88]]
[33 66 99]
To train the model, split the data to train and test rows
The dataset will be divided in to two parts first set wil be used to train the model and second to test the accuracy of trained model. slicing all columns by specifying β:β in the second dimension index. The training dataset would be all rows from the beginning to the split point.
# split train and test
from numpy import array
# define array
data = array([[11, 22, 33],
[44, 55, 66],
# split train and test
from numpy import array
# define array
data = array([[11, 22, 33],
[44, 55, 66],
[77, 88, 99],
[12, 14, 44],
[17,15, 18],
[45, 23, 22]])
# separate data
split = int((len(data) * 0.8)) # 805 data is seperated for training and 20% for the testing
train,test = data[:split,:],data[split:,:]
print(train)
print(test)
split
[[11 22 33]
[44 55 66]
[77 88 99]
[12 14 44]]
[[17 15 18]
[45 23 22]]
4
Array reshaping:
After slicing, you need to reshape the data.
You can use the size of your array dimensions in the shape dimension, such as specifying parameters.
# array shape
from numpy import array
# list of data
data = [[11, 22],
[33, 44],
[55, 66]]
# array of data
data = array(data)
print(data.shape)
print('Rows: %d' % data.shape[0])
print('Cols: %d' % data.shape[1])
(3, 2)
Rows: 3
Cols: 2
Reshape 1D to 2D Array
Its common to need to reshape 1D array to 2D array:
the case of reshaping a one-dimensional array into a two-dimensional array with one column, the tuple would be the shape of the array as the first dimension (data.shape[0]) and 1 for the second dimension.
# define array
data = array([11, 22, 33, 44, 55])
print(data.shape)
# reshape
data = data.reshape((data.shape[0], 1))
print(data.shape)
(5,)
(5, 1)
Reshape 2D to 3D array
We can use the sizes in the shape attribute on the array to specify the number of samples (rows) and columns (time steps) and fix the number of features at 1.
data = [[11, 22],
[33, 44],
[55, 66]]
# array of data
data = array(data)
print(data.shape)
# reshape
data = data.reshape((data.shape[0], data.shape[1], 1))
print(data.shape)
(3, 2)
(3, 2, 1)
Have a nice day π