MACHINE LEARNING AND STATISTICS WORLD

Posts

K-MEANS CLUSTERING

June 22, 2021

K-Means clustering is an unsupervised centroid based algorithm. The algorithm tends to reduce the distance between the points in a cluster and the cluster centroid. The dataset I used is seeds dataset from : https://archive.ics.uci.edu/ml/datasets/seeds import pandas as pd import numpy as np import matplotlib.pyplot as plt #loading dataset df = pd.read_csv('seeds_dataset.csv') df.head() #taking compactness and perimeter columns z= df.iloc[:,[2,3]].values #applying elbow method to find the maximum number of clusters from sklearn.cluster import KMeans elbow_list= [] for i in range(1, 11): kmeans = KMeans(n_clusters=i, init='k-means++', random_state= 42) kmeans.fit(z) elbow_list.append(kmeans.inertia_) plt.plot(range(1, 11), elbow_list) plt.title('The Elbow Method Graph') plt.xlabel('Number of clusters(k)') plt.ylabel('elbow_list')...

SIMULATION OF AUTOREGRESSIVE PROCESS AR(2) in R

June 19, 2021

Here is the code in R for AR(2) process: set.seed(2017) X.ts <- arima.sim(list(ar = c(.7, .2)), n=1000) par(mfrow=c(2,1)) plot(X.ts,main="AR(2) Time Series, phi1=.7, phi2=.2") X.acf = acf(X.ts, main="Autocorrelation of AR(2) Time Series")

SIMULATION OF AUTOREGRESSIVE PROCESS AR(1) IN R

June 19, 2021

Here is the code in R for AR(1): set.seed(20190) n=10000 phi = .6 Z = rnorm(n,0,1) X=NULL X[1] = Z[1] for (t in 2:n) { X[t] = Z[t] + phi*X[t-1] } X.ts = ts(X) par(mfrow=c(2,1)) plot(X.ts,main="AR(1) Time Series on White Noise, phi=.6") X.acf = acf(X.ts, main="AR(1) Time Series on White Noise, phi=.6")

SIMULATION OF MOVING AVERAGE PROCESS IN R

June 19, 2021

Here is the code in R for simulation of moving average: #simulating MA(3) process noise = rnorm(10000) ma3= NULL for(i in 4:10000) { ma3[i] = noise[i] + 0.8*noise[i-1] + 0.5*noise[i-2] + 0.3*noise[i-3] } moving_average = ma3[4:10000] #changing the series into time series moving_average = ts(moving_average) par(mfrow=c(2,1)) plot(moving_average, col='blue') acf(moving_average) Conclusion: We observe the lag cuts off at 3 in the autocorrelation graph showing that the process is a MA(3)

SIMULATION OF A RANDOM WALK IN R

June 19, 2021

Here is the code in R for simulation of Random walk : x=NULL x[1]=0 for( i in 2:10000) { x[i]=x[i-1] + rnorm(1) } print(x) #converting it into a time series data random_walk = ts(x) plot(random_walk, main='visualization of a random work' , xlab='days', ylab=' ') acf(random_walk) As we see there is a high correlation in the correlogram, the random walk is a non-stationary process. # making the series stationary by differencing the values z<-diff(random_walk) plot(z) # we get white noise acf(z) Conclusion : We observe that there is no lag and hence no correlation. Thus we obtained stationary series by differencing the time series.

ESTIMATION OF PI USING MONTE CARLO METHOD USING PYTHON

June 17, 2021

The value of pi is calculated using monte carlo method by taking a square of 1 unit and inscribing a circle in the square. The radius of circle is 0.5 units. Now the ratio of area of circle to the ratio of square multiplied by 4 gives us pi. Python code: import random n=1000000 c_points=0 #points inside circle s_points=0 #points inside square for i in range(n): x = random.uniform(0,1) y = random.uniform(0,1) d = x**2 + y**2 if d<=1 : c_points +=1 s_points +=1 pi = 4*(c_points/s_points) print("pi value is:", pi) Conclusion: Higher the value of n, higher is the accuracy of value of pi.

Search This Blog

MACHINE LEARNING AND STATISTICS WORLD

Posts

LINEAR DISCRIMINANT ANALYSIS IN PYTHON

K-MEANS CLUSTERING

SIMULATION OF AUTOREGRESSIVE PROCESS AR(2) in R

SIMULATION OF AUTOREGRESSIVE PROCESS AR(1) IN R

SIMULATION OF MOVING AVERAGE PROCESS IN R

SIMULATION OF A RANDOM WALK IN R

ESTIMATION OF PI USING MONTE CARLO METHOD USING PYTHON