Data science with coconut

This is an attempt to rewrite an introductory notebook to data science and machine learning in the coconut language.

Linear Regression

Basic function

import numpy as np

warm_up_exercise = -> np.eye(5, dtype=int)

Run warm_up_exercise() :

warm_up_exercise() |> print
[[1 0 0 0 0]
 [0 1 0 0 0]
 [0 0 1 0 0]
 [0 0 0 1 0]
 [0 0 0 0 1]]

Plotting

The data ex1data1.txt has only two properties: profit and population.

import numpy as np

data = open("data/ml/ex1data1.txt", "r") |> np.loadtxt$(?, delimiter=",")
(X, y) = [0, 1] |> fmap$(i -> data[:, i])
m = y |> len
m
97

The plot_data() uses scatter plot to visualize the data.

import matplotlib.pyplot as plt


def plot_data(x, y):
    plt.plot(x, y, linestyle='', marker='x', color='r', label='Training data')
    plt.xlabel('Population of City in 10,000s')
    plt.ylabel('Profit in $10,000s')
    
plt.figure()
(X, y) |*> plot_data
plt.show()
_images/coconut-introduction_12_0.png

Gradient descent

In this part, we will fit the linear regression parameters to our dataset using gradient descent. Parameters are initialized as follows:

  • Add a column of ones to \(x\) to accommodate the \(\theta_0\) intercept term: