<DD>Using the already trained algorithm, the evaluate function does the evaluation of the test samples. In other words, it calculates the 87 class belonging probabilities for a given sample. </p></UL>

<br>

<p>The required utility functions are those below:</p>

with <p><imgsrc="http://2.bp.blogspot.com/-PqGvtE_NEmE/TraZYTHEfLI/AAAAAAAAAoo/swDZg20vd4Q/s1600/Screen+shot+2011-11-06+at+11.27.03+AM.png"border="0"></p>

the sigmoid function which figures in the module mathutils.

<p>In our case Θ is the lot of parameters, and with m = len(y), the cost function is given by the following formula:

But we're going to use the regularized version of the cost function, therefore the regularization term needs to be added, the regularized cost function follows:<p><imgsrc="http://3.bp.blogspot.com/-qNym-oCdMIg/Trd03YeslWI/AAAAAAAAApQ/GUfXiJ3vpUE/s400/Screen+shot+2011-11-07+at+3.03.55+AM.png"border="0"height="48"width="400"></p>

<p><blockquote>‹In mathematics, the <b>gradient</b> is a generalization of the usual concept of derivative to the functions of several variables. If <spanclass="texhtml"><i>f</i>(<i>x</i><sub>1</sub>, ..., <i>x</i><sub><i>n</i></sub>)</span> is a differentiable function of several variables, also called "scalar field", its <b>gradient</b> is the vector of the <i>n</i> partial derivatives of <i>f</i>. It is thus a vector-valued function also called vector field.› (wikipedia extract)</blockquote></p>

<p>Thus the gradient of the regularized cost function is a vector whose elements are defined as follows:</p>

<p>I decided not to use these modules as I wanted to make every step of the algorithm explicit. Implementing a neural network and a learning algorithm with

these modules can basically be reduced to two function calls but I was interested in what is going on behind the scenes. I am well aware of the fact

<p>I decided not to use these modules as I wanted to make every step of the algorithm explicit.

Implementing a neural network and a learning algorithm with

these modules can basically be reduced to two function calls but I was interested in what is going on

behind the scenes. I am well aware of the fact

that my algorithms performance lies far below the performance of the algorithms used in these modules.

</p>

...

...

@@ -143,9 +227,11 @@ that my algorithms performance lies far below the performance of the algorithms

<h2> Implementation </h2>

<p>So my implementation is a basic feed-forward neural network consisting of 1 input, 1 hidden and 1 output layer.

To adjust the weights during training I use the backpropagation algorithm (http://en.wikipedia.org/wiki/Backpropagation).

The main functions of the algorithm are the initialisation of the neuronal network, the training of the network with the training data, and the evaluation of the testing data.

Several other functions were necessary as helpers to realise these 3 functions. More detailed information can be found in the code.

To adjust the weights during training I use the <ahref="http://en.wikipedia.org/wiki/Backpropagation">backpropagation algorithm</a>.

The main functions of the algorithm are the initialisation of the neuronal network, the training of the network with

the training data, and the evaluation of the testing data.

Several other functions were necessary as helpers to realise these 3 functions. More detailed information can

be found in the code.

</p>

...

...

@@ -155,7 +241,8 @@ Several other functions were necessary as helpers to realise these 3 functions.

where n depends on the length of the input wav file. This means that the input varies in length.

This results in 17*n input nodes of the network (it does not matter that 2 dimensional data is fed in

as 1 dimensional as the network will find its own way to extract the information).

The problem, however, is that the input length is variable and in feed-forward networks the input must be of the same length always.

The problem, however, is that the input length is variable and in feed-forward networks the input must be of

the same length always.

For variable input-length other networks such as recurrent networks (which are beyond my knowledge) are more suitable.

</p>

...

...

@@ -172,9 +259,9 @@ the results by using the verification process integrated in the base script. </p

<p> We run some tests of all the algorithms, using the full training data, and using 75% of

it for learning and 25% for verification. The neural network algorithm performed quite well,

reaching on a limited set of training samples a correspondence index of 77% (against the

benchmark of around 50% of the random algorithm). The linear regression and K-neighbours algorithms

on the other hand had some implementation problems which did not allow them to reach acceptable

results. </p>

benchmark of around 50% of the random algorithm). The linear regression on the other hand

was not a viable choice, as python could not cope with the matrix of parameters

coming out of it. </p>

<h1> 7. Further development </h1>

...

...

@@ -184,10 +271,10 @@ to be able to re-use the algorithm running and verification parts. Another inter

can be done on the verification process, which could replicate better the one used in the Kaggle

competition, and making the reading and writing of the files operating system independent.</p>

<p>The linear regression and K-neighours algorithms need of course to be fixed, and generally all

of them need a more efficient approach, as the complexity and efficiency

<p>The K-neighours algorithms needs of course to be fixed, and generally all the algorithms

could be implemented more efficiently, as the complexity and efficiency

are of concern for such amounts of data. </p>

<p>A better tuning of the parameters of the algorithms should also be done,

<p>A better tuning of the parameters of the algorithms could also be done,