A Survival Machine's blog

[Code, Notes, Rants, Writings]*

Open Solaris GEEK HUNT

leave a comment »

The Open Solaris Geek hunt is a series of quizzes geared towards the novice to intermediate users of Open Solaris. So, if you love using Open Solaris, time to get rewarded for it! Its a lot of fun. The second quiz is now up at http://www.thinkdigit.com/osolgeekhunt/

Videos: http://www.thinkdigit.com/opensolaris/video.html

Resources: http://www.thinkdigit.com/opensolaris/resource.html

The fun guy Abhishek is in charge, so expect to have a lot of fun :)

Written by Amit

February 9, 2010 at 3:49 am

Posted in Uncategorized

RRTD: Random rant of the Day

leave a comment »

Random and Rant pretty much define a lot of my state of being. I rant a lot, and I do a lot of random stuff, sometimes for constructive work, like Genetic Algorithms. (GAs).

So, here is the RRTD ( a new blog series probably :) :

  • I have been in cribbing mood off late thanks to my pathetic performance in entrance exams off late, like the one on the coming February 14th. I dread it, God. Somewhere during my undergraduate college, I lost faith in studying for marks. That’s it. I haven’t gathered it back, and will probably never do it again. Its kind of perilous, since India needs you to do exactly that to get into a post-graduate degree programme. But what the heck! I would rather not study for marks, since I really don’t see the utility for it. If nothing else, Bangalore, Pune, here I come.
  • I am writing a motivation letter for a scholarship. Since its for a scholarship,  I need to make it “dazzle” like Ben Campbell in 21. Huh! My life and dazzling. LOLs. But hey, as I am writing this- its actually been one hell of a ride. WOW. I love me. Thanks to all the wonderful people- past and present, in my life.
  • Oh, I am loving this song, Uff Teri Ada from a upcoming Bollywood film. I have heard it over 100 times today, I think.
  • My current status message is, “No Pads! No helmets! Only balls”, to Simple Plan, yeah, I have balls. FTW.

That’s it folks. Time for more ranting :D

Written by Amit

February 6, 2010 at 7:00 pm

Posted in Uncategorized

Tagged with ,

Eigen values, vectors and Fibonacci sequence

with 2 comments

Eigen values and Eigen vectors were just numbers to me, thanks to the way they were first taught to me. In a larger scale, Linear Algebra, I guessed was a dry topic then. Not any more now. I find Linear Algebra fun and awesome now. I understand a bit of it too now. So, this is my first post dedicated to this awesomeness.

I came across application of using Eigen values and Eigen vectors to find the N’th Fibonacci number very easily in Prof. Gilbert Strang’s lecture series. Though I found one great blog post, which almost led to me not writing this post, but I use this blog as another way to keep track of the things I learn. So here it goes. Errors may have crept in my understanding. Please correct me, if you find one.

Fibonacci numbers

A lot about Fibonacci numbers is known. The least I can say is that they are a sequence of numbers such that every term is a sum of the previous two terms: F_{n+2} = F_{n} + F_{n+1}. The first few terms of this series are 0,1,1,2,3,5,8,13..

Linear Dynamical System

Consider a N-dimensional vector \vec{X}. Assume that the variation of \vec{X} with respect to time, when it varies in discrete steps may be represented as: X_{t+1}=AX_{t}, where A is constant matrix  Such a system is called a Linear Dynamical System. For example, X_{n+1}=2X_n is a simple Linear Dynamical System.

Fibonacci Number sequence as a Linear Dynamical system

Let us now make an attempt to model the Fibonacci number sequence as a Linear Dynamical System. This is our first step towards this beautiful application of Eigen values and vectors.

Consider the following equations:

F_{n+2}=F_n + F_{n+1}                                                                                                                                       F_{n+1}=F_{n+1}

The second equation is written to enable us to write the above as a Linear dynamical system. Now, can we write the above equations as follows:

which is of the form U_{n+1}=AU_{n}, where A is the constant matrix as above.

The matrices S and \Lambda

The matrix S is said to be the Eigen vector matrix, i.e S= [x_1 x_2 .. x_n], considering x_1, x_2, , x_n are the n Eigen vectors of the constant matrix A, defined above. The \Lambda matrix is a diagonal matrix defined as follows:

where \lambda_{1}, \lambda_{2}, \lambda_{3}, .. \lambda_{n} are the Eigen vectors of the same matrix.

From the basic equation of Eigen values and vectors Ax= \lambda x, we can write the above as AS=S\Lambda. From this equation, we can write :                                                                                                                                                                                                  A = S \Lambda S^{-1} , AA = (S\Lambda S^{-1}) (S\Lambda S^{-1}), AA = S\Lambda (S^{-1}S) \Lambda S^{-1} = S\Lambda ^{2}S^{-1}, A^2 = S\Lambda ^{2}S^{-1} , A^n = S\Lambda ^{n}S^{-1}

This is an useful and important relationship. How? Earlier, we wrote:   U_{n+1} = AU_{n}, U_{1} = AU_{0} , U_{2} =AU_{1} = A(AU_{0}) = A^{2}U_0 U_n = A^{n}U_{0}

Now, substitute A^n  by S\Lambda ^{n}S^{-1}U_0 can be written in a more simplified manner, by writing it as a combination of its Eigen vectors, U_0=Sc where c is a coefficient matrix, from which we can write c=S^{-1}U_0. Thus we have:

U_n = S\Lambda ^{n}S^{-1}Sc

U_n = S\Lambda ^{n}c

So what does the last equation say? It says, the current state (n^{th} state) of our model is expressed as matrix obtained by the product of the Eigen vector matrix S, the Eigen value matrix \Lambda, raised to the power k and the matrix c.

Let us now make use of this equation to get the Nth Fibonacci number.

Eigen Values and Vectors of A

The Eigen values of A can be shown to be: \lambda_{1}= \frac{1+\sqrt{5}}{2} and \lambda_{2}= \frac{1-\sqrt{5}}{2}. The Eigen vectors of A turns out to be x_1 = \frac{1+\sqrt(5)}{2} and x_2 = \frac{1-\sqrt(5)}{2}.

S and \Lambda

The coefficient matric c
As derived earlier,  c=S^{-1} u_0. We already have S, S^{-1} turns out to be:

The first two Fibonacci numbers are 1,1, so we can take u_0 =\begin{bmatrix}  1 \\  1 \end{bmatrix}. Hence:

Finding the Nth Fibonacci Number

Now, let’s rewrite the earlier equation U_n = S\Lambda ^{n}c, by substituting the values such that it matches our Fibonacci model:

We can rewrite the above equation as follows:

So, we have it! Just substitute the value of ‘n’ and we get the F_{n+1} and F_{n+2} Fibonacci numbers. For example, let n=0 and see the values of F_1 and F_2, they should come out to be 1, 1 respectively. This piece of Python code prints the first ‘n’ Fibonacci numbers using the above formulae:

#!/usr/bin/python
# Print fibonacci numbers by forming it as a linear dynamical
# system
# Amit Saha (amitsaha.in@gmail.com)
# http://amitksaha.wordpress.com

import math
import numpy as np

# Eigen vectors of A
eig_vec1= (1.0 + math.sqrt(5))/2.0
eig_vec2= (1.0 - math.sqrt(5))/2.0

# coefficients
c_1 = ((5.0 + math.sqrt(5.0))/10.0)
c_2 = ((5.0 - math.sqrt(5.0))/10.0)

# matrices
mat1 = np.matrix([[eig_vec1],[1]])
mat2 = np.matrix([[eig_vec2],[1]])

# set n accordingly to get the first (n+1) fibonacci numbers
n=99
for n in range(0,n):
    sum_mat = (eig_vec1**n)*c_1*mat1 + (eig_vec2**n)*c_2*mat2
    print int(sum_mat[1])

Parting notes

A very interesting thing to note here is that the second term of the last equation, contains the term \frac{1-\sqrt{5}}{2}. The absolute value of this term t is between 0 and 1, hence as ‘n’ increases its value goes closer and closer to 0 and almost vanishes. Thus, with large values of ‘n’, the value of the above expression can be assumed to be determined by the first term. This has an interesting implication, for which I shall direct you to THE awesome post I was talking about.

Hope you enjoyed reading this, as much as I enjoyed learning about it! Please let me know if there are any Mathematical errors remaining.

Written by Amit

February 2, 2010 at 9:09 pm

Posted in Mathematics

Tagged with ,

CS 229: Machine learning- Notes on Lecture #3

leave a comment »

For a brief recap of lecture #2, please see my notes.

In the third lecture, Prof. Andrew Ng continues his lecture on regression. He talks about non-parametric learning algorithms, a probabilistic interpretation of linear regression (introduced in the 2nd lecture), a simple classification algorithm and very brief idea about perceptron.

Underfitting and Overfitting: To start with, the concepts of underfitting and overfitting in linear regression are explained very informally and clearly. The choice of the degree of the fitting polynomial and hence, the parameters \vec{\theta} can lead to either of the above to happen. This is a motivating factor to study non-parametric learning algorithms.

Non-parametric learning algorithms: Whereas the name may falsely imply that the parameters are done away with, the actual meaning is different, which is kind of weird. The learning algorithm that we studied in the 2nd lecture, had  a fixed number of parameters, independent of the size of the data set m. In non-parametric learning algorithms, the number of parameters varies linearly with m. We will see what this means shortly.

Locally Weighted Regression: Locally Weighted Regression (LWR) is a non-parametric learning algorithm. As is apparent, it has two important characteristics:  it works locally and there are weights involved. Let’s see how.

Demonstrative sketch to demonstrate LWR (drawn from the lecture)

Let’s say, we are interested in finding the value of Y_i  for a particular X_i. The LWR acts locally, so it considers a certain vicinity around X_i and their corresponding Y_i’s. Then, its tries to fit \vec{\theta} so as to minimize the function:     \sum_{i=1}^m w^{(i)}(y^{(i)}-{\theta}^T X^{(i)})^2, where w^{(i)}=e^{\frac{-(x^{(i)}-x)^2}{2}}.

From the expression, if w^{(i)} it is clear that for points which are near, w^{(i)}=1 and w^{(i)}=0 for points far away from each other. Thus the weight is higher when we are considering points which are close to each other.

Probabilistic Interpretation of Linear regression: This is a very interesting part of the lecture. In the second lecture, we estimated \vec{\theta} by minimizing J(\vec{\theta})=\frac{1}{2}\sum_{i=1}^m(h_\theta(x^{(i)}-y^{(i)})^2. Why was the mean square error minimized and not the third or fourth power of the error? This part of the lecture gives a proof to that. The starting point of the proof is to consider y^{(i)} = \theta^{T}x^{(i)} + \epsilon^{i}, where \epsilon is random noise, and assumed to be a gaussian random variable with mean m and \sigma^{2} its variance. Using this definition, it defines a likelihood function L(\vec{\theta})=P(\vec{Y}|X;\theta). Finally, after defining a logarithmic function on L(\vec{\theta}), the function to be minimized turns out to be the same as J(\vec{\theta}) defined earlier.

Besides the above, classification and perceptrons were discussed. I didn’t take much notes in those sections. Here is the video lecture:

It is highly recommended to go through the section notes on probability theory now. See you next time!

Written by Amit

January 24, 2010 at 8:44 am

Posted in Course Notes

Tagged with

Of the classics and the dolphins: My contribution to the Sun history

with 3 comments

This couldn’t have come at a more right time. I have been meaning to write this for 6 months now. 22nd July, 2009 was my last working day at Sun Microsystems and I am no more in danger of being filed for saying anything, probably harmful, since Sun Microsystems is no more. Its a part  of history now. What happened to it will surely be a case study for upcoming business honchos sometime soon in the future. I am too little a person to even trying to know such big happenings.

I joined Sun in a hurry. I left Sun in the same hurried fashion. Both times, it was completely unexpected. But, when I joined the company of my dream, I had no idea that I would be leaving it after 14 months, and more importantly that it would already be sold to another company even before I had completed a year! But that wasn’t the reason for me to leave. I just made a different career choice, having found my true calling. Whether it was a justified choice or not is a matter of debate- with myself, my friends and parents. Only time will tell.

My 14 months at Sun Microsystems, Inc. India

May 26th, 2008 was the day I was going to work in the company of my dream. How I got a call for interview (and hence employed) is again a mystery. Probably someone from inside had referred me, because of my Code For Freedom contributions. Let’s move on. So, how was the first day like? Seen Charlie in a Chocolate factory? Exactly,  I was Charlie. So, I learnt that I would be working in the MySQL team at the India Engineering Center. BTW, for a bit of perspective, MySQL AB and hence their products were all then Sun’s after the buyout in early 2008. Even before joining Sun, I had some experience with the Sun employees, thanks to the large scale open source efforts that they embarked in, like the NetBeans IDE and OpenOffice.org projects. The free coffee, the free t-shirts and the experience of the first  job in a wonderful city like Bangalore was soaking me in. It was wonderful.

At work however, things were not so fun. Ours was a new team, and the only “Sun Classic” team in MySQL at that point of time. It wasn’t easy to suddenly start working as part of a team which had just been inherited. The “dolphins” were already working together as a team for years, and it was difficult on both sides to get adjusted to each other. They relied heavily on IRC as a communication medium, like true Open Source projects. We, including me was more used to mailing lists. That was just one of the conflicts. Besides, we had this usual inter-personal issues and the geographic distance didn’t help. We were in India and they were all over Europe! Time zone was always an issue. Besides, there was something distinctly different between Sun engineers and the “dolphins”. We felt it. What, we couldn’t explain. However, they had a very high standard of “go getters” in their team. Folks who were brilliant hackers! We were striving hard to get upto speed with them.

The “ice breaker” came with our first meeting in Riga, Latvia around September, 2008 (Yes! My first trip abroad). It helped a bit, but we were still feeling kinda “left out”. May be it was just the attitude you get in a hack fest. But, this was no hack fest! It was a company meeting, and people were employees first and then hackers! Anyways, it helped to talk to the “dolphins”.

I as a part of the team, started getting up to speed with the work. SQL, by the way and databases in general is something I hate, even today. But, I couldn’t deny a offer from Sun Microsystems, Could I?

The first big thing came around December, 2008- IBM was supposedly in talks to buy Sun Microsystems. damn. (This proved to be true later on, though, I think). But, what happened in January, 2009 was even worse. More than a 100 odd folks were “layed off” from the IEC. Another experience in life- Surviving my first layoff! Supposedly, it was a part of a six month layoff cycle!

Sun Microsystems was bleeding. I hated to see it. I loved the company so much. Infact, all the people who worked there had a deep sense of respect for the company, more than anything else!

The big thing happened in April, 2009. Oracle and Sun had entered into a non-definitive agreement that the latter would be bought over. Damn. That was it. It was finally happening. And it finally happened couple of days back.

Oh by the way, in May, 2009 (I think) the Sun IEC celebrated its 10 years of inception and that day I danced for the first time in my life. Whom should I thank? Sun or Manasi Scott? Probably both :)

I left Sun in July, 2009 as a Sun employee leaving Bangalore, leaving the dream company and leaving my team and leaving some great friends and colleagues.

Rest in peace, Sun Microsystems. The world will miss thy name!

My Sun blog: http://blogs.sun.com/amitsaha/

Written by Amit

January 22, 2010 at 8:28 pm

Posted in Uncategorized

Tagged with , ,

Not your Grandmother’s Genetic Algorithms (GA): video series

leave a comment »

Excuse me for sounding like Barney, but see and hear the legendary David E. Goldberg speak on Genetic Algorithms, a field that he has shaped develop over the past 2 decades. (The slides are here on Slideshare)

Part 1

Part 2

Part 3

Part 4


Written by Amit

January 19, 2010 at 10:51 pm

CS 229: Machine learning- Notes on Lecture #2

with one comment

In the second lecture, Prof. Andrew Ng starts talking about supervised learning methods. He begins with Linear Regression, in which the relationship between the input is assumed to be linear, such as  h_\theta(x)=\theta_0 + \theta_1x. The parameters \theta_0 and \theta_1 need to be found, for which couple of approaches are discussed:

  1. The first one involves minimizing the function:                                                                                                     J(\vec{\theta})=\frac{1}{2}\sum_{i=1}^m(h_\theta(x^{(i)}-y^{(i)})^2, where m is the number of training examples, x(i) is the ith sample input, and y(i) is the corresponding output. Couple of methods are discussed for the minimization task above. The first method that is discussed is the Gradient Descent method, which roughly is:
    1. Start with some value of the \vec{\theta} (say \vec{\theta} = \vec{0})
    2. Keep updating \vec{\theta} to reduce J(\vec{\theta}) as follows:                                           {\theta}_i = {\theta}_i - \alpha \frac{\partial}{\partial {\theta}_i} J(\theta)
    3. Stop when a desired reduced value of J(\vec{\theta}) is reached.
  2. A faster method in case of large data sets is the stochastic gradient descent method is then described

  3. The second approach to estimate the value of \vec{\theta} uses linear algebraic techniques to obtain a closed form formulae for the parameters, \vec{\theta}

This is the video lecture:

As I told in the notes of my first lecture, this is a good time to review the section notes on Linear Algebra. Lecture Notes 1 have some notes on the first two lectures.

Looks like we will do a lot of regression in the next lecture. See you then!

Written by Amit

January 19, 2010 at 8:00 pm

Posted in Course Notes

Tagged with ,

Measure of population diversity in Genetic Algorithms

with one comment

Population diversity is a desirable characteristic in a Genetic Algorithm(GA). What it basically implies is that the search space should be well represented. This gives the GA a chance to search for all the possible solutions uniformly. Premature convergence often results in poor or sub-optimal GA performance.

GA literature most often talks about population diversity in a qualitative manner. However, a quantitative measure is sometimes needed, if not for anything else, but for own satisfaction that a population is indeed able to maintain its diversity or sometimes to explain premature convergence, in which the population loses its diversity before it has found all the solutions. In my current research work, I needed such a metric which would give an idea of diversity of the population and give me an idea when there is a premature convergence versus a proper one. I shall now summarize a couple of the efforts that I have found very intuitive: (There may  be more; I found couple more but I haven’t yet found them intuitive and/or explored them fully )

  • Leung et.al, in their work Degree of population diversity-a perspective on premature convergence in genetic algorithms and its markov chain analysis defines a metric, \lambda(x) to define a diversity metric, considering the GA population as a matrix. Unfortunately, by its very definition, the metric is only applicable to a Binary coded GA.
  • Morrison and De Jong in their work Measurement of Population Diversity proposes a novel measure of diversity. Their idea is derived from the concepts of centroid and moment of inertia of a rotating object. They define the moment of inertia of a GA population about its centroid. The larger the value of the moment of inertia, the diverse the population. This comes from the definition itself. This is of course applicable to a real coded GA and hence more useful and widely applicable.

Illustrative plot:

To illustrate the point, the following is a plot showing the variations of the moment of inertia for a population, in the case of a good convergence(RED) and in case of a bad convergence (GREEN):

Illustrative plot showing the population diversity loss

Such a quantitative measure of population diversity, is also useful in setting the GA parameters such as crossover probability or mutation probability. For example, if we see that there is an unexpected convergence, increasing the mutation probability may be a good idea.

Hope you find this post useful!

    Written by Amit

    January 18, 2010 at 3:07 pm

    Posted in Genetic Algorithms

    Tagged with ,

    CS 229: Machine learning- Notes on Lecture #1

    with one comment

    With this blog post, I am making a bold attempt to follow an online course to a decent degree of successful completion. I shall post some notes, links to relevant lecture notes and other misc. stuff as I start on my quest to follow the CS 229: Machine Learning course from Stanford University.

    This is the video lecture:

    My summarized notes:

    The first 30 minutes or so of this lecture is some logistic information for the course, introduction to the course and the field of machine learning in general. The actual stuff starts at around 31:36, so you may want to skip till that time
    • Machine learning definition: Arthur Samuel(1959), Tom Mitchel(1998)

    • In the rest of the lecture, the four main parts of the course is described in some detail along with illustrative examples.
    • Supervised learning: providing the algorithm a dataset, supervising- learn the association between input & output, regression problems, classification problems, support vector machines- infinite number of features

    • Learning theory

    • Unsupervised learning: clustering, Cocktail party problem, indepedent component analysis

    • Reinforcement learning: reward function (good dog, bad dog), feedback function

    Miscellaneous

    Before proceeding further, you may also want to take a look at the course materials. The Lecture notes and the video lectures are not in a 1-1 mapping, its a many-one mapping. So you have 20 video lectures and 12 lecture notes. Corresponding to the first video lecture, there is no corresponding lecture note.

    The first of the Section Notes on Linear Algebra carries a nice review of the concepts that will be useful for a quick review. From what I understand and know, Linear Algebra seems to very important to the study of Machine Learning. So, its a good idea to just skim over the notes given on the website.

    Hope you enjoy following the first video lecture! See you.

    Written by Amit

    January 9, 2010 at 5:36 pm

    Posted in Course Notes

    Tagged with

    LaTex in your WordPress posts

    leave a comment »

    • A circle at centre (0,0) with radius 0.0 or a point in 2D?:
      • x^2 + y^2 = 0
    • How beautiful are you? Golden ratio:
      • \phi = \frac {(1+\sqrt{5})}{2} \approx{1.6180..}

    No trivial trivia this post, but just a sample post to try out LaTex in WordPress. See here for help.

    Written by Amit

    January 8, 2010 at 3:53 pm

    Posted in updates

    Tagged with ,