Sunday, October 29, 2006

The Mont Blanc Pen

I got a Mont Blanc Pen as a farewell gift from my dear friends and colleagues. It's a Mont Blanc, it looks very professional, it has the new, and very cool Mont Blanc floating star, it is one of the new styles, and I generally love their products... but.. but... but I just don't like that style at all. It is too "complex" for me.


See the many fine rings it has? The black and silver? The two black rings?

I take simplicity to the extremes I guess. I like simple, clean, straight designs. Nothing fancy at all. "Less is More" is one of my favourite proverbs. And this applies to design too.

So, as much as I appreciate the gift, I just had to exchange it. I want to have a gift that I like. It still seems unethical to me, but I proceeded. I went with my wife to Mont Blanc in Seef Mall with the Pen. I didn't have a receipt. So I did not know if they would accept an exchange. I just told them I got it as a gift two days ago, and I want to exchange it. They gladly exchanged it for another.

So now, I have a gift from my dear friends and colleagues, and it is something I like. A clean, straight, and simple, yet sophisticated Mont Blanc.


I also like the fact that this pen has a cap that you need to screw to open/lock it. It sorts of lets you think before you use the pen. Not just any cap you get from a Bic. You make sort of an effort to use it. After all you are using a Mont Blanc :-)

What do you think?

Saturday, October 28, 2006

Prada

My wife read the book, "The Devil Wears Prada" and she enjoyed it. Today we went to see the movie.

I was reluctant at first. I knew the movie was about fashion, main character is a she, so I thought it's a chick flick. It's not like that, although it does show a lot of chicks and a lot of fashion. I did like the movie and the points it was trying to make.

The movie is about Andrea, an educated smart girl, who is not into fashion, not that pretty, or at least thinks and acts as if she is not. She has no other opportunity but to work as an assistant to the editor of Runway magazine, Miranda. Miranda is so bossy, so over-the-top and a goddess of fashion. I will not spoil any more of the movie, but it is very good.

For me, the main point in the movie is that it doesn't matter if you have something that a million other persons dream to have, what matters is do you have what you want? And even if you think you don't have a choice but to do what you have to do, you always do have a choice, even if it is simply walking away.

Tuesday, October 17, 2006

MySQL and index use

And yet another post about MySQL and how it uses indexes.
I have a transaction log file that contains a million plus records. Two essential fields I almost always use are TranDate and TranTime, CHAR(6) and CHAR(8) respectively. They contain the date and time in YYMMDD and HHMMSSss.

After the log table contained a million records, my queries were starting to take longer and longer, even though I had an index on TranDate and TranTime columns called DateTime. I needed to search for transactions between Date Time (YYMMDDHHMMSSss). The query had the following syntax:

SELECT *
FROM log
WHERE Concat(TranDate, TranTime) BETWEEN '061001' AND '061010'


I initially thought that the DateTime index will be used in the above query, but an EXPLAIN command on MySQL showed a full table scan. My index was not used!

I went digging in, and surely enough, MySQL will not be able to use the DateTime index, because the TranDate and TranTime are NOT used directly in the WHERE clause. The query basically tells MySQL to take the TranDate and TranTime columns in each row, combine them, and only return those rows between the given values. MySQL is no magician to know that the combined values are indexed already.

So, the solution was very simple. Create another column, TranDateTime that is the concatenation of both fields, and then modify the DateTime index to index that single column. The query will also change to replace the Concat function with the TranDateTime column.

Another EXPLAIN command, and the DateTime index is used, cutting the execution time to less than a second instead of few minutes.

The rules of index use in MySQL (and probably other databases) are very simple:
1) The columns that in the WHERE clause will be subject to index use if they are part of and index.
2) An index on (col1, col2, col3) can be used to search for any of the following combinations of columns: (col1), (col1,col2), (col1, col2, col3). The same index cannot be used to search for col2 only. Just think of the index as a sorted array of string, and you should figure out why.
3) The index used will be the one that returns the least number of rows.

Moral of the story?


1) Ask the database engine what is it doing. MySQL may not be the most sophisticated database, especially with multiple tables, but it does the job. Imagine what a database like Oracle, Sybase or one of the big guys will do.
All you have to do is ask.

2) Understand why is the engine doing what it is doing. Why is that seemingly suitable index not used? Is the right column present in a query WHERE clause? is that column used in a function that makes the index unusable? A LIKE '%value%' clause will not use any index.

3) Try to create a better index suitable for that query, or try to fine tune the query so the best index can be used.

Re profile, and continue until you get fed up.

Remember that you cannot solve your bad query problems by getting better hardware!

Monday, October 16, 2006

MySQL Crosstabs

I was writing some reports, and needed a better way of visualising some data. Instead of of having something like:

Term Resp Count
ATM01 APP 100
ATM01 DEC 10
ATM01 REV 5
ATM02 APP 250
ATM02 DEC 23

Listing 1

something like the following is much easier to look at, and graph:

Term APP DEC REV
ATM01 100 10 5
ATM02 250 23 null

Listing 2


There is no built-in crosstab or pivot construct in MySQL, but this is much easier than it seems. After you create it once, it will become second nature. I am using it very frequently to create all sorts of statistics and graphs.

So, let us assume you have the above data in a table like Listing 1. You need to transform it, or create another query that will generate to you the result in Listing 2.

Here is the Query:

SELECT Term,
SUM( IF(Resp = 'APP', 1, null) ) as APP,
SUM( IF(Resp = 'DEC', 1, null) ) as DEC,
SUM( IF(Resp = 'REV', 1, null) ) as REV
FROM mytable
GROUP BY Term



The trick is using a SUM of 1 or null. 1 will be added when the condition in the IF applies, and null otherwise. The beauty of this, is that you can have any arbitrary complex expression in the IF statement to group records that must be added in the APP column. For example let us say you need to have only two columns for APP and OTHERS. Your query will be:

SELECT Term,
SUM( IF(Resp = 'APP', 1, null) ) as APP,
SUM( IF(Resp <> 'DEC', 1, null) ) as OTHERS
FROM mytable


This gives you much more flexibility in constructing your crosstabs. It May get a bit complicated if you need many columns, or if you need a single column per value. If you have this, then probably the best thing is use the right tool for this. Use Microsoft Excel. Or use any one of several techniques that build the SQL for you. Just do a Google search on MySQL crosstab, and you will find some good ones. One good article on MySQL site is here.

Thursday, October 05, 2006

Genetic Algorithms with Ruby

I was very excited when Shark Hunter presented his first session on AI. He started with Genetic Algorithms. The idea that a solution "magically" evolves is fascinating. All you need to do is present your candidate solutions as a sequence "chromosome" of some objects "genes", then assign a fitness to each chromosome, then mimic nature by mating and mutating the population. Eventually "life finds a way", and solution is found.

One of the nice sites with demos is here.

So I thought this will be a very cool exercise to practice my newly acquired Ruby skills. It took me a few hours to come up with a simple "framework" to represent a general GA.

I will keep updating my GA framework and add more tools and functionality to it. Ultimately I want to be able to modify my sub Chromosome class only, and just add a fitness method, then fire my simulation and let the search begin.


Basically the framework consists of a Chromosome class, and a Population Class.


The Chromosome Class

This class represents a Base Chromosome. It is basically a sequence of genes that have a fitness, and possibly represent a certain value. The Base class knows how to marry, or crossover. The Base class crossover is a single point, and no mutation is done. A subclass will be created that can have two crossover points. The framework is still in very early stages, and will be re-factored. But basically, that is all there is to it.


The Population Class

This class represents a Population of Chromosomes. It holds the entire population and is responsible for Selection and Generating offsprings. The Base class does a simple Roulette Wheel Selection based on a fitness value. To make the initial code simple, the fitness value must be higher for fitter solutions. So, a chromosome of fitness = 10 will be 10 times likely to be selected that a chromosome with fitness = 1.


Genes?

I did not create any classes for genes. I wanted to make things simple, so the sequence itself will represent any kind of genes. It is tested as a Strings in this initial version. The main thing is finding a way to represent the fitness of the chromosome. I will see if representing binary numbers, integers as sequences, or real numbers will require any changes in the sequence attribute or require the creation of a Gene class. But since Ruby is so dynamic, I doubt this will be needed.


The Code

Anxious to see the code and try it out? Drop my a line, and I will send it to you or publish the current version somewhere. It is still in very early stages, but works for simple problems.


Just Google it!