saveabend: 08/01/2006

Wednesday, August 30, 2006

My Database Design Rules

I've been designing and using simple, and complex databases for at least ten years now. These are my rules of thumb database design rules:

Database design:

It is easier to think in terms of Entity Relationships. Entities then become tables. Entity Attributes then become columns. ER can be mapped to classes very easily.
Uses consistent names for fields. If you can, stick with full words or abbreviate very long, but well known columns and tables.
Table names are plurals. Classes, Messages...
Use underscores to separate words.
When naming columns or tales, start with the more general entity, then narrow down. This makes searches and lists much easier. Case Attachments, Case_Notes, User_Roles, User_Groups and so on.
There is no need to prefix tables with tbl or the database name. Have one database or catalog for each application. This is just my personal preference. Since most GUI tools group tables and views, why replicate this? What good would it make to prefix eveything with tbl or v?
Normalize, normalize and normalize even more. However you may need to copy totals, and some user entered or varying data in some tables and denormalize or duplicate. An example is for some critical applications, e.g. financial apps, the lookup data may change, and it may be needed later. E.g. lookup of a location of a terminal may be copied to a transaction record, even though the terminal ID, which contains the location is present in the transaction record. In this case copy the lookup data to the table.
This is related to above point. To make your queries simpler, denormalize some tables by creating some detail views that lookup the master data from the master tables. You can then simplify alot of your queries and reports by not linking the master tables every time you need master data in detail report.
For enterprise applications, it is best to have two columns in every table as: USER_ID and SYSTEM_DATE. Name these two columns anything you want, but use the same name in all tables. I used CREATED_BY and CREATE_DATE sometime. These two columns must be set by triggers and must get the values from the system. If the tables are critical, have the triggers update audit tables (TABLE_NAME_AUDIT) with before and after data for the main table. The point is, these USER_ID and SYSTEM_DATE columns, and the audit tables must be maintained at the database level. This way the application never needs to care about updating them, and the database will contain correct data about each record, no matter what the source of the change is. Come on, even MySQL has triggers now, so there is no excuse.
I had a very strong argument with my Masters Degree DB teacher about this one, and I finally convinced him: If you have many-to-many relationship, you did something wrong. The relationship should be broken down to another entity (table) and then you will see that this link table probably has some more attributes to it. SO... it is not a many-to-many relationship after all. This many-to-many relationship is a new entity.

About Primary Keys:

You must have one column called ID for each table. This must be the primary key for the table, and if you can, have it auto-increment. Even if it does not seem needed at first, it will be later. What if you need to delete or update that single record from the table? you have it's ID.
Use this ID, and only this ID, as link in foreign tables, and prefix the foreign key column with the foreign table name, and an underscore. Always use database constraints for foreign keys. For example, you have one to many relationship between cases and attachments. In Cases, there is ID column. In Attachments, you have ID (Prikey) and Case_ID. The name of the foreign key column must be the singular form of the foreign table.
Never EVER EVER use user entered data as a primary key. If user entered data must be unique, then use a unique key constraint. This way all your links will not be affected if a user enters the wrong "key" and everything must be in sync again. It may seem overkill, but trust me on this.

Sunday, August 27, 2006

Ruby Scripts - Simple SQL dumps

We have an old version of SQL GUI tool that can return the results of an SQL statement as tab delimited data. This works fine for small result sets. However I created a Reconciliation system that needed few 10's of thousands of records. The GUI just could not handle that. So, a nice exercise to use Ruby to the rescue and learn more about it's DBI.

Here is the sample Ruby code:

require 'DBI'

# make an ODBC connection
conn = DBI.connect('DBI:ODBC:','[USERNAME-HERE]','[PASSWORD-HERE]')

sth = conn.execute("
[SELECT STATEMENT HERE]
")
# If you need column names, then you can use the below statement to get them
# cols=sth.column_names
sth.each do |row|
puts = row.join("\t")
end
sth.finish

Nothing except the Ruby Installer was needed to run this.

Thursday, August 24, 2006

A bit of history

I have been coding for well over 15 years. I like creating new stuff and seeing them work, or break :-) So programming was the ultimate toy, sort of speak. I can actually write something and see things happen on screen. From the very old basic moving circles on my MSX Sakhr, solving quadratic equations (before I learned them in school), plotting x/y and time functions, Programming is FUN!
Back in the good old days of the Commodore 64, I typed pages and pages (probably 30+) of numbers from my Programming The Commodore 64 book just to create an assembler. Then the fun began with Assembly Language. Things were really fast in assembly when you have an 8-bit 1-Mhz processor with three registers, and a full 64K (almost) of RAM! I developed my first compression routine when I was probably 16. Later I knew it was called RLE (Run Length Encoding). I had my first "games" with moving sprites and changing colors.
So why did I do that? No internet for me at that time, no help from anyone. Just my book, my C-64 and lots of reset button presses (there is no debugging in assembly). Why? Because I simply loved doing that. Still do. Programming for Programming's sake. Born to Code.

Wednesday, August 23, 2006

Take One

Okay, this will be the first blog here. This will mostly some tech blog, about coding, hacking, new techniques, and probably some Ruby programming.
Stay tuned.

saveabend