Curious Database Indentity Fields
Ah databases, now I’m no DBA by any stretch of the imagination, but I’ve recently come across a new trend in database design which I’m pretty sure isn’t ideal.
When designing databases I’ve always taken the basic principle of using identity fields in each table and using foreign keys to ‘link’ the tables, a pretty standard affair I thought. I primarily work with SQL server, and as your average joe, I generally use an ‘int’ for my identity field. It’s never let me down and gives me a possible 2,147,483,647 (2^31 -1) indentities. The largest tables I tend to work with have about 8.5 million rows, so plenty of scope for expansion. I suppose if I were really worried I could use a bigint, which gives me a staggering 9,223,372,036,854,775,807 (2^63 – 1) possible indentities.
Back to that less than ideal design….. two databases I’ve recently worked with (both SQL server based and both ‘live’ commercial applications) have taken a different approach to indentities. In fact neither use a traditional identity field at all, both ‘maintain’ their own ids. Both shall remain nameless to protect the innocent.
The first uses a curious varchar(20) identifier, which is split into 2 parts, comma delimited no less. The first part is essentially the ‘customer identifier’ which is a number, the second part after the comma is a base36 alphanumeric. The arguments behind this are that, firstly it allows for easy merging of records from mulitiple ‘customers’ into the same table and secondly that base36 numbers are easy to read/recognise when large!!?! For example “1,BCV32J”, ah yes I know what that is, not!
The second database ‘indentifier’ design is possibly even more curious, each table has a composite key consisting of a bigint and a guid. There seems to be no logical rhyme nor reason to this, other than to make sure we have plenty of indentifiers. This we sure have, a guid has about 2^128 combinations and the bigint about 2^63, giving a possible total number of combinations of 2^191! Which is erm…. a lot. Put into perspective I could generate 1,000,000 identifiers per second and still not run out before the Sun expands and swallows planet Earth.
I suppose there is one plus side, at least they are using SQL server and not an Excel spreadsheet (where most ‘enterprise’ databases seem to get stored these days).

February 18th, 2007 at 11:40 am
I know you are doubting yourself due to these seemingly ridiculous implementations. But let me persuade to keep the faith. These cowboys have obviously no concept of relational database design. Loading 2 columns into a single column is more common than you’d think, and there is almost always no good reason for it.
Your non-intelligent keys are almost always the best way to proceed, although i can admit to skipping this best practice on occasion – for small jobs – e.g. using username as a key instead of a separate identity/sequence for user_id