Tuesday, October 24, 2006

Some Interesting MySQL facts

I've been doing a lot of research into the internals of MySQL in order to develop DrySQL, and here are some useful bits of information that I came across:

1) Check constraints are not supported in the latest version of MySQL (5.1.x). The syntax for check constraints is supported, but any check constraints in your alter/create table syntax are just ignored.
Official MySQL documentation.

2) The REFERENTIAL_CONSTRAINTS table was added to the INFORMATION_SCHEMA database as of MySQL 5.1.10.
Bug Report/Feature Request
Official MySQL documentation

This is in compliance with the ISO SQL:2003 standard. This table is useful to those who want information about the "ON UPDATE" and "ON DELETE" rules associated with a table constraint. Currently, the only solution for retrieving this info is to parse the SHOW CREATE TABLE output, which is a pretty ugly hack and a performance hit compared to retrieving the constraint info directly from the INFORMATION_SCHEMA DB.

3) MySQL creates implicit default values for columns if no default is specified. If the column is nullable, then MySQL sets the default value to NULL. If the column is not nullable, the default is value is set the MySQL's implicit default value for that column type (check this link for details about implicit defaults).

Suppose I create the following table:

CREATE TABLE TEST (ID CHAR(11) NOT NULL)

If I query the information schema's COLUMNS table to find out whether a default value has been specified for the ID column in the TEST table, MySQL will return to me the empty string as the default value. Does this mean that the empty string was defined as the default value for this column, or that no default was specified at all? Unfortunately, there is no way to make this determination in the latest version of MySQL, which is a pain for those of us developing DB tools against MySQL.

Errata: As of version 5.0.2, MySQL no longer creates implicit default values when no default is specified

4) Roland Bouman created an excellent clickable diagram of the MySQL Information Schema, which can be found here.

Labels: ,

Sunday, October 22, 2006

ORM Frameworks...the horror! (And a solution)

I've used a lot of Object-Relational Mapping frameworks over the years in the enterprise space. They're supposed to make it easier to objectify relational data, and they do. However, they all seem to force me to contravene one of the most important principles (IMO) of software development: Don't Repeat Yourself.

In my experience in the enterprise space, a database schema is carefully crafted to ensure data integrity. The database can then be used by a number of applications, none of whom need to worry about data integrity, right? Then why, after I define referential constraints, unique constraints, check constraints, not-null conditions, and default values in my DB schema, do I need to define many of these again in a config file used by my ORM framework? What happens if I change the structure of my DB?


Hibernate

Hibernate is the ORM framework I have used most often in the workplace, and it is great in a number of areas. Handling DB table changes in a dynamic way is not one of them. If I change my underlying DB structure, I need to change associated config files, re-generate data classes, re-build and re-deploy my app(s). I may be jaded from my experience using Smalltalk, but I find this lack of dynamicism unacceptable. I'll admit that the Smalltalk ORM frameworks I've used are no better (worse, in fact), but I can see that the language itself allows for a much more dynamic framework to be built. I'm not sure that's the case with Java, but would love to hear (politely, of course) that I'm wrong (yes, I've heard of Trails, but I'm not sold on it at present).


There are a number of tools available that support some degree of round-trip engineering with Hibernate. XDoclet, MiddleGen, and even the Hibernate tools package can be very useful for autogenerating code and config files. However, I personally don't want to have to use 4 different frameworks to support round-trip engineering. And I don't want to have to re-build my app after a DB table change.


Ruby on Rails

So after complaining about these sorts of problems for awhile (read: years), I decided to work on a solution. I've been experimenting with Ruby and ActiveRecord (the part of the Rails framework that interfaces with databases) for a few months now. Ruby has the dynamic, reflective nature of Smalltalk, which won me over immediately. And ActiveRecord sure makes ORM a lot more simple and pleasant than most other frameworks. It is not without weaknesses, though.


On any given table, I can change the names of the columns, remove/add columns, and my Rails app will handle these changes without requiring changes to the code. I just need to re-start my app. However, ActiveRecord uses naming conventions to infer properties of tables, and even table names themselves. For example, unless I specifically call
set_primary_key in my class definition, ActiveRecord will assume that the primary key column for my Invoices table is invoice_id. Thankfully, I can override such assumptions in my class definitions, but I shouldn't have to. My primary key is defined in my database schema in the form of a table constraint, and can be retrieved from the information_schema tables (or system catalogs, or whatever terminology applies to the metadata schema in your DB of choice).

To summarize my long-winded rant:


1) I personally can gain a lot of productivity from having a very dynamic, reflective ORM framework that doesn't require me to define my DB schema anywhere but in my DB itself.


2) I have found that Ruby/Rails/ActiveRecord comes closest to giving me what I need, but could really use some enhancements to better support legacy DB tables, and really use the DB schema to its full potential



A Solution

I have nearly completed a solution to this "problem" in the form of a Ruby Gem called DrySQL that is an extension to ActiveRecord. I will explain DrySQL fully in a subsequent post, but put simply your data class can look like this:

class Invoice
end

--> or if your DB table isn't named Invoices, your class would look like this:

class Invoice
set_table_name "XYZ123"
end


You do not need to specify primary/foreign keys, associations, or validations. DrySQL will retrieve all your table's constraints and metadata from the DB, and generate the necessary logic in your Ruby app.


This is extremely useful to me, and I hope that others will find it useful as well.

Labels: , ,