Archive for the ‘SQL’ Category

Fun with SQLObject and mxDateTime

Thursday, November 29th, 2007

I’m working on a small CherryPy web service that among other things saves timestamps to a database. The timestamp is in RFC 3339 format (like 2007-07-31T16:05:00.000-05:00), and I needed to store the timezone.

Luckily mxDateTime and SQLObject’s DateTimeCol both support full dates with times and time zone. Unfortunately its not immediately obvious from SQLObject’s lackluster documentation how to use mxDateTime instead of Python’s built-in datetime.

A little searching brought me to a mailing list post about how to use mxDateTime by default in SQLObject. (I don’t know why the sample code includes the conditional as I would think you’d want your code to outright fail if you’re unable to use the datetime library you expect.)

So my model’s code looks something like this:

from sqlobject import *
from sqlobject import col
 
col.default_datetime_implementation = MXDATETIME_IMPLEMENTATION
 
class Foo(SQLObject):
    timestamp = DateTimeCol(default=DateTimeCol.now)

Then my parsing code looks something like this:

import model
from mx import DateTime
 
timestamp = '2007-07-31T16:05:00.000-05:00'
bar = model.Foo(timestamp=DateTime.DateTimeFrom(timestamp))
print 'UTC Timestamp:', bar.timestamp
print 'Local Timestamp:', bar.timestamp.localtime()

Basically once you use the magic line col.default_datetime_implementation = MXDATETIME_IMPLEMENTATION, everything Just Works.

PostgreSQL 8.3 Beta

Tuesday, October 9th, 2007

Warning: IANADBA
I was just reading the release notes for PostgreSQL 8.3 beta1, and there’s quite a lot of exciting improvements coming:

  • Full text search is now a built-in feature
  • Support for the SQL/XML standard, including a new xml builtin data type (For the enterprisy users)
  • enum data types (I’m not a big fan of enums in a database, but lots of people like them.)
  • UUID data type, similar to that defined by RFC 4122. (Very useful feature in some clustering situations. Some people even argue GUIDs/UUIDs should replace integer surrogate keys most of the time.)
  • ORDER BY … NULLS FIRST/LAST (A minor little feature, but I remember wishing for it once upon a time.)
  • And some excellent performance improvements, especially if you’re data doesn’t have to be flushed to disk to finish a transaction (potentially dangerous, but acceptable for many use cases).

While I haven’t had the pleasure of using PostgreSQL for a few months, I prefer it over MySQL and recommend it to anyone looking for an open source database with advanced features and excellent reliability.

Drupal Database Maintenance Script

Friday, August 3rd, 2007

I love Drupal, but its insistence on saving every session its ever created causes the sessions table to become massive. So I usually end up creating a shell script like the following:

drupal-db-maint.sh (I posted a copy to pastebin since WordPress makes the code impossible to read.)

#!/bin/sh
echo `date` - Starting Drupal database maintenance
MYSQL_USERNAME=’insert your MySQL username here
MYSQL_PASSWORD=’insert your MySQL password here
MYSQL_DATABASE=’insert your database name here

# Note: 2592000 = sessions over 30 days old.
# Adjust the timespan as desired.
CLEAN_SESSIONS_SQL=’DELETE FROM `sessions` WHERE `timestamp` < (UNIX_TIMESTAMP() - 2592000)’
OPTIMIZE_CACHES_SQL=’OPTIMIZE TABLE `cache` , `cache_filter` , `cache_menu` , `cache_page` , `cache_views` , `sessions`’

# Note: I have tons of modules installed.
# I tried cleaning out their table names before posting, but you’ll want to double check this SQL.
OPTIMIZE_REST_SQL=’OPTIMIZE TABLE `access` , `accesslog` , `aggregator_category` , `aggregator_category_feed` , `aggregator_category_item` , `aggregator_feed` , `aggregator_item` , `authmap` , `blocks` , `blocks_roles` , `boxes` , `client` , `client_system` , `comments` , `contact` , `files` , `file_revisions` , `filters` , `filter_formats` , `flood` , `history` , `invite` , `menu` , `node` , `node_access` , `node_comment_statistics` , `node_counter` , `node_revisions` , `node_type` , `permission` , `profile_fields` , `profile_values` , `role` , `search_dataset` , `search_index` , `search_total` , `sequences` , `system` , `term_data` , `term_hierarchy` , `term_node` , `term_relation` , `term_synonym` , `url_alias` , `users` , `users_roles` , `variable` , `vocabulary` , `vocabulary_node_types` , `watchdog`’

# Note: I run this script on a Linode which has limited IO.
# Adding a short wait between commands lessens the load on the server.
SLEEP_TIME=5

echo $CLEAN_SESSIONS_SQL | mysql -u $MYSQL_USERNAME -p$MYSQL_PASSWORD $MYSQL_DATABASE
sleep $SLEEP_TIME

echo $OPTIMIZE_CACHES_SQL | mysql -u $MYSQL_USERNAME -p$MYSQL_PASSWORD $MYSQL_DATABASE > /dev/null
sleep $SLEEP_TIME

echo $OPTIMIZE_REST_SQL | mysql -u $MYSQL_USERNAME -p$MYSQL_PASSWORD $MYSQL_DATABASE > /dev/null

echo `date` - Done with Drupal database maintenance

Then just run crontab -e to edit your crontab and add a line similar to:
46 4 * * * /path/to/drupal-db-maint.sh

That will optimize your database nightly at 4:46 AM server time (usually UTC).

If you don’t like my solution, there are plenty of more elegant session cleaning solutions out there. I’m just more comfortable writing shell scripts and crontabs than Drupal modules that use hook_cron.

Posting code in WordPress so that its readable is pretty much impossible. Any suggestions?

Debian Etch Released

Sunday, April 8th, 2007

It appears Debian GNU/Linux 4.0, codenamed Etch, (quite a mouthful) has been released as the latest stable version of Debian.

Go download it using BitTorrent to be cool. ;)

Tip #1:  You do not have to download all of the CDs, just get one and Debian’s installer will download the rest.)

I’m a happy and loyal Debian user and strongly encourage any Linux power users to try out Etch. Ubuntu is still my recommendation for users who want things to “Just Work.” However, I highly recommend Debian for any kind of server (file, print, application, database, web, e-mail).

Tip #2: Check out Debian Multimedia for all of your proprietary codec needs!

Mini-Review

I’ve been using Etch (and Sid) for a while now, so its tough to do a real review from a “normal” end user’s perspective.

Some highlights:

  • Developer’s should be happy as PHP, Python, Mono, MySQL, and PostgreSQL are all fairly up-to-date and extremely easy to install.
  • Gnome is stable and speedy.
  • The graphical installer brings Debian out of the 1990s and offers tons of features.
  • As usual, Debian’s package repository is huge. Forget vendor-supplied RPMs or custom installers, if its not in Debian’s repository (or Debian Multimedia), you probably don’t need it. ;)

Some downsides:

  • Relatively ancient version of the standard C library (2.3.6) means poor compatibility with 3rd party binaries (which Debian has always hated anyway).
  • Gnome is 2 stable versions behind in Debian (2.14 vs. 2.18). This is very frustrating as Gnome constantly produces nice incremental improvements.
  • Debian just isn’t as easy to use by default as Ubuntu. Debian needs to install and integrate some of the more cutting edge desktop apps like Beagle, Tomboy, Deskbar, and Ubuntu’s nicer package management interfaces.
  • Secure APT is a great feature, but the error messages it can produce are going to scare the pants off of new users. More effort needs to be made to integrate all of the various places user’s encounter public key crypto. Right now its very schizophrenic and confusing: GPG errors are too common, and command line magic that even confuses me is necessary to add new repositories.
  • Not related to the release, Debian’s support system is a confusing jumble of web interfaces, mailing lists, IRC channels, and an excellent wiki (that I rarely remember to use). Hopefully this is something a new Debian DPL will address.

PostgreSQL Killed My Harddrive

Wednesday, March 14th, 2007

The harddrive died in the server that dumps our firewall’s syslog output into a nice PostgreSQL database for easy reporting. PostgreSQL didn’t do anything evil in order to kill the drive, but because it syncs data to the harddrive for safety, it kept my ancient server’s harddrive spinning 24/7.

I should mention this was my fault. I really should have turned fsync off to let the server cache the trickle of ICMP packets that keep our firewall awake all night. Instead the old Quantum Fireball Plus LM harddrive wrote every byte right as it came in.

Luckily the harddrive still worked enough that I could attach it to another computer via a IDE-USB adapter and copy off any data I wanted to keep.

Below are the SMART statistics for the drive as collected by smartmontools and munin. As I was able to retrieve data off the disk, I would say SMART did its job even if it didn’t give me much warning.

moodletremontlocal-smart_hda-day.png
moodletremontlocal-smart_hda-month.png

Connecting SQLObjects to PostgreSQL via SSL

Wednesday, March 7th, 2007

I’m trying to learn TurboGears, but I got stumped yesterday trying to figure out how to connect to PostgreSQL via SSL.

Turns out this works in SQLObject (TurboGear’s default ORM):

sqlobject.dburi="postgres://username:pass@servername/mydb sslmode=require"

Turns out the syntax I was trying to use works with SQLAlchemy (which it appears most cool kids use):

postgres://username:pass@servername/mydb?sslmode=require

Notice the only difference is the question mark between the database and extra parameters. Kind of makes me want to cry…

To all my friends who have listened to me extol the virtues of Python, let me also add that learning a new framework sucks. Especially when that framework is a moving target.