All of the interesting technological, artistic or just plain fun subjects I'd investigate if I had an infinite number of lifetimes. In other words, a dumping ground...

Wednesday 30 April 2008

Postgresql - versioning the schema and running totals, sums and moving averages

Robert Treat

http://people.planetpostgresql.org/xzilla/index.php?/archives/341-getddl-now-available.htmlgetddl now available

One common discussion we see on the postgresql mailing lists is that of how to track changes and do versioning of schema within a database. One of the common solutions offered is the idea of grabbing schema from the system catalogs, writing it to a file, and then committing that to svn. In theory you could write an elegant tool for such a task, but given the number of times someone has asked me to send them a copy of the script we use at OmniTI, I'm guess that even the hacky script we use will probably be helpful. To that end, I've finally gone ahead and put a public copy of our getddl tool up on labs. Be aware that the script has some issues; I've include a TODO of some items that I'd like to see improved, but it meets our current needs, so you're better off writing patches than waiting for a new release (;-)), but hopefully it will still be of some value.

Leo Hsu and Regina Obe

How to calculate Running Totals and Sums in SQL

People have asked us how to calculate running totals a number of times; not a lot but enough that we feel we should document the general technique. This approach is fairly ANSI-SQL standard and involves using SELF JOINS. In a later article we shall describe how to calculate moving averages which follows a similar technique but with some extra twists.

Note that the below examples can also be done with a correlated sub-select in the SELECT clause and in some cases that sometimes works better. Perhaps we shall show that approach in a later issue. We tend to prefer the look of the SELF JOIN though and in practice it is generally more efficient since its easier for planners to optimize and doesn't always result in a nested loop strategy. Just feels a little cleaner and if you are totaling a lot of columns (e.g number of items, products) etc, much more efficient.



No comments:

tim's shared items

Add to Google Reader or Homepage