display | more...

So, Mr. Banker, you're using transactions in your computer programs to protect the integrity of your data; that's good!

About transactions

Here's a quick introduction or review for those who aren't. Anytime you need to modify your database using more than one operation in sequence, you run the risk of a system or program failure occuring after you've begun but before you finish.

The canonical example, as Big Ben is well aware, is the task of moving a sum of money from one account to another. One first subtracts $X from account A, and then adds it to account B. A failure between the two steps leaves you with a sum of money missing, and nobody likes that. Thus, in SQL, one begins a transaction before the first step, and commits it after the last. The database software ensures that the net effect is either a succesful, complete movement of money, or no change at all.

And now back to savepoints

That's all wonderful, and a boon to programmers the world over. There must be a catch, right? Right. Committing a transaction is an expensive operation (in terms of time; Oracle won't send you a bill). Now let's say that before the Fed closes for the night, Ben has to put on his green visor and do a thousand such transfers. If he does them all in one transaction, he'll get home in time and the Missus won't yell at him; committing after each one will take longer and he'll be lucky if she leaves him leftovers from the dog's dinner.

Unfortunately, the ANSI committee that standardized SQL played a joke, and even Big Al couldn't talk them out of it. They said that any error that occurs inside of a transaction renders the transaction void, and the database will require a rollback. This means that if you've done 999 transfers, but then while you're trying to move $20,000 from Fred's Bank to the Bank of Ethel, the database barks at you because Fred hasn't got it, then all your work is undone and now even the kibble will be cold when you get home.

Are you destined never to get that hot pot roast with apple pie warm from the oven? Was it because you shortchanged an Assyrian rug merchant in a prior life? Well, it's not hopeless. SQL provided you a way to have your cake and eat it, too. (Cake with pot roast?)

Within a transaction, you are allowed to declare a SAVEPOINT. What this does is, it allows you later, after an error occurs, to say to the database, "Okay, I'll rollback. But only back to my savepoint!" This allows you to quickly rewrite your nightly task something like this:

Begin transaction
For each transfer to be done
    Savepoint MyBacon
    Subtract $X from A
    Add $X to B
    If there was any error
        Rollback to savepoint MyBacon
Commit transaction
Now, all your successful transfers will get committed, even if there are errors, and you only have to commit once so you can get home. Though you ought at least to make a note of Fred's overdraft for the morning crew to take care of, so you may miss the very beginning of the salad course.

That is the canonical way to use savepoints. At least, it's the only way I ever have. SQL allows more complicated setups, in that you can have more than one savepoint outstanding at a time (which is why you give it a name when you declare it), but I doubt I'll ever have a use for that.

One hint, if you're using savepoints with PostgreSQL. In a departure from the standard, PostgreSQL maintains stacks of like-named savepoints, rather than releasing[1] the current one when a new one is declared with the same name. Hence the above code would tend to accumulate many of them for no reason. Therefore I actually put a "Release savepoint MyBacon" statement at the end of the loop.

[1] There is a related SQL command called RELEASE, which causes a named savepoint to be destroyed. In the case of PostgreSQL, RELEASEing a savepoint removes the most recently declared one with the given name, restoring visibility to the next most recently declared (and unreleased) savepoint with the same name. Why they did it that way, and why you'd want to take advantage of it, is beyond me.

Well, the reason they did it that way is so that if a stored procedure called during your transaction unknowingly (to you) used the same name for its savepoint as you did for yours, it wouldn't mess with yours. Assuming it was released before the procedure returned, of course.

Log in or register to write something here or to contact authors.