psycopgda reconnection and conflict handling

Bug #2088 reported by Stuart Bishop
68
Affects Status Importance Assigned to Milestone
Launchpad itself
Fix Released
Medium
Stuart Bishop

Bug Description

When an unserializable transaction exception occurs, Z3 should raise a ReadConflict so Z3 automatically retries the request. When an psycopg.OperationalError occurs, PsycopgDA should attempt to reconnect (but no more than a fixed number of reconnection attempts per minute)

Steve Alexander wrote:
> 2005-01-13 Federico Di Gregorio <email address hidden>
>
> * ZPsycopgDA/db.py (DB.query): applied patch from Jonathan
> Stoneman to automatically try to reconnect *once* on
> OperationalError. This fix the problem with Zope loosing the
> connection to the database when PostgreSQL is restarted.
>
> http://initd.org/pub/software/psycopg/ChangeLog
>
> I think this is the version in breezy.
>
> Also, Mark reported some IntegrityErrors being changed to
> ProgrammingErrors in the psycopg in breezy.

This is the Zope2 DA, not the Zope3 one btw.

We should do something similar upstream in the Z3 DA, as well as raising the
relevant conflict exceptions when a deadlock or unserializable transaction
is encountered to make Z3 retry the transaction.

Dafydd Harries (daf)
Changed in launchpad:
status: Unconfirmed → Confirmed
Stuart Bishop (stub)
Changed in launchpad:
assignee: nobody → stub
Revision history for this message
Stuart Bishop (stub) wrote :

With the new session machinery in place, this is now much more important; despite the techniques we are using to avoid concurrent updates, they still happen occasionally.

summary: + When an unserializable transaction exception occurs, Z3 should raise a
+ ReadConflict so Z3 automatically retries the request. When an
+ psycopg.OperationalError occurs, PsycopgDA should attempt to reconnect
+ (but no more than a fixed number of reconnection attempts per minute)
Revision history for this message
Chris Moore (dooglus) wrote :

I don't know if it's relevant or not, but the only time I've seen this problem (which I reported in the duplicate bug #28721) it was immediately after launchpad had gone down for its weekly update. When I refreshed the page it worked, and still showed something to the effect of "launchpad will be going down very, very soon" even though it was actually on the way back up at the time.

Revision history for this message
Dafydd Harries (daf) wrote :

This is the source of many oops reports, so I'm reassigning to the oops milestone.

Stuart Bishop (stub)
Changed in launchpad:
status: Confirmed → Fix Committed
Revision history for this message
Stuart Bishop (stub) wrote :

Serialization and deadlock exception handling is now done, with the code pushed upstream into Z3's psycopgda (but not the tests).

Reconnection after database outages is not yet working. I'll open this as a seperate bug due to the large number of serialization errors flagged as duplicates of this one.

Dafydd Harries (daf)
Changed in launchpad:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.