How do long-running transactions affect performance?

Asked by Baron Schwartz

I began this conversation with Paul on personal email but wanted to move it here. My question is what happens during long-running conversations. Paul answered:

> Performance is something I am currently looking at in detail for the
> first time since I implemented the fully-durable version of PBXT.
>
> I am currently working on the DBT2 performance of PBXT. The last release
> was not so good at DBT2, but I am making some progress.
>
> That was just "by the way" because DBT2 is not relevant to long running
> transactions (it only has very short transactions).
>
> There is no limit to the length of a transaction in PBXT.
>
> However, a long running transaction will affect the overall performance
> of PBXT. The problem is that the sweeper thread cannot cleanup older
> record versions as long as the long transaction continues.
>
> This makes foreground threads work harder because they run into more and
> more older record versions (which are ignored, but must still be read
> sometimes).

I was also curious if the performance impact on the sweeper thread itself is noticeable. I see there are some linked-list data structures to traverse. These will not necessarily grow long just because of a long-running transaction, I think -- I believe you need the combination of a long-running transaction and lots of updates that create old row versions.

Of course this is standard fodder for an MVCC architecture, but I'm curious whether there are any ways to avoid the performance impact. You've got some clever designs in other aspects.

What is the cost of a rollback in such circumstances?

Baron

Question information

Language:
English Edit question
Status:
Answered
For:
PBXT Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
Paul McCullagh (paul-mccullagh) said :
#1

Hi Baron,

Congratulations! You are the first (of many I hope) to post a question on this forum!

You are right. The linked list of old versions of a row only gets longer if a row is repeatedly updated.

This is a problem for the sweeper, as you say, because the sweeper must traverse the list in order to remove a record version. But the sweepers activity has not been a problem in any of my tests so far. Even when it is many 1000's of transactions behind, it can catch up in a couple of seconds.

Traversing the lists is also not such a problem for the foreground threads because a thread will usually only read the first version of a record (i.e. the record at the start of the list), and conclude that this record is current version, and then it no longer needs to read anymore.

What is expensive is that index scans and table scans run into the old record versions. So, even though it is a quick process, each version must still be checked for visibility.

A clever idea of how to improve this situation would be welcome!

Rollback is not affected by this situation. At least, in the sense that rollback becomes slower, for example. Commit is also not slower. But both operations usually add to the number old variations, which does affect the overall performance.

Paul

Can you help with this problem?

Provide an answer of your own, or ask Baron Schwartz for more information if necessary.

To post a message you must log in.