Zim

slow task list updates

Asked by Morten Nielsen

Hi

I have been a happy Zim user since 2009, and I have kept my notes and tasks in Zim. It works and I am able to find and retrieve what I did last year. Thank you for that.

It also means that I have lots and lots of tasks. Currently I have 360 open tasks with something like 30 different tags, and most tasks have multiple tags. All of this spread out on numerous subpages.

The issue I am encountering is that whenever I update a todo list on a page, Zim freezes for a "long" time at 100% processor power for it to update. The question is what to do about it. I have not had any crashes and it is not a bug as such, just an optimization issue.

I am running Zim version 0.54 on Debian/Testings.

Do you have any suggestion to mitigate this?

Question information

Language:
English Edit question
Status:
Answered
For:
Zim Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
Jaap Karssenberg (jaap.karssenberg) said :
#1

On Thu, Jun 28, 2012 at 3:41 PM, Morten Nielsen
<email address hidden> wrote:
> I have been a happy Zim user since 2009, and I have kept my notes and tasks in Zim. It works and I am able to find and retrieve what I did last year. Thank you for that.
>
> It also means that I have lots and lots of tasks. Currently I have 360 open tasks with something like 30 different tags, and most tasks have multiple tags. All of this spread out on numerous subpages.
>
> The issue I am encountering is that whenever I update a todo list on a page, Zim freezes for a "long" time at 100% processor power for it to update. The question is what to do about it. I have not had any crashes and it is not a bug as such, just an optimization issue.
>
> I am running Zim version 0.54 on Debian/Testings.
>
> Do you have any suggestion to mitigate this?

I think it should depend on the size of the page you are updating.
Basically with each update zim re-indexes all tasks on that page. This
should work quite well as long as you use small to medium size pages.
But if you have one page with a long list of tasks it may hit a limit
where it becomes blocking.

Now 360 tasks doesn't sound like a lot to me (well, it sounds like you
have a lot of stuff to do, but it doesn't sound like a number that
should cause this kind of behavior..). But maybe you have a lot of
closed tasks on that page ?

Or does it also happen when you make changes to pages with only a few
tasks ? In that case there maybe a bug in the updating.

Regards,

Jaap

Revision history for this message
Morten Nielsen (mbn) said :
#2

Based on your reply, I considered what could be wrong.

The first I did was to turn off the setting "Consider all checkboxes as tasks" (in edit->preferences->plugins->Task list-> configure). That helped. I found it obvious that it was much faster in re-indexing when that option was turned off.

The wait appears most often on the pages with a lot of tasks, but it also on pages with 10-15 tasks. My personal conclusion is that it is not the page itself, but the data structure the page data is inserted into.

Does zim keep the master list of all tasks, both open and closed?
I am python literate, so if you give me a hint of the file to look in, I could try to dig out the actual size of the task list. That would tell us if it is the taks list size that is the problem.

Revision history for this message
Morten Nielsen (mbn) said :
#3

btw,

$ python --version
Python 2.7.2+

Revision history for this message
Jaap Karssenberg (jaap.karssenberg) said :
#4

On Thu, Jul 5, 2012 at 11:46 AM, Morten Nielsen
<email address hidden> wrote:
> Question #201715 on Zim changed:
> https://answers.launchpad.net/zim/+question/201715
>
> Status: Answered => Open
>
> Morten Nielsen is still having a problem:
> Based on your reply, I considered what could be wrong.
>
> The first I did was to turn off the setting "Consider all checkboxes as
> tasks" (in edit->preferences->plugins->Task list-> configure). That
> helped. I found it obvious that it was much faster in re-indexing when
> that option was turned off.
>
> The wait appears most often on the pages with a lot of tasks, but it
> also on pages with 10-15 tasks. My personal conclusion is that it is not
> the page itself, but the data structure the page data is inserted into.
>
> Does zim keep the master list of all tasks, both open and closed?
> I am python literate, so if you give me a hint of the file to look in, I could try to dig out the actual size of the task list. That would tell us if it is the taks list size that is the problem.

It is stored in the index database. This is a SQLite database stored
in the ".zim" cache folder in the notebook. Code to index the page and
update this table is in the plugin in zim/plugins/tasklist.py .

There is several tools to inspect the database file - I sometimes use
this one: http://sqlitebrowser.sourceforge.net/

Maybe you can add a few print statements in the plugin to check which
part of the code the delay is located.

Regards,

Jaap

Revision history for this message
Jaap Karssenberg (jaap.karssenberg) said :
#5

P.S. if it does not depend on the page, there maybe a genuine bug or
missing optimization. So womething I would want to fix, let me know
where I can help.

-- Jaap

Revision history for this message
Morten Nielsen (mbn) said :
#6

I have done some digging

Before opening the task list window the first time after starting, there is not problem.
It then takes a couple of seconds to show the task list window, this is similar to what I encounter whenever a page is autosaved.

In tasklist.py I inserted a counter in TaskListTreeView._appendtask
It show that whenever a page is autosaved _appendtask is recursively called a lot.
with 'consider all checkboxes as tasks' it was 335 times, and with out it was 107,

My conclusion is that it relates to the recursive updates of the listview data in the tasklist window. In TaskListTreeView._appendtask it seems that the rows are processed 5 times at every call - it seems like a good candidate for optimizing :-)

This, of course, is much much worse for re-indexing which recalculates whenever a page with tasks are encountered.

I have included som stats about my notebook. Would you consider 1055 pages, and 1155 tasks a lot?
---
Some stats from sqlite about my notebook below
without 'consider all checkboxes as tasks'
$ ZimNotes/.zim
sqlite3 index.db
sqlite> .tables
links meta pagetypes tagsources
linktypes pages tags tasklist
sqlite> select count(*) from links;
605
sqlite> select count(*) from linktypes;
0
sqlite> select count(*) from meta;
2
sqlite> select count(*) from pages;
1054
sqlite> select count(*) from pagetypes;
0
sqlite> select count(*) from tags;
93
sqlite> select count(*) from tagsources;
289
sqlite> select count(*) from tasklist;
1155

with 'consider all checkboxes as tasks'
sqlite> select count(*) from links;
605
sqlite> select count(*) from linktypes;
0
sqlite> select count(*) from meta;
2
sqlite> select count(*) from pages;
1055
sqlite> select count(*) from pagetypes;
0
sqlite> select count(*) from tags;
93
sqlite> select count(*) from tagsources;
289
sqlite> select count(*) from tasklist;
3179

Revision history for this message
Jaap Karssenberg (jaap.karssenberg) said :
#7

What you refer to is logic to update the dialog, not the indexing
itself. This is not something I had not thought about. Definitely we
can make the dialog update asynchronously and not block the indexing
and thereby the saving.

I also see something else that is bad, the dialog is not destroyed
properly, so if you open and close multiple times, there are multiple
instances in the background still updating. See each time you open the
dialog, the lag should increase...

See fix would be
1) disconnect dialog update from indexing / update async
2) properly destroy dialog
3) check optimizing dialog update

Probably I can have a look over the weekend, but if you feel like
digging in, go ahead.

Regards,

Jaap

On Thu, Jul 5, 2012 at 4:55 PM, Morten Nielsen
<email address hidden> wrote:
> Question #201715 on Zim changed:
> https://answers.launchpad.net/zim/+question/201715
>
> Status: Answered => Open
>
> Morten Nielsen is still having a problem:
> I have done some digging
>
> Before opening the task list window the first time after starting, there is not problem.
> It then takes a couple of seconds to show the task list window, this is similar to what I encounter whenever a page is autosaved.
>
> In tasklist.py I inserted a counter in TaskListTreeView._appendtask
> It show that whenever a page is autosaved _appendtask is recursively called a lot.
> with 'consider all checkboxes as tasks' it was 335 times, and with out it was 107,
>
> My conclusion is that it relates to the recursive updates of the
> listview data in the tasklist window. In TaskListTreeView._appendtask
> it seems that the rows are processed 5 times at every call - it seems
> like a good candidate for optimizing :-)
>
> This, of course, is much much worse for re-indexing which recalculates
> whenever a page with tasks are encountered.
>
> I have included som stats about my notebook. Would you consider 1055 pages, and 1155 tasks a lot?
> ---
> Some stats from sqlite about my notebook below
> without 'consider all checkboxes as tasks'
> $ ZimNotes/.zim
> sqlite3 index.db
> sqlite> .tables
> links meta pagetypes tagsources
> linktypes pages tags tasklist
> sqlite> select count(*) from links;
> 605
> sqlite> select count(*) from linktypes;
> 0
> sqlite> select count(*) from meta;
> 2
> sqlite> select count(*) from pages;
> 1054
> sqlite> select count(*) from pagetypes;
> 0
> sqlite> select count(*) from tags;
> 93
> sqlite> select count(*) from tagsources;
> 289
> sqlite> select count(*) from tasklist;
> 1155
>
> with 'consider all checkboxes as tasks'
> sqlite> select count(*) from links;
> 605
> sqlite> select count(*) from linktypes;
> 0
> sqlite> select count(*) from meta;
> 2
> sqlite> select count(*) from pages;
> 1055
> sqlite> select count(*) from pagetypes;
> 0
> sqlite> select count(*) from tags;
> 93
> sqlite> select count(*) from tagsources;
> 289
> sqlite> select count(*) from tasklist;
> 3179
>
> --
> You received this question notification because you are an answer
> contact for Zim.

Revision history for this message
Jaap Karssenberg (jaap.karssenberg) said :
#8

On Thu, Jul 5, 2012 at 4:55 PM, Morten Nielsen
<email address hidden> wrote:
...
> I have included som stats about my notebook. Would you consider 1055 pages, and 1155 tasks a lot?

Not at all - indexing handles one page at a time and for a database
these are very small numbers to handle.

Interesting statistic would be max / average tasks per page.

Memo to self: add a small routing that lists these statistics and puts
them in the debug output when debug is enabled. Idem for depth of
pages, links etc.

-- Jaap

Can you help with this problem?

Provide an answer of your own, or ask Morten Nielsen for more information if necessary.

To post a message you must log in.