Comment 92 for bug 1439288

Revision history for this message
In , Aliakc (aliakc) wrote :

(In reply to André Miranda from comment #88)
> Well, I'll try to reproduce this bug later, meanwhile if some one could
> apply Ali's patch, build thunar, remove the "misc-file-size-binary" property
> and give it a shot would be great.

After applying one of these two offered patches, preferabely the "improved patch", you don't have to remove or set any property at all. The thunar.xml can be empty (start stop xfconfd and thunar --quit afterwards).

Here an attempt of explaination:

Race coditions depend on so many aspects. Speed of computer (Harddisk, CPU), Compile flags, Compiler used, Different types of libraries, Thread safe libs vs. non-thread safe libs.

My investigations as follows:

Is dbus 1.6.x the cause ? I don't think so, otherwise tons of other programs of my Fedora 20 installation (which I consider rock solid) must have triggered similar issues.

Is xfconf 4.10/4.11/4.12 the cause ? I don't think so, otherwise ... (same as above).

Is Thunar 1.6.4+ the cause ? My answer here is "most likely". On some machines the bug does not appear or appear rarely. This is due to different (newer) libraries which might be a tad more robust or offer better thread safety. Some machines might be faster, others not. Even compile options and architecture might take a role here.

Setting the "binary size option" was my initial thought because I assumed that not having this flag set, might leave an unknown state. This is not the case as I learned later since the preferences object set default internal values.

Now looking at it from different perspectives:

Leave the code of thunar GIT (or 1.6.x) as is. Only remove all traces from the preferences getter to xfconf allows me to run thunar most of the time stable. Realy rare crashes!

This means! The "may cause" code stays as is and thunar "works". No connections to xfconf.

This leads to the believing that either xfconf or dbus may be buggy (let's assume it so) or that xfconf_property_get returns some undefined values which then gets copied into the preferences object. This is not the case either! But then I had that theory of dbus and xfconf being used plenty of time on my system and that it might be race condition.

I then started to keep a closer look at the methods that introduced the object referencing of the preferences and thus asking to get the "binary size" attribute. I realized that referencing the preferences object and getting the "binary size" attribute is done at the beginning of the methods (nearly all the time). Even if it's not necessary to do this at all, because often the attribute is needed in a nested if/else statement.

So when I enter the function and reference the preferences object at the beginning, then the code is being executed ALL the time (reference object, call to xfconf, call to dbus) regardless if we never hit the nested if/else statements. Placing the preferences object references close to where it's required makes sense. This will heavily reduce activity on the dbus and allows the if/else statements to leave without even the requirement to trigger it at all (performance imrpovement as well).

Whenever "copying" is requested from the user then a new thread is being created and after copy is done, the thread is left again (gdb gives output of threads being entered and left). My investigations (if we leave thunar as is) is, that most of the time when we request "copy" or "move" that the thread is entered but not left and crashes (but the file, dir we triggered is copied as well).

Having this in mind I assumed as follows:

The file "thunar-transfer-job.c" which has these two if return statements at the close beginning. If something is empty (file list and or size) then instantly return. I don't know how often this function is called but I assume it might be a high frequent called one (depending on the amount of files and directories moved - it can also be the other way :) ). So we enter that method, trigger the object, return from the if/return clause and deal with the other files and dirs while "dbus and xfconf" are still dealing with the "initial" request of that particular "misc binary size" attribute.

Leaves the final question:

Why does it disappear when we explicit set the "misc binary size" attribute in the preferences (and leave the code untouched).

Answer:

Race condition, thread safety, <insert anything else>