sshfs often breaks (azureus related?)

Asked by Bogdan Butnaru on 2007-03-06

This is on Ubuntu Feisty, using fuse.

I have two computers, one of which (say, A) has a big hard drive. The other (call it B) is a laptop I use as my primary machine. A is connected to my home network via Ethernet. When I'm at home, B is connected to the same network via WiFi.

The network works, as far as I can tell. I can always connect to all computers, I did long transfers between them, etc.

Recently I tried using sshfs/fuse to make the big hard drive of A accessible to B. (It's a USB drive that used to be connected to A, but I got sick of pluging it in and out all the time.) This works, at first: it's mounted without any error, I can browse it normally, I can watch movies off it. However, it suddenly fails with the same error message: "Mar 6 22:08:26 cimer collectd[6905]: statv?fs failed: Transport endpoint is not connected" (snipped from syslog). If I unmount/mount it again, it seems to work alright. I couldn't find anything else related in the logs except that.

I'm not sure, but it might be related to Azureus. I have Azureus set up to download files on the laptop, but once it's done it moves them to the remote disc and seeds from there. It seems the problem appears quickly after I start Azureus, and I don't think I noticed it without it, but I'm not sure, and I noticed no explicit connection. Also, I _think_ I noticed once the error activated (thus, I couldn't access the remote disk anymore), with two seeding torrents; one of them reported the error, but the other seemed to still work. It could have operated from a cache, though.

Question information

English Edit question
Ubuntu Edit question
No assignee Edit question
Solved by:
Michael Bienia
Last query:
Last reply:
Jeff Greene (jeffgreene) said : #1

It seems as if the problem could be WiFi related. For instance, if your laptop lost a signal for a bit, the internet would get disconnected. This might confuse the fuse mount and would be the reason that it stops working until it is remounted. Try hooking up both computers using ethernet cables and see if the problem still exists. If it goes away, then you probably should try staying in range of the WiFi access point and you might want to consider submitting a bug report about how fuse does not automatically reconnect after the network goes down temporarily.

I hope that helps!

Bogdan Butnaru (bogdanb) said : #2

I just tried with a wired connection. It happens the same way. When I start seeding with Azureus, the error appears almost immediately.

If I don't use Azureus, it seems to take longer. I would watch some TV show, it would work for a couple episodes, then suddenly it stops working. I don't know what causes it in those cases. The error is always the same.

I can just "sudo umount -l -f /mnt/mountpoint" (it never works without the lazy option), then just a normal "mount /mnt/mountpoint" makes it work again.

So, should I report this under fuse?

Jeff Greene (jeffgreene) said : #3

No, you wouldn't have sufficient enough information anyways. Try launching Azureus from the command prompt and see if it throws you any errors.

Bogdan Butnaru (bogdanb) said : #4

OK, I tried running Azureus in a console. It pretty much says the same thing: an exception with a big stack trace is thrown a few times. The message is:

DEBUG::Thu Mar 08 01:24:24 CET 2007::com.aelitis.azureus.core.diskmanager.file.impl.FMFileImpl::openSupport::383: /media/LACIE/torrent/files.done/[filename] (Transport endpoint is not connected)
[rest of stack trace omitted]

This happens a few more time with pretty much the same header:

DEBUG::Thu Mar 08 01:24:24 CET 2007::com.aelitis.azureus.core.diskmanager.file.impl.FMFileImpl::openSupport::383:
DEBUG::Thu Mar 08 01:24:24 CET 2007::org.gudy.azureus2.core3.disk.impl.access.impl.DMReaderImpl$requestDispatcher::failed::575:
DEBUG::Thu Mar 08 01:24:24 CET 2007::com.aelitis.azureus.core.diskmanager.file.impl.FMFileImpl::openSupport::383:
DEBUG::Thu Mar 08 01:24:24 CET 2007::com.aelitis.azureus.core.diskmanager.file.impl.FMFileImpl::openSupport::383:

Then it gives up with:

DEBUG::Thu Mar 08 01:24:24 CET 2007::org.gudy.azureus2.core3.disk.impl.access.impl.DMReaderImpl$requestDispatcher::failed::575:
  com.aelitis.azureus.core.diskmanager.cache.CacheFileManagerException: open fails

Presumably it has a set number of retries, or it just started four or five threads that try to read at the same time.

Interestingly, syslog has a single entry at 1:24:25, reading

Mar 8 01:24:25 cimer collectd[5953]: statv?fs failed: Transport endpoint is not connected

But then it continues with many identical lines (just the timestamp changes). These are interspersed with many other warnings/errors I get from pulseaudio and ntpd, but I get those all the time, and the problem appears almost only when runnig Azureus.

Jeff Greene (jeffgreene) said : #5

Hi again,
Perhaps this error is caused by Azureus using the sshfs for cache/temporary files. It may be writing too aggressively, not knowing it is a network mount. There could be limitations to the sshfs that make it not a viable solution for using it as a download source for Azureus.

Alternatively, why don't you try turning on "Remote Desktop" (System --> Preferences --> Remote Desktop) or downloading the torrent through SSH. This way, you can still initiate the torrent download from the laptop, but all the downloading will be done from the other computer.

I thought of a better solution:
Azureus has a web interface that you can enable. This allows you to be able to control Azureus completely from a web browser. Set it up on the desktop computer with the big hard drive and then start all the downloads from the laptop through the web interface.

I hope one of those solutions works. Good luck!

Bogdan Butnaru (bogdanb) said : #6

Hi! First of all, I'm almost sure Azureus is not writting, but reading. I have configured it very carefully to use the remote folders only for completed downloads. It could do a bit of intense reading at start-up, I don't know.

Anyway, I know I can run a torrent client on the other host, but that's not exactly the point. I want to trust the ssh-mounted fs, because it contains a lot of relatively important data. That's partly why I use SSH instead of NFS or SMB. If it breaks just because Azureus does a few quick reads then it may break when I copy my photos to it, or something similar, and I could lose data. A network filesystem should not break unless the network fails, and this doesn't happen, because other simultaneous SSH sessions don't break. (It should also resume when the network is again available, but that's another thing.)

Initially the hard drive was connected directly to my computer, and I had a lot of applications set-up to use it. I moved it to another computer and I expected to just mount it through sshfs in the same mountpoint and go on. Azureus just happened to use that mount-point, too.

I could go to the trouble of moving Azureus to the other computer and connect remotely -- I did that for Amarok-- but that's not the point. I need to find out what the problem is with sshfs, because I can't move _all_ processes to the other computer and I need to use that drive.

Best Michael Bienia (geser) said : #7

This is a known regression in fuse 2.6.2 which is fixed in fuse 2.6.3. fuse 2.6.3 should hopefully appear soon in feisty.

Bogdan Butnaru (bogdanb) said : #8

User confirmed that the question is solved.

Bogdan Butnaru (bogdanb) said : #9

Thanks! Which of the bugs there caused my problem?