bzr branch freezes

Asked by Lucius

I am running:

bzr branch bzr+ssh://alias@100.101.102.103/d:/repos/main

and getting:

Connected (version 2.0, client WeOnlyDo)
Authentication (publickey) successful!
Secsh channel 1 opened.
[#########\ ] 927KB 2KB/s | Fetching revisions:Inserting stream

The screen stays frozen at this point indefinitely. On the server side, I do see how my ssh server shows spawns bzr.exe process and I do see a lot CPU and I/O activity by both SSH server and bzr.exe. Then activity stops for both processes. Though SSH keeps serving other requests. I have tried a couple of times and client side bzr keeps freezing.

What are the best ways to debug this problem. Is there verbose logging I can turn on or anything else that allows me to narrow down the problem?

The OS is Windows Server 2008 on both client and server.
Thanks

Question information

Language:
English Edit question
Status:
Needs information
For:
Bazaar Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
Martin Pool (mbp) said :
#1

Hi Lucius,

Could you please run the bzr command as "bzr branch -Dhpss -Dhpssdetail ....". That will record some debug information into bzr.log. whose location is shown by 'bzr version.' Then just attach the last section of that log in this question.

It's also worth checking you're running the latest release, either bzr 1.18 or bzr 2.0rc1 on both sides.

Revision history for this message
Lucius (luciusf) said :
#2

I am using version 17.1 on both client and server. There is no Windows Installer yet available for 1.18 or 2.x. Once the 1.18 windows installer has been released, I will try it out. In the mean time, the relevant log portion is shown below. What is interesting is that the last "n byte part read" line appears to be truncated to "665 byte pa":

Thu 2009-08-27 17:58:37 -0700
0.194 bzr arguments: [u'branch', u'-Dhpss', u'-Dhpssdetail', u'bzr+ssh://alias@100.100.100.100/d:/repos/main']
0.211 looking for plugins in C:/Users/aliasf/AppData/Roaming/bazaar/2.0/plugins
0.211 looking for plugins in C:/Program Files (x86)/Bazaar/plugins
0.566 encoding stdout as sys.stdout encoding 'cp437'
0.657 hpss: Built a new medium: SmartSSHClientMedium
1.107 bzr-svn: using Subversion 1.5.4 ()
1.490 opening SVN RA connection to 'bzr+ssh://alias@100.100.100.100/d%3A/repos/main'
1.491 hpss call: 'BzrDir.open', 'd:/repos/main/'
1.491 (to bzr+ssh://alias@100.100.100.100/d%3A/repos/main/)
1.495 falling back to default implementation
1.496 failed to load system host keys: [Errno 2] No such file or directory: 'C:\\users\\default/.ssh/known_hosts'
[10148] 2009-08-27 17:58:39.335 INFO: Connected (version 2.0, client WeOnlyDo)
[10148] 2009-08-27 17:58:39.986 INFO: Authentication (publickey) successful!
[10148] 2009-08-27 17:58:40.086 INFO: Secsh channel 1 opened.
3.125 result: ('yes',)
3.126 hpss call: 'BzrDir.open_branchV2', 'd:/repos/main/'
3.126 (to bzr+ssh://alias@100.100.100.100/d%3A/repos/main/)
3.380 result: ('branch', 'Bazaar Branch Format 6 (bzr 0.15)\n')
3.380 hpss call: 'BzrDir.find_repositoryV3', 'd:/repos/main/'
3.380 (to bzr+ssh://alias@100.100.100.100/d%3A/repos/main/)
3.453 result: ('ok', '..', 'no', 'no', 'no', 'Bazaar pack repository format 1 (needs bzr 0.92)\n')
3.454 opening SVN RA connection to 'bzr+ssh://alias@100.100.100.100/d%3A/repos'
3.455 hpss call: 'BzrDir.open', 'd:/repos/'
3.455 (to bzr+ssh://alias@100.100.100.100/d%3A/repos/main/)
3.526 result: ('yes',)
3.527 hpss call: 'BzrDir.find_repositoryV3', 'd:/repos/'
3.527 (to bzr+ssh://alias@100.100.100.100/d%3A/repos/main/)
3.604 result: ('ok', '', 'no', 'no', 'no', 'Bazaar pack repository format 1 (needs bzr 0.92)\n')
3.604 hpss call: 'Repository.is_shared', 'd:/repos/'
3.604 (to bzr+ssh://alias@100.100.100.100/d%3A/repos/main/)
3.677 result: ('yes',)
3.679 hpss call: 'Branch.get_stacked_on_url', 'd:/repos/main/'
3.679 (to bzr+ssh://alias@100.100.100.100/d%3A/repos/main/)
3.751 result: ('UnstackableBranchFormat', 'Branch format 6', 'chroot-26491248:///d:/repos/main/')
3.751 hpss call: 'Branch.last_revision_info', 'd:/repos/main/'
3.751 (to bzr+ssh://alias@100.100.100.100/d%3A/repos/main/)
3.825 result: ('ok', '670', '<email address hidden>')
3.828 hpss call: 'BzrDir.cloning_metadir', 'd:/repos/main/', 'False'
3.828 (to bzr+ssh://alias@100.100.100.100/d%3A/repos/main/)
3.912 result: ('Bazaar-NG meta directory, format 1\n', 'Bazaar pack repository format 1 (needs bzr 0.92)\n', ('branch', 'Bazaar Branch Format 6 (bzr 0.15)\n'))
4.021 creating repository in file:///C:/users/default/main/.bzr/.
4.112 Using fetch logic to copy between RemoteRepository(bzr+ssh://alias@100.100.100.100/d%3A/repos/.bzr/)(<RemoteRepositoryFormat>) and KnitPackRepository('file:///C:/users/default/main/.bzr/repository/')(<RepositoryFormatKnitPack1>)
4.114 hpss call w/body: 'Repository.get_stream', 'd:/repos/', 'Bazaar pack repository format 1 (needs bzr 0.92)\n' ('ancestry-of\nzifengh@'...)
4.114 61 bytes
4.306 result: ('ok',)
4.307 42 byte part read
4.308 54 byte part read
4.309 499 byte part read
4.333 494 byte part read
... many more of the same type lines
14.670 911 byte part read
14.671 425 byte part read
14.672 1483 byte part read
14.672 407 byte part read
14.672 1358 byte part read
14.673 891 byte part read
14.673 424 byte part read
14.674 665 byte pa
Thu 2009-08-27 18:00:14 -0700
0.090 bzr arguments: [u'version']
0.104 looking for plugins in C:/Users/aliasf/AppData/Roaming/bazaar/2.0/plugins
0.104 looking for plugins in C:/Program Files (x86)/Bazaar/plugins
0.219 encoding stdout as sys.stdout encoding 'cp437'
0.232 return code 0

Revision history for this message
Martin Pool (mbp) said :
#3

There are now exe installers for 1.18, could you please retest with that?

Revision history for this message
Andrew Bennetts (spiv) said :
#4

> 14.673 424 byte part read
> 14.674 665 byte pa
> Thu 2009-08-27 18:00:14 -0700

That's weird. It looks like the log file output is buffered, so we're losing the most interesting part :( I thought we tried to open it unbuffered, but maybe there's a windows-specific buglet there?

What's the version of bzr on the server? Is there traceback in the server's .bzr.log? My guess is that the server is encountering an unexpected error while generating the stream, and that a bug is stopping that error from being sent to the client. I vaguely recall fixing a bug in the server-side error handling in a recent-ish release, so perhaps upgrading both ends to 1.18 will at least give a clearer failure. I certainly don't know of any outstanding bugs in server or client error handling in 1.18.

(Hmm... we really should start muttering the reported server version in .bzr.log...)

Another thing to try is hitting Ctrl+Break when it hangs, then generating a backtrace by typing "bt" then pressing Enter. I'd expect that backtrace to confirm that the client is hung while waiting for a message from the server, but it would be good to be certain given then the buffering of the bzr.log.

Revision history for this message
Lucius (luciusf) said :
#5

I upgraded both client and server to bzr version 18.1. This time I am using an entirely new clean OS install of Windows 7 (original repro was on Windows Server 2008). Server still on Windows Server 2008.

Same problem.

To repeat: I am able to pull and push from a different machines. I just can't download a new branch.
I can't do Ctrl+Break or Ctrl+C to unfreeze. I have to kill the client process.

Log has the same problem that the interesting info is not in there:

Mon 2009-09-07 08:50:21 -0700
0.046 bzr arguments: [u'branch', u'-Dhpss', u'-Dhpssdetail', u'bzr+ssh://myalias@55.206.142.242/d:/repos/main']
0.052 looking for plugins in C:/Users/myalias/AppData/Roaming/bazaar/2.0/plugins
0.052 looking for plugins in C:/Program Files (x86)/Bazaar/plugins
0.146 encoding stdout as sys.stdout encoding 'cp437'
0.162 hpss: Built a new medium: SmartSSHClientMedium
0.177 bzr-svn: using Subversion 1.5.6 ()
0.354 hpss call: 'BzrDir.open', 'd:/repos/main/'
0.354 (to bzr+ssh://myalias@55.206.142.242/d%3A/repos/main/)
0.356 falling back to default implementation
0.356 failed to load system host keys: [Errno 2] No such file or directory: 'C:\\Users\\myalias/.ssh/known_hosts'
[ 4912] 2009-09-07 08:50:21.811 INFO: Connected (version 2.0, client WeOnlyDo)
[ 4912] 2009-09-07 08:50:22.556 INFO: Authentication (publickey) successful!
[ 4912] 2009-09-07 08:50:22.657 INFO: Secsh channel 1 opened.
1.829 result: ('yes',)
1.830 hpss call: 'BzrDir.open_branchV2', 'd:/repos/main/'
1.830 (to bzr+ssh://myalias@55.206.142.242/d%3A/repos/main/)
2.121 result: ('branch', 'Bazaar Branch Format 6 (bzr 0.15)\n')
2.122 hpss call: 'BzrDir.find_repositoryV3', 'd:/repos/main/'
2.122 (to bzr+ssh://myalias@55.206.142.242/d%3A/repos/main/)
2.210 result: ('ok', '..', 'no', 'no', 'no', 'Bazaar pack repository format 1 (needs bzr 0.92)\n')
2.211 hpss call: 'BzrDir.open', 'd:/repos/'
2.211 (to bzr+ssh://myalias@55.206.142.242/d%3A/repos/main/)
2.297 result: ('yes',)
2.298 hpss call: 'BzrDir.find_repositoryV3', 'd:/repos/'
2.298 (to bzr+ssh://myalias@55.206.142.242/d%3A/repos/main/)
2.387 result: ('ok', '', 'no', 'no', 'no', 'Bazaar pack repository format 1 (needs bzr 0.92)\n')
2.387 hpss call: 'Repository.is_shared', 'd:/repos/'
2.387 (to bzr+ssh://myalias@55.206.142.242/d%3A/repos/main/)
2.475 result: ('yes',)
2.476 hpss call: 'Branch.get_stacked_on_url', 'd:/repos/main/'
2.476 (to bzr+ssh://myalias@55.206.142.242/d%3A/repos/main/)
2.565 result: ('UnstackableBranchFormat', 'Branch format 6', 'chroot-25811216:///d:/repos/main/')
2.565 hpss call: 'Branch.last_revision_info', 'd:/repos/main/'
2.565 (to bzr+ssh://myalias@55.206.142.242/d%3A/repos/main/)
2.647 result: ('ok', '793', '<email address hidden>')
2.649 hpss call: 'BzrDir.cloning_metadir', 'd:/repos/main/', 'False'
2.649 (to bzr+ssh://myalias@55.206.142.242/d%3A/repos/main/)
2.747 result: ('Bazaar-NG meta directory, format 1\n', 'Bazaar pack repository format 1 (needs bzr 0.92)\n', ('branch', 'Bazaar Branch Format 6 (bzr 0.15)\n'))
2.778 creating repository in file:///C:/Users/myalias/Depots/Rdrn/main/.bzr/.
2.816 Using fetch logic to copy between RemoteRepository(bzr+ssh://myalias@55.206.142.242/d%3A/repos/.bzr/)(<RemoteRepositoryFormat>) and KnitPackRepository('file:///C:/Users/myalias/Depots/Rdrn/main/.bzr/repository/')(<RepositoryFormatKnitPack1>)
2.818 hpss call w/body: 'Repository.get_stream', 'd:/repos/', 'Bazaar pack repository format 1 (needs bzr 0.92)\n' ('ancestry-of\nzifengh@'...)
2.818 61 bytes
3.016 result: ('ok',)
3.016 42 byte part read
3.016 54 byte part read
3.017 480 byte part read
3.021 478 byte part read
... Many more lines of the same type
8.006 410 byte part read
11.826 70254 byte part read <--- Last line in log

Revision history for this message
Lucius (luciusf) said :
#6

The client process does unfreeze if I kill the corresponding bzr process that SSH creates on the server side.

In this case the error is:

bzr: interrupted
HPSS calls: 10 (0 vfs) SmartSSHClientMedium(connected=False, username='myalias', host='55.206.142.242', port=None)

From that I conclude that the connection between client and server is established.

Here is the relevant server log:

Mon 2009-09-07 09:11:07 -0700
0.055 bzr arguments: [u'serve', u'--inet', u'--directory=/', u'--allow-writes']

If there is any other information I can provide, please let me know.

Revision history for this message
Launchpad Janitor (janitor) said :
#7

This question was expired because it remained in the 'Open' state without activity for the last 15 days.

Revision history for this message
Lucius (luciusf) said :
#8

Additional information:
Co-workers experience the same problem, but when using Windows 7 (RTM version). In other words we can repro this problem every time on all machines using WIndows 7.

Revision history for this message
Martin Pool (mbp) said :
#9

Do you know where this "WeOnlyDo" thing is coming from? Could it be that you're using an external ssh client and that's causing a problem? How about if you use paramiko?

Revision history for this message
Lucius (luciusf) said :
#10

Not entirely sure where "WeDoThis" comes from. My best guess is, that this is from the SSH Server we use. We are using freeSSHD for Windows from http://www.freesshd.com/. This product has been working fine for us, but it is conceivable that it has problems with larger streams. Our depot has gotten bigger (~4000 files, 200MB) and branches take a longer time to pull down. It reproes now 100% on various clients. As a work around we zip the entire depot including history, copy it over and hook it back up.

We do NOT use an external SSH client. We use exactly what is installed with the Windows Standalone Installer and do not install any extra tools or apply unusual configuration client side.

Revision history for this message
Martin Pool (mbp) said :
#11

Is there a firewall or ADSL line between you and the server? It may be that things are actually hanging at the network level.

Could you try interrupting the client (I think with Ctrl-Pause) which should put you into a debugger, and then type 'bt' there.

Can you help with this problem?

Provide an answer of your own, or ask Lucius for more information if necessary.

To post a message you must log in.