bzr memory usage

Asked by Dimitrios Apostolou

Hi, I sent en email to the bazaar mailing list regarding memory usage, but since I'm not subscribed did not get through in 5 days. My experience in brief:

Commands:

(1) bzr branch lp:gcc
(2) bzr branch --stacked lp:gcc
(3) bzr checkout lp:gcc
(4) bzr checkout --lightweight lp:gcc

As you can see I am desparately trying to checkout a very large project, and I need to be able to do it on old machinery (128MB RAM). Nevertheless I run the commands on a recent machine with lots of RAM, and here is what I got as total Virtual Memory sizes:

(1) and (2) succeeded with too much memory used (Peak at ~800MB, avg about 500-600MB)
(3) was OOM killed (!) after surpassing 850MB
(4) had downloaded more than 8GB of data after 5 or 6 hours and I had to kill it, it's memory usage was much less but still significant, I think about 250MB when I killed it.

So my questions:
* Is there a leak somewhere? All memory sizes were constantly increasing.
* Is bzr the proper tool for that large projects? Or should I use svn, that I was trying to avoid?
* Is there hope for doing a simple checkout on old machinery? Should I resort to downloading from recent hardware, compressing a tarball, and scp'ing that?
* Time for the succesful branch commands was 3-4 hours, is that normal? I got the impression that the first hour or so it downloaded data, but didn't write anything to disk...
* Does bazaar mailing list supports posts from unsubscribed? If not it should be mentioned somewhere.

Thanks in advance,
Dimitris

Question information

Language:
English Edit question
Status:
Answered
For:
Bazaar Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
Dimitrios Apostolou (jimis) said :
#1

I forgot to post info about bzr version:

$ bzr --version
Bazaar (bzr) 2.3.1
  Python interpreter: /usr/bin/python2 2.7.1
  Python standard library: /usr/lib/python2.7
  Platform: Linux-2.6.37-ARCH-i686-Intel-R-_Core-TM-2_Quad_CPU_Q6600_@_2.40GHz-with-glibc2.0
  bzrlib: /usr/lib/python2.7/site-packages/bzrlib
  Bazaar configuration: /home/jimis/.bazaar
  Bazaar log file: /home/jimis/.bzr.log

Revision history for this message
Andrew Bennetts (spiv) said :
#2

Sorry about the mailing list messages getting stuck in moderation. I've approved it now. Messages from non-subscribers are allowed, but have to be manually moderated to prevent a flood of spam getting through.

Unfortunately we have a fair few memory consumption bugs: <https://bugs.launchpad.net/bzr/+bugs?field.tag=memory>. We're making steady progress on improving bzr's memory consumption, but we're probably some time away from meeting your requirements. Working on a 128MB system isn't a case we've been focussing on, as most systems are much more powerful than that.

Much of the consumption aren't straightforward leaks; e.g. in some cases bzr tends to calculate a lot of work up front before it writes the results to disk. That said it's possible there are some cheap improvements we can make for your specific situation. I'll take a quick look and see if I can find anything obvious I can improve for bzr 2.4, and report back with what, if anything, I find.

3-4 hours is unexpected, even for fetching such a large branch as lp:gcc. It should be much quicker.

Revision history for this message
Dimitrios Apostolou (jimis) said :
#3

I thought that development would be impossible on such low end machines, but simply downloading the project should be feasible on even lower specs. The way I understood lightweight checkout, I thought it was exactly this, fetch all data like wget with no metadata at all.

But lightweight checkout on lp:gcc simply doesn't work for me, much more data is downloaded than the project's total size, and seems to never finish. Can you reproduce this behaviour?

I should note that I tried all the commands without login to launchpad, and without doing "init" in the directory before.

Revision history for this message
Andrew Bennetts (spiv) said :
#4

See my mailing list reply about lightweight checkouts; they are designed for completely local use, and it's a known problem that they are terrible over a network, e.g. bug 762330.

Revision history for this message
John A Meinel (jameinel) said :
#5

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 5/4/2011 3:59 AM, Dimitrios Apostolou wrote:
> Question #155775 on Bazaar changed:
> https://answers.launchpad.net/bzr/+question/155775
>
> Dimitrios Apostolou posted a new comment:
> I thought that development would be impossible on such low end machines,
> but simply downloading the project should be feasible on even lower
> specs. The way I understood lightweight checkout, I thought it was
> exactly this, fetch all data like wget with no metadata at all.
>
> But lightweight checkout on lp:gcc simply doesn't work for me, much more
> data is downloaded than the project's total size, and seems to never
> finish. Can you reproduce this behaviour?
>
> I should note that I tried all the commands without login to launchpad,
> and without doing "init" in the directory before.
>

Logging into launchpad should have better characteristics in general.
Since we use a custom protocol to communicate, rather than reading the
raw files over HTTP.

It should have lower peak memory, but I won't say that it would be
enough to manage on 128MB memory machine.

As for 3 hours, most likely you're just swapping to death.

One thing you could try (though no guarantees), is to:

a) log into launchpad, so we can issue protocol requests
b) fetch only part of the history at a time. So do:
  bzr branch lp:gcc -r 1000
  cd gcc
  for i in `seq 2000 90000 1000`; do bzr pull -r $i; done
  bzr pull

Also, I do agree we could be a bit lower for lightweight checkouts. ATM
we struggle a bit to balance batching vs bandwidth vs memory
consumption. Also, lightweight checkouts treat the source as a 'dumb'
source, so it doesn't optimize for streaming in the same way as
branch/push/pull do. (again, it generally assumes close/fast connections
for lightweight checkouts, since you have to access the source for most
operations.)

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk3A94MACgkQJdeBCYSNAANQpwCg1grfB0MS74wNrByqUZnB9g1P
6R0AoM+m3VY54BbU2fQKPwRUT+gzkrxx
=S/o1
-----END PGP SIGNATURE-----

Can you help with this problem?

Provide an answer of your own, or ask Dimitrios Apostolou for more information if necessary.

To post a message you must log in.