Revno/revision history of central branch in distributed-style version control

Asked by Brian Cox

Hello,

I'm having a hard time understanding how the revision number and history is set for the central branch when working in the distributed-style setup.

It appears that when a user Pushes their branch to the central branch the central branch gets the revision history of the user branch that was Pushed.

For example, earlier today I Pushed my branch and the bzr log of the central branch shows only my commits with the latest revno being 55. Another user Merged my changes, committed after resolving a conflict, then Pushed. Now the bzr log shows his commits with the latest revno being 47.

I stumbled across this while trying to get bazaar-hookless-email set up and was troubleshooting why so many messages (one for each commit on a user branch) were being sent out. It might be related to this I suppose.

Do I possibly have something setup incorrectly in Bazaar to have the revision history change like this? Or am I doing something else wrong?

Any ideas/help is greatly appreciated.

Thanks!
Brian

Question information

Language:
English Edit question
Status:
Solved
For:
Bazaar Edit question
Assignee:
No assignee Edit question
Solved by:
Brian Cox
Solved:
Last query:
Last reply:
Revision history for this message
John A Meinel (jameinel) said :
#1

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 9/14/2011 1:10 AM, Brian Cox wrote:
> New question #171044 on Bazaar:
> https://answers.launchpad.net/bzr/+question/171044
>
> Hello,
>
> I'm having a hard time understanding how the revision number and
> history is set for the central branch when working in the
> distributed-style setup.
>
> It appears that when a user Pushes their branch to the central
> branch the central branch gets the revision history of the user
> branch that was Pushed.
>
> For example, earlier today I Pushed my branch and the bzr log of
> the central branch shows only my commits with the latest revno
> being 55. Another user Merged my changes, committed after resolving
> a conflict, then Pushed. Now the bzr log shows his commits with the
> latest revno being 47.
>
> I stumbled across this while trying to get bazaar-hookless-email
> set up and was troubleshooting why so many messages (one for each
> commit on a user branch) were being sent out. It might be related
> to this I suppose.
>
> Do I possibly have something setup incorrectly in Bazaar to have
> the revision history change like this? Or am I doing something else
> wrong?

If you want to preserve monotonically increasing revnos, you can do:

  bzr config append_revisions_only=True -d $BRANCH

That will prevent users from pushing changes that would allow the
mainline revisions to change. For a graphical example (time goes down):

A
|
B
|
C
|
D # Your original history

A
|
B
|\
C X # User branched off an old version, did some changes
| |
D |
 \|
  Y # Merged your new tip

This is what the history looks like in *their* branch, and what the
trunk branch looks like after they do 'bzr push':
A
|
B
|\
X C
| |
| D
|/
Z

And 'bzr log -n1' will now show A B X Z instead of A B C D.

We recommend setting append_revisions_only=True for shared branches.
We just haven't come up with a great way of exposing that to people.

Note that with append_revisions_only=True, then User1 has to have 2
branches locally for doing work. One stays a "pristine" copy of the
trunk branch, which he then switches to, merges into, commits and
pushes to trunk. You can't push from your feature branch, because you
would be changing the mainline history of trunk.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk5xpVgACgkQJdeBCYSNAAPdCACdE4v+E5CjQO55bcPFRdZKTkEP
jX4AoIHi9p61XekfsunhGgC7w/3UMRlv
=k0kK
-----END PGP SIGNATURE-----

Revision history for this message
Brian Cox (brian-cox) said :
#2

This answered my question, but I have a follow up question now.

Why is it different for a user to have a "pristine" copy of the trunk versus just pushing from their development branch? Isn't the trunk mainline history going to be affected either way?

My attempt at visualizing this is below. If I understand, then a user would make 2 branches when they start development. The C/D branch below would be their development branch, that they would push into the J/K branch, which would then be pushed into the trunk.

A
|
B--
|\ \
X J C
| | |
| | |
| | D
| | /
| K
| /
Z

Or... are you saying that they have to create a new intermediate branch every time they want to push in to the trunk?

Up to this point, our process has been:
- User gets branch from trunk
- user makes changes
- user commits changes on his branch
- user merges from trunk, resolving any conflicts
- user commits these changes
- user pushes to trunk

So you're saying this will not work with the append_revisions_only setting set to True?

Thank you again for the help!

Revision history for this message
John A Meinel (jameinel) said :
#3

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

...
> This answered my question, but I have a follow up question now.
>
> Why is it different for a user to have a "pristine" copy of the trunk
> versus just pushing from their development branch? Isn't the trunk
> mainline history going to be affected either way?
>
> My attempt at visualizing this is below. If I understand, then a user
> would make 2 branches when they start development. The C/D branch below
> would be their development branch, that they would push into the J/K
> branch, which would then be pushed into the trunk.
>
> A
> |
> B--
> |\ \
> X J C
> | | |
> | | |
> | | D
> | | /
> | K
> | /
> Z
>
> Or... are you saying that they have to create a new intermediate branch
> every time they want to push in to the trunk?
>
> Up to this point, our process has been:
> - User gets branch from trunk
> - user makes changes
> - user commits changes on his branch
> - user merges from trunk, resolving any conflicts
> - user commits these changes
> - user pushes to trunk
>
> So you're saying this will not work with the append_revisions_only
> setting set to True?
>
> Thank you again for the help!
>

Correct. There is one step that needs to change:

> - User gets branch from trunk
> - user makes changes
> - user commits changes on his branch
> - user merges from trunk, resolving any conflicts

Becomes
  - user switches to their trunk branch
  - user pulls latest trunk tip
  - user merges their changes into trunk

> - user commits these changes
> - user pushes to trunk

If they do that, my earlier graph looks like:

 A
 |
 B
 |\
 C X
 | |
 D |
 |/
 Z

Which is subtly different from:
 A
 |
 B
 |\
 C X
 | |
 D |
  \|
   Z

Specifically, the order of the parents in the first is "D,X" the second
is "X,D".

If you push the first one, the 'mainline' history is only added to. Once
a revision is in the mainline history, it stays there. In the
'append_revisions_only=False' case, you still don't "lose" revisions,
they are just no longer mainline after the push.
append_revisions_only=True is strictly about making sure revisions that
become mainline on a given branch, will always be mainline revisions on
that branch.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk5yPw0ACgkQJdeBCYSNAAN+DACfR1yTwR7ddO0ldSLDw+GApmDm
dgkAoLgJHCMnnvTaXtGTQBpXoSVB1DT2
=mrP7
-----END PGP SIGNATURE-----

Revision history for this message
Brian Cox (brian-cox) said :
#4

Thanks again John. At the risk of repeating myself I'm going to ask some more questions so I completely understand it before I explain it to the other developers.

The user version of trunk, that's something they can keep merging (or is pulling more appropriate?) to so that it stays current without the need for creating a new branch each time, right?

And I'm still not crystal clear on how using the intermediate trunk branch changes things.

(note, my Clearcase background may be clouding my judgement here)

Let's say I have a central trunk branch and a user creates two branches, one as their "trunk" and one as their development branch. They make changes in their branch and in the meantime someone else makes a separate change to the central trunk.

When the user is ready to bring their changes to the central branch so everyone else gets them they would:
- merge (or pull?) from the central branch to their version of the the trunk branch
- merge the development branch into the intermediate trunk branch (does the development branch need to be updated with the current trunk before doing this?)
- commit the changes
- push to the central trunk

I think one of the things I am having trouble wrapping my head around is how does Bazaar know the difference between the two branches that the developer makes so that it does

 A
 |
 B
 |\
 C X
 | |
 D |
 |/
 Z

instead of

 A
 |
 B
 |\
 C X
 | |
 D |
  \|
   Z

?

The first is what I want so that bazaar-hookless-email looks back and sends an email for going from D to Z (not details about everything on the development branch, though all of this explains the weird (to me) behavior that I was seeing given the way the revision history seems to move around).

Brian

Revision history for this message
Brian Cox (brian-cox) said :
#5

And another follow up question. If I set append_revisions_only=True, what are the consequences if I revert that to false?

Revision history for this message
John A Meinel (jameinel) said :
#6

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 9/22/2011 7:05 PM, Brian Cox wrote:
> Question #171044 on Bazaar changed:
> https://answers.launchpad.net/bzr/+question/171044
>
> Brian Cox gave more information on the question: And another follow
> up question. If I set append_revisions_only=True, what are the
> consequences if I revert that to false?
>

If you ever set it to False, then you are back at the default
behavior. Any commit which has the existing tip in its ancestry is
allowed to become the new tip. (I can give more details if you need.)

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk58dSMACgkQJdeBCYSNAAOiRACfSDT3ZC8Kk4Cch1YcRtckLjc4
jPUAn1XN7h+EMbikBC4vErzdyY9h/gBo
=sYk3
-----END PGP SIGNATURE-----

Revision history for this message
John A Meinel (jameinel) said :
#7

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 9/15/2011 9:20 PM, Brian Cox wrote:
> Question #171044 on Bazaar changed:
> https://answers.launchpad.net/bzr/+question/171044
>
> Status: Answered => Open
>
> Brian Cox is still having a problem: Thanks again John. At the risk
> of repeating myself I'm going to ask some more questions so I
> completely understand it before I explain it to the other
> developers.
>
> The user version of trunk, that's something they can keep merging
> (or is pulling more appropriate?) to so that it stays current
> without the need for creating a new branch each time, right?
>

Correct. You would generally only pull into it, because you want an
exact copy of trunk, not something that is 'similar' but a 'mirror'.

> And I'm still not crystal clear on how using the intermediate
> trunk branch changes things.

bzr does not treat merging as 100% symmetric. Merging A => B gives a
slightly different result than B => A. The only major effect is that
of mainline history, and how that impacts things like 'bzr log'.

>
> (note, my Clearcase background may be clouding my judgement here)
>
> Let's say I have a central trunk branch and a user creates two
> branches, one as their "trunk" and one as their development branch.
> They make changes in their branch and in the meantime someone else
> makes a separate change to the central trunk.
>
> When the user is ready to bring their changes to the central branch
> so everyone else gets them they would: - merge (or pull?) from the
> central branch to their version of the the trunk branch

pull

> - merge the development branch into the intermediate trunk branch
> (does the development branch need to be updated with the current
> trunk before doing this?) - commit the changes - push to the
> central trunk

Correct.

>
> I think one of the things I am having trouble wrapping my head
> around is how does Bazaar know the difference between the two
> branches that the developer makes so that it does
>
> A | B |\ C X | | D | |/ Z
>
> instead of
>
> A | B |\ C X | | D | \| Z
>
> ?
>
> The first is what I want so that bazaar-hookless-email looks back
> and sends an email for going from D to Z (not details about
> everything on the development branch, though all of this explains
> the weird (to me) behavior that I was seeing given the way the
> revision history seems to move around).
>
> Brian
>

If you are in a tree at revision "D", and you do "bzr merge X" it sets
the parents to "D, X". If you are in a tree at X and you merge D, it
sets the parents to "X, D". Similar, but not identical.

The general idea is a difference of intent. There are times when you
are "landing" your branch, and there are times when you are "merging
trunk to bring it up to date and handle conflicts, etc." Both are
effectively a merge of the same two branches. But the intent is
different. Setting "append_revisions_only=True" encourages (requires?)
people to clarify the intent, because we won't let them push a "merge
trunk to come up to date" commit.

There are some really nice benefits long term from this. Like "bzr log
lp:bzr". Some people claim that merge commits are noise. I find it a
way to progressively display information. So a simple "bzr log -n1"
gives you the summary of what feature landed at each revision, and
"bzr log -n0" gives you the detailed information about individual
changes that contributed to the overall feature.

If you don't distinguish between "land on trunk" and "merge to get
latest trunk", then future log information cannot tease it apart easily.

For example, Samba recently switched to git, but adopted a policy that
all new landings onto trunk *must be rebased*, so that you can tell
what the new commits are, rather than having them intertwined in
history and hard to tease apart. 'append_revisions_only=True' gives us
the tweasing apart, without having to rewrite revisions, etc.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk58dPEACgkQJdeBCYSNAAPIzQCgjUH//YvMtNkT7caWB8K7qL52
0OwAoLu0PKnjehpYToJnFoHZB3s+QbPQ
=toZb
-----END PGP SIGNATURE-----

Revision history for this message
Brian Cox (brian-cox) said :
#8

Hi John,

I think your last response cleared a lot of things up for me. After talking with a couple of guys here, we do have a couple more smaller questions.

1. Can you clarify how the development branch gets updated? Merging from the trunk? Or the intermediate branch?

2. Is it possible to get away with not using the intermediate branch by merging from the trunk to the development branch, then merging (not pushing) from the development branch back to the trunk?

3. And assuming we need the intermediate branch, is there an issue with creating one integration branch (rather than one extra branch for each developer) that all the developer branches merge to and then push from there to the trunk?

Thanks again, you've been a huge help.
Brian

Revision history for this message
John A Meinel (jameinel) said :
#9

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 9/24/2011 12:50 AM, Brian Cox wrote:
> Question #171044 on Bazaar changed:
> https://answers.launchpad.net/bzr/+question/171044
>
> Status: Answered => Open
>
> Brian Cox is still having a problem: Hi John,
>
> I think your last response cleared a lot of things up for me.
> After talking with a couple of guys here, we do have a couple more
> smaller questions.
>
> 1. Can you clarify how the development branch gets updated? Merging
> from the trunk? Or the intermediate branch?

In bzr, we use a 'PQM robot' to manage trunk. So we submit "please
merge this branch" to it, it does the merge, runs the test suite, if
all tests pass, it then commits the changes to trunk. If you don't
want that level of complexity, the other projects I work on do this:

 bzr init-repo project
 cd project
 bzr co $CENTRAL_LOCATION/trunk trunk
 bzr branch trunk myfeature

At this point, you have a pristine copy of the trunk development
branch, and a place to do your work. When it comes time to land
"myfeature", you then do:

 cd ../trunk
 bzr up
 bzr merge ../myfeature
 bzr commit -m "Land myfeature which adds frib and frobnaz"

If you are working on 'myfeature' and you need to bring in code from
another branch, or from trunk or wherever, you just do:

 cd myfeature
 bzr merge $OTHER_BRANCH
 bzr commit -m "Merging $OTHER_BRANCH to get juicy goodness."

You can push this branch somewhere so other people can see it, etc,
but you generally won't push changes to trunk.

>
> 2. Is it possible to get away with not using the intermediate
> branch by merging from the trunk to the development branch, then
> merging (not pushing) from the development branch back to the
> trunk?

You have to have a 'tree' where files exist to be able to merge. It
isn't possible to merge into a remote location.

>
> 3. And assuming we need the intermediate branch, is there an issue
> with creating one integration branch (rather than one extra branch
> for each developer) that all the developer branches merge to and
> then push from there to the trunk?
>

If you just have an integration branch, then you essentially have
trunk by another name.

> Thanks again, you've been a huge help. Brian
>

How about if you write down the steps of how you think you want your
developers to work, and I'll tweak/critique it.

In general, you are going to have some central location where
developers collaborate. Then you will have local working trees where
the developer actually makes changes. There are quite a few ways to
link the two. (Local trees always being a checkout of a remote branch,
local branches that get pushed to the central location, etc.)

I generally recommend lots of feature branches with a short lifetime,
rather than big integration branches with a long lifetime. But 'it
depends' on what you are trying to achieve.

If you are worried about developers having trouble keeping track of
branches, then I would suggest having 1 "integration" branch per
developer. Which periodically is merged into trunk (either by them, or
someone acting as a gatekeeper.)

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk6Aj4IACgkQJdeBCYSNAAMr1ACgm5HLDUN3RihxtPpIk1IluhwO
RTEAn1piUFByqkfKVVKzL1EIYvNZ8qcv
=vWeJ
-----END PGP SIGNATURE-----

Revision history for this message
Brian Cox (brian-cox) said :
#10

Hi John,

That's a good idea, here goes...

First some background. We are early in our development process. Very little code is being written right now, it is mostly ICDs, design, and other documentation that we are working right now. Which means that we also have systems engineers with their hands in this that are not used to working in a version controlled environment (or at least this one). Given the early state of development, it is likely that each developer will have a development branch (or two) that will be used for a long time. Later on, we absolutely plan to switch to shorter term branches for individual features, bug fixes, etc, but I don't think we're going to be ready for that for a while. With that in mind...

This is how I think you have described it:
- The developer creates a branch off of trunk to use for their integration into trunk.
- The developer creates another branch off of trunk to use for their development/documentation.
- The developer does work in their development branch.
- The developer commits their changes.
- When ready to merge their work to trunk they pull from trunk to their integration branch to get a clean copy.
- The developer merges from their development branch to their integration branch.
- The developer commits their changes to the integration branch.
- The developer pushes their changes to trunk.
- So now the trunk has their changes, but the latest version of trunk has not made it to their development branch. So I think we pull from the trunk to the development branch to get the development branch up to date.

I have concerns about adding more steps than I have to for some users, thus my proposal of using one singular integration branch (not one per user). And I understand that it would basically be another version of trunk. (Though I think this behavior would be useful down the road when we are working on parallel release/bug fix branches. But I digress...)

In this case we would do the following (changes marked with a "*"):
*- I (as a gatekeeper, and to minimize impact on developers) create a single integration branch off of trunk
- The developer creates a branch off of trunk to use for their development/documentation.
- The developer does work in their development branch.
- The developer commits their changes.
*- When ready to merge their work to trunk they notify me they are ready.
*- I pull from trunk to the integration branch to get a clean copy.
*- I merge the changes from their development branch to the integration branch.
*- I commit their changes to the integration branch.
*- I push their changes to trunk.
*- The developer pulls from the trunk to their development branch to get the development branch up to date.

Hopefully these makes sense. I think I get what you're saying, I'm just trying to find a path to what we want with making minimal impact on the developers. Even if that means a little more work on my part.

Thanks again,
Brian

Revision history for this message
Brian Cox (brian-cox) said :
#11

Oh, and I meant to ask in my last post, some of the overhead of having one integration branch could be handled through some scripting, right?

Revision history for this message
Brian Cox (brian-cox) said :
#12

John - Sorry to bug you but did you have anything to add about comment #10? Thanks, Brian

Revision history for this message
Martin Pool (mbp) said :
#13

Hi, Brian, I think your plan described there would work fine.

You can simplify it a bit further by not having a separate integration branch, but rather just you as the integrator have a checkout of the trunk:

*- I (as a gatekeeper, and to minimize impact on developers) have a checkout of trunk
- The developer creates a branch off of trunk to use for their development/documentation.
- The developer does work in their development branch.
- The developer commits their changes.
*- When ready to merge their work to trunk they notify me they are ready.
*- I update my checkout of trunk
*- I merge the changes from their development branch into the trunk checkout
*- I review or test their changes as desired
*- I commit their changes to trunk
*- The developer pulls from the trunk to their development branch to get the development branch up to date.

hth

Revision history for this message
John A Meinel (jameinel) said :
#14

I think your #10 is fine, though I think Martin's idea works a little bit better.

Basically, developers should be encouraged to do their work on their own branch. Having them work directly on an integration branch is possible, but tends to muddy the internals of what is going on. (Developer A notices that things are broken, but it was because of what Developer B did, and it is difficult to sort out who actually did what because everything is intermixed in the history.)

Revision history for this message
Brian Cox (brian-cox) said :
#15

Excellent, thank you both for your time and expertise!