github pull requests: odd listings of hundreds of commits

I suspect that the series of May 6 commits ending with a50874b711983cba505ecdb2801c4996eccf3812 has tangled the history in such a way that some (but not all) older pull requests on github like this one

https://github.com/matplotlib/matplotlib/pull/1

are showing hundreds of commits and diffs.

Eric

Darned if I know what I did differently that time. (I'm sure I hit git's "misunderstanding" feature again).

It only seems to have affected 2 or 3 of the pull requests -- I suspect rebasing those branches off of the current master would fix the problem, but maybe there's an easier way.

Mike

···

On 05/17/2011 03:16 PM, Eric Firing wrote:

I suspect that the series of May 6 commits ending with
a50874b711983cba505ecdb2801c4996eccf3812 has tangled the history in such
a way that some (but not all) older pull requests on github like this one

https://github.com/matplotlib/matplotlib/pull/1

are showing hundreds of commits and diffs.

Eric

------------------------------------------------------------------------------
Achieve unprecedented app performance and reliability
What every C/C++ and Fortran developer should know.
Learn how Intel has extended the reach of its next-generation tools
to help boost performance applications - inlcuding clusters.
http://p.sf.net/sfu/intel-dev2devmay
_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-devel
   
--
Michael Droettboom
Science Software Branch
Space Telescope Science Institute
Baltimore, Maryland, USA

Darned if I know what I did differently that time. (I'm sure I hit
git's "misunderstanding" feature again).

It only seems to have affected 2 or 3 of the pull requests -- I suspect
rebasing those branches off of the current master would fix the problem,
but maybe there's an easier way.

Mike,

First, regarding the pull requests, I think that in some cases it is not a problem because they are being closed (rejected or superseded), and in any case, it is easy to handle them locally and then ensure a clean changeset to be pushed up to the main repo. So they look terrible in the github web view, but no real damage is done to the information in the branches they represent when one pulls those branches and works with them locally.

Second, I think the problems came from one or two exceptions to recommended git practice (this is coming from a git non-expert, so take with a grain of salt; git experts, please chime in and correct me as needed), which appear to be of two types: one is pulling from "origin" when you intended to pull from "upstream":

* | | commit 14406a68c039dc6578730f23955bf34d34008a08

\ \ \ Merge: fdf5fae 132f967
> > > Author: Michael Droettboom <mdboom@...149...>
> > > Date: Fri May 6 15:25:01 2011 -0400
> > >
> > > Merge remote branch 'origin/v1.0.x'

(this is the one that deleted scads of files, which you replaced in the subsequent commit)

and the other is merging from origin or upstream into a feature branch:

> * | commit 668ef6d826eff5eb4aac6251e1ddd9ac713616e5
> >\ \ Merge: de18d9a 53f8139
> >/ / Author: Michael Droettboom <mdboom@...149...>
>/| | Date: Thu May 12 11:43:48 2011 -0400
> > >
> > > Merge remote branch 'upstream/v1.0.x' into gtk_crash

My understanding is that:

1) merges should always be from a feature branch into v1.0.x or master (and from v1.0.x into master, not the reverse); this can be done by updating local copies of v1.0.x etc from upstream, merging from the feature branch, and then pushing to upstream.

2) If a feature branch is local only (no one else has it) then it can be rebased so that merging it into v1.0.x etc. does not involve a merge commit; but if the feature branch has been published on github, then it should not be rebased.

3) We don't have to always push sets of changes from an original pull request to upstream; they can be consolidated using any of a variety of methods to form a new local feature branch with the same net effect but fewer commits (maybe only one), and then that can be merged (which should be fast-forward, no merge commit) and pushed to upstream. This takes a little more work than simply accepting (merging) a pull request as-is, but in many cases it may be worth it because it can yield a cleaner history. Similarly, if someone is developing a feature branch on github, and the net effect is correct but the branch has intermediate commits that distract from the net result, then a good practice would be for that person to consolidate the changes into a new feature branch with a cleaner history, close the pull request on the old one and open a new request for the polished branch.

Regarding your 14406a commit: if I understand correctly, a better recovery method would have been to use

git reset --hard HEAD^

to simply back up the branch pointer by one.

The bad commit would still be in your local tree, but it would be un-named (that is, it would have no branch name) and would not go anywhere else.

Eric

···

On 05/17/2011 10:14 AM, Michael Droettboom wrote:

Mike

On 05/17/2011 03:16 PM, Eric Firing wrote:

I suspect that the series of May 6 commits ending with
a50874b711983cba505ecdb2801c4996eccf3812 has tangled the history in such
a way that some (but not all) older pull requests on github like this one

https://github.com/matplotlib/matplotlib/pull/1

are showing hundreds of commits and diffs.

Eric

I believe this script more or less automates the process: https://github.com/jeresig/pulley

Gerald.

···

On 18/05/2011 5:14 AM, Eric Firing wrote:

3) We don't have to always push sets of changes from an original pull
request to upstream; they can be consolidated using any of a variety of
methods to form a new local feature branch with the same net effect but
fewer commits (maybe only one), and then that can be merged (which
should be fast-forward, no merge commit) and pushed to upstream. This
takes a little more work than simply accepting (merging) a pull request
as-is, but in many cases it may be worth it because it can yield a
cleaner history. Similarly, if someone is developing a feature branch
on github, and the net effect is correct but the branch has intermediate
commits that distract from the net result, then a good practice would be
for that person to consolidate the changes into a new feature branch
with a cleaner history, close the pull request on the old one and open a
new request for the polished branch.

Don’t know if this was a mistake or not, but I see that commit e7f1e83 (the one to fix a clipping issue when a patch’s line width is 1 but there is no color) seems to have been merged back into itself… somehow… in commit 0c886b8. I have seen things like this before, and I never quite understood how they happen. Plus, do we want to get that patch merged down to master?

I think we definitely need to see what sort of controls we can put in place to prevent mix-ups in the future. One thing I did like about SVN was that it was next to impossible to change the history. Meanwhile, with git, it becomes possible. Is there some way we can disallow forced pushes, maybe? Just a thought…

Ben Root

···

On Tue, May 17, 2011 at 9:07 PM, Gerald Storer <gds@…943…> wrote:

On 18/05/2011 5:14 AM, Eric Firing wrote:

  1. We don’t have to always push sets of changes from an original pull

request to upstream; they can be consolidated using any of a variety of

methods to form a new local feature branch with the same net effect but

fewer commits (maybe only one), and then that can be merged (which

should be fast-forward, no merge commit) and pushed to upstream. This

takes a little more work than simply accepting (merging) a pull request

as-is, but in many cases it may be worth it because it can yield a

cleaner history. Similarly, if someone is developing a feature branch

on github, and the net effect is correct but the branch has intermediate

commits that distract from the net result, then a good practice would be

for that person to consolidate the changes into a new feature branch

with a cleaner history, close the pull request on the old one and open a

new request for the polished branch.

I believe this script more or less automates the process:

https://github.com/jeresig/pulley

Gerald.

     > 3) We don't have to always push sets of changes from an original pull
     > request to upstream; they can be consolidated using any of a
    variety of
     > methods to form a new local feature branch with the same net
    effect but
     > fewer commits (maybe only one), and then that can be merged (which
     > should be fast-forward, no merge commit) and pushed to upstream.
      This
     > takes a little more work than simply accepting (merging) a pull
    request
     > as-is, but in many cases it may be worth it because it can yield a
     > cleaner history. Similarly, if someone is developing a feature
    branch
     > on github, and the net effect is correct but the branch has
    intermediate
     > commits that distract from the net result, then a good practice
    would be
     > for that person to consolidate the changes into a new feature branch
     > with a cleaner history, close the pull request on the old one and
    open a
     > new request for the polished branch.
     >
    I believe this script more or less automates the process:
    https://github.com/jeresig/pulley

    Gerald.

Don't know if this was a mistake or not, but I see that commit e7f1e83
(the one to fix a clipping issue when a patch's line width is 1 but
there is no color) seems to have been merged back into itself...
somehow... in commit 0c886b8. I have seen things like this before, and
I never quite understood how they happen. Plus, do we want to get that
patch merged down to master?

Yes, it needs to get merged to master; but no, I don't think anything was "merged back into itself"; instead, it is just a case where a fast-forward with no merge commit would have left a simpler history--one commit instead of two.

Merging from v1.0.x to master doesn't have to be done after every change to v1.0.x, but it shouldn't be left undone for very long. A few days ago I wasted a chunk of time thrashing around on master because of a bug that had been fixed on 1.0.x but was not yet merged into master--and I had not thought to check their relative states. The bug had actually been introduced to master via a recent change merged from 1.0.x.

We are experiencing some bumps on the git/github learning curve, but not nearly enough to make me pine for svn.

I think we definitely need to see what sort of controls we can put in
place to prevent mix-ups in the future. One thing I did like about SVN
was that it was next to impossible to change the history. Meanwhile,
with git, it becomes possible. Is there some way we can disallow forced
pushes, maybe? Just a thought...

I suspect the anomalies have not resulted from forced pushes, but from local pulls and merges followed by innocuous pushes. So the key is understanding how to ensure one's local branches have the desired history before pushing to github. (And making sure one is pushing from the correct source to the correct destination. Trying first with --dry-run can help.)

Eric

···

On 05/18/2011 08:47 AM, Benjamin Root wrote:

On Tue, May 17, 2011 at 9:07 PM, Gerald Storer <gds@…943… > <mailto:gds@…943…>> wrote:
    On 18/05/2011 5:14 AM, Eric Firing wrote:

Ben Root

------------------------------------------------------------------------------
What Every C/C++ and Fortran developer Should Know!
Read this article and learn how Intel has extended the reach of its
next-generation tools to help Windows* and Linux* C/C++ and Fortran
developers boost performance applications - including clusters.
http://p.sf.net/sfu/intel-dev2devmay

_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-devel

Before pushing, I also recommend inspecting the history graph, either
with "gitk --all" or "git log --oneline --graph --all". I try to
remember to make sure the history graph looks the way I expect it
should before I push anywhere.

···

On Wed, May 18, 2011 at 3:22 PM, Eric Firing <efiring@...229...> wrote:

On 05/18/2011 08:47 AM, Benjamin Root wrote:

On Tue, May 17, 2011 at 9:07 PM, Gerald Storer <gds@…943… >> <mailto:gds@…943…>> wrote:

On 18/05/2011 5:14 AM, Eric Firing wrote:
 &gt; 3\) We don&#39;t have to always push sets of changes from an original pull
 &gt; request to upstream; they can be consolidated using any of a
variety of
 &gt; methods to form a new local feature branch with the same net
effect but
 &gt; fewer commits \(maybe only one\), and then that can be merged \(which
 &gt; should be fast\-forward, no merge commit\) and pushed to upstream\.
  This
 &gt; takes a little more work than simply accepting \(merging\) a pull
request
 &gt; as\-is, but in many cases it may be worth it because it can yield a
 &gt; cleaner history\.  Similarly, if someone is developing a feature
branch
 &gt; on github, and the net effect is correct but the branch has
intermediate
 &gt; commits that distract from the net result, then a good practice
would be
 &gt; for that person to consolidate the changes into a new feature branch
 &gt; with a cleaner history, close the pull request on the old one and
open a
 &gt; new request for the polished branch\.
 &gt;
I believe this script more or less automates the process:
https://github.com/jeresig/pulley

Gerald\.

Don't know if this was a mistake or not, but I see that commit e7f1e83
(the one to fix a clipping issue when a patch's line width is 1 but
there is no color) seems to have been merged back into itself...
somehow... in commit 0c886b8. I have seen things like this before, and
I never quite understood how they happen. Plus, do we want to get that
patch merged down to master?

Yes, it needs to get merged to master; but no, I don't think anything
was "merged back into itself"; instead, it is just a case where a
fast-forward with no merge commit would have left a simpler history--one
commit instead of two.

Merging from v1.0.x to master doesn't have to be done after every change
to v1.0.x, but it shouldn't be left undone for very long. A few days
ago I wasted a chunk of time thrashing around on master because of a bug
that had been fixed on 1.0.x but was not yet merged into master--and I
had not thought to check their relative states. The bug had actually
been introduced to master via a recent change merged from 1.0.x.

We are experiencing some bumps on the git/github learning curve, but not
nearly enough to make me pine for svn.

I think we definitely need to see what sort of controls we can put in
place to prevent mix-ups in the future. One thing I did like about SVN
was that it was next to impossible to change the history. Meanwhile,
with git, it becomes possible. Is there some way we can disallow forced
pushes, maybe? Just a thought...

I suspect the anomalies have not resulted from forced pushes, but from
local pulls and merges followed by innocuous pushes. So the key is
understanding how to ensure one's local branches have the desired
history before pushing to github. (And making sure one is pushing from
the correct source to the correct destination. Trying first with
--dry-run can help.)

QGit is another alternative.

Eric

···

On 05/18/2011 11:45 AM, Darren Dale wrote:

I suspect the anomalies have not resulted from forced pushes, but from
local pulls and merges followed by innocuous pushes. So the key is
understanding how to ensure one's local branches have the desired
history before pushing to github. (And making sure one is pushing from
the correct source to the correct destination. Trying first with
--dry-run can help.)

Before pushing, I also recommend inspecting the history graph, either
with "gitk --all" or "git log --oneline --graph --all". I try to
remember to make sure the history graph looks the way I expect it
should before I push anywhere.