github workflow

_John_Hunter · September 1, 2011, 2:54am

github workflow: this seems to present a different workflow than that
espoused in gitwash used by mpl and other projects

http://scottchacon.com/2011/08/31/github-flow.html

I like the idea of lots of feature branches off upstream/master and
master always being deployable (nightly builds?). What is the
advantage of core devs working in their own forks, as we currently do,
over working on feature branches off of
https://github.com/matplotlib/matplotlib? Seems like a lighter-weight
approach that works, and it would probably make it easier for users to
follow mpl development by tracking the mpl repo and all the branches
off of it, rather than having to pull in the various dev's forked
branches.

_Matthew_Brett · September 1, 2011, 3:16am

Yo,

github workflow: this seems to present a different workflow than that
espoused in gitwash used by mpl and other projects

http://scottchacon.com/2011/08/31/github-flow.html

I like the idea of lots of feature branches off upstream/master and
master always being deployable (nightly builds?). What is the
advantage of core devs working in their own forks, as we currently do,
over working on feature branches off of
GitHub - matplotlib/matplotlib: matplotlib: plotting with Python? Seems like a lighter-weight
approach that works, and it would probably make it easier for users to
follow mpl development by tracking the mpl repo and all the branches
off of it, rather than having to pull in the various dev's forked
branches.

The issue being - why not have all the development branches in the
same main repo?

Because:

a) Everyone needs write access to the main repo
b) It's much less tempting to start experimental and highly unstable branches
c) You can get a very similar effect by adding remotes to your own repo.
d) It only very slightly simplifies an unusual case (what's developer
X working on today?).

Less tempting

···

On Wed, Aug 31, 2011 at 7:54 PM, John Hunter <jdh2358@...149...> wrote:
------------------
Just as a minor example, here's my nipy branch list:

Lots of crap in there; I just made a branch with a single extra commit
that I may well throw away, the branch I'm currently working on:

- I am constantly rebasing and reorganizing while I try to work out
what I'm doing. I'd think much harder about that if I thought other
people were expecting to pull down all my stuff. Thinking harder =
slower coding (for me at least :))

Similar effect
----------------

- I'd like to see what Gael and Jonathan Taylor are up to from time to time:

Once:
git clone git@...679...:matthew-brett/nipy.git # origin remote
git remote add gael git://github.com/GaelVaroquaux/nipy.git
git remote add jonathan git://github.com/jtaylor/nipy.git

From time to time:

git fetch --all

- same effect, and it allows me to chose who I'm following. But
actually, I very rarely do that in the abstract, I look when they tell
me to look at something, and I'm pretty sure it's the same for them
and my stuff.

See you,

Matthew

Benjamin_Root1 · September 1, 2011, 3:34am

I agree with Matthew here, but I could see a possible hybrid approach.

Let’s say we were a more organized group (I will wait for the laughter to die down…), then one could imagine having branches with names of approved milestones/goals/planned features for the next release. The names would convey the features actively being worked on, and provide a focus for us. When a feature is finished, we then merge into master. If a feature is not ready for prime-time, then we can just hold it off.

Just a thought,
Ben Root

···

On Wed, Aug 31, 2011 at 10:16 PM, Matthew Brett <matthew.brett@…149…> wrote:

Yo,

On Wed, Aug 31, 2011 at 7:54 PM, John Hunter <jdh2358@…149…> wrote:

github workflow: this seems to present a different workflow than that

espoused in gitwash used by mpl and other projects

http://scottchacon.com/2011/08/31/github-flow.html

I like the idea of lots of feature branches off upstream/master and

master always being deployable (nightly builds?). What is the

advantage of core devs working in their own forks, as we currently do,

over working on feature branches off of

https://github.com/matplotlib/matplotlib? Seems like a lighter-weight

approach that works, and it would probably make it easier for users to

follow mpl development by tracking the mpl repo and all the branches

off of it, rather than having to pull in the various dev’s forked

branches.

The issue being - why not have all the development branches in the

same main repo?

Because:

a) Everyone needs write access to the main repo

b) It’s much less tempting to start experimental and highly unstable branches

c) You can get a very similar effect by adding remotes to your own repo.

d) It only very slightly simplifies an unusual case (what’s developer

X working on today?).

Less tempting

Just as a minor example, here’s my nipy branch list:

https://github.com/matthew-brett/nipy/branches

Lots of crap in there; I just made a branch with a single extra commit

that I may well throw away, the branch I’m currently working on:

https://github.com/matthew-brett/nipy/tree/fmristat-test-refactor

I am constantly rebasing and reorganizing while I try to work out

what I’m doing. I’d think much harder about that if I thought other

people were expecting to pull down all my stuff. Thinking harder =

slower coding (for me at least :))

Similar effect

I’d like to see what Gael and Jonathan Taylor are up to from time to time:

Once:

git clone git@…679…:matthew-brett/nipy.git # origin remote

git remote add gael git://github.com/GaelVaroquaux/nipy.git

git remote add jonathan git://github.com/jtaylor/nipy.git

From time to time:

git fetch --all

same effect, and it allows me to chose who I’m following. But

actually, I very rarely do that in the abstract, I look when they tell

me to look at something, and I’m pretty sure it’s the same for them

and my stuff.

See you,

Matthew

_John_Hunter · September 1, 2011, 3:37am

Yo,

github workflow: this seems to present a different workflow than that
espoused in gitwash used by mpl and other projects

http://scottchacon.com/2011/08/31/github-flow.html

I like the idea of lots of feature branches off upstream/master and
master always being deployable (nightly builds?). What is the
advantage of core devs working in their own forks, as we currently do,
over working on feature branches off of
GitHub - matplotlib/matplotlib: matplotlib: plotting with Python? Seems like a lighter-weight
approach that works, and it would probably make it easier for users to
follow mpl development by tracking the mpl repo and all the branches
off of it, rather than having to pull in the various dev's forked
branches.

The issue being - why not have all the development branches in the
same main repo?

Because:

a) Everyone needs write access to the main repo

I'm thinking about core devs here -- they all have write access to the
main repo. Users and non core devs can continue with the fork
approach.

b) It's much less tempting to start experimental and highly unstable branches

This can still be done in forks. And experimental and unstable
branches are a minor threat -- they may increase the signal-to-noise,
but dead branches can be pruned and users and devs can probably get a
pretty good feel for which are active by looking at the "last update"
time on the branch list.

c) You can get a very similar effect by adding remotes to your own repo.

Yes, I do this and I'm sure other mpl developers do to, but you need
to know who to follow, which is harder for the casual developer or
user. By having the core devs develop in feature branches off of
upstream, it makes it easier for users and other developers to see
what all the other cores devs are up to w/o having to specify who to
track. They track the main repo, they see the main work of the core
devs as they come and go.

d) It only very slightly simplifies an unusual case (what's developer
X working on today?).

I think it simplifies it dramatically, because the average user or
part time developer doesn't have to ask "which developers should I
follow?" and do the work to add them to externals. They can assume
that by tracking the upstream branches, they see the important
non-experimental branches the core developers are working on. It is
easy to follow developer X if you know a priori who X is. But since
95% of the work is done by people who have write access to the central
repo, and 95% of the users want to track this, it makes sense to me to
push more of the workflow into the central repo, while still
supporting external contributions via pull requests from forks.

Maybe I'm missing something, but I feel the gitwash workflow is more
complicated than it needs to be and this article re-inforces that
view.

JDH

···

On Wed, Aug 31, 2011 at 10:16 PM, Matthew Brett <matthew.brett@...149...> wrote:

On Wed, Aug 31, 2011 at 7:54 PM, John Hunter <jdh2358@...149...> wrote:

Fernando_Perez1 · September 1, 2011, 9:07am

Limited internet access here, so no time for a long discussoin... Just
to say that I'm totally in agreement with Matthew here.

We only make branches in the main ipython repo under exceptional
circumstances, when there's a major piece of work that requires
multiple-developer commit collaboration to beat into shape and
cross-pulling from personal repos would just get annoying. But once
those are ready and merge we delete them as visible branches right
away.

For example, since we moved to github, we've only done this *twice*:
once for the big parallel rewrite, and once for the notebook work.
Both of these were *major* efforts that took months to shape up, so it
made sense to have them in there. But we make such a decision only
for such special cases, otherwise following the workflow Matthew
points out seems to work really well.

Once you get into the habit of using multiple remotes to get a handle
of an entire team's worth of contributions to a project, you realize
how simple and effective it is.

Cheers,

f

···

On Wed, Aug 31, 2011 at 20:16, Matthew Brett <matthew.brett@...149...> wrote:

The issue being - why not have all the development branches in the
same main repo?

Because:

a) Everyone needs write access to the main repo
b) It's much less tempting to start experimental and highly unstable branches
c) You can get a very similar effect by adding remotes to your own repo.
d) It only very slightly simplifies an unusual case (what's developer
X working on today?).

Michael_Droettboom · September 6, 2011, 3:38pm

I think most of the points being made here are valid. However, a common occurrence (at least for me) is for a user to struggle against a bug that I'm currently working on in one of my branches. Looking at the main repository, it isn't very discoverable that a solution may already exist, and the user can waste time wondering if it's a bug or user error etc. Perhaps a compromise between these two approaches would be to have a wiki page which is a directory of any branches that developers consider interesting and want to point people toward? Maybe that's just creating busy work, of course.

Mike

···

On 09/01/2011 05:07 AM, Fernando Perez wrote:

On Wed, Aug 31, 2011 at 20:16, Matthew Brett<matthew.brett@...149...> wrote:

The issue being - why not have all the development branches in the
same main repo?

Because:

a) Everyone needs write access to the main repo
b) It's much less tempting to start experimental and highly unstable branches
c) You can get a very similar effect by adding remotes to your own repo.
d) It only very slightly simplifies an unusual case (what's developer
X working on today?).

Limited internet access here, so no time for a long discussoin... Just
to say that I'm totally in agreement with Matthew here.

We only make branches in the main ipython repo under exceptional
circumstances, when there's a major piece of work that requires
multiple-developer commit collaboration to beat into shape and
cross-pulling from personal repos would just get annoying. But once
those are ready and merge we delete them as visible branches right
away.

For example, since we moved to github, we've only done this *twice*:
once for the big parallel rewrite, and once for the notebook work.
Both of these were *major* efforts that took months to shape up, so it
made sense to have them in there. But we make such a decision only
for such special cases, otherwise following the workflow Matthew
points out seems to work really well.

Once you get into the habit of using multiple remotes to get a handle
of an entire team's worth of contributions to a project, you realize
how simple and effective it is.

Cheers,

f

------------------------------------------------------------------------------
Special Offer -- Download ArcSight Logger for FREE!
Finally, a world-class log management solution at an even better
price-free! And you'll get a free "Love Thy Logs" t-shirt when you
download Logger. Secure your free ArcSight Logger TODAY!
http://p.sf.net/sfu/arcsisghtdev2dev
_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel@lists.sourceforge.net
matplotlib-devel List Signup and Options

--
Michael Droettboom
Science Software Branch
Space Telescope Science Institute
Baltimore, Maryland, USA

Michael_Droettboom · September 6, 2011, 4:53pm

It occurred to me that it's also possible to file pull requests very early on while working on a branch. This would make these branches that others may care about more visible. We would just want some convention to say "wait -- this branch isn't done yet, don't merge".

Mike

···

On 09/06/2011 11:38 AM, Michael Droettboom wrote:

I think most of the points being made here are valid. However, a common
occurrence (at least for me) is for a user to struggle against a bug
that I'm currently working on in one of my branches. Looking at the
main repository, it isn't very discoverable that a solution may already
exist, and the user can waste time wondering if it's a bug or user error
etc. Perhaps a compromise between these two approaches would be to have
a wiki page which is a directory of any branches that developers
consider interesting and want to point people toward? Maybe that's just
creating busy work, of course.

Mike

On 09/01/2011 05:07 AM, Fernando Perez wrote:

On Wed, Aug 31, 2011 at 20:16, Matthew Brett<matthew.brett@...149...> wrote:

The issue being - why not have all the development branches in the
same main repo?

Because:

a) Everyone needs write access to the main repo
b) It's much less tempting to start experimental and highly unstable branches
c) You can get a very similar effect by adding remotes to your own repo.
d) It only very slightly simplifies an unusual case (what's developer
X working on today?).

Limited internet access here, so no time for a long discussoin... Just
to say that I'm totally in agreement with Matthew here.

We only make branches in the main ipython repo under exceptional
circumstances, when there's a major piece of work that requires
multiple-developer commit collaboration to beat into shape and
cross-pulling from personal repos would just get annoying. But once
those are ready and merge we delete them as visible branches right
away.

For example, since we moved to github, we've only done this *twice*:
once for the big parallel rewrite, and once for the notebook work.
Both of these were *major* efforts that took months to shape up, so it
made sense to have them in there. But we make such a decision only
for such special cases, otherwise following the workflow Matthew
points out seems to work really well.

Once you get into the habit of using multiple remotes to get a handle
of an entire team's worth of contributions to a project, you realize
how simple and effective it is.

Cheers,

f

------------------------------------------------------------------------------
Special Offer -- Download ArcSight Logger for FREE!
Finally, a world-class log management solution at an even better
price-free! And you'll get a free "Love Thy Logs" t-shirt when you
download Logger. Secure your free ArcSight Logger TODAY!
http://p.sf.net/sfu/arcsisghtdev2dev
_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel@lists.sourceforge.net
matplotlib-devel List Signup and Options

--
Michael Droettboom
Science Software Branch
Space Telescope Science Institute
Baltimore, Maryland, USA

_Fernando_Perez1 · September 6, 2011, 5:58pm

We do that all the time: we simply say "this PR isn't meant for merge
yet, just to get the discussion going while the problem is worked on".

In IPython, we've merged over 250 PRs since switching to github, and I
think we've had *two* long-lived branches in the main repo
(newparallel and htmlnotebook). I still think that's the right
approach, as situations like these should be exceptional.

I think getting used to many long-lived branches in the main repo
precisely encourages the kind of workflow that leads to
hard-to-review, hard-to-integrate branches. By *not* putting them in
the main repo, there's a certain pressure on keeping things small,
self-contained and easy to review in little pull requests.

Each time we've done one of these monster branches there's been a
solid reason to do it, but it has required summoning extra resources,
committing big chunks of time for difficult and lengthy review
periods, and being very careful about how they can go out of sync with
the rest of the repo. So while occasionally necessary, these things
have such a high cost that I absolutely want a workflow that
discourages them in everyday practice.

HTH,

f

···

On Tue, Sep 6, 2011 at 9:53 AM, Michael Droettboom <mdroe@...31...> wrote:

It occurred to me that it's also possible to file pull requests very
early on while working on a branch. This would make these branches that
others may care about more visible. We would just want some convention
to say "wait -- this branch isn't done yet, don't merge".

_Matthew_Brett · September 6, 2011, 6:33pm

Hi,

I think most of the points being made here are valid. However, a common
occurrence (at least for me) is for a user to struggle against a bug
that I'm currently working on in one of my branches. Looking at the
main repository, it isn't very discoverable that a solution may already
exist, and the user can waste time wondering if it's a bug or user error
etc. Perhaps a compromise between these two approaches would be to have
a wiki page which is a directory of any branches that developers
consider interesting and want to point people toward? Maybe that's just
creating busy work, of course.

Maybe the summary is that putting the branches in the main repo labels
those branches somehow.

You're suggesting the label means 'you might consider merging these to
see if they fix your bug'.

John is suggesting the label means 'here are the main threads of
development going on'.

Of course you have another virtual label which is 'branch in pull
request state'. Maybe, as Fernando says, that's the best label to use
for a branch that the user might consider merging?

See you,

Matthew

···

On Tue, Sep 6, 2011 at 8:38 AM, Michael Droettboom <mdroe@...31...> wrote: