Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

Journal of nicholas (3034)

Sunday August 09, 2009
01:02 PM

Dear git-using lazyweb...

[ #39430 ]

Dear lazyweb,

Say one has a git repository, and one would like to produce bi-weekly e-mails summarising activity...

As I understand it, branches in git, whilst first class, don't store history. In that, right now, there is a well defined "head" for blead , but there is no state that tells me what the "head" for blead was yesterday, or even last week. So, if I want my summary to say "activity on blead", and "activity on maint-5.10", "activity on target-the-SuperCollider-VM" etc., I don't have any way from a single current repository to infer the history of the branches.

So, I'm wondering, what happens if I conceptually have two repositories. One current at half-a-week-ago, and one current at right-now. Can I usefully tease apart some semblance of activity on each branch between "then" and "now"? I think that I can - I know the position of blead's head back then, and I know the positions of blead's head right now, so any commit that is both a parent of "head now", and a child of "head then", is somewhere on the web of commits assignable to the branch blead. (Of course, if the web is particularly tangled with merges and branching, it might also qualify by the same means as being on the head of maint-5.10 both then and now. But I can choose an iteration order on branches, and report each commit exactly once on the first position I try it.) And then, for each branch "now", seek out all commits that are a parent of it that are not yet reported. And then, for each branch "then", do likewise. And finally, seek out any orphan commits.

But I wonder - do I actually need two repositories? Because I know that when git itself pulls from the remote, it only transfers the commit objects it doesn't yet have. So is there a way to either do a "dry run" first to get the full list of commit objects? Or a way to determine when commit objects were added to the repository, for example by the time stamp of the file on disk? That way I only need one repository, which feels cleaner.

Or is this already a solved problem? git shortlog isn't the solution, because it groups by author, whereas I want grouping first by branch, and then by time.

This post brought to you by the campaign to give SuperCollider programming the recognition it deserves.

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • Git is all about the commits. Commits have parentage. Commits are shared. And some commits have human-friendly names. Tags are one kind of names, and are semi-static. Branches are a different kind of name, but generally move over time. Also, branches are repo-local, where tags are generally shared.

    I think if you define carefully what you want to know, there are lots of options on "git log" that will give you what you want. But in a proper repo, branches are generally rather ephemeral (provided you a

    --
    • Randal L. Schwartz
    • Stonehenge
  • I find it helps to not talk about the "branch" master but rather the "head" master. When you separate the fact that heads can cause branching from their essence, you can think of more ways to use git.

    So, make a new head called "last-summary." Start it at commit x, presumably the current location of master.

    In two weeks, get the output of `git log last-summary..master` and you will see all the changes that are between the two named commits. With that summary generated, you can now move last-summary to the

    --
    rjbs
  • … I think you still want to look into git help reflog.