1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
|
* democratized Version Control Systems (dVCS)
Sharing complete repositories is a simple concept which poses subtle
constraints in whose solution yields a form of democratized version
control.
* A look back at SVN
** Linear history is normal, all graphs are trees
In other words, any given commit can have many children, but only
one parent.
** Merging is painful and error prone
Most solutions to this problem involve writing appropriate commit
logs or writing out to files so merges can be traced. Screwing this
up can be bad, and as a result it is avoided as much as possible.
** Sharing changes consists of mailing patches
Obviously this was all workable, but it didn't exactly engender
itself to lazy people like myself. The existance and popularity of
CVSup in spite of being written in Modula 3 shows the value of
repository sharing.
* Constraints
** Repositories are collections of interwoven histories
So:
*** Linear history is a thing of the past
*** Merging must be easy
*** Sharing changes must be easy
* How git satisfies dVCS constraints
** History is no longer linear
Time is no longer a useful identifier when comparing the history of
disparate repositories, and thus can't be used for commit
identifiers.
** git uses SHA hashes to identify repository objects
SHA is a critical factor in determining correct sharing of file
data. The hash is computed from file contents and mapped to
file names.
** Merging is elevated to a first class operation
Git makes merging easy(ier). It will probably never be trivial, but
git at least automates the grunt work of tracking down common
ancestors to reduce conflicts and ease merging.
** Branching is trivial and encouraged
Creating a branch is just creating another ref pointing to an
existing commit. It's very fast and efficient. It's very easy to
move things between branches, and they are encouraged for any
non-trivial work. It doesn't even mess up your history graph a lot
of the time, and when it does you can often alter it so it does not.
*** What is the object store?
**** blobs
Blobs are blobs of binary data.
**** trees
Trees point to blobs or other trees.
**** commits
Git commits contain a tree, its parent commits, and a tree object,
along with meta-data: message, author, commiter, and so forth.
**** tags
Tag objects contain a commit id and an optional message and
cryptographic signature. If neither are present, a tag is merely a
symbolic ref.
*** All objects are identified by SHA hashes.
The unit of history is the commit which can be soley identified by
its contents. The hash is easy to compute and provides good entropy
properties when building a hash table.
*** History is immediately verifiable (barring hash collisions)
*** Some measure of security comes for free
All commits are effectively signed by all their previous commits, so
verifying a repository becomes trivial given only a valid commit id.
** merge commits
In git, a commit can have many parents, as opposed to SVN where a
commit can have only one parent. Merge commits can also contain
blobs themselves to mark conflict resolutions.
** SHA hashes are a pain to type
Git has a concept of `refs' which are typically symbolic references
to commits. At the end of the day, every ref ends up as a SHA hash.
*** SHA hashes can typically be shortened to a few characters
*** tags are fixed refs
Tags have optional descriptions and GPG signatures.
*** branches and HEAD
Branches are moving refs and always reference their tips. HEAD is a
pointer to the tip of the current branch.
*** $ref^ and $ref~$n
You can follow parents by using caret or tilde notation. Merge
commits are followed in their order in the commit blob.
# ^ is the parent, ^^ is the paren't parent, and so on
e.g: HEAD^ (The next most-recent commit on the current branch)
# ~2 is shorthand for ^^
e.g: HEAD~2 (The third most-recent commit on the current branch)
** Sharing commits
*** Remotes
*** Implicit read-only "vendor" branches.
*** Push and Pull
*** Example
* Merge strategies
*** Fast forward
When the merge target is an ancestor of the other branch, this just
points the target's HEAD at the other branch.
*** Recursive
Used when more than one common ancestor exists. Builds the merge
base revision by recursively merging common ancestors.
*** And others
See git-merge(1)
* A brief note on the index
The index stores the tree object of the commit-to-be.
# adding to the index cache: git add
# removing: git rm --cached
** git reset
Can be used to reset the index, or certain files in the index, to a
given commit, which is HEAD by default.
* How dVCS democratizes version control
* My seekrit agenda
I am a lazy programmer, and the more people who use git the easier
my life is. I use git because...
* Additional Resources
# Git - SVN Crash Course
<http://git.or.cz/course/svn.html>
# Git User's Manual
<http://www.kernel.org/pub/software/scm/git/docs/user-manual.html>
# Extensive Man Pages
|