aboutsummaryrefslogtreecommitdiffstats
path: root/README
blob: 4fff9fe7b7422bbc338e632201485f5f1979438b (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
$Id: README,v 1.1.1.1 1999/02/02 23:29:39 shmit Exp $

				    TICRA
				    -----
						Brian Cully <shmit@rcn.com>

Contents:
	0 - Introduction
	1 - Installation Instructions
		1.1 - Building the Package
		1.2 - Configuring
			1.2.1 - Server Configuration
				1.2.1.1 - Hostlist
				1.2.1.2 - Server.conf
			1.2.2 - Client Configuration
				1.2.2.1 - Disklist
				1.2.2.2 - Dumptypes
				1.2.2.3 - Client.conf
	2 - Common Problems
	3 - How it works
		3.1 - Gathering Estimates
		3.2 - Gathering the Dumps
			3.2.1 - The Master
			3.2.2 - The Slaves
				3.2.2.1 - Run-dump
		3.3 - Putting It on to the Tape
			3.3.1 - Writing the Dump
		3.4 - Sending out the Reports
	4 - What Still Needs to be Done
	Appendix A - Example Configuration Files
	Appendix B - The Tape Format
	Appendix C - Design Decisions

Part 0: Introduction
--------------------
TICRA is a backup system. Wow. Toot toot toot.

Part 1: Installation Instructions
---------------------------------
Installation consists of three parts, building the package, installing
the binaries and sample data files, and configuring the client and
server.

Before you build the package you should know whether the target machine
is a client or a server, as the build process is a bit different
depending on the target.

1.1: Building the Package
-------------------------
Before you build the package, you should poke through the Makefile and
set it up for your environment. In particular, you should verify that
the binaries are installed into the proper place, and that the OSLIBS
match the platform upon which you are trying to install.

You should also poke through config.h and make sure things are set up as
you would like there. Of particular interest is the PATH_RSH macro,
which you should set to point to either rsh or something equivalent
(like ssh).

Once that's set up, you can build the binaries. As stated above, you
build differently depending on whether you're installing the client or
server. If you're installing the server, you type:

% make server

If you're installing the client, you type:

% make client

Of course, if you don't care you can just type:

% make

and both will be built.

Watch the build process for errors and correct them if you can. Then
proceed to install the binaries when everything is compiled properly.

Then you install the binaries and example files. To do this for the
server you type:

% make install-server

To install the client, you type:

% make install-client

Finally, to install the example files:

% make install-examples

If you want to install everything at once you can just:

% make install

1.2: Configuring
----------------
Now that everything is installed you have to do the configuration. The
process is very different depending on whether you are installing a
client or a server, so they're covered in different sections.

All configuration files have fields seperated on word boundries, ignore
blank lines, and ignore everything after a `#'.

The one thing that must be taken care of for both client and server is
the setup of the backup account in the password file. The backup
account must be named `ticra' and must use the shell `smrsh' which was
installed into BINDIR as specified in the Makefile (probably
/usr/local/ticra/bin).

1.2.1: Server Configuration
---------------------------
To configure the server you must edit two files: server.conf and
hostlist. Hostlist contains a list of hosts to which the server will
connect and make backups. Server.conf contains the rest of the server
configuration.

1.2.1.1: Hostlist
-----------------
The hostlist file is simply a list of hosts to which the backup operator
should connect. It consists of one entry per line, each of which is a
hostname.

The backup operator must be able to connect to each host in the hostlist
in order to back it up. The connection process uses rsh, or whatever was
specified in the config.h file, so you must make sure that the backup
operator can connect to the target host /before/ adding it to the
hostlist file, otherwise an error will occur during the backup.

1.2.1.2: Server.conf
--------------------
The server.conf file contains the real configuration information for the
server. The variables that can be set and their functions are specified
below:

manager:	The e-mail address of the backup operator. This address
		gets the reports generated by the reporter process.

logdir:		The directory that contains the info and error logs.

infolog:	The name of the log that contains informational output
		generated by the dumper process. It can be set to `-' to
		log to standard output instead of a file.

errorlog:	The name of the log that contains error output generated
		by the dumper process. It can be set to `-' to log to
		standard error instead of a file.

timeout:	How many seconds to wait on a opening a connection
		before giving up.

hostlist:	The filename that contains the hostlist.

spooldir:	The directory that backups will be stored in before they
		are written to disk.

spoolsize:	The maximum amount of data that can be written to the
		spooldir, in megabytes.

tapedev:	The name of the tape device.

tapesize:	The size of the tape, in megabytes.

labelstr:	The label on the tape to be written must start with this
		in order for data to be written.

1.2.2: Client Configuration
---------------------------
The client end of things has three files that must be configured
before it can be backed up. These three files are `disklist',
`dumptypes', and `client.conf'.

Disklist contains a list of disks to be backed up, what kind of
backup will be performed on the disk, whether or not compression
is enabled, and finally, what type of authentication should be used
on the disk.

Client.conf describes what each kind of backup is and what port
will be used for unauthenticated backups.

Because TICRA uses rsh to communicate between the client and server,
you must make sure that the backup server can initiate a connection
to the client over rsh (or the equivalent that was specified in
config.h).

1.2.2.1: Disklist
-----------------
The disklist contains one entry per line, in the following format:

filesystem	dumptype	compression	auth-type

These fields are described below:

filesystem:	The name of the device or directory to be backed up.

dumptype:	The name of the type of dump to use, the exact
		meaning of which is set in the `dumptypes' file.

compression:	`uncompressed' is the only type currently supported.

auth-type:	`noauth' is the only type currently supported.

1.2.2.2: Dumptypes
------------------
The dumptypes file contains the definition for the dumps which are
specified in the disklist. It consists of one entry per line, in the
following format:

dumpname	dumpline	estimateline	regexp

These fields mean:

dumpname:	The name of type of dump, this is what's used in
		`disklist'.

dumpline:	The command line that's used to dump this type
		onto standard output. Simple substitution is
		performed as follows:
			%l - Dump level.
			%v - Filesystem name (from disklist).
			%% - %

estimateline:	The command line that's used to get a size
		estimate for a filesystem. Substitution is
		performed as in dumpline.

regexp:		The regular expression which extracts the size
		in 1K blocks for a dump from the estimate.

1.2.2.3: Client.conf
--------------------
The client.conf file specifies what port should be used for
unauthenticated backups.

The fields are:

port:		The port that will be used for unauthenticated
		backups. Must be a number.

Part 2: Common Problems
-----------------------
No one uses this, how can there be common problems?

Part 3: How it Works
--------------------
3.1: Gathering Estimates
------------------------
The server machine first reads its hostlist and connects to each
host sequentially over an RSH pipe, asking it for a list of disks
to backup, what type of authentication to use (either `noauth' or
`kerbV'), and an estimate of how much space the dump will take up.
It uses this information to put together a schedule of which disks
to dump and at what level to dump. If the authentication type is
`noauth' then the dumper asks which port should be used to connect
to for the dump.

3.2: Gathering the Dumps
------------------------
3.2.1: The Master
-----------------
Once the estimates have been gathered and the schedule has been
worked out, the master dumper process then identifies itself as
such via setproctitle(3) (not available on all platforms).

The master then forks and execs the taper process, opening pipes
to it in the process. It saves the open file descriptors for use
with the slaves.

In order to do the actual dumping the master then forks once for
each entry in the hostlist, each of these processes will be referred
to as slaves.

Once all the slaves have finished, the master tells the taper that
there is no further data to be written to the tape and waits for
the taper to finish writing everything to the tape. When the taper
is done, the master finishes off by running the report generating
script, `report'.


3.2.2: The Slaves
-----------------
The slaves do all the real work with regards to client communication.
They first open up an RSH pipe to their client which is used as a
control session. The RSH pipe execs the run-dump process and the
two (dumper and run-dump) exchange control information and error
messages over the pipe.

The slave tells the run-dump process on the client to open up a
socket on the port specified in the client's client.conf file (which
was given to the server during the estimation gathering process).

The slave then goes through each disk it knows about iteratively
requesting a dump from the run-dump on the client. For each disk
it opens a new connection to the client over the negotiated port
and waits for data to start coming down the socket (from now on,
we'll refer to this as the `data connection').

Normally, the slave will just put the data from the data connection
into a file in the spooling directory (as specified in the server.conf
file) in the format:

<hostname>:<diskname>

where <hostname> is the name specified in the hostlist file on the
server and <diskname> is the name specified in the disklist file
on the client.

Once the dump is completely written to the spooling directory, the
slave tells the taper to write that file to tape (using the pipe
passed down from the master).

If, however, the estimate for the disk says it will be too big for
the spooling directory (also specified in the server.conf file),
then the slave will instead dump the data directly to the taper.

When all disks are finished, the slave tells run-dump that it's
done and closes the RSH control connection.

3.2.2.1: Run-dump
-----------------
Run-dump reads through the disklist and the client.conf file. It
responds to requests from the server to execute a dump and open a
connection on a given port (specified in the client.conf file and
passed over to the server as part of the estimate process).

When a dump is requested for a disk, run-dump executes the process
for the dump type specified in the disklist. Run-dump does very
crude syntax substitution on the command line, allowing it to run
arbitrary commands to do the dump. This means that the only real
difference between the types `dump' and `tar' is that the type
`dump' is capable of calculating estimates and using leveled dumps,
whereas tar is not.

The only other thing run-dump responds to is a port command. This
tells run-dump to open up a socket on a given port so that the
server can connect to it. This is the channel that is used to send
the dump over to the server.

For obvious reasons, the port command needs to come before the dump
command.

3.3: Putting It on to the Tape
------------------------------
Once the taper process is created, it rewinds the tape and checks
the label against the one in the server.conf file, if the label
doesn't match, it sends an error and dies.

If the tape matches the label, however, the taper will wait on the
control pipes waiting for to be told either to dump something to
tape or to quit.

The taper can be told to either write a file to the tape or write
a stream of data. In either event, it first writes a file header
describing the data and then writes the data.

When the taper is done writing to the tape, and the master dumper
has told the taper to quit, the taper will write an end of tape
marker; in the future this will be used to store more than one
periods worth of dumps on a single tape. The taper then exits.

3.3.1: Writing the Dump
-----------------------
It is expected that writing to tape will be slower than grabbing
the dump from the target machine (although, this is certainly not
always the case!), moreover, there will be multiple dumper processes
all writing to the spooling disk, but only one taper. Because of
this, the taper has to keep a queue of things to write to tape
while it is busy actually writing data to the tape.

To accomplish this, when the taper first receives a request to
write something to tape, it forks off a child to accomplish the
task of actually writing a file to tape, while the parent sits on
the control connection waiting for more requests and adding them
to the queue.

When the child dies (after it's done writing to tape, or on an
error), the parent forks off another child to deal with the next
item in the queue. This goes on until the master dumper says there
aren't going to be any more things added to the queue.

When the master dumper says everything is finished, the taper stops
forking off children and goes through the rest of the queue
iteratively.

3.4: Sending out the Reports
----------------------------
The reporter doesn't do much of anything fancy right now, it just
mails the error and info logs to the manager and renames the log
files to contain a date stamp.

Part 4: What Still Needs to be Done
-----------------------------------
Check out the file TODO in this directory for a list of things that
still need to be accomplished.

In particular, this document lies in many places. Notably, the
estimate process doesn't actually grab an estimate, there is no
scheduling being done, backup levels aren't being used (so, currently,
there is no difference between tar and dump), and there's no facility
to dump straight to tape (so you'd better make sure that none of
your dumps are bigger than the spooling directory).

Appendix A: Example Configuration Files
---------------------------------------
			      +-----------+
			      |server.conf|
			      +-----------+
# To whom mail should go.
manager		shmit@erols.com

# Where to log informational and error messages.
logdir		/usr/local/ticra/var/log
infolog		-
errorlog	-

# How many seconds to wait on a connection before giving up.
timeout		10

# List of machines to backup.
hostlist	/usr/local/ticra/libdata/ticra/hostlist

# Where dumps will be spooled before being dumped to tape.
spooldir	/var/holding
spoolsize	50

# The tape device on which to dump.
tapedev		/dev/nrst0
tapesize	50
labelstr	TEST00
				  +--------+
				  |hostlist|
				  +--------+
localhost
				+-----------+
				|client.conf|
				+-----------+
# Port to use for non-authenticated dumps.
port	31337

# Syntax lines for dump types. Substitution rules are:
#	%l - dump level
#	%v - volume name
#	%% - %
dump	"dump -%luf - %v"
tar	"tar -clf - %v"
				  +--------+
				  |disklist|
				  +--------+
# Filesystem	dumptype	compression	auth-type
#---------------------------------------------------------
/tmp		tar		uncompressed	noauth
/var		dump		uncompressed	noauth

Appendix B: The Tape Format
---------------------------
Physical Start of Tape
TAPE LABEL:	8192 bytes
Tape EOF
START OF TAPE:	8192 bytes
Tape EOF
Repeat for each file on tape:
	FILE HEADER: 8192 bytes
		<hostname>:<diskname> <date>
		<hostname>:	Machine from which the disk came.
		<diskname>:	Name of disk on said machine.
		<date>:		Date of dump in DDMMYYYY format
	Tape EOF
STOP OF TAPE:	8192 bytes
Tape EOF

Appendix C: Design Decisions
----------------------------
This package has been designed primarily to back up data in a Very
Large Organization, although it should work even at small companies
without any hassle.

When I thought `Very Large Organization', I saw an organization
that necessarily had many different departments, all of whom want
as much control as possible as to how to back up their data.  After
all, if the Head of Finance wants to change /var/db/account to be
backup up with tar instead of dump, he shouldn't have to call over
to the backup operator to do it.

Moreover, I didn't want the backup operator to have to spend his
days toiling with lots of different hosts and departments trying
to get things done their way, and should be able to spend as little
time as possible dealing with other people and the lack of reasonable
communication which only complicates and confuses the matter.

Keep in mind, that I work backups, and I'm lazy, so I wanted to
make things as easy as possible for me. However, I think you'll
find that my decisions don't make life any harder for anyone, but
make it more convenient (all true lazy bones try and make life
easier for everyone).

To that end, I've tried to make as little as possible configurable
on the server end of things, instead pushing the responsibility
over to the client end.

I came from an AMANDA background, so I was fairly biased when I
started writing this software. I kept many of the ideas of AMANDA
but thought that a few things needed changing.

In particular, I wanted to separate the disks to be backed up from
the hosts to be backed up, and keep the disks in a separate file
on the client so the client managers could configure which disks
are backed up and the type of backup that is used.

I also wanted to have the client define how the dump was done,
instead of the server, this simplified things in a larger environment
for a number of reasons:

	* Across a diverse network with many different types of
	  clients and file systems you can't be guaranteed that a
	  program called `dump' or `tar' will exist or will do the
	  right thing. Instead of having the backup operator know
	  these details that are largely unimportant to him, I opted
	  to have the person that runs the client in question fill
	  in the blanks.

	* The size of the server.conf file would become rather
	  unwieldy if it had to contain entries for every possible
	  type of dump that would be used in an organization.

	* There is no need for the server to know what kind of data
	  is in the backup stream.

The other main goal was to have software that actually worked. My
experience with AMANDA was mostly loathsome, mainly due to what I
considered really poor design decisions, like the use of UDP,
embedded data flow information, the poor use of RSH. I've endeavored
to fix these design problems with what I consider to be better
solutions.

Apropos of that, I also endeavored to make the code clean and used
a style the moves in that direction. If you're interested in it
you should read style(9) on any *BSD system. I also used ANSI
prototypes and function prefixes. I feel they're easy to read, and
the main argument I see against them (namely, compatibility) I
don't find valid with this software (I use so many POSIXism that
if you have them, you'll have a compiler that can handle ANSI C).