[CrackMonkey] Recovery

Nick Moffitt nick at zork.net
Thu Nov 30 11:56:38 PST 2000


----- Forwarded message from glen mccready <gkm at petting-zoo.net> -----
Forwarded-by: Nev Dull <nev at sleepycat.com>
Forwarded-by: "A. Chase Turner" <acturner at uswest.net>
From: owner-sage-members at usenix.org On Behalf Of Mark R. Lindsey
Sent: Friday, July 23, 1999 22:47
To: sage-members at usenix.org

I couple of weeks ago, I ceded the Senior System Administerial Throne
to someone else in my company. It was a friendly move, to another 
group within the same company.

I'd like to relate something I've realized since that change. It's
something personal to other admins like me, something about the strain
of system administration.

There's a sign in our Network Operations Center:

	DSS Online Goal of Network Operation:
		To be the invisible bringers of computing joy

I arrived at this goal practically; a system that works well is one that
is `invisible' -- it's just consistent enough that it becomes part of
the scenery, and users don't even think about it. When users don't think
about the system, they're not complaining. And when they're not
complaining, they're usually not cancelling their service. (DSS Online is 
12-employee ISP.)

For me, it became an artform. How can I test and upgrade sendmail without
dropping a single mail connection? How can we reboot a router such that 
nobody notices? How can I correct that user's problem, even though the
user doesn't know he has a problem? And so on.

We setup a system, `paulRevere', which had the job of watching for 
trouble. paulRevere performs hundreds of health probes every hours,
and pages us with a hair trigger.

With all of the effort, failures still occured; and, in most cases,
I took them personally. It's hard not to do so; after all, when that
PRI fails for the sixth Thursday in a row, it was my job to ensure
that it was reported and repaired. Or when that customer starts
spamming at 2am, it was my job to ensure that they stopped, and
that their mess is cleaned up.

Thus, at the same time, my role of Senior System Administrator became a
job, and an artform, and a sentence.

When a major part of your job is to think of solutions to problems, 
it's hard to take `time off'. Those problems are going to be there,
plaguing your users the whole time. And that's just wrong, according
to my goals. 

...

I haven't solved it. Somebody else is grappling with it now. My means
of escape with to find another project.

So how do *you* people handle it when you can't just switch jobs? 
(I'm going to be very disappointed if the answer is to lower one's 
standards.)

datasys.net!mark


----- End forwarded message -----

-- 
CrackMonkey.Org - Non-sequitur arguments and ad-hominem personal attacks
Pigdog.Org      - The Online Handbook for Bad People of the Future
 
                You are not entitled to your opinions.





More information about the Crackmonkey mailing list