comparison dcc.html.in @ 0:c7f6b056b673

First import of vendor version
author Peter Gervai <grin@grin.hu>
date Tue, 10 Mar 2009 13:49:58 +0100
parents
children
comparison
equal deleted inserted replaced
-1:000000000000 0:c7f6b056b673
1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN">
2 <HTML>
3 <HEAD>
4 <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
5 <TITLE>dcc.0.8</TITLE>
6 <META http-equiv="Content-Style-Type" content="text/css">
7 <STYLE type="text/css">
8 BODY {background-color:white; color:black}
9 ADDRESS {font-size:smaller}
10 IMG.logo {width:6em; vertical-align:middle}
11 </STYLE>
12 </HEAD>
13 <BODY>
14 <PRE>
15 <!-- Manpage converted by man2html 3.0.1 -->
16 <B><A HREF="dcc.html">DCC(8)</A></B> Distributed Checksum Clearinghouse <B><A HREF="dcc.html">DCC(8)</A></B>
17
18
19 </PRE>
20 <H2><A NAME="NAME">NAME</A></H2><PRE>
21 <B>DCC</B> -- Distributed Checksum Clearinghouse
22
23
24 </PRE>
25 <H2><A NAME="DESCRIPTION">DESCRIPTION</A></H2><PRE>
26 The Distributed Checksum Clearinghouse or <B>DCC</B> is a cooperative, distrib-
27 uted system intended to detect "bulk" mail or mail sent to many people.
28 It allows individuals receiving a single mail message to determine that
29 many other people have received essentially identical copies of the mes-
30 sage and so reject or discard the message.
31
32 Source for the server, client, and utilities is available at Rhyolite
33 Software, LLC, http://www.rhyolite.com/dcc/ It is free for organizations
34 that do not sell spam or virus filtering services.
35
36 <A NAME="How-the-DCC-Is-Used"><B>How the DCC Is Used</B></A>
37 The DCC can be viewed as a tool for end users to enforce their right to
38 "opt-in" to streams of bulk mail by refusing bulk mail except from
39 sources in a "whitelist." Whitelists are the responsibility of DCC
40 clients, since only they know which bulk mail they solicited.
41
42 False positives or mail marked as bulk by a DCC server that is not bulk
43 occur only when a recipient of a message reports it to a DCC server as
44 having been received many times or when the "fuzzy" checksums of differ-
45 ing messages are the same. The fuzzy checksums ignore aspects of mes-
46 sages in order to compute identical checksums for substantially identical
47 messages. The fuzzy checksums are designed to ignore only differences
48 that do not affect meanings. So in practice, you do not need to worry
49 about DCC false positive indications of "bulk," but not all bulk mail is
50 unsolicited bulk mail or spam. You must either use whitelists to distin-
51 guish solicited from unsolicited bulk mail or only use DCC indications of
52 "bulk" as part of a scoring system such as SpamAssassin. Besides unso-
53 licited bulk email or spam, bulk messages include legitimate mail such as
54 order confirmations from merchants, legitimate mailing lists, and empty
55 or test messages.
56
57 A DCC server estimates the number copies of a message by counting check-
58 sums reported by DCC clients. Each client must decide which bulk mes-
59 sages are unsolicited and what degree of "bulkiness" is objectionable.
60 Client DCC software marks, rejects, or discards mail that is bulk accord-
61 ing to local thresholds on target addresses from DCC servers and unso-
62 licited according to local whitelists.
63
64 DCC servers are usually configured to receive reports from as many tar-
65 gets as possible, including sources that cannot be trusted to not exag-
66 gerate the number of copies of a message they see. A user of a DCC
67 client angry about receiving a message could report it with 1,000,000
68 separate DCC reports or with a single report claiming 1,000,000 targets.
69 An unprincipled user could subscribe a "spam trap" to mailing lists such
70 as those of the IETF or CERT. Such abuses of the system area not prob-
71 lems, because much legitimate mail is "bulk." You cannot reject bulk
72 mail unless you have a whitelist of sources of legitimate bulk mail.
73
74 DCC can also be used by an Internet service provider to detect bulk mail
75 coming from its own customers. In such circumstances, the DCC client
76 might be configured to only log bulk mail from unexpected (not
77 whitelisted) customers.
78
79 <A NAME="What-the-DCC-Is"><B>What the DCC Is</B></A>
80 A DCC server accumulates counts of cryptographic checksums of messages
81 but not the messages themselves. It exchanges reports of frequently seen
82 checksums with other servers. DCC clients send reports of checksums
83 related to incoming mail to a nearby DCC server running <B><A HREF="dccd.html">dccd(8)</A></B>. Each
84 report from a client includes the number of recipients for the message.
85 A DCC server accumulates the reports and responds to clients the the cur-
86 rent total number of recipients for each checksum. The client adds an
87 SMTP header to incoming mail containing the total counts. It then dis-
88 cards or rejects mail that is not whitelisted and has counts that exceed
89 local thresholds.
90
91 A special value of the number of addressees is "MANY" and means it is
92 certain that this message was bulk and might be unsolicited, perhaps
93 because it came from a locally blacklisted source or was addressed to an
94 invalid address or "spam trap." The special value "MANY" is merely the
95 largest value that fits in the fixed sized field containing the count of
96 addressees. That "infinity" accumulated total can be reached with mil-
97 lions of independent reports as well as with one or two.
98
99 DCC servers <I>flood</I> or send reports of checksums of bulk mail to neighbor-
100 ing servers.
101
102 To keep a server's database of checksums from growing without bound,
103 checksums are forgotten when they become old. Checksums of bulk mail are
104 kept longer. See <B><A HREF="dbclean.html">dbclean(8)</A></B>.
105
106 DCC clients pick the nearest working DCC server using a small shared or
107 memory mapped file, <I>@prefix@/map</I>. It contains server names, port num-
108 bers, passwords, recent performance measures, and so forth. This file
109 allows clients to use quick retransmission timeouts and to waste little
110 time on servers that have temporarily stopped working or become unreach-
111 able. The utility program <B><A HREF="cdcc.html">cdcc(8)</A></B> is used to maintain this file as well
112 as to check the health of servers.
113
114 <A NAME="X-DCC-Headers"><B>X-DCC Headers</B></A>
115 The DCC software includes several programs used by clients. <B><A HREF="dccm.html">Dccm(8)</A></B> uses
116 the sendmail "milter" interface to query a DCC server, add header lines
117 to incoming mail, and reject mail whose total checksum counts are high.
118 Dccm is intended to be run with SMTP servers using sendmail.
119
120 <B><A HREF="dccproc.html">Dccproc(8)</A></B> adds header lines to mail presented by file name or <I>stdin</I>, but
121 relies on other programs such as procmail to deal with mail with large
122 counts. <B><A HREF="dccsight.html">Dccsight(8)</A></B> is similar but deals with previously computed check-
123 sums.
124
125 <B><A HREF="dccifd.html">Dccifd(8)</A></B> is similar to dccproc but is not run separately for each mail
126 message and so is far more efficient. It receives mail messages via a
127 socket somewhat like dccm, but with a simpler protocol that can be used
128 by Perl scripts or other programs.
129
130 DCC SMTP header lines are of one of the forms:
131
132 X-DCC-brand-Metrics: client server-ID; bulk cknm1=count cknm2=count ...
133 X-DCC-brand-Metrics: client; whitelist
134 where
135 <I>whitelist</I> appears if the global or per-user <I>whiteclnt</I> file marks the
136 message as good.
137 <I>brand</I> is the "brand name" of the DCC server, such as "RHYOLITE".
138 <I>client</I> is the name or IP address of the DCC client that added the
139 header line to the SMTP message.
140 <I>server-ID</I> is the numeric ID of the DCC server that the DCC client con-
141 tacted.
142 <I>bulk</I> is present if one or more checksum counts exceeded the DCC
143 client's thresholds to make the message "bulky."
144 <I>bulk</I> <I>rep</I> is present if the DCC reputation of the IP address of the
145 sender is bad.
146 <I>cknm1</I>,<I>cknm2</I>,... are types of checksums:
147 <I>IP</I> address of SMTP client
148 <I>env</I><B>_</B><I>From</I> SMTP envelope value
149 <I>From</I> SMTP header line
150 <I>Message-ID</I> SMTP header line
151 <I>Received</I> last Received: header line in the SMTP message
152 <I>substitute</I> SMTP header line chosen by the DCC client, pre-
153 fixed with the name of the header
154 <I>Body</I> SMTP body ignoring white-space
155 <I>Fuz1</I> filtered or "fuzzy" body checksum
156 <I>Fuz2</I> another filtered or "fuzzy" body checksum
157 <I>rep</I> DCC reputation of the mail sender or the esti-
158 mated probability that the message is bulk.
159 Counts for <I>IP</I>, <I>env</I><B>_</B><I>From</I>, <I>From</I>, <I>Message-Id</I>, <I>Received</I>, and
160 <I>substitute</I> checksums are omitted by the DCC client if the
161 server says it has no information. Counts for <I>Fuz1</I> and <I>Fuz2</I>
162 are omitted if the message body is empty or contains too lit-
163 tle of the right kind of information for the checksum to be
164 computed.
165 <I>count</I> is the total number of recipients of messages with that check-
166 sum reported directly or indirectly to the DCC server. The
167 special count "MANY" means that DCC client have claimed that
168 the message is directed at millions of recipients. "MANY"
169 imples the message is definitely bulk, but not necessarily
170 unsolicited. The special counts "OK" and "OK2" mean the
171 checksum has been marked "good" or "half-good" by DCC servers.
172
173 <A NAME="Mailing-lists"><B>Mailing lists</B></A>
174 Legitimate mailing list traffic differs from spam only in being solicited
175 by recipients. Each client should have a private whitelist.
176
177 DCC whitelists can also mark mail as unsolicited bulk using blacklist
178 entries for commonly forged values such as "From: user@public.com".
179
180 <A NAME="White-and-Blacklists"><B>White and Blacklists</B></A>
181 DCC server and client whitelist files share a common format. Server
182 files are always named <I>whitelist</I> and one is required to be in the DCC
183 home directory with the other server files. Client whitelist files are
184 named <I>whiteclnt</I> in the DCC home directory or a subdirectory specified
185 with the <B>-U</B> option for <B><A HREF="dccm.html">dccm(8)</A></B>. They specify mail that should not be
186 reported to a DCC server or that is always unsolicited and almost cer-
187 tainly bulk.
188
189 A DCC whitelist file contains blank lines, comments starting with "#",
190 and lines of the following forms:
191 <I>include</I> <I>file</I>
192 Copies the contents of <I>file</I> into the whitelist. It can occur
193 only in the main whitelist or whiteclnt file and not in an
194 included file. The file name should be absolute or relative to
195 the DCC home directory.
196
197 <I>count</I> <I>value</I>
198 lines specify checksums that should be white- or blacklisted.
199 <I>count</I> <I>env</I><B>_</B><I>From</I> <I>821-path</I>
200 <I>count</I> <I>env</I><B>_</B><I>To</I> <I>dest-mailbox</I>
201 <I>count</I> <I>From</I> <I>822-mailbox</I>
202 <I>count</I> <I>Message-ID</I> <I>&lt;string&gt;</I>
203 <I>count</I> <I>Received</I> <I>string</I>
204 <I>count</I> <I>Substitute</I> <I>header</I> <I>string</I>
205 <I>count</I> <I>Hex</I> <I>ctype</I> <I>cksum</I>
206 <I>count</I> <I>ip</I> <I>IP-address</I>
207
208 <I>MANY</I> <I>value</I>
209 indicates that millions of targets have received messages
210 with the header, IP address, or checksum <I>value</I>.
211 <I>OK</I> <I>value</I>
212 <I>OK2</I> <I>value</I>
213 say that messages with the header, IP address, or check-
214 sum <I>value</I> are OK and should not reported to DCC servers
215 or be greylisted. <I>OK2</I> says that the message is "half
216 OK." Two <I>OK2</I> checksums associated with a message are
217 equivalent to one <I>OK</I>.
218 A DCC server never shares or <I>floods</I> reports containing
219 checksums marked in its whitelist with OK or OK2 to other
220 servers. A DCC client does not report or ask its server
221 about messages with a checksum marked OK or OK2 in the
222 client whitelist. This is intended to allow a DCC client
223 to keep private mail so private that even its checksums
224 are not disclosed.
225 <I>MX</I> <I>IP-address-or-hostname</I>
226 <I>MXDCC</I> <I>IP-address-or-hostname</I>
227 mark an address or block of addresses of trust mail
228 relays including MX servers, smart hosts, and bastion or
229 DMZ relays. The DCC clients <B><A HREF="dccm.html">dccm(8)</A></B>, <B><A HREF="dccifd.html">dccifd(8)</A></B>, and
230 <B><A HREF="dccproc.html">dccproc(8)</A></B> parse and skip initial Received: headers added
231 by listed MX servers to determine the external sources of
232 mail messages. Unsolicited bulk mail that has been for-
233 warded through listed addresses is discarded by <B><A HREF="dccm.html">dccm(8)</A></B>
234 and <B><A HREF="dccifd.html">dccifd(8)</A></B> as if with <B>-a</B> <I>DISCARD</I> instead of rejected.
235 <I>MXDCC</I> marks addresses that are MX servers that run DCC
236 clients. The checksums for a mail message that has been
237 forwarded through an address listed as MXDCC queried
238 instead of reported.
239 <I>SUBMIT</I> <I>IP-address-or-hostname</I>
240 marks an IP address or block addresses of SMTP submission
241 clients such as web browsers that cannot tolerate 4yz
242 temporary rejections but that cannot be trusted to not
243 send spam. Since they are local addresses, DCC Reputa-
244 tions are not computed for them.
245
246 <I>value</I> in <I>count</I> <I>value</I> lines can be
247 <I>dest-mailbox</I>
248 is an RFC 821 address or a local user name.
249 <I>821-path</I>
250 is an RFC 821 address.
251 <I>822-mailbox</I>
252 is an RFC 822 address with optional name.
253 <I>Substitute</I> <I>header</I>
254 is the name of an SMTP header such as "Sender" or the
255 name of one of two SMTP envlope values, "HELO," or
256 "Mail_Host" for the resolved host name from the <I>821-path</I>
257 in the message.
258 <I>Hex</I> <I>ctype</I> <I>cksum</I>
259 starts with the string <I>Hex</I> followed a checksum type, and
260 a string of four hexadecimal numbers obtained from a DCC
261 log file or the <B><A HREF="dccproc.html">dccproc(8)</A></B> command using <B>-CQ</B>. The check-
262 sum type is <I>body</I>, <I>Fuz1</I>, or <I>Fuz2</I> or one of the preceding
263 checksum types such as <I>env</I><B>_</B><I>From</I>.
264 <I>IP-address</I>
265 is a host name, IPv4 or IPv6 address, or a block of IP
266 addresses in the standard xxx/mm from with mm limited for
267 server whitelists to 16 for IPv4 or 112 for IPv6. There
268 can be at most 64 CIDR blocks in a client <I>whiteclnt</I> file.
269 A host name is converted to IP addresses with DNS,
270 <I>/etc/hosts</I> or other mechanisms and one checksum for each
271 addresses added to the whitelist.
272
273 <I>option</I> <I>setting</I>
274 can only be in a DCC client <I>whiteclnt</I> file used by <B><A HREF="dccifd.html">dccifd(8)</A></B>,
275 <B><A HREF="dccm.html">dccm(8)</A></B> or <B><A HREF="dccproc.html">dccproc(8)</A></B>. Settings in per-user whiteclnt files
276 override settings in the global file. <I>Setting</I> can be any of the
277 following:
278 <I>option</I> <I>log-all</I>
279 to log all mail messages.
280 <I>option</I> <I>log-normal</I>
281 to log only messages that meet the logging thresholds.
282 <I>option</I> <I>log-subdirectory-day</I>
283 <I>option</I> <I>log-subdirectory-hour</I>
284 <I>option</I> <I>log-subdirectory-minute</I>
285 creates log files containing mail messages in subdirecto-
286 ries of the form <I>JJJ</I>, <I>JJJ/HH</I>, or <I>JJJ/HH/MM</I> where <I>JJJ</I> is the
287 current julian day, <I>HH</I> is the current hour, and <I>MM</I> is the
288 current minute. See also the <B>-l</B> <I>logdir</I> option for <B><A HREF="dccm.html">dccm(8)</A></B>,
289 <B><A HREF="dccifd.html">dccifd(8)</A></B>, and <B><A HREF="dccproc.html">dccproc(8)</A></B>.
290 <I>option</I> <I>dcc-on</I>
291 <I>option</I> <I>dcc-off</I>
292 Control DCC filtering. See the discussion of <B>-W</B> for
293 <B><A HREF="dccm.html">dccm(8)</A></B> and <B><A HREF="dccifd.html">dccifd(8)</A></B>.
294 <I>option</I> <I>greylist-on</I>
295 <I>option</I> <I>greylist-off</I>
296 to control greylisting. Greylisting for other recipients
297 in the same SMTP transaction can still cause greylist tem-
298 porary rejections. <I>greylist-off</I> in the main whiteclnt
299 file.
300 <I>option</I> <I>greylist-log-on</I>
301 <I>option</I> <I>greylist-log-off</I>
302 to control logging of greylisted mail messages.
303 <I>option</I> <I>DCC-rep-off</I>
304 <I>option</I> <I>DCC-rep-on</I>
305 to honor or ignore DCC Reputations computed by the DCC
306 server.
307 <I>option</I> <I>DNSBL1-off</I>
308 <I>option</I> <I>DNSBL1-on</I>
309 <I>option</I> <I>DNSBL2-off</I>
310 <I>option</I> <I>DNSBL2-on</I>
311 <I>option</I> <I>DNSBL3-off</I>
312 <I>option</I> <I>DNSBL3-on</I>
313 honor or ignore results of DNS blacklist checks configured
314 with <B>-B</B> for <B><A HREF="dccm.html">dccm(8)</A></B>, <B><A HREF="dccifd.html">dccifd(8)</A></B>, and <B><A HREF="dccproc.html">dccproc(8)</A></B>.
315 <I>option</I> <I>MTA-first</I>
316 <I>option</I> <I>MTA-last</I>
317 consider MTA determinations of spam or not-spam first so
318 they can be overridden by <I>whiteclnt</I> files, or last so that
319 they can override <I>whiteclnt</I> <I>files.</I>
320 <I>option</I> <I>forced-discard-ok</I>
321 <I>option</I> <I>no-forced-discard</I>
322 control whether <B><A HREF="dccm.html">dccm(8)</A></B> and <B><A HREF="dccifd.html">dccifd(8)</A></B> are allowed to dis-
323 card a message for one mailbox for which it is spam when it
324 is not spam and must be delivered to another mailbox. This
325 can happen if a mail message is addressed to two or more
326 mailboxes with differing whitelists. Discarding can be
327 undesirable because false positives are not communicated to
328 mail senders. To avoid discarding, <B><A HREF="dccm.html">dccm(8)</A></B> and <B><A HREF="dccifd.html">dccifd(8)</A></B>
329 running in proxy mode temporarily reject SMTP envelope <I>Rcpt</I>
330 <I>To</I> values that involve differing <I>whiteclnt</I> files.
331 <I>option</I> <I>threshold</I> <I>type,rej-thold</I>
332 has the same effects as <B>-c</B> <I>type,rej-thold</I> for <B><A HREF="dccproc.html">dccproc(8)</A></B> or
333 <B>-t</B> <I>type,rej-thold</I> for <B><A HREF="dccm.html">dccm(8)</A></B> and <B><A HREF="dccifd.html">dccifd(8)</A></B>. It is useful
334 only in per-user whiteclnt files to override the global DCC
335 checksum thresholds.
336 <I>option</I> <I>spam-trap-accept</I>
337 <I>option</I> <I>spam-trap-reject</I>
338 say that mail should be reported to the DCC server as
339 extremely bulk or with target counts of <I>MANY</I>. Greylisting,
340 DNS blacklist (DNSBL), and other checks are turned off.
341 <I>Spam-trap-accept</I> tells the MTA to accept the message while
342 <I>spam-trap-reject</I> tells the MTA to reject the message. Use
343 <I>Spam-trap-accept</I> for spam traps that should not be dis-
344 closed. <I>Spam-trap-reject</I> can be used on <I>catch-all</I> mail-
345 boxes that might receive legitimate mail by typographical
346 errors and that senders should be told about.
347
348 In the absence of explicit settings, the default in the main
349 whiteclnt file is equivalent to
350 <I>option</I> <I>log-normal</I>
351 <I>option</I> <I>dcc-on</I>
352 <I>option</I> <I>greylist-on</I>
353 <I>option</I> <I>greylist-log-on</I>
354 <I>option</I> <I>DCC-rep-off</I>
355 <I>option</I> <I>DNSBL1-off</I>
356 <I>option</I> <I>DNSBL2-off</I>
357 <I>option</I> <I>DNSBL3-off</I>
358 <I>MTA-last</I>
359 <I>option</I> <I>no-forced-discard</I>
360 The defaults for individual recipient <I>whiteclnt</I> files are the
361 same except as change by explicit settings in the main file.
362
363 Checksums of the IP address of the SMTP client sending a mail message are
364 practically unforgeable, because it is impractical for an SMTP client to
365 "spoof" its address or pretend to use some other IP address. That would
366 make the IP address of the sender useful for whitelisting, except that
367 the IP address of the SMTP client is often not available to users of
368 <B><A HREF="dccproc.html">dccproc(8)</A></B>. In addition, legitimate mail relays make whitelist entries
369 for IP addresses of little use. For example, the IP address from which a
370 message arrived might be that of a local relay instead of the home
371 address of a whitelisted mailing list.
372
373 Envelope and header <I>From</I> values can be forged, so whitelist entries for
374 their checksums are not entirely reliable.
375
376 Checksums of <I>env</I><B>_</B><I>To</I> values are never sent to DCC servers. They are valid
377 in only <I>whiteclnt</I> files and used only by <B><A HREF="dccm.html">dccm(8)</A></B>, <B><A HREF="dccifd.html">dccifd(8)</A></B>, and
378 <B><A HREF="dccproc.html">dccproc(8)</A></B> when the envelope <I>Rcpt</I> <I>To</I> value is known.
379
380 <A NAME="Greylists"><B>Greylists</B></A>
381 The DCC server, <B><A HREF="dccd.html">dccd(8)</A></B>, can be used to maintain a greylist database for
382 some DCC clients including <B><A HREF="dccm.html">dccm(8)</A></B> and <B><A HREF="dccifd.html">dccifd(8)</A></B>. Greylisting involves
383 temporarily refusing mail from unfamiliar SMTP clients and is unrelated
384 to filtering with a Distributed Checksum Clearinghouse.
385 See http://projects.puremagic.com/greylisting/
386
387 <A NAME="Privacy"><B>Privacy</B></A>
388 Because sending mail is a less private act than receiving it, and because
389 sending bulk mail is usually not private at all and cannot be very pri-
390 vate, the DCC tries first to protect the privacy of mail recipients, and
391 second the privacy of senders of mail that is not bulk.
392
393 DCC clients necessarily disclose some information about mail they have
394 received. The DCC database contains checksums of mail bodies, header
395 lines, and source addresses. While it contains significantly less infor-
396 mation than is available by "snooping" on Internet links, it is important
397 that the DCC database be treated as containing sensitive information and
398 to not put the most private information in the DCC database. Given the
399 contents of a message, one might determine whether that message has been
400 received by a system that subscribes to the DCC. Guesses about the
401 sender and addressee of a message can also be validated if the checksums
402 of the message have been sent to a DCC server.
403
404 Because the DCC is distributed, organizations can operate their own DCC
405 servers, and configure them to share or "flood" only the checksums of
406 bulk mail that is not in local whitelists.
407
408 DCC clients should not report the checksums of messages known to be pri-
409 vate to a DCC server. For example, checksums of messages local to a sys-
410 tem or that are otherwise known a priori to not be unsolicited bulk
411 should not be sent to a remote DCC server. This can accomplished by
412 adding entries for the sender to the client's local whitelist file.
413 Client whitelist files can also include entries for email recipients
414 whose mail should not be reported to a DCC server.
415
416 <A NAME="Security"><B>Security</B></A>
417 Whenever considering security, one must first consider the risks. The
418 worst DCC security problems are unauthorized commands to a DCC service,
419 denial of the DCC service, and corruption of DCC data. The worst that
420 can be done with remote commands to a DCC server is to turn it off or
421 otherwise cause it to stop responding. The DCC is designed to fail
422 gracefully, so that a denial of service attack would at worst allow
423 delivery of mail that would otherwise be rejected. Corruption of DCC
424 data might at worst cause mail that is already somewhat "bulk" by virtue
425 of being received by two or more people to appear have higher recipient
426 numbers. Since DCC users <I>must</I> whitelist all sources of legitimate bulk
427 mail, this is also not a concern. Such security risks should be
428 addressed, but only with defenses that don't cost more than the possible
429 damage from an attack.
430
431 The DCC must contend with senders of unsolicited bulk mail who resort to
432 unlawful actions to express their displeasure at having their advertising
433 blocked. Because the DCC protocol is based on UDP, an unhappy advertiser
434 could try to flood a DCC server with packets supposedly from subscribers
435 or non-subscribers. DCC servers defend against that attack by rate-lim-
436 iting requests from anonymous users.
437
438 Also because of the use of UDP, clients must be protected against forged
439 answers to their queries. Otherwise an unsolicited bulk mail advertiser
440 could send a stream of "not spam" answers to an SMTP client while simul-
441 taneously sending mail that would otherwise be rejected. This is not a
442 problem for authenticated clients of the DCC because they share a secret
443 with the DCC. Unauthenticated, anonymous DCC clients do not share any
444 secrets with the DCC, except for unique and unpredictable bits in each
445 query or report sent to the DCC. Therefore, DCC servers cryptographi-
446 cally sign answers to unauthenticated clients with bits from the corre-
447 sponding queries. This protects against attackers that do not have
448 access to the stream of packets from the DCC client.
449
450 The passwords or shared secrets used in the DCC client and server pro-
451 grams are "cleartext" for several reasons. In any shared secret authen-
452 tication system, at least one party must know the secret or keep the
453 secret in cleartext. You could encrypt the secrets in a file, but
454 because they are used by programs, you would need a cleartext copy of the
455 key to decrypt the file somewhere in the system, making such a scheme
456 more expensive but no more secure than a file of cleartext passwords.
457 Asymmetric systems such as that used in UNIX allow one party to not know
458 the secrets, but they must be and are designed to be computationally
459 expensive when used in applications like the DCC that involve thousands
460 or more authentication checks per second. Moreover, because of "dictio-
461 nary attacks," asymmetric systems are now little more secure than keeping
462 passwords in cleartext. An adversary can compare the hash values of com-
463 binations of common words with /etc/passwd hash values to look for bad
464 passwords. Worse, by the nature of a client/server protocol like that
465 used in the DCC, clients must have the cleartext password. Since it is
466 among the more numerous and much less secure clients that adversaries
467 would seek files of DCC passwords, it would be a waste to complicate the
468 DCC server with an asymmetric system.
469
470 The DCC protocol is vulnerable to dictionary attacks to recover pass-
471 words. An adversary could capture some DCC packets, and then check to
472 see if any of the 100,000 to 1,000,000 passwords in so called "cracker
473 dictionaries" applied to a packet generated the same signature. This is
474 a concern only if DCC passwords are poorly chosen, such as any combina-
475 tion of words in an English dictionary. There are ways to prevent this
476 vulnerability regardless of how badly passwords are chosen, but they are
477 computationally expensive and require additional network round trips.
478 Since DCC passwords are created and typed into files once and do not need
479 to be remembered by people, it is cheaper and quite easy to simply choose
480 good passwords that are not in dictionaries.
481
482 <A NAME="Reliability"><B>Reliability</B></A>
483 It is better to fail to filter unsolicited bulk mail than to fail to
484 deliver legitimate mail, so DCC clients fail in the direction of assuming
485 that mail is legitimate or even whitelisted.
486
487 A DCC client sends a report or other request and waits for an answer. If
488 no answer arrives within a reasonable time, the client retransmits.
489 There are many things that might result in the client not receiving an
490 answer, but the most important is packet loss. If the client's request
491 does not reach the server, it is easy and harmless for the client to
492 retransmit. If the client's request reached the server but the server's
493 response was lost, a retransmission to the same server would be misunder-
494 stood as a new report of another copy of the same message unless it is
495 detected as a retransmission by the server. The DCC protocol includes
496 transactions identifiers for this purpose. If the client retransmitted
497 to a second server, the retransmission would be misunderstood by the sec-
498 ond server as a new report of the same message.
499
500 Each request from a client includes a timestamp to aid the client in mea-
501 suring the round trip time to the server and to let the client pick the
502 closest server. Clients monitor the speed of all of the servers they
503 know including those they are not currently using, and use the quickest.
504
505 <A NAME="Client-and-Server-IDs"><B>Client and Server-IDs</B></A>
506 Servers and clients use numbers or IDs to identify themselves. ID 1 is
507 reserved for anonymous, unauthenticated clients. All other IDs are asso-
508 ciated with a pair of passwords in the <I>ids</I> file, the current and next or
509 previous and current passwords. Clients included their client IDs in
510 their messages. When they are not using the anonymous ID, they sign
511 their messages to servers with the first password associated with their
512 client-ID. Servers treat messages with signatures that match neither of
513 the passwords for the client-ID in their own <I>ids</I> file as if the client
514 had used the anonymous ID.
515
516 Each server has a unique <I>server-ID</I> less than 32768. Servers use their
517 IDs to identify checksums that they <I>flood</I> to other servers. Each server
518 expects local clients sending administrative commands to use the server's
519 ID and sign administrative commands with the associated password.
520
521 Server-IDs must be unique among all systems that share reports by "flood-
522 ing." All servers must be told of the IDs all other servers whose
523 reports can be received in the local <I>@prefix@/flod</I> file described in
524 <B><A HREF="dccd.html">dccd(8)</A></B>. However, server-IDs can be mapped during flooding between inde-
525 pendent DCC organizations.
526
527 <I>Passwd-IDs</I> are server-IDs that should not be assigned to servers. They
528 appear in the often publicly readable <I>@prefix@/flod</I> and specify passwords
529 in the private <I>@prefix@/ids</I> file for the inter-server flooding protocol
530
531 The client identified by a <I>client-ID</I> might be a single computer with a
532 single IP address, a single but multi-homed computer, or many computers.
533 Client-IDs are not used to identify checksum reports, but the organiza-
534 tion operating the client. A client-ID need only be unique among clients
535 using a single server. A single client can use different client-IDs for
536 different servers, each client-ID authenticated with a separate password.
537
538 An obscure but important part of all of this is that the inter-server
539 flooding algorithm depends on server-IDs and timestamps attached to
540 reports of checksums. The inter-server flooding mechanism requires coop-
541 erating DCC servers to maintain reasonable clocks ticking in UTC.
542 Clients include timestamps in their requests, but as long as their time-
543 stamps are unlikely to be repeated, they need not be very accurate.
544
545 <A NAME="Installation-Considerations"><B>Installation Considerations</B></A>
546 DCC clients on a computer share information about which servers are cur-
547 rently working and their speeds in a shared memory segment. This segment
548 also contains server host names, IP addresses, and the passwords needed
549 to authenticate known clients to servers. That generally requires that
550 <B><A HREF="dccm.html">dccm(8)</A></B>, <B><A HREF="dccproc.html">dccproc(8)</A></B>, <B><A HREF="dccifd.html">dccifd(8)</A></B>, and <B><A HREF="cdcc.html">cdcc(8)</A></B> execute with an UID that can
551 write to the DCC home directory and its files. The sendmail interface,
552 dccm, is a daemon that can be started by an "rc" or other script already
553 running with the correct UID. The other two, dccproc and cdcc need to be
554 set-UID because they are used by end users. They relinquish set-UID
555 privileges when not needed.
556
557 Files that contain cleartext passwords including the shared file used by
558 clients must be readable only by "owner."
559
560 The data files required by a DCC can be in a single "home" directory,
561 <I>@prefix@</I>. Distinct DCC servers can run on a single computer, provided
562 they use distinct UDP port numbers and home directories. It is possible
563 and convenient for the DCC clients using a server on the same computer to
564 use the same home directory as the server.
565
566 The DCC source distribution includes sample control files. They should
567 be modified appropriately and then copied to the DCC home directory.
568 Files that contain cleartext passwords must not be publicly readable.
569
570 The DCC source includes "feature" m4 files to configure sendmail to use
571 <B><A HREF="dccm.html">dccm(8)</A></B> to check a DCC server about incoming mail.
572
573 See also the <A HREF="INSTALL.html">INSTALL.html</A> file.
574
575 <A NAME="Client-Installation"><B>Client Installation</B></A>
576 Installing a DCC client starts with obtaining or compiling program bina-
577 ries for the client server data control tool, <B><A HREF="cdcc.html">cdcc(8)</A></B>. Installing the
578 sendmail DCC interface, <B><A HREF="dccm.html">dccm(8)</A></B>, or <B><A HREF="dccproc.html">dccproc(8)</A></B>, the general or
579 <B>procmail(1)</B> interface is the main part of the client installation. Con-
580 necting the DCC to sendmail with dccm is most powerful, but requires
581 administrative control of the system running sendmail.
582
583 As noted above, cdcc and dccproc should be set-UID to a suitable UID.
584 Root or 0 is thought to be safe for both, because they are careful to
585 release privileges except when they need them to read or write files in
586 the DCC home directory. A DCC home directory, <I>@prefix@</I> should be cre-
587 ated. It must be owned and writable by the UID to which cdcc is set.
588
589 After the DCC client programs have been obtained, contact the operator(s)
590 of the chosen DCC server(s) to obtain each server's hostname, port num-
591 ber, and a <I>client-ID</I> and corresponding password. No client-IDs or pass-
592 words are needed touse DCC servers that allow anonymous clients. Use the
593 <I>load</I> or <I>add</I> commands of cdcc to create a <I>map</I> file in the DCC home direc-
594 tory. It is usually necessary to create a client whitelist file of the
595 format described above. To accommodate users sharing a computer but not
596 ideas about what is solicited bulk mail, the client whitelist file can be
597 any valid path name and need not be in the DCC home directory.
598
599 If dccm is chosen, arrange to start it with suitable arguments before
600 sendmail is started. See the <I>homedir/dcc</I><B>_</B><I>conf</I> file and the <I>misc/rcDCC</I>
601 script in the DCC source. The procmail DCCM interface, <B><A HREF="dccproc.html">dccproc(8)</A></B>, can
602 be run manually or by a <B>procmailrc(5)</B> rule.
603
604 <A NAME="Server-Installation"><B>Server Installation</B></A>
605 The DCC server, <B><A HREF="dccd.html">dccd(8)</A></B>, also requires that the DCC home directory exist.
606 It does not use the client shared or memory mapped file of server
607 addresses, but it requires other files. One is the <I>@prefix@/ids</I> file of
608 client-IDs, server-IDs, and corresponding passwords. Another is a <I>flod</I>
609 file of peers that send and receive floods of reports of checksums with
610 large counts. Both files are described in <B><A HREF="dccd.html">dccd(8)</A></B>.
611
612 The server daemon should be started when the system is rebooted, probably
613 before sendmail. See the <I>misc/rcDCC</I> and <I>misc/start-dccd</I> files in the DCC
614 source.
615
616 The database should be cleaned regularly with <B><A HREF="dbclean.html">dbclean(8)</A></B> such as by run-
617 ning the crontab job that is in the misc directory.
618
619
620 </PRE>
621 <H2><A NAME="SEE-ALSO">SEE ALSO</A></H2><PRE>
622 <B><A HREF="cdcc.html">cdcc(8)</A></B>, <B><A HREF="dbclean.html">dbclean(8)</A></B>, <B><A HREF="dcc.html">dcc(8)</A></B>, <B><A HREF="dccd.html">dccd(8)</A></B>, <B><A HREF="dccifd.html">dccifd(8)</A></B>, <B><A HREF="dccm.html">dccm(8)</A></B>, <B><A HREF="dccproc.html">dccproc(8)</A></B>,
623 <B><A HREF="dblist.html">dblist(8)</A></B>, <B><A HREF="dccsight.html">dccsight(8)</A></B>, <B>sendmail(8)</B>.
624
625
626 </PRE>
627 <H2><A NAME="HISTORY">HISTORY</A></H2><PRE>
628 Distributed Checksum Clearinghouses are based on an idea of Paul Vixie
629 with code designed and written at Rhyolite Software starting in 2000.
630 This document describes version 1.3.103.
631
632 February 26, 2009
633 </PRE>
634 <HR>
635 <ADDRESS>
636 Man(1) output converted with
637 <a href="http://www.oac.uci.edu/indiv/ehood/man2html.html">man2html</a>
638 modified for the DCC $Date 2001/04/29 03:22:18 $
639 <BR>
640 <A HREF="http://www.dcc-servers.net/dcc/">
641 <IMG SRC="http://logos.dcc-servers.net/border.png"
642 class=logo ALT="DCC logo">
643 </A>
644 <A HREF="http://validator.w3.org/check?uri=referer">
645 <IMG class=logo ALT="Valid HTML 4.01 Strict"
646 SRC="http://www.w3.org/Icons/valid-html401">
647 </A>
648 </ADDRESS>
649 </BODY>
650 </HTML>