Mercurial > notdcc
comparison dcc.8.in @ 0:c7f6b056b673
First import of vendor version
author | Peter Gervai <grin@grin.hu> |
---|---|
date | Tue, 10 Mar 2009 13:49:58 +0100 |
parents | |
children |
comparison
equal
deleted
inserted
replaced
-1:000000000000 | 0:c7f6b056b673 |
---|---|
1 .\" Copyright (c) 2008 by Rhyolite Software, LLC | |
2 .\" | |
3 .\" This agreement is not applicable to any entity which sells anti-spam | |
4 .\" solutions to others or provides an anti-spam solution as part of a | |
5 .\" security solution sold to other entities, or to a private network | |
6 .\" which employs the DCC or uses data provided by operation of the DCC | |
7 .\" but does not provide corresponding data to other users. | |
8 .\" | |
9 .\" Permission to use, copy, modify, and distribute this software without | |
10 .\" changes for any purpose with or without fee is hereby granted, provided | |
11 .\" that the above copyright notice and this permission notice appear in all | |
12 .\" copies and any distributed versions or copies are either unchanged | |
13 .\" or not called anything similar to "DCC" or "Distributed Checksum | |
14 .\" Clearinghouse". | |
15 .\" | |
16 .\" Parties not eligible to receive a license under this agreement can | |
17 .\" obtain a commercial license to use DCC by contacting Rhyolite Software | |
18 .\" at sales@rhyolite.com. | |
19 .\" | |
20 .\" A commercial license would be for Distributed Checksum and Reputation | |
21 .\" Clearinghouse software. That software includes additional features. This | |
22 .\" free license for Distributed ChecksumClearinghouse Software does not in any | |
23 .\" way grant permision to use Distributed Checksum and Reputation Clearinghouse | |
24 .\" software | |
25 .\" | |
26 .\" THE SOFTWARE IS PROVIDED "AS IS" AND RHYOLITE SOFTWARE, LLC DISCLAIMS ALL | |
27 .\" WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES | |
28 .\" OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL RHYOLITE SOFTWARE, LLC | |
29 .\" BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES | |
30 .\" OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, | |
31 .\" WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, | |
32 .\" ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS | |
33 .\" SOFTWARE. | |
34 .\" | |
35 .\" Rhyolite Software DCC 1.3.103-1.112 $Revision$ | |
36 .\" | |
37 .Dd February 26, 2009 | |
38 .ds volume-ds-DCC Distributed Checksum Clearinghouse | |
39 .Dt DCC 8 DCC | |
40 .Os " " | |
41 .Sh NAME | |
42 .Nm DCC | |
43 .Nd Distributed Checksum Clearinghouse | |
44 .Sh DESCRIPTION | |
45 The Distributed Checksum Clearinghouse or | |
46 .Nm | |
47 is a cooperative, distributed | |
48 system intended to detect "bulk" mail or mail sent to many people. | |
49 It allows individuals receiving a single mail message to determine | |
50 that many | |
51 other people have received essentially identical copies of the message | |
52 and so reject or discard the message. | |
53 .Pp | |
54 Source for the server, client, and utilities | |
55 is available at Rhyolite Software, LLC, http://www.rhyolite.com/dcc/ | |
56 It is free for organizations that do not sell spam or virus filtering | |
57 services. | |
58 .Ss How the DCC Is Used | |
59 The DCC can be viewed as a tool for end users to enforce their | |
60 right to "opt-in" to streams of bulk mail | |
61 by refusing bulk mail except from sources in a "whitelist." | |
62 Whitelists are the responsibility of DCC clients, | |
63 since only they know which bulk mail they solicited. | |
64 .Pp | |
65 False positives or mail marked as bulk by a DCC server that | |
66 is not bulk occur only when a recipient of a message reports it | |
67 to a DCC server as having been received many times | |
68 or when the "fuzzy" checksums of differing messages are the same. | |
69 The fuzzy checksums ignore aspects of messages in order to compute | |
70 identical checksums for substantially identical messages. | |
71 The fuzzy checksums are designed to ignore only | |
72 differences that do not affect meanings. | |
73 So in practice, you do not need to worry about DCC false positive indications | |
74 of "bulk," but not all bulk mail is unsolicited bulk mail or spam. | |
75 You must either use whitelists to distinguish solicited from unsolicited bulk | |
76 mail | |
77 or only use DCC indications of "bulk" as part of a scoring system such | |
78 as SpamAssassin. | |
79 Besides unsolicited bulk email or spam, | |
80 bulk messages include legitimate mail such as | |
81 order confirmations from merchants, | |
82 legitimate mailing lists, | |
83 and empty or test messages. | |
84 .Pp | |
85 A DCC server estimates the number copies of a | |
86 message by counting checksums reported by DCC clients. | |
87 Each client must decide which | |
88 bulk messages are unsolicited and what degree of "bulkiness" is objectionable. | |
89 Client DCC software marks, rejects, or discards mail that is bulk | |
90 according to local thresholds on target addresses from DCC servers | |
91 and unsolicited according to local whitelists. | |
92 .Pp | |
93 DCC servers are usually configured to receive reports from as many targets | |
94 as possible, including sources that cannot be trusted to not exaggerate the | |
95 number of copies of a message they see. | |
96 A user of a DCC client angry about receiving a message could report it with | |
97 1,000,000 separate DCC reports | |
98 or with a single report claiming 1,000,000 targets. | |
99 An unprincipled user could subscribe a "spam trap" to mailing lists | |
100 such as those of the IETF or CERT. | |
101 Such abuses of the system area not problems, | |
102 because much legitimate mail is "bulk." | |
103 You cannot reject bulk mail unless you have a whitelist of sources | |
104 of legitimate bulk mail. | |
105 .Pp | |
106 DCC can also be used by an Internet service provider to detect bulk | |
107 mail coming from its own customers. | |
108 In such circumstances, the DCC client might be configured to only log | |
109 bulk mail from unexpected (not whitelisted) customers. | |
110 .Ss What the DCC Is | |
111 A DCC server accumulates counts of cryptographic checksums of | |
112 messages but not the messages themselves. | |
113 It exchanges reports of frequently seen checksums with other servers. | |
114 DCC clients send reports of checksums related to incoming mail to | |
115 a nearby DCC server running | |
116 .Xr dccd 8 . | |
117 Each report from a client includes the number of recipients for the message. | |
118 A DCC server accumulates the reports and responds to clients the | |
119 the current total number of recipients for each checksum. | |
120 The client adds an SMTP header to incoming mail containing the total | |
121 counts. | |
122 It then discards or rejects mail that is not whitelisted and has | |
123 counts that exceed local thresholds. | |
124 .Pp | |
125 A special value of the number of addressees is "MANY" and means | |
126 it is certain that this message was bulk and might be unsolicited, | |
127 perhaps because it came from a locally blacklisted source or was | |
128 addressed to an invalid address or "spam trap." | |
129 The special value "MANY" is merely the largest value | |
130 that fits in the fixed sized field containing the count of addressees. | |
131 That "infinity" accumulated total can be reached with millions of | |
132 independent reports as well as with one or two. | |
133 .Pp | |
134 DCC servers | |
135 .Em flood | |
136 or send | |
137 reports of checksums of bulk mail to neighboring servers. | |
138 .Pp | |
139 To keep a server's database of checksums from growing without bound, | |
140 checksums are forgotten when they become old. | |
141 Checksums of bulk mail are kept longer. | |
142 See | |
143 .Xr dbclean 8 . | |
144 .Pp | |
145 DCC clients pick the nearest working DCC server using a small shared | |
146 or memory mapped file, | |
147 .Pa @prefix@/map . | |
148 It contains server names, port numbers, passwords, recent performance | |
149 measures, and so forth. | |
150 This file allows clients to use quick retransmission timeouts | |
151 and to waste little time on servers that have temporarily | |
152 stopped working or become unreachable. | |
153 The utility program | |
154 .Xr cdcc 8 | |
155 is used to maintain this file as well as to check the health of servers. | |
156 .Ss X-DCC Headers | |
157 The DCC software includes several programs used by clients. | |
158 .Xr Dccm 8 | |
159 uses the sendmail "milter" interface to query a DCC server, | |
160 add header lines to incoming mail, | |
161 and reject mail whose total checksum counts are high. | |
162 Dccm is intended to be run with SMTP servers using sendmail. | |
163 .Pp | |
164 .Xr Dccproc 8 | |
165 adds header lines to mail presented by file name or | |
166 .Pa stdin , | |
167 but relies on other programs | |
168 such as procmail to deal with mail with large counts. | |
169 .Xr Dccsight 8 | |
170 is similar but deals with previously computed checksums. | |
171 .Pp | |
172 .Xr Dccifd 8 | |
173 is similar to dccproc but is not run separately for each mail message | |
174 and so is far more efficient. | |
175 It receives mail messages via a socket somewhat like dccm, | |
176 but with a simpler protocol that can be used by Perl scripts | |
177 or other programs. | |
178 .Pp | |
179 DCC SMTP header lines are of one of the forms: | |
180 .Bd -literal -offset 2n | |
181 X-DCC-brand-Metrics: client server-ID; bulk cknm1=count cknm2=count ... | |
182 X-DCC-brand-Metrics: client; whitelist | |
183 .Ed | |
184 where | |
185 .Bl -hang -offset 3n -compact | |
186 .It Em whitelist | |
187 appears if the global or per-user | |
188 .Pa whiteclnt | |
189 file marks the message as good. | |
190 .It Em brand | |
191 is the "brand name" of the DCC server, such as "RHYOLITE". | |
192 .It Em client | |
193 is the name or IP address of the DCC client that added the | |
194 header line to the SMTP message. | |
195 .It Em server-ID | |
196 is the numeric ID of the DCC server that the DCC client contacted. | |
197 .It Em bulk | |
198 is present if one or more checksum counts exceeded the DCC client's | |
199 thresholds to make the message "bulky." | |
200 .It Em bulk rep | |
201 is present if the DCC reputation of the IP address of the sender is bad. | |
202 .It Em cknm1 , Ns Em cknm2 , Ns ... | |
203 are types of checksums: | |
204 .Bl -hang -offset 2n -width "Message-IDx" -compact | |
205 .It Em IP | |
206 address of SMTP client | |
207 .It Em env_From | |
208 SMTP envelope value | |
209 .It Em From | |
210 SMTP header line | |
211 .It Em Message-ID | |
212 SMTP header line | |
213 .It Em Received | |
214 last Received: header line in the SMTP message | |
215 .It Em substitute | |
216 SMTP header line chosen by the DCC client, prefixed with the name of | |
217 the header | |
218 .It Em Body | |
219 SMTP body ignoring white-space | |
220 .It Em Fuz1 | |
221 filtered or "fuzzy" body checksum | |
222 .It Em Fuz2 | |
223 another filtered or "fuzzy" body checksum | |
224 .It Em rep | |
225 DCC reputation of the mail sender or the estimated | |
226 probability that the message is bulk. | |
227 .El | |
228 Counts for | |
229 .Em IP , env_From , From , | |
230 .Em Message-Id , Received , | |
231 and | |
232 .Em substitute | |
233 checksums are omitted by the DCC client if the server | |
234 says it has no information. | |
235 Counts for | |
236 .Em Fuz1 | |
237 and | |
238 .Em Fuz2 | |
239 are omitted if the message body is empty or | |
240 contains too little of the right kind of information | |
241 for the checksum to be computed. | |
242 .It Em count | |
243 is the total number of recipients of messages with that | |
244 checksum reported directly or indirectly to the DCC server. | |
245 The special count "MANY" means that DCC client have claimed that | |
246 the message is directed at millions of recipients. | |
247 "MANY" imples the message is definitely bulk, but not necessarily unsolicited. | |
248 The special counts "OK" and "OK2" mean the checksum has been | |
249 marked "good" or "half-good" by DCC servers. | |
250 .El | |
251 .Pp | |
252 .Ss Mailing lists | |
253 Legitimate mailing list traffic differs from spam only in being solicited | |
254 by recipients. | |
255 Each client should have a private whitelist. | |
256 .Pp | |
257 DCC whitelists can also mark mail as unsolicited bulk using | |
258 blacklist entries for commonly forged values such as "From: user@public.com". | |
259 .Ss White and Blacklists | |
260 DCC server and client whitelist files share a common format. | |
261 Server files are always named | |
262 .Pa whitelist | |
263 and one is required to be in the DCC home directory | |
264 with the other server files. | |
265 Client whitelist files are | |
266 named | |
267 .Pa whiteclnt | |
268 in the DCC home directory or a subdirectory specified with the | |
269 .Fl U | |
270 option for | |
271 .Xr dccm 8 . | |
272 They specify mail that should not be reported to a DCC server or that is | |
273 always unsolicited and almost certainly bulk. | |
274 .Pp | |
275 A DCC whitelist file contains blank lines, comments starting | |
276 with "#", | |
277 and lines of the following forms: | |
278 .Bl -tag -offset 2n -width 4n -compact | |
279 .It Ar include file | |
280 Copies the contents of | |
281 .Ar file | |
282 into the whitelist. | |
283 It can occur only in the main whitelist or whiteclnt file and not in an | |
284 included file. | |
285 The file name should be absolute or relative to the DCC home directory. | |
286 .Pp | |
287 .It Ar count Em value | |
288 lines specify checksums that should be white- or blacklisted. | |
289 .Bl -inset -offset 2n -compact | |
290 .It Ar count Em env_From Ar 821-path | |
291 .It Ar count Em env_To Ar dest-mailbox | |
292 .It Ar count Em From Ar 822-mailbox | |
293 .It Ar count Em Message-ID Ar <string> | |
294 .It Ar count Em Received Ar string | |
295 .It Ar count Em Substitute Ar header string | |
296 .It Ar count Ar Hex ctype cksum | |
297 .It Ar count Em ip Ar IP-address | |
298 .El | |
299 .Pp | |
300 .Bl -tag -offset 2n -width 4n -compact | |
301 .It Ar MANY Em value | |
302 indicates that millions of targets have received messages with | |
303 the header, IP address, or checksum | |
304 .Em value . | |
305 .It Ar OK Em value | |
306 .It Ar OK2 Em value | |
307 say that messages with | |
308 the header, IP address, or checksum | |
309 .Em value | |
310 are OK and should not reported to DCC servers | |
311 or be greylisted. | |
312 .Ar OK2 | |
313 says that the message is "half OK." | |
314 Two | |
315 .Ar OK2 | |
316 checksums associated with a message are equivalent to one | |
317 .Ar OK . | |
318 .br | |
319 A DCC server never shares or | |
320 .Em floods | |
321 reports containing checksums | |
322 marked in its whitelist with OK or OK2 to other servers. | |
323 A DCC client does not report or ask its server about messages | |
324 with a checksum marked OK or OK2 in the client whitelist. | |
325 This is intended to allow a DCC client to keep private mail | |
326 so private that even its checksums are not disclosed. | |
327 .It Ar MX Em IP-address-or-hostname | |
328 .It Ar MXDCC Em IP-address-or-hostname | |
329 mark an address or block of addresses of trust mail relays including | |
330 MX servers, smart hosts, and bastion or DMZ relays. | |
331 The DCC clients | |
332 .Xr dccm 8 , | |
333 .Xr dccifd 8 , | |
334 and | |
335 .Xr dccproc 8 | |
336 parse and skip initial Received: headers added by listed MX servers to | |
337 determine the external sources of mail messages. | |
338 Unsolicited bulk mail that has been forwarded through listed addresses | |
339 is discarded by | |
340 .Xr dccm 8 | |
341 and | |
342 .Xr dccifd 8 | |
343 as if with | |
344 .Fl a Ar DISCARD | |
345 instead of rejected. | |
346 .Ar MXDCC | |
347 marks addresses that are MX servers that run DCC clients. | |
348 The checksums for a mail message that has been forwarded through | |
349 an address listed as MXDCC | |
350 queried instead of reported. | |
351 .It Ar SUBMIT Em IP-address-or-hostname | |
352 marks an IP address or block addresses of SMTP submission clients | |
353 such as web browsers | |
354 that cannot tolerate 4yz temporary rejections | |
355 but that cannot be trusted to not send spam. | |
356 Since they are local addresses, DCC Reputations are not computed for them. | |
357 .El | |
358 .Pp | |
359 .Ar value | |
360 in | |
361 .Ar count Em value | |
362 lines can be | |
363 .Bl -tag -offset 2n -width 4n -compact | |
364 .It Ar dest-mailbox | |
365 is an RFC\ 821 address or a local user name. | |
366 .It Ar 821-path | |
367 is an RFC\ 821 address. | |
368 .It Ar 822-mailbox | |
369 is an RFC\ 822 address with optional name. | |
370 .It Em Substitute Ar header | |
371 is the name of an SMTP header such as "Sender" or | |
372 the name of one of two SMTP envlope values, "HELO," or | |
373 "Mail_Host" for the resolved host name from the | |
374 .Ar 821-path | |
375 in | |
376 the message. | |
377 .It Ar Hex ctype cksum | |
378 starts with the string | |
379 .Em Hex | |
380 followed a checksum type, and | |
381 a string of four hexadecimal numbers obtained from a DCC log file | |
382 or the | |
383 .Xr dccproc 8 | |
384 command using | |
385 .Fl CQ . | |
386 The checksum type is | |
387 .Em body , Fuz1 , | |
388 or | |
389 .Em Fuz2 | |
390 or one of the preceding checksum types such as | |
391 .Em env_From . | |
392 .It Ar IP-address | |
393 is a host name, IPv4 or IPv6 address, or a block | |
394 of IP addresses in the standard xxx/mm from with | |
395 mm limited for server whitelists to 16 for IPv4 or 112 for IPv6. | |
396 There can be at most 64 CIDR blocks in a client | |
397 .Pa whiteclnt | |
398 file. | |
399 A host name is converted to IP addresses with DNS, | |
400 .Pa /etc/hosts | |
401 or other mechanisms | |
402 and one checksum for each addresses added to the whitelist. | |
403 .El | |
404 .Pp | |
405 .It Ar option setting | |
406 can only be in a DCC client | |
407 .Pa whiteclnt | |
408 file used by | |
409 .Xr dccifd 8 , | |
410 .Xr dccm 8 | |
411 or | |
412 .Xr dccproc 8 . | |
413 Settings in per-user whiteclnt files override settings | |
414 in the global file. | |
415 .Ar Setting | |
416 can be any of the following: | |
417 .Bl -tag -offset 2n -width 2n -compact | |
418 .It Ar option log-all | |
419 to log all mail messages. | |
420 .It Ar option log-normal | |
421 to log only messages that meet the logging thresholds. | |
422 .It Ar option log-subdirectory-day | |
423 .It Ar option log-subdirectory-hour | |
424 .It Ar option log-subdirectory-minute | |
425 creates log files containing mail messages in subdirectories | |
426 of the form | |
427 .Ar JJJ , | |
428 .Ar JJJ/HH , | |
429 or | |
430 .Ar JJJ/HH/MM | |
431 where | |
432 .Ar JJJ | |
433 is the current julian day, | |
434 .Ar HH | |
435 is the current hour, and | |
436 .Ar MM | |
437 is the current minute. | |
438 See also the | |
439 .Fl l Ar logdir | |
440 option for | |
441 .Xr dccm 8 , | |
442 .Xr dccifd 8 , | |
443 and | |
444 .Xr dccproc 8 . | |
445 .It Ar option dcc-on | |
446 .It Ar option dcc-off | |
447 Control DCC filtering. | |
448 See the discussion of | |
449 .Fl W | |
450 for | |
451 .Xr dccm 8 | |
452 and | |
453 .Xr dccifd 8 . | |
454 .It Ar option greylist-on | |
455 .It Ar option greylist-off | |
456 to control greylisting. | |
457 Greylisting for other recipients in the same SMTP transaction | |
458 can still cause greylist temporary rejections. | |
459 .Ar greylist-off | |
460 in the main whiteclnt file. | |
461 .It Ar option greylist-log-on | |
462 .It Ar option greylist-log-off | |
463 to control logging of greylisted mail messages. | |
464 .It Ar option DCC-rep-off | |
465 .It Ar option DCC-rep-on | |
466 to honor or ignore DCC Reputations computed by the DCC server. | |
467 .It Ar option DNSBL1-off | |
468 .It Ar option DNSBL1-on | |
469 .It Ar option DNSBL2-off | |
470 .It Ar option DNSBL2-on | |
471 .It Ar option DNSBL3-off | |
472 .It Ar option DNSBL3-on | |
473 honor or ignore results of DNS blacklist checks configured with | |
474 .Fl B | |
475 for | |
476 .Xr dccm 8 , | |
477 .Xr dccifd 8 , | |
478 and | |
479 .Xr dccproc 8 . | |
480 .It Ar option MTA-first | |
481 .It Ar option MTA-last | |
482 consider MTA determinations of spam or not-spam first so they can be overridden | |
483 by | |
484 .Pa whiteclnt | |
485 files, or last so that they can override | |
486 .Pa whiteclnt files. | |
487 .It Ar option forced-discard-ok | |
488 .It Ar option no-forced-discard | |
489 control whether | |
490 .Xr dccm 8 | |
491 and | |
492 .Xr dccifd 8 | |
493 are allowed to discard a message for one mailbox for which | |
494 it is spam when it is not spam and must be delivered to another mailbox. | |
495 This can happen if a mail message is addressed to two or more mailboxes with | |
496 differing whitelists. | |
497 Discarding can be undesirable because false positives are not communicated | |
498 to mail senders. | |
499 To avoid discarding, | |
500 .Xr dccm 8 | |
501 and | |
502 .Xr dccifd 8 | |
503 running in proxy mode temporarily reject SMTP envelope | |
504 .Em Rcpt To | |
505 values that involve differing | |
506 .Pa whiteclnt | |
507 files. | |
508 .It Ar option threshold type,rej-thold | |
509 has the same effects as | |
510 .Fl c Ar type,rej-thold | |
511 for | |
512 .Xr dccproc 8 | |
513 or | |
514 .Fl t Ar type,rej-thold | |
515 for | |
516 .Xr dccm 8 | |
517 and | |
518 .Xr dccifd 8 . | |
519 It is useful only in per-user whiteclnt files to override the global | |
520 DCC checksum thresholds. | |
521 .It Ar option spam-trap-accept | |
522 .It Ar option spam-trap-reject | |
523 say that mail should be reported to the DCC server as extremely | |
524 bulk or with target counts of | |
525 .Ar MANY . | |
526 Greylisting, DNS blacklist (DNSBL), and other checks are turned off. | |
527 .Ar Spam-trap-accept | |
528 tells the MTA to accept the message while | |
529 .Ar spam-trap-reject | |
530 tells the MTA to reject the message. | |
531 Use | |
532 .Ar Spam-trap-accept | |
533 for spam traps that should not be disclosed. | |
534 .Ar Spam-trap-reject | |
535 can be used on | |
536 .Em catch-all | |
537 mailboxes that might receive legitimate mail by typographical errors | |
538 and that senders should be told about. | |
539 .El | |
540 .Pp | |
541 In the absence of explicit settings, | |
542 the default in the main whiteclnt file is equivalent to | |
543 .Bl -hang -offset 4n -width 4n -compact | |
544 .It Ar option log-normal | |
545 .It Ar option dcc-on | |
546 .It Ar option greylist-on | |
547 .It Ar option greylist-log-on | |
548 .It Ar option DCC-rep-off | |
549 .It Ar option DNSBL1-off | |
550 .It Ar option DNSBL2-off | |
551 .It Ar option DNSBL3-off | |
552 .It Ar MTA-last | |
553 .It Ar option no-forced-discard | |
554 .El | |
555 The defaults for individual recipient | |
556 .Pa whiteclnt | |
557 files are the same except as change by explicit settings | |
558 in the main file. | |
559 .El | |
560 .Pp | |
561 Checksums of the IP address of the SMTP client sending a mail message | |
562 are practically unforgeable, because it is impractical for | |
563 an SMTP client to "spoof" its address or pretend to use some other IP address. | |
564 That would make the IP address of the sender useful for whitelisting, | |
565 except that the IP address of the SMTP client | |
566 is often not available to users of | |
567 .Xr dccproc 8 . | |
568 In addition, legitimate mail relays make whitelist entries for IP | |
569 addresses of little use. | |
570 For example, | |
571 the IP address from which a message arrived might be that of a | |
572 local relay instead of the home address of a whitelisted mailing list. | |
573 .Pp | |
574 Envelope and header | |
575 .Ar From | |
576 values can be forged, | |
577 so whitelist entries for their checksums are not entirely reliable. | |
578 .Pp | |
579 Checksums of | |
580 .Ar env_To | |
581 values are never sent to DCC servers. | |
582 They are valid in only | |
583 .Pa whiteclnt | |
584 files | |
585 and used only by | |
586 .Xr dccm 8 , | |
587 .Xr dccifd 8 , | |
588 and | |
589 .Xr dccproc 8 | |
590 when the envelope | |
591 .Em Rcpt To | |
592 value is known. | |
593 .Ss Greylists | |
594 The DCC server, | |
595 .Xr dccd 8 , | |
596 can be used to maintain a greylist database for some DCC clients | |
597 including | |
598 .Xr dccm 8 | |
599 and | |
600 .Xr dccifd 8 . | |
601 Greylisting involves temporarily refusing mail from unfamiliar | |
602 SMTP clients and is unrelated to filtering with a | |
603 Distributed Checksum Clearinghouse. | |
604 .br | |
605 See http://projects.puremagic.com/greylisting/ | |
606 .Ss Privacy | |
607 Because sending mail is a less private act than receiving it, | |
608 and because sending bulk mail is usually not private at all | |
609 and cannot be very private, | |
610 the DCC tries first to protect the privacy of mail recipients, | |
611 and second the privacy of senders of mail that is not bulk. | |
612 .Pp | |
613 DCC clients necessarily disclose some information about mail they have | |
614 received. | |
615 The DCC database contains checksums of mail bodies, | |
616 header lines, and source addresses. | |
617 While it contains significantly less information than is | |
618 available by "snooping" on Internet links, | |
619 it is important that the DCC database be treated as containing | |
620 sensitive information and to not put the most private information | |
621 in the DCC database. | |
622 Given the contents of a message, one might determine | |
623 whether that message has been received | |
624 by a system that subscribes to the DCC. | |
625 Guesses about the sender and addressee of a message can also be | |
626 validated if the checksums of the message have been sent to a DCC server. | |
627 .Pp | |
628 Because the DCC is distributed, | |
629 organizations can operate their own DCC servers, and configure | |
630 them to share or "flood" only the checksums of bulk mail that is not | |
631 in local whitelists. | |
632 .Pp | |
633 DCC clients should not report the checksums of messages known to be | |
634 private to a DCC server. | |
635 For example, checksums of messages local to | |
636 a system or that are otherwise known a priori to not be unsolicited bulk | |
637 should not be sent to a remote DCC server. | |
638 This can accomplished by adding entries for the sender to the | |
639 client's local whitelist file. | |
640 Client whitelist files can also include entries for email recipients | |
641 whose mail should not be reported to a DCC server. | |
642 .Ss Security | |
643 Whenever considering security, | |
644 one must first consider the risks. | |
645 The worst DCC security problems are | |
646 unauthorized commands to a DCC service, | |
647 denial of the DCC service, | |
648 and corruption of DCC data. | |
649 The worst that can be done with remote commands to a DCC server is | |
650 to turn it off or otherwise cause it to stop responding. | |
651 The DCC is designed to fail gracefully, | |
652 so that a denial of service attack | |
653 would at worst allow delivery of mail that would otherwise be rejected. | |
654 Corruption of DCC data might at worst cause mail that is already | |
655 somewhat "bulk" by virtue of being received by two or more people | |
656 to appear have higher recipient numbers. | |
657 Since DCC users | |
658 .Em must | |
659 whitelist all sources of legitimate bulk mail, | |
660 this is also not a concern. | |
661 Such security risks should be addressed, | |
662 but only with defenses that don't cost more than the possible damage from | |
663 an attack. | |
664 .Pp | |
665 The DCC must contend with senders of unsolicited bulk mail who | |
666 resort to unlawful actions | |
667 to express their displeasure at having their advertising blocked. | |
668 Because the DCC protocol is based | |
669 on UDP, an unhappy advertiser could try to | |
670 flood a DCC server with | |
671 packets supposedly from subscribers or non-subscribers. | |
672 DCC servers defend against that attack by rate-limiting requests | |
673 from anonymous users. | |
674 .Pp | |
675 Also because of the use of UDP, clients must be protected | |
676 against forged answers to their queries. | |
677 Otherwise an unsolicited bulk mail advertiser could send | |
678 a stream of "not spam" answers to an SMTP | |
679 client while simultaneously sending mail that would otherwise be | |
680 rejected. | |
681 This is not a problem for authenticated clients of the | |
682 DCC because they share a secret with the DCC. | |
683 Unauthenticated, anonymous DCC | |
684 clients do not share any secrets with the DCC, except for unique and | |
685 unpredictable bits in each query or report sent to the DCC. | |
686 Therefore, DCC servers cryptographically sign answers to | |
687 unauthenticated clients with bits from the corresponding queries. | |
688 This protects against attackers that do not | |
689 have access to the stream of packets from the DCC client. | |
690 .Pp | |
691 The passwords or shared secrets used in the DCC client and server programs | |
692 are "cleartext" for several reasons. | |
693 In any shared secret authentication system, | |
694 at least one party must know the secret or keep the secret in cleartext. | |
695 You could encrypt the secrets in a file, but because they are used | |
696 by programs, you would need a cleartext copy of the key to decrypt | |
697 the file somewhere in the system, making such a scheme more expensive | |
698 but no more secure than a file of cleartext passwords. | |
699 Asymmetric systems such as that used in UNIX allow one party to not | |
700 know the secrets, but they must be and are | |
701 designed to be computationally expensive when used in applications | |
702 like the DCC that involve thousands or more authentication checks per second. | |
703 Moreover, because of "dictionary attacks," | |
704 asymmetric systems are now little more secure than | |
705 keeping passwords in cleartext. | |
706 An adversary can compare the hash values of combinations of common words | |
707 with /etc/passwd hash values to look for bad passwords. | |
708 Worse, by the nature of a client/server protocol like that used in | |
709 the DCC, clients must have the cleartext password. | |
710 Since it is among the more numerous and much less secure clients | |
711 that adversaries would seek files of DCC passwords, | |
712 it would be a waste to complicate the DCC server with an asymmetric | |
713 system. | |
714 .Pp | |
715 The DCC protocol is vulnerable to dictionary attacks to recover passwords. | |
716 An adversary could capture some DCC packets, and then check to see | |
717 if any of the 100,000 to 1,000,000 passwords in so called | |
718 "cracker dictionaries" | |
719 applied to a packet generated the same signature. | |
720 This is a concern only if DCC passwords are poorly chosen, such | |
721 as any combination of words in an English dictionary. | |
722 There are ways to prevent this vulnerability regardless of | |
723 how badly passwords are chosen, but they are computationally expensive | |
724 and require additional network round trips. | |
725 Since DCC passwords are created and typed into files once | |
726 and do not need to be remembered by people, | |
727 it is cheaper and quite easy to simply choose good passwords | |
728 that are not in dictionaries. | |
729 .Ss Reliability | |
730 It is better to fail to filter unsolicited bulk mail than to fail | |
731 to deliver legitimate mail, so DCC clients fail in the direction of | |
732 assuming that mail is legitimate or even whitelisted. | |
733 .Pp | |
734 A DCC client sends a report or other request and waits for an answer. | |
735 If no answer arrives within a reasonable time, | |
736 the client retransmits. | |
737 There are many things that | |
738 might result in the client not receiving an answer, | |
739 but the most important is packet loss. | |
740 If the client's request does not reach the server, | |
741 it is easy and harmless for the client to retransmit. | |
742 If the client's request reached the server but the server's response was lost, | |
743 a retransmission to the same server would be misunderstood as | |
744 a new report of another copy of the same message unless it is detected | |
745 as a retransmission by the server. | |
746 The DCC protocol includes transactions identifiers for this purpose. | |
747 If the client retransmitted to a second server, | |
748 the retransmission would be misunderstood by the second server as | |
749 a new report of the same message. | |
750 .Pp | |
751 Each request from a client includes a timestamp to aid the client in | |
752 measuring the round trip time to the server and to let the client pick | |
753 the closest server. | |
754 Clients monitor the speed of all of the servers they know including | |
755 those they are not currently using, | |
756 and use the quickest. | |
757 .Ss Client and Server-IDs | |
758 Servers and clients use numbers or IDs to identify themselves. | |
759 ID 1 is reserved for anonymous, unauthenticated clients. | |
760 All other IDs are associated with a pair of passwords in the | |
761 .Pa ids | |
762 file, the | |
763 current and next or previous and current passwords. | |
764 Clients included their client IDs in their messages. | |
765 When they are not using the anonymous ID, | |
766 they sign their messages to servers with the first password | |
767 associated with their client-ID. | |
768 Servers treat messages with signatures that match neither of the passwords | |
769 for the client-ID in their own | |
770 .Pa ids | |
771 file as if the client had used the anonymous ID. | |
772 .Pp | |
773 Each server has a unique | |
774 .Em server-ID | |
775 less than 32768. | |
776 Servers use their IDs to identify checksums that they | |
777 .Em flood | |
778 to other servers. | |
779 Each server expects local clients sending administrative | |
780 commands to use the server's ID and sign administrative commands | |
781 with the associated password. | |
782 .Pp | |
783 Server-IDs must be unique among all systems that share reports | |
784 by "flooding." | |
785 All servers must be told of the IDs all other servers whose | |
786 reports can be received in the local | |
787 .Pa @prefix@/flod | |
788 file described in | |
789 .Xr dccd 8 . | |
790 However, server-IDs can be mapped during flooding between | |
791 independent DCC organizations. | |
792 .Pp | |
793 .Em Passwd-IDs | |
794 are server-IDs that should not be assigned to servers. | |
795 They appear in the often publicly readable | |
796 .Pa @prefix@/flod | |
797 and specify passwords in the private | |
798 .Pa @prefix@/ids | |
799 file for the inter-server flooding protocol | |
800 .Pp | |
801 The client identified by a | |
802 .Em client-ID | |
803 might be a single computer with a | |
804 single IP address, a single but multi-homed computer, or many computers. | |
805 Client-IDs are not used to identify checksum reports, but | |
806 the organization operating the client. | |
807 A client-ID need only be unique among clients using a single server. | |
808 A single client can use different client-IDs for different servers, | |
809 each client-ID authenticated with a separate password. | |
810 .Pp | |
811 An obscure but important part of all of this is that the | |
812 inter-server flooding algorithm | |
813 depends on server-IDs and timestamps attached to reports of checksums. | |
814 The inter-server flooding mechanism | |
815 requires cooperating DCC servers to maintain reasonable clocks | |
816 ticking in UTC. | |
817 Clients include timestamps in their requests, but as long as their | |
818 timestamps are unlikely to be repeated, they need not be very accurate. | |
819 .Ss Installation Considerations | |
820 DCC clients on a computer share information about which servers | |
821 are currently working and their speeds in a shared memory segment. | |
822 This segment also contains server host names, IP addresses, and | |
823 the passwords needed to authenticate known clients to servers. | |
824 That generally requires that | |
825 .Xr dccm 8 , | |
826 .Xr dccproc 8 , | |
827 .Xr dccifd 8 , | |
828 and | |
829 .Xr cdcc 8 | |
830 execute with an UID that | |
831 can write to the DCC home directory and its files. | |
832 The sendmail interface, dccm, | |
833 is a daemon that can be started by an "rc" or other script already | |
834 running with the correct UID. | |
835 The other two, dccproc and cdcc need to be set-UID because they are | |
836 used by end users. | |
837 They relinquish set-UID privileges when not needed. | |
838 .Pp | |
839 Files that contain cleartext passwords including the shared file used by clients | |
840 must be readable only by "owner." | |
841 .Pp | |
842 The data files required by a DCC can be in a single "home" directory, | |
843 .Pa @prefix@ . | |
844 Distinct DCC servers can run on a single computer, provided they use | |
845 distinct UDP port numbers and home directories. | |
846 It is possible and convenient for the DCC clients using a server | |
847 on the same computer to use the same home directory as the server. | |
848 .Pp | |
849 The DCC source distribution includes sample control files. | |
850 They should be modified appropriately and then copied to the DCC | |
851 home directory. | |
852 Files that contain cleartext passwords must not be publicly readable. | |
853 .Pp | |
854 The DCC source includes "feature" m4 files to configure | |
855 sendmail to use | |
856 .Xr dccm 8 | |
857 to check a DCC server about incoming mail. | |
858 .Pp | |
859 See also the INSTALL.html file. | |
860 .Ss Client Installation | |
861 Installing a DCC client starts with obtaining or compiling program binaries | |
862 for the client server data control tool, | |
863 .Xr cdcc 8 . | |
864 Installing the sendmail DCC interface, | |
865 .Xr dccm 8 , | |
866 or | |
867 .Xr dccproc 8 , | |
868 the general or | |
869 .Xr procmail 1 | |
870 interface | |
871 is the main part of the client installation. | |
872 Connecting the DCC to sendmail with dccm is most powerful, | |
873 but requires administrative control of the system running sendmail. | |
874 .Pp | |
875 As noted above, cdcc and dccproc should be | |
876 set-UID to a suitable UID. | |
877 Root or 0 is thought to be safe for both, because they are | |
878 careful to release privileges except when they need them to | |
879 read or write files in the DCC home directory. | |
880 A DCC home directory, | |
881 .Pa @prefix@ | |
882 should be created. | |
883 It must be owned and writable by the UID to which cdcc is set. | |
884 .Pp | |
885 After the DCC client programs have been obtained, | |
886 contact the operator(s) of the chosen DCC server(s) | |
887 to obtain | |
888 each server's | |
889 hostname, | |
890 port number, | |
891 and a | |
892 .Em client-ID | |
893 and corresponding password. | |
894 No client-IDs or passwords are needed touse | |
895 DCC servers that allow anonymous clients. | |
896 Use the | |
897 .Em load | |
898 or | |
899 .Em add | |
900 commands | |
901 of cdcc to create a | |
902 .Pa map | |
903 file in the DCC home directory. | |
904 It is usually necessary to create a client whitelist file of | |
905 the format described above. | |
906 To accommodate users sharing a computer but not ideas about what | |
907 is solicited bulk mail, | |
908 the client whitelist file can be any valid path name | |
909 and need not be in the DCC home directory. | |
910 .Pp | |
911 If dccm is chosen, | |
912 arrange to start it with suitable arguments | |
913 before sendmail is started. | |
914 See the | |
915 .Pa homedir/dcc_conf | |
916 file and the | |
917 .Pa misc/rcDCC | |
918 script in the DCC source. | |
919 The procmail DCCM interface, | |
920 .Xr dccproc 8 , | |
921 can be run manually or by a | |
922 .Xr procmailrc 5 | |
923 rule. | |
924 .Ss Server Installation | |
925 The DCC server, | |
926 .Xr dccd 8 , | |
927 also requires that the DCC home directory exist. | |
928 It does not use the client shared or memory mapped file of server | |
929 addresses, | |
930 but it requires other files. | |
931 One is the | |
932 .Pa @prefix@/ids | |
933 file of client-IDs, server-IDs, and corresponding passwords. | |
934 Another is a | |
935 .Pa flod | |
936 file of peers that send and receive floods of reports of checksums | |
937 with large counts. | |
938 Both files are described | |
939 in | |
940 .Xr dccd 8 . | |
941 .Pp | |
942 The server daemon should be started when the system is rebooted, | |
943 probably before sendmail. | |
944 See the | |
945 .Pa misc/rcDCC | |
946 and | |
947 .Pa misc/start-dccd | |
948 files in the DCC source. | |
949 .Pp | |
950 The database should be cleaned regularly with | |
951 .Xr dbclean 8 | |
952 such as by running the crontab job that is in the misc directory. | |
953 .Sh SEE ALSO | |
954 .Xr cdcc 8 , | |
955 .Xr dbclean 8 , | |
956 .Xr dcc 8 , | |
957 .Xr dccd 8 , | |
958 .Xr dccifd 8 , | |
959 .Xr dccm 8 , | |
960 .Xr dccproc 8 , | |
961 .Xr dblist 8 , | |
962 .Xr dccsight 8 , | |
963 .Xr sendmail 8 . | |
964 .Sh HISTORY | |
965 Distributed Checksum Clearinghouses are based on an idea of Paul Vixie | |
966 with code designed and written at Rhyolite Software starting in 2000. | |
967 This document describes version 1.3.103. |