0
|
1 .\" Copyright (c) 2008 by Rhyolite Software, LLC |
|
2 .\" |
|
3 .\" This agreement is not applicable to any entity which sells anti-spam |
|
4 .\" solutions to others or provides an anti-spam solution as part of a |
|
5 .\" security solution sold to other entities, or to a private network |
|
6 .\" which employs the DCC or uses data provided by operation of the DCC |
|
7 .\" but does not provide corresponding data to other users. |
|
8 .\" |
|
9 .\" Permission to use, copy, modify, and distribute this software without |
|
10 .\" changes for any purpose with or without fee is hereby granted, provided |
|
11 .\" that the above copyright notice and this permission notice appear in all |
|
12 .\" copies and any distributed versions or copies are either unchanged |
|
13 .\" or not called anything similar to "DCC" or "Distributed Checksum |
|
14 .\" Clearinghouse". |
|
15 .\" |
|
16 .\" Parties not eligible to receive a license under this agreement can |
|
17 .\" obtain a commercial license to use DCC by contacting Rhyolite Software |
|
18 .\" at sales@rhyolite.com. |
|
19 .\" |
|
20 .\" A commercial license would be for Distributed Checksum and Reputation |
|
21 .\" Clearinghouse software. That software includes additional features. This |
|
22 .\" free license for Distributed ChecksumClearinghouse Software does not in any |
|
23 .\" way grant permision to use Distributed Checksum and Reputation Clearinghouse |
|
24 .\" software |
|
25 .\" |
|
26 .\" THE SOFTWARE IS PROVIDED "AS IS" AND RHYOLITE SOFTWARE, LLC DISCLAIMS ALL |
|
27 .\" WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES |
|
28 .\" OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL RHYOLITE SOFTWARE, LLC |
|
29 .\" BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES |
|
30 .\" OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, |
|
31 .\" WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, |
|
32 .\" ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS |
|
33 .\" SOFTWARE. |
|
34 .\" |
|
35 .\" Rhyolite Software DCC 1.3.103-1.112 $Revision$ |
|
36 .\" |
|
37 .Dd February 26, 2009 |
|
38 .ds volume-ds-DCC Distributed Checksum Clearinghouse |
|
39 .Dt DCC 8 DCC |
|
40 .Os " " |
|
41 .Sh NAME |
|
42 .Nm DCC |
|
43 .Nd Distributed Checksum Clearinghouse |
|
44 .Sh DESCRIPTION |
|
45 The Distributed Checksum Clearinghouse or |
|
46 .Nm |
|
47 is a cooperative, distributed |
|
48 system intended to detect "bulk" mail or mail sent to many people. |
|
49 It allows individuals receiving a single mail message to determine |
|
50 that many |
|
51 other people have received essentially identical copies of the message |
|
52 and so reject or discard the message. |
|
53 .Pp |
|
54 Source for the server, client, and utilities |
|
55 is available at Rhyolite Software, LLC, http://www.rhyolite.com/dcc/ |
|
56 It is free for organizations that do not sell spam or virus filtering |
|
57 services. |
|
58 .Ss How the DCC Is Used |
|
59 The DCC can be viewed as a tool for end users to enforce their |
|
60 right to "opt-in" to streams of bulk mail |
|
61 by refusing bulk mail except from sources in a "whitelist." |
|
62 Whitelists are the responsibility of DCC clients, |
|
63 since only they know which bulk mail they solicited. |
|
64 .Pp |
|
65 False positives or mail marked as bulk by a DCC server that |
|
66 is not bulk occur only when a recipient of a message reports it |
|
67 to a DCC server as having been received many times |
|
68 or when the "fuzzy" checksums of differing messages are the same. |
|
69 The fuzzy checksums ignore aspects of messages in order to compute |
|
70 identical checksums for substantially identical messages. |
|
71 The fuzzy checksums are designed to ignore only |
|
72 differences that do not affect meanings. |
|
73 So in practice, you do not need to worry about DCC false positive indications |
|
74 of "bulk," but not all bulk mail is unsolicited bulk mail or spam. |
|
75 You must either use whitelists to distinguish solicited from unsolicited bulk |
|
76 mail |
|
77 or only use DCC indications of "bulk" as part of a scoring system such |
|
78 as SpamAssassin. |
|
79 Besides unsolicited bulk email or spam, |
|
80 bulk messages include legitimate mail such as |
|
81 order confirmations from merchants, |
|
82 legitimate mailing lists, |
|
83 and empty or test messages. |
|
84 .Pp |
|
85 A DCC server estimates the number copies of a |
|
86 message by counting checksums reported by DCC clients. |
|
87 Each client must decide which |
|
88 bulk messages are unsolicited and what degree of "bulkiness" is objectionable. |
|
89 Client DCC software marks, rejects, or discards mail that is bulk |
|
90 according to local thresholds on target addresses from DCC servers |
|
91 and unsolicited according to local whitelists. |
|
92 .Pp |
|
93 DCC servers are usually configured to receive reports from as many targets |
|
94 as possible, including sources that cannot be trusted to not exaggerate the |
|
95 number of copies of a message they see. |
|
96 A user of a DCC client angry about receiving a message could report it with |
|
97 1,000,000 separate DCC reports |
|
98 or with a single report claiming 1,000,000 targets. |
|
99 An unprincipled user could subscribe a "spam trap" to mailing lists |
|
100 such as those of the IETF or CERT. |
|
101 Such abuses of the system area not problems, |
|
102 because much legitimate mail is "bulk." |
|
103 You cannot reject bulk mail unless you have a whitelist of sources |
|
104 of legitimate bulk mail. |
|
105 .Pp |
|
106 DCC can also be used by an Internet service provider to detect bulk |
|
107 mail coming from its own customers. |
|
108 In such circumstances, the DCC client might be configured to only log |
|
109 bulk mail from unexpected (not whitelisted) customers. |
|
110 .Ss What the DCC Is |
|
111 A DCC server accumulates counts of cryptographic checksums of |
|
112 messages but not the messages themselves. |
|
113 It exchanges reports of frequently seen checksums with other servers. |
|
114 DCC clients send reports of checksums related to incoming mail to |
|
115 a nearby DCC server running |
|
116 .Xr dccd 8 . |
|
117 Each report from a client includes the number of recipients for the message. |
|
118 A DCC server accumulates the reports and responds to clients the |
|
119 the current total number of recipients for each checksum. |
|
120 The client adds an SMTP header to incoming mail containing the total |
|
121 counts. |
|
122 It then discards or rejects mail that is not whitelisted and has |
|
123 counts that exceed local thresholds. |
|
124 .Pp |
|
125 A special value of the number of addressees is "MANY" and means |
|
126 it is certain that this message was bulk and might be unsolicited, |
|
127 perhaps because it came from a locally blacklisted source or was |
|
128 addressed to an invalid address or "spam trap." |
|
129 The special value "MANY" is merely the largest value |
|
130 that fits in the fixed sized field containing the count of addressees. |
|
131 That "infinity" accumulated total can be reached with millions of |
|
132 independent reports as well as with one or two. |
|
133 .Pp |
|
134 DCC servers |
|
135 .Em flood |
|
136 or send |
|
137 reports of checksums of bulk mail to neighboring servers. |
|
138 .Pp |
|
139 To keep a server's database of checksums from growing without bound, |
|
140 checksums are forgotten when they become old. |
|
141 Checksums of bulk mail are kept longer. |
|
142 See |
|
143 .Xr dbclean 8 . |
|
144 .Pp |
|
145 DCC clients pick the nearest working DCC server using a small shared |
|
146 or memory mapped file, |
|
147 .Pa @prefix@/map . |
|
148 It contains server names, port numbers, passwords, recent performance |
|
149 measures, and so forth. |
|
150 This file allows clients to use quick retransmission timeouts |
|
151 and to waste little time on servers that have temporarily |
|
152 stopped working or become unreachable. |
|
153 The utility program |
|
154 .Xr cdcc 8 |
|
155 is used to maintain this file as well as to check the health of servers. |
|
156 .Ss X-DCC Headers |
|
157 The DCC software includes several programs used by clients. |
|
158 .Xr Dccm 8 |
|
159 uses the sendmail "milter" interface to query a DCC server, |
|
160 add header lines to incoming mail, |
|
161 and reject mail whose total checksum counts are high. |
|
162 Dccm is intended to be run with SMTP servers using sendmail. |
|
163 .Pp |
|
164 .Xr Dccproc 8 |
|
165 adds header lines to mail presented by file name or |
|
166 .Pa stdin , |
|
167 but relies on other programs |
|
168 such as procmail to deal with mail with large counts. |
|
169 .Xr Dccsight 8 |
|
170 is similar but deals with previously computed checksums. |
|
171 .Pp |
|
172 .Xr Dccifd 8 |
|
173 is similar to dccproc but is not run separately for each mail message |
|
174 and so is far more efficient. |
|
175 It receives mail messages via a socket somewhat like dccm, |
|
176 but with a simpler protocol that can be used by Perl scripts |
|
177 or other programs. |
|
178 .Pp |
|
179 DCC SMTP header lines are of one of the forms: |
|
180 .Bd -literal -offset 2n |
|
181 X-DCC-brand-Metrics: client server-ID; bulk cknm1=count cknm2=count ... |
|
182 X-DCC-brand-Metrics: client; whitelist |
|
183 .Ed |
|
184 where |
|
185 .Bl -hang -offset 3n -compact |
|
186 .It Em whitelist |
|
187 appears if the global or per-user |
|
188 .Pa whiteclnt |
|
189 file marks the message as good. |
|
190 .It Em brand |
|
191 is the "brand name" of the DCC server, such as "RHYOLITE". |
|
192 .It Em client |
|
193 is the name or IP address of the DCC client that added the |
|
194 header line to the SMTP message. |
|
195 .It Em server-ID |
|
196 is the numeric ID of the DCC server that the DCC client contacted. |
|
197 .It Em bulk |
|
198 is present if one or more checksum counts exceeded the DCC client's |
|
199 thresholds to make the message "bulky." |
|
200 .It Em bulk rep |
|
201 is present if the DCC reputation of the IP address of the sender is bad. |
|
202 .It Em cknm1 , Ns Em cknm2 , Ns ... |
|
203 are types of checksums: |
|
204 .Bl -hang -offset 2n -width "Message-IDx" -compact |
|
205 .It Em IP |
|
206 address of SMTP client |
|
207 .It Em env_From |
|
208 SMTP envelope value |
|
209 .It Em From |
|
210 SMTP header line |
|
211 .It Em Message-ID |
|
212 SMTP header line |
|
213 .It Em Received |
|
214 last Received: header line in the SMTP message |
|
215 .It Em substitute |
|
216 SMTP header line chosen by the DCC client, prefixed with the name of |
|
217 the header |
|
218 .It Em Body |
|
219 SMTP body ignoring white-space |
|
220 .It Em Fuz1 |
|
221 filtered or "fuzzy" body checksum |
|
222 .It Em Fuz2 |
|
223 another filtered or "fuzzy" body checksum |
|
224 .It Em rep |
|
225 DCC reputation of the mail sender or the estimated |
|
226 probability that the message is bulk. |
|
227 .El |
|
228 Counts for |
|
229 .Em IP , env_From , From , |
|
230 .Em Message-Id , Received , |
|
231 and |
|
232 .Em substitute |
|
233 checksums are omitted by the DCC client if the server |
|
234 says it has no information. |
|
235 Counts for |
|
236 .Em Fuz1 |
|
237 and |
|
238 .Em Fuz2 |
|
239 are omitted if the message body is empty or |
|
240 contains too little of the right kind of information |
|
241 for the checksum to be computed. |
|
242 .It Em count |
|
243 is the total number of recipients of messages with that |
|
244 checksum reported directly or indirectly to the DCC server. |
|
245 The special count "MANY" means that DCC client have claimed that |
|
246 the message is directed at millions of recipients. |
|
247 "MANY" imples the message is definitely bulk, but not necessarily unsolicited. |
|
248 The special counts "OK" and "OK2" mean the checksum has been |
|
249 marked "good" or "half-good" by DCC servers. |
|
250 .El |
|
251 .Pp |
|
252 .Ss Mailing lists |
|
253 Legitimate mailing list traffic differs from spam only in being solicited |
|
254 by recipients. |
|
255 Each client should have a private whitelist. |
|
256 .Pp |
|
257 DCC whitelists can also mark mail as unsolicited bulk using |
|
258 blacklist entries for commonly forged values such as "From: user@public.com". |
|
259 .Ss White and Blacklists |
|
260 DCC server and client whitelist files share a common format. |
|
261 Server files are always named |
|
262 .Pa whitelist |
|
263 and one is required to be in the DCC home directory |
|
264 with the other server files. |
|
265 Client whitelist files are |
|
266 named |
|
267 .Pa whiteclnt |
|
268 in the DCC home directory or a subdirectory specified with the |
|
269 .Fl U |
|
270 option for |
|
271 .Xr dccm 8 . |
|
272 They specify mail that should not be reported to a DCC server or that is |
|
273 always unsolicited and almost certainly bulk. |
|
274 .Pp |
|
275 A DCC whitelist file contains blank lines, comments starting |
|
276 with "#", |
|
277 and lines of the following forms: |
|
278 .Bl -tag -offset 2n -width 4n -compact |
|
279 .It Ar include file |
|
280 Copies the contents of |
|
281 .Ar file |
|
282 into the whitelist. |
|
283 It can occur only in the main whitelist or whiteclnt file and not in an |
|
284 included file. |
|
285 The file name should be absolute or relative to the DCC home directory. |
|
286 .Pp |
|
287 .It Ar count Em value |
|
288 lines specify checksums that should be white- or blacklisted. |
|
289 .Bl -inset -offset 2n -compact |
|
290 .It Ar count Em env_From Ar 821-path |
|
291 .It Ar count Em env_To Ar dest-mailbox |
|
292 .It Ar count Em From Ar 822-mailbox |
|
293 .It Ar count Em Message-ID Ar <string> |
|
294 .It Ar count Em Received Ar string |
|
295 .It Ar count Em Substitute Ar header string |
|
296 .It Ar count Ar Hex ctype cksum |
|
297 .It Ar count Em ip Ar IP-address |
|
298 .El |
|
299 .Pp |
|
300 .Bl -tag -offset 2n -width 4n -compact |
|
301 .It Ar MANY Em value |
|
302 indicates that millions of targets have received messages with |
|
303 the header, IP address, or checksum |
|
304 .Em value . |
|
305 .It Ar OK Em value |
|
306 .It Ar OK2 Em value |
|
307 say that messages with |
|
308 the header, IP address, or checksum |
|
309 .Em value |
|
310 are OK and should not reported to DCC servers |
|
311 or be greylisted. |
|
312 .Ar OK2 |
|
313 says that the message is "half OK." |
|
314 Two |
|
315 .Ar OK2 |
|
316 checksums associated with a message are equivalent to one |
|
317 .Ar OK . |
|
318 .br |
|
319 A DCC server never shares or |
|
320 .Em floods |
|
321 reports containing checksums |
|
322 marked in its whitelist with OK or OK2 to other servers. |
|
323 A DCC client does not report or ask its server about messages |
|
324 with a checksum marked OK or OK2 in the client whitelist. |
|
325 This is intended to allow a DCC client to keep private mail |
|
326 so private that even its checksums are not disclosed. |
|
327 .It Ar MX Em IP-address-or-hostname |
|
328 .It Ar MXDCC Em IP-address-or-hostname |
|
329 mark an address or block of addresses of trust mail relays including |
|
330 MX servers, smart hosts, and bastion or DMZ relays. |
|
331 The DCC clients |
|
332 .Xr dccm 8 , |
|
333 .Xr dccifd 8 , |
|
334 and |
|
335 .Xr dccproc 8 |
|
336 parse and skip initial Received: headers added by listed MX servers to |
|
337 determine the external sources of mail messages. |
|
338 Unsolicited bulk mail that has been forwarded through listed addresses |
|
339 is discarded by |
|
340 .Xr dccm 8 |
|
341 and |
|
342 .Xr dccifd 8 |
|
343 as if with |
|
344 .Fl a Ar DISCARD |
|
345 instead of rejected. |
|
346 .Ar MXDCC |
|
347 marks addresses that are MX servers that run DCC clients. |
|
348 The checksums for a mail message that has been forwarded through |
|
349 an address listed as MXDCC |
|
350 queried instead of reported. |
|
351 .It Ar SUBMIT Em IP-address-or-hostname |
|
352 marks an IP address or block addresses of SMTP submission clients |
|
353 such as web browsers |
|
354 that cannot tolerate 4yz temporary rejections |
|
355 but that cannot be trusted to not send spam. |
|
356 Since they are local addresses, DCC Reputations are not computed for them. |
|
357 .El |
|
358 .Pp |
|
359 .Ar value |
|
360 in |
|
361 .Ar count Em value |
|
362 lines can be |
|
363 .Bl -tag -offset 2n -width 4n -compact |
|
364 .It Ar dest-mailbox |
|
365 is an RFC\ 821 address or a local user name. |
|
366 .It Ar 821-path |
|
367 is an RFC\ 821 address. |
|
368 .It Ar 822-mailbox |
|
369 is an RFC\ 822 address with optional name. |
|
370 .It Em Substitute Ar header |
|
371 is the name of an SMTP header such as "Sender" or |
|
372 the name of one of two SMTP envlope values, "HELO," or |
|
373 "Mail_Host" for the resolved host name from the |
|
374 .Ar 821-path |
|
375 in |
|
376 the message. |
|
377 .It Ar Hex ctype cksum |
|
378 starts with the string |
|
379 .Em Hex |
|
380 followed a checksum type, and |
|
381 a string of four hexadecimal numbers obtained from a DCC log file |
|
382 or the |
|
383 .Xr dccproc 8 |
|
384 command using |
|
385 .Fl CQ . |
|
386 The checksum type is |
|
387 .Em body , Fuz1 , |
|
388 or |
|
389 .Em Fuz2 |
|
390 or one of the preceding checksum types such as |
|
391 .Em env_From . |
|
392 .It Ar IP-address |
|
393 is a host name, IPv4 or IPv6 address, or a block |
|
394 of IP addresses in the standard xxx/mm from with |
|
395 mm limited for server whitelists to 16 for IPv4 or 112 for IPv6. |
|
396 There can be at most 64 CIDR blocks in a client |
|
397 .Pa whiteclnt |
|
398 file. |
|
399 A host name is converted to IP addresses with DNS, |
|
400 .Pa /etc/hosts |
|
401 or other mechanisms |
|
402 and one checksum for each addresses added to the whitelist. |
|
403 .El |
|
404 .Pp |
|
405 .It Ar option setting |
|
406 can only be in a DCC client |
|
407 .Pa whiteclnt |
|
408 file used by |
|
409 .Xr dccifd 8 , |
|
410 .Xr dccm 8 |
|
411 or |
|
412 .Xr dccproc 8 . |
|
413 Settings in per-user whiteclnt files override settings |
|
414 in the global file. |
|
415 .Ar Setting |
|
416 can be any of the following: |
|
417 .Bl -tag -offset 2n -width 2n -compact |
|
418 .It Ar option log-all |
|
419 to log all mail messages. |
|
420 .It Ar option log-normal |
|
421 to log only messages that meet the logging thresholds. |
|
422 .It Ar option log-subdirectory-day |
|
423 .It Ar option log-subdirectory-hour |
|
424 .It Ar option log-subdirectory-minute |
|
425 creates log files containing mail messages in subdirectories |
|
426 of the form |
|
427 .Ar JJJ , |
|
428 .Ar JJJ/HH , |
|
429 or |
|
430 .Ar JJJ/HH/MM |
|
431 where |
|
432 .Ar JJJ |
|
433 is the current julian day, |
|
434 .Ar HH |
|
435 is the current hour, and |
|
436 .Ar MM |
|
437 is the current minute. |
|
438 See also the |
|
439 .Fl l Ar logdir |
|
440 option for |
|
441 .Xr dccm 8 , |
|
442 .Xr dccifd 8 , |
|
443 and |
|
444 .Xr dccproc 8 . |
|
445 .It Ar option dcc-on |
|
446 .It Ar option dcc-off |
|
447 Control DCC filtering. |
|
448 See the discussion of |
|
449 .Fl W |
|
450 for |
|
451 .Xr dccm 8 |
|
452 and |
|
453 .Xr dccifd 8 . |
|
454 .It Ar option greylist-on |
|
455 .It Ar option greylist-off |
|
456 to control greylisting. |
|
457 Greylisting for other recipients in the same SMTP transaction |
|
458 can still cause greylist temporary rejections. |
|
459 .Ar greylist-off |
|
460 in the main whiteclnt file. |
|
461 .It Ar option greylist-log-on |
|
462 .It Ar option greylist-log-off |
|
463 to control logging of greylisted mail messages. |
|
464 .It Ar option DCC-rep-off |
|
465 .It Ar option DCC-rep-on |
|
466 to honor or ignore DCC Reputations computed by the DCC server. |
|
467 .It Ar option DNSBL1-off |
|
468 .It Ar option DNSBL1-on |
|
469 .It Ar option DNSBL2-off |
|
470 .It Ar option DNSBL2-on |
|
471 .It Ar option DNSBL3-off |
|
472 .It Ar option DNSBL3-on |
|
473 honor or ignore results of DNS blacklist checks configured with |
|
474 .Fl B |
|
475 for |
|
476 .Xr dccm 8 , |
|
477 .Xr dccifd 8 , |
|
478 and |
|
479 .Xr dccproc 8 . |
|
480 .It Ar option MTA-first |
|
481 .It Ar option MTA-last |
|
482 consider MTA determinations of spam or not-spam first so they can be overridden |
|
483 by |
|
484 .Pa whiteclnt |
|
485 files, or last so that they can override |
|
486 .Pa whiteclnt files. |
|
487 .It Ar option forced-discard-ok |
|
488 .It Ar option no-forced-discard |
|
489 control whether |
|
490 .Xr dccm 8 |
|
491 and |
|
492 .Xr dccifd 8 |
|
493 are allowed to discard a message for one mailbox for which |
|
494 it is spam when it is not spam and must be delivered to another mailbox. |
|
495 This can happen if a mail message is addressed to two or more mailboxes with |
|
496 differing whitelists. |
|
497 Discarding can be undesirable because false positives are not communicated |
|
498 to mail senders. |
|
499 To avoid discarding, |
|
500 .Xr dccm 8 |
|
501 and |
|
502 .Xr dccifd 8 |
|
503 running in proxy mode temporarily reject SMTP envelope |
|
504 .Em Rcpt To |
|
505 values that involve differing |
|
506 .Pa whiteclnt |
|
507 files. |
|
508 .It Ar option threshold type,rej-thold |
|
509 has the same effects as |
|
510 .Fl c Ar type,rej-thold |
|
511 for |
|
512 .Xr dccproc 8 |
|
513 or |
|
514 .Fl t Ar type,rej-thold |
|
515 for |
|
516 .Xr dccm 8 |
|
517 and |
|
518 .Xr dccifd 8 . |
|
519 It is useful only in per-user whiteclnt files to override the global |
|
520 DCC checksum thresholds. |
|
521 .It Ar option spam-trap-accept |
|
522 .It Ar option spam-trap-reject |
|
523 say that mail should be reported to the DCC server as extremely |
|
524 bulk or with target counts of |
|
525 .Ar MANY . |
|
526 Greylisting, DNS blacklist (DNSBL), and other checks are turned off. |
|
527 .Ar Spam-trap-accept |
|
528 tells the MTA to accept the message while |
|
529 .Ar spam-trap-reject |
|
530 tells the MTA to reject the message. |
|
531 Use |
|
532 .Ar Spam-trap-accept |
|
533 for spam traps that should not be disclosed. |
|
534 .Ar Spam-trap-reject |
|
535 can be used on |
|
536 .Em catch-all |
|
537 mailboxes that might receive legitimate mail by typographical errors |
|
538 and that senders should be told about. |
|
539 .El |
|
540 .Pp |
|
541 In the absence of explicit settings, |
|
542 the default in the main whiteclnt file is equivalent to |
|
543 .Bl -hang -offset 4n -width 4n -compact |
|
544 .It Ar option log-normal |
|
545 .It Ar option dcc-on |
|
546 .It Ar option greylist-on |
|
547 .It Ar option greylist-log-on |
|
548 .It Ar option DCC-rep-off |
|
549 .It Ar option DNSBL1-off |
|
550 .It Ar option DNSBL2-off |
|
551 .It Ar option DNSBL3-off |
|
552 .It Ar MTA-last |
|
553 .It Ar option no-forced-discard |
|
554 .El |
|
555 The defaults for individual recipient |
|
556 .Pa whiteclnt |
|
557 files are the same except as change by explicit settings |
|
558 in the main file. |
|
559 .El |
|
560 .Pp |
|
561 Checksums of the IP address of the SMTP client sending a mail message |
|
562 are practically unforgeable, because it is impractical for |
|
563 an SMTP client to "spoof" its address or pretend to use some other IP address. |
|
564 That would make the IP address of the sender useful for whitelisting, |
|
565 except that the IP address of the SMTP client |
|
566 is often not available to users of |
|
567 .Xr dccproc 8 . |
|
568 In addition, legitimate mail relays make whitelist entries for IP |
|
569 addresses of little use. |
|
570 For example, |
|
571 the IP address from which a message arrived might be that of a |
|
572 local relay instead of the home address of a whitelisted mailing list. |
|
573 .Pp |
|
574 Envelope and header |
|
575 .Ar From |
|
576 values can be forged, |
|
577 so whitelist entries for their checksums are not entirely reliable. |
|
578 .Pp |
|
579 Checksums of |
|
580 .Ar env_To |
|
581 values are never sent to DCC servers. |
|
582 They are valid in only |
|
583 .Pa whiteclnt |
|
584 files |
|
585 and used only by |
|
586 .Xr dccm 8 , |
|
587 .Xr dccifd 8 , |
|
588 and |
|
589 .Xr dccproc 8 |
|
590 when the envelope |
|
591 .Em Rcpt To |
|
592 value is known. |
|
593 .Ss Greylists |
|
594 The DCC server, |
|
595 .Xr dccd 8 , |
|
596 can be used to maintain a greylist database for some DCC clients |
|
597 including |
|
598 .Xr dccm 8 |
|
599 and |
|
600 .Xr dccifd 8 . |
|
601 Greylisting involves temporarily refusing mail from unfamiliar |
|
602 SMTP clients and is unrelated to filtering with a |
|
603 Distributed Checksum Clearinghouse. |
|
604 .br |
|
605 See http://projects.puremagic.com/greylisting/ |
|
606 .Ss Privacy |
|
607 Because sending mail is a less private act than receiving it, |
|
608 and because sending bulk mail is usually not private at all |
|
609 and cannot be very private, |
|
610 the DCC tries first to protect the privacy of mail recipients, |
|
611 and second the privacy of senders of mail that is not bulk. |
|
612 .Pp |
|
613 DCC clients necessarily disclose some information about mail they have |
|
614 received. |
|
615 The DCC database contains checksums of mail bodies, |
|
616 header lines, and source addresses. |
|
617 While it contains significantly less information than is |
|
618 available by "snooping" on Internet links, |
|
619 it is important that the DCC database be treated as containing |
|
620 sensitive information and to not put the most private information |
|
621 in the DCC database. |
|
622 Given the contents of a message, one might determine |
|
623 whether that message has been received |
|
624 by a system that subscribes to the DCC. |
|
625 Guesses about the sender and addressee of a message can also be |
|
626 validated if the checksums of the message have been sent to a DCC server. |
|
627 .Pp |
|
628 Because the DCC is distributed, |
|
629 organizations can operate their own DCC servers, and configure |
|
630 them to share or "flood" only the checksums of bulk mail that is not |
|
631 in local whitelists. |
|
632 .Pp |
|
633 DCC clients should not report the checksums of messages known to be |
|
634 private to a DCC server. |
|
635 For example, checksums of messages local to |
|
636 a system or that are otherwise known a priori to not be unsolicited bulk |
|
637 should not be sent to a remote DCC server. |
|
638 This can accomplished by adding entries for the sender to the |
|
639 client's local whitelist file. |
|
640 Client whitelist files can also include entries for email recipients |
|
641 whose mail should not be reported to a DCC server. |
|
642 .Ss Security |
|
643 Whenever considering security, |
|
644 one must first consider the risks. |
|
645 The worst DCC security problems are |
|
646 unauthorized commands to a DCC service, |
|
647 denial of the DCC service, |
|
648 and corruption of DCC data. |
|
649 The worst that can be done with remote commands to a DCC server is |
|
650 to turn it off or otherwise cause it to stop responding. |
|
651 The DCC is designed to fail gracefully, |
|
652 so that a denial of service attack |
|
653 would at worst allow delivery of mail that would otherwise be rejected. |
|
654 Corruption of DCC data might at worst cause mail that is already |
|
655 somewhat "bulk" by virtue of being received by two or more people |
|
656 to appear have higher recipient numbers. |
|
657 Since DCC users |
|
658 .Em must |
|
659 whitelist all sources of legitimate bulk mail, |
|
660 this is also not a concern. |
|
661 Such security risks should be addressed, |
|
662 but only with defenses that don't cost more than the possible damage from |
|
663 an attack. |
|
664 .Pp |
|
665 The DCC must contend with senders of unsolicited bulk mail who |
|
666 resort to unlawful actions |
|
667 to express their displeasure at having their advertising blocked. |
|
668 Because the DCC protocol is based |
|
669 on UDP, an unhappy advertiser could try to |
|
670 flood a DCC server with |
|
671 packets supposedly from subscribers or non-subscribers. |
|
672 DCC servers defend against that attack by rate-limiting requests |
|
673 from anonymous users. |
|
674 .Pp |
|
675 Also because of the use of UDP, clients must be protected |
|
676 against forged answers to their queries. |
|
677 Otherwise an unsolicited bulk mail advertiser could send |
|
678 a stream of "not spam" answers to an SMTP |
|
679 client while simultaneously sending mail that would otherwise be |
|
680 rejected. |
|
681 This is not a problem for authenticated clients of the |
|
682 DCC because they share a secret with the DCC. |
|
683 Unauthenticated, anonymous DCC |
|
684 clients do not share any secrets with the DCC, except for unique and |
|
685 unpredictable bits in each query or report sent to the DCC. |
|
686 Therefore, DCC servers cryptographically sign answers to |
|
687 unauthenticated clients with bits from the corresponding queries. |
|
688 This protects against attackers that do not |
|
689 have access to the stream of packets from the DCC client. |
|
690 .Pp |
|
691 The passwords or shared secrets used in the DCC client and server programs |
|
692 are "cleartext" for several reasons. |
|
693 In any shared secret authentication system, |
|
694 at least one party must know the secret or keep the secret in cleartext. |
|
695 You could encrypt the secrets in a file, but because they are used |
|
696 by programs, you would need a cleartext copy of the key to decrypt |
|
697 the file somewhere in the system, making such a scheme more expensive |
|
698 but no more secure than a file of cleartext passwords. |
|
699 Asymmetric systems such as that used in UNIX allow one party to not |
|
700 know the secrets, but they must be and are |
|
701 designed to be computationally expensive when used in applications |
|
702 like the DCC that involve thousands or more authentication checks per second. |
|
703 Moreover, because of "dictionary attacks," |
|
704 asymmetric systems are now little more secure than |
|
705 keeping passwords in cleartext. |
|
706 An adversary can compare the hash values of combinations of common words |
|
707 with /etc/passwd hash values to look for bad passwords. |
|
708 Worse, by the nature of a client/server protocol like that used in |
|
709 the DCC, clients must have the cleartext password. |
|
710 Since it is among the more numerous and much less secure clients |
|
711 that adversaries would seek files of DCC passwords, |
|
712 it would be a waste to complicate the DCC server with an asymmetric |
|
713 system. |
|
714 .Pp |
|
715 The DCC protocol is vulnerable to dictionary attacks to recover passwords. |
|
716 An adversary could capture some DCC packets, and then check to see |
|
717 if any of the 100,000 to 1,000,000 passwords in so called |
|
718 "cracker dictionaries" |
|
719 applied to a packet generated the same signature. |
|
720 This is a concern only if DCC passwords are poorly chosen, such |
|
721 as any combination of words in an English dictionary. |
|
722 There are ways to prevent this vulnerability regardless of |
|
723 how badly passwords are chosen, but they are computationally expensive |
|
724 and require additional network round trips. |
|
725 Since DCC passwords are created and typed into files once |
|
726 and do not need to be remembered by people, |
|
727 it is cheaper and quite easy to simply choose good passwords |
|
728 that are not in dictionaries. |
|
729 .Ss Reliability |
|
730 It is better to fail to filter unsolicited bulk mail than to fail |
|
731 to deliver legitimate mail, so DCC clients fail in the direction of |
|
732 assuming that mail is legitimate or even whitelisted. |
|
733 .Pp |
|
734 A DCC client sends a report or other request and waits for an answer. |
|
735 If no answer arrives within a reasonable time, |
|
736 the client retransmits. |
|
737 There are many things that |
|
738 might result in the client not receiving an answer, |
|
739 but the most important is packet loss. |
|
740 If the client's request does not reach the server, |
|
741 it is easy and harmless for the client to retransmit. |
|
742 If the client's request reached the server but the server's response was lost, |
|
743 a retransmission to the same server would be misunderstood as |
|
744 a new report of another copy of the same message unless it is detected |
|
745 as a retransmission by the server. |
|
746 The DCC protocol includes transactions identifiers for this purpose. |
|
747 If the client retransmitted to a second server, |
|
748 the retransmission would be misunderstood by the second server as |
|
749 a new report of the same message. |
|
750 .Pp |
|
751 Each request from a client includes a timestamp to aid the client in |
|
752 measuring the round trip time to the server and to let the client pick |
|
753 the closest server. |
|
754 Clients monitor the speed of all of the servers they know including |
|
755 those they are not currently using, |
|
756 and use the quickest. |
|
757 .Ss Client and Server-IDs |
|
758 Servers and clients use numbers or IDs to identify themselves. |
|
759 ID 1 is reserved for anonymous, unauthenticated clients. |
|
760 All other IDs are associated with a pair of passwords in the |
|
761 .Pa ids |
|
762 file, the |
|
763 current and next or previous and current passwords. |
|
764 Clients included their client IDs in their messages. |
|
765 When they are not using the anonymous ID, |
|
766 they sign their messages to servers with the first password |
|
767 associated with their client-ID. |
|
768 Servers treat messages with signatures that match neither of the passwords |
|
769 for the client-ID in their own |
|
770 .Pa ids |
|
771 file as if the client had used the anonymous ID. |
|
772 .Pp |
|
773 Each server has a unique |
|
774 .Em server-ID |
|
775 less than 32768. |
|
776 Servers use their IDs to identify checksums that they |
|
777 .Em flood |
|
778 to other servers. |
|
779 Each server expects local clients sending administrative |
|
780 commands to use the server's ID and sign administrative commands |
|
781 with the associated password. |
|
782 .Pp |
|
783 Server-IDs must be unique among all systems that share reports |
|
784 by "flooding." |
|
785 All servers must be told of the IDs all other servers whose |
|
786 reports can be received in the local |
|
787 .Pa @prefix@/flod |
|
788 file described in |
|
789 .Xr dccd 8 . |
|
790 However, server-IDs can be mapped during flooding between |
|
791 independent DCC organizations. |
|
792 .Pp |
|
793 .Em Passwd-IDs |
|
794 are server-IDs that should not be assigned to servers. |
|
795 They appear in the often publicly readable |
|
796 .Pa @prefix@/flod |
|
797 and specify passwords in the private |
|
798 .Pa @prefix@/ids |
|
799 file for the inter-server flooding protocol |
|
800 .Pp |
|
801 The client identified by a |
|
802 .Em client-ID |
|
803 might be a single computer with a |
|
804 single IP address, a single but multi-homed computer, or many computers. |
|
805 Client-IDs are not used to identify checksum reports, but |
|
806 the organization operating the client. |
|
807 A client-ID need only be unique among clients using a single server. |
|
808 A single client can use different client-IDs for different servers, |
|
809 each client-ID authenticated with a separate password. |
|
810 .Pp |
|
811 An obscure but important part of all of this is that the |
|
812 inter-server flooding algorithm |
|
813 depends on server-IDs and timestamps attached to reports of checksums. |
|
814 The inter-server flooding mechanism |
|
815 requires cooperating DCC servers to maintain reasonable clocks |
|
816 ticking in UTC. |
|
817 Clients include timestamps in their requests, but as long as their |
|
818 timestamps are unlikely to be repeated, they need not be very accurate. |
|
819 .Ss Installation Considerations |
|
820 DCC clients on a computer share information about which servers |
|
821 are currently working and their speeds in a shared memory segment. |
|
822 This segment also contains server host names, IP addresses, and |
|
823 the passwords needed to authenticate known clients to servers. |
|
824 That generally requires that |
|
825 .Xr dccm 8 , |
|
826 .Xr dccproc 8 , |
|
827 .Xr dccifd 8 , |
|
828 and |
|
829 .Xr cdcc 8 |
|
830 execute with an UID that |
|
831 can write to the DCC home directory and its files. |
|
832 The sendmail interface, dccm, |
|
833 is a daemon that can be started by an "rc" or other script already |
|
834 running with the correct UID. |
|
835 The other two, dccproc and cdcc need to be set-UID because they are |
|
836 used by end users. |
|
837 They relinquish set-UID privileges when not needed. |
|
838 .Pp |
|
839 Files that contain cleartext passwords including the shared file used by clients |
|
840 must be readable only by "owner." |
|
841 .Pp |
|
842 The data files required by a DCC can be in a single "home" directory, |
|
843 .Pa @prefix@ . |
|
844 Distinct DCC servers can run on a single computer, provided they use |
|
845 distinct UDP port numbers and home directories. |
|
846 It is possible and convenient for the DCC clients using a server |
|
847 on the same computer to use the same home directory as the server. |
|
848 .Pp |
|
849 The DCC source distribution includes sample control files. |
|
850 They should be modified appropriately and then copied to the DCC |
|
851 home directory. |
|
852 Files that contain cleartext passwords must not be publicly readable. |
|
853 .Pp |
|
854 The DCC source includes "feature" m4 files to configure |
|
855 sendmail to use |
|
856 .Xr dccm 8 |
|
857 to check a DCC server about incoming mail. |
|
858 .Pp |
|
859 See also the INSTALL.html file. |
|
860 .Ss Client Installation |
|
861 Installing a DCC client starts with obtaining or compiling program binaries |
|
862 for the client server data control tool, |
|
863 .Xr cdcc 8 . |
|
864 Installing the sendmail DCC interface, |
|
865 .Xr dccm 8 , |
|
866 or |
|
867 .Xr dccproc 8 , |
|
868 the general or |
|
869 .Xr procmail 1 |
|
870 interface |
|
871 is the main part of the client installation. |
|
872 Connecting the DCC to sendmail with dccm is most powerful, |
|
873 but requires administrative control of the system running sendmail. |
|
874 .Pp |
|
875 As noted above, cdcc and dccproc should be |
|
876 set-UID to a suitable UID. |
|
877 Root or 0 is thought to be safe for both, because they are |
|
878 careful to release privileges except when they need them to |
|
879 read or write files in the DCC home directory. |
|
880 A DCC home directory, |
|
881 .Pa @prefix@ |
|
882 should be created. |
|
883 It must be owned and writable by the UID to which cdcc is set. |
|
884 .Pp |
|
885 After the DCC client programs have been obtained, |
|
886 contact the operator(s) of the chosen DCC server(s) |
|
887 to obtain |
|
888 each server's |
|
889 hostname, |
|
890 port number, |
|
891 and a |
|
892 .Em client-ID |
|
893 and corresponding password. |
|
894 No client-IDs or passwords are needed touse |
|
895 DCC servers that allow anonymous clients. |
|
896 Use the |
|
897 .Em load |
|
898 or |
|
899 .Em add |
|
900 commands |
|
901 of cdcc to create a |
|
902 .Pa map |
|
903 file in the DCC home directory. |
|
904 It is usually necessary to create a client whitelist file of |
|
905 the format described above. |
|
906 To accommodate users sharing a computer but not ideas about what |
|
907 is solicited bulk mail, |
|
908 the client whitelist file can be any valid path name |
|
909 and need not be in the DCC home directory. |
|
910 .Pp |
|
911 If dccm is chosen, |
|
912 arrange to start it with suitable arguments |
|
913 before sendmail is started. |
|
914 See the |
|
915 .Pa homedir/dcc_conf |
|
916 file and the |
|
917 .Pa misc/rcDCC |
|
918 script in the DCC source. |
|
919 The procmail DCCM interface, |
|
920 .Xr dccproc 8 , |
|
921 can be run manually or by a |
|
922 .Xr procmailrc 5 |
|
923 rule. |
|
924 .Ss Server Installation |
|
925 The DCC server, |
|
926 .Xr dccd 8 , |
|
927 also requires that the DCC home directory exist. |
|
928 It does not use the client shared or memory mapped file of server |
|
929 addresses, |
|
930 but it requires other files. |
|
931 One is the |
|
932 .Pa @prefix@/ids |
|
933 file of client-IDs, server-IDs, and corresponding passwords. |
|
934 Another is a |
|
935 .Pa flod |
|
936 file of peers that send and receive floods of reports of checksums |
|
937 with large counts. |
|
938 Both files are described |
|
939 in |
|
940 .Xr dccd 8 . |
|
941 .Pp |
|
942 The server daemon should be started when the system is rebooted, |
|
943 probably before sendmail. |
|
944 See the |
|
945 .Pa misc/rcDCC |
|
946 and |
|
947 .Pa misc/start-dccd |
|
948 files in the DCC source. |
|
949 .Pp |
|
950 The database should be cleaned regularly with |
|
951 .Xr dbclean 8 |
|
952 such as by running the crontab job that is in the misc directory. |
|
953 .Sh SEE ALSO |
|
954 .Xr cdcc 8 , |
|
955 .Xr dbclean 8 , |
|
956 .Xr dcc 8 , |
|
957 .Xr dccd 8 , |
|
958 .Xr dccifd 8 , |
|
959 .Xr dccm 8 , |
|
960 .Xr dccproc 8 , |
|
961 .Xr dblist 8 , |
|
962 .Xr dccsight 8 , |
|
963 .Xr sendmail 8 . |
|
964 .Sh HISTORY |
|
965 Distributed Checksum Clearinghouses are based on an idea of Paul Vixie |
|
966 with code designed and written at Rhyolite Software starting in 2000. |
|
967 This document describes version 1.3.103. |