Mercurial > notdcc
comparison dcc.html.in @ 0:c7f6b056b673
First import of vendor version
author | Peter Gervai <grin@grin.hu> |
---|---|
date | Tue, 10 Mar 2009 13:49:58 +0100 |
parents | |
children |
comparison
equal
deleted
inserted
replaced
-1:000000000000 | 0:c7f6b056b673 |
---|---|
1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"> | |
2 <HTML> | |
3 <HEAD> | |
4 <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1"> | |
5 <TITLE>dcc.0.8</TITLE> | |
6 <META http-equiv="Content-Style-Type" content="text/css"> | |
7 <STYLE type="text/css"> | |
8 BODY {background-color:white; color:black} | |
9 ADDRESS {font-size:smaller} | |
10 IMG.logo {width:6em; vertical-align:middle} | |
11 </STYLE> | |
12 </HEAD> | |
13 <BODY> | |
14 <PRE> | |
15 <!-- Manpage converted by man2html 3.0.1 --> | |
16 <B><A HREF="dcc.html">DCC(8)</A></B> Distributed Checksum Clearinghouse <B><A HREF="dcc.html">DCC(8)</A></B> | |
17 | |
18 | |
19 </PRE> | |
20 <H2><A NAME="NAME">NAME</A></H2><PRE> | |
21 <B>DCC</B> -- Distributed Checksum Clearinghouse | |
22 | |
23 | |
24 </PRE> | |
25 <H2><A NAME="DESCRIPTION">DESCRIPTION</A></H2><PRE> | |
26 The Distributed Checksum Clearinghouse or <B>DCC</B> is a cooperative, distrib- | |
27 uted system intended to detect "bulk" mail or mail sent to many people. | |
28 It allows individuals receiving a single mail message to determine that | |
29 many other people have received essentially identical copies of the mes- | |
30 sage and so reject or discard the message. | |
31 | |
32 Source for the server, client, and utilities is available at Rhyolite | |
33 Software, LLC, http://www.rhyolite.com/dcc/ It is free for organizations | |
34 that do not sell spam or virus filtering services. | |
35 | |
36 <A NAME="How-the-DCC-Is-Used"><B>How the DCC Is Used</B></A> | |
37 The DCC can be viewed as a tool for end users to enforce their right to | |
38 "opt-in" to streams of bulk mail by refusing bulk mail except from | |
39 sources in a "whitelist." Whitelists are the responsibility of DCC | |
40 clients, since only they know which bulk mail they solicited. | |
41 | |
42 False positives or mail marked as bulk by a DCC server that is not bulk | |
43 occur only when a recipient of a message reports it to a DCC server as | |
44 having been received many times or when the "fuzzy" checksums of differ- | |
45 ing messages are the same. The fuzzy checksums ignore aspects of mes- | |
46 sages in order to compute identical checksums for substantially identical | |
47 messages. The fuzzy checksums are designed to ignore only differences | |
48 that do not affect meanings. So in practice, you do not need to worry | |
49 about DCC false positive indications of "bulk," but not all bulk mail is | |
50 unsolicited bulk mail or spam. You must either use whitelists to distin- | |
51 guish solicited from unsolicited bulk mail or only use DCC indications of | |
52 "bulk" as part of a scoring system such as SpamAssassin. Besides unso- | |
53 licited bulk email or spam, bulk messages include legitimate mail such as | |
54 order confirmations from merchants, legitimate mailing lists, and empty | |
55 or test messages. | |
56 | |
57 A DCC server estimates the number copies of a message by counting check- | |
58 sums reported by DCC clients. Each client must decide which bulk mes- | |
59 sages are unsolicited and what degree of "bulkiness" is objectionable. | |
60 Client DCC software marks, rejects, or discards mail that is bulk accord- | |
61 ing to local thresholds on target addresses from DCC servers and unso- | |
62 licited according to local whitelists. | |
63 | |
64 DCC servers are usually configured to receive reports from as many tar- | |
65 gets as possible, including sources that cannot be trusted to not exag- | |
66 gerate the number of copies of a message they see. A user of a DCC | |
67 client angry about receiving a message could report it with 1,000,000 | |
68 separate DCC reports or with a single report claiming 1,000,000 targets. | |
69 An unprincipled user could subscribe a "spam trap" to mailing lists such | |
70 as those of the IETF or CERT. Such abuses of the system area not prob- | |
71 lems, because much legitimate mail is "bulk." You cannot reject bulk | |
72 mail unless you have a whitelist of sources of legitimate bulk mail. | |
73 | |
74 DCC can also be used by an Internet service provider to detect bulk mail | |
75 coming from its own customers. In such circumstances, the DCC client | |
76 might be configured to only log bulk mail from unexpected (not | |
77 whitelisted) customers. | |
78 | |
79 <A NAME="What-the-DCC-Is"><B>What the DCC Is</B></A> | |
80 A DCC server accumulates counts of cryptographic checksums of messages | |
81 but not the messages themselves. It exchanges reports of frequently seen | |
82 checksums with other servers. DCC clients send reports of checksums | |
83 related to incoming mail to a nearby DCC server running <B><A HREF="dccd.html">dccd(8)</A></B>. Each | |
84 report from a client includes the number of recipients for the message. | |
85 A DCC server accumulates the reports and responds to clients the the cur- | |
86 rent total number of recipients for each checksum. The client adds an | |
87 SMTP header to incoming mail containing the total counts. It then dis- | |
88 cards or rejects mail that is not whitelisted and has counts that exceed | |
89 local thresholds. | |
90 | |
91 A special value of the number of addressees is "MANY" and means it is | |
92 certain that this message was bulk and might be unsolicited, perhaps | |
93 because it came from a locally blacklisted source or was addressed to an | |
94 invalid address or "spam trap." The special value "MANY" is merely the | |
95 largest value that fits in the fixed sized field containing the count of | |
96 addressees. That "infinity" accumulated total can be reached with mil- | |
97 lions of independent reports as well as with one or two. | |
98 | |
99 DCC servers <I>flood</I> or send reports of checksums of bulk mail to neighbor- | |
100 ing servers. | |
101 | |
102 To keep a server's database of checksums from growing without bound, | |
103 checksums are forgotten when they become old. Checksums of bulk mail are | |
104 kept longer. See <B><A HREF="dbclean.html">dbclean(8)</A></B>. | |
105 | |
106 DCC clients pick the nearest working DCC server using a small shared or | |
107 memory mapped file, <I>@prefix@/map</I>. It contains server names, port num- | |
108 bers, passwords, recent performance measures, and so forth. This file | |
109 allows clients to use quick retransmission timeouts and to waste little | |
110 time on servers that have temporarily stopped working or become unreach- | |
111 able. The utility program <B><A HREF="cdcc.html">cdcc(8)</A></B> is used to maintain this file as well | |
112 as to check the health of servers. | |
113 | |
114 <A NAME="X-DCC-Headers"><B>X-DCC Headers</B></A> | |
115 The DCC software includes several programs used by clients. <B><A HREF="dccm.html">Dccm(8)</A></B> uses | |
116 the sendmail "milter" interface to query a DCC server, add header lines | |
117 to incoming mail, and reject mail whose total checksum counts are high. | |
118 Dccm is intended to be run with SMTP servers using sendmail. | |
119 | |
120 <B><A HREF="dccproc.html">Dccproc(8)</A></B> adds header lines to mail presented by file name or <I>stdin</I>, but | |
121 relies on other programs such as procmail to deal with mail with large | |
122 counts. <B><A HREF="dccsight.html">Dccsight(8)</A></B> is similar but deals with previously computed check- | |
123 sums. | |
124 | |
125 <B><A HREF="dccifd.html">Dccifd(8)</A></B> is similar to dccproc but is not run separately for each mail | |
126 message and so is far more efficient. It receives mail messages via a | |
127 socket somewhat like dccm, but with a simpler protocol that can be used | |
128 by Perl scripts or other programs. | |
129 | |
130 DCC SMTP header lines are of one of the forms: | |
131 | |
132 X-DCC-brand-Metrics: client server-ID; bulk cknm1=count cknm2=count ... | |
133 X-DCC-brand-Metrics: client; whitelist | |
134 where | |
135 <I>whitelist</I> appears if the global or per-user <I>whiteclnt</I> file marks the | |
136 message as good. | |
137 <I>brand</I> is the "brand name" of the DCC server, such as "RHYOLITE". | |
138 <I>client</I> is the name or IP address of the DCC client that added the | |
139 header line to the SMTP message. | |
140 <I>server-ID</I> is the numeric ID of the DCC server that the DCC client con- | |
141 tacted. | |
142 <I>bulk</I> is present if one or more checksum counts exceeded the DCC | |
143 client's thresholds to make the message "bulky." | |
144 <I>bulk</I> <I>rep</I> is present if the DCC reputation of the IP address of the | |
145 sender is bad. | |
146 <I>cknm1</I>,<I>cknm2</I>,... are types of checksums: | |
147 <I>IP</I> address of SMTP client | |
148 <I>env</I><B>_</B><I>From</I> SMTP envelope value | |
149 <I>From</I> SMTP header line | |
150 <I>Message-ID</I> SMTP header line | |
151 <I>Received</I> last Received: header line in the SMTP message | |
152 <I>substitute</I> SMTP header line chosen by the DCC client, pre- | |
153 fixed with the name of the header | |
154 <I>Body</I> SMTP body ignoring white-space | |
155 <I>Fuz1</I> filtered or "fuzzy" body checksum | |
156 <I>Fuz2</I> another filtered or "fuzzy" body checksum | |
157 <I>rep</I> DCC reputation of the mail sender or the esti- | |
158 mated probability that the message is bulk. | |
159 Counts for <I>IP</I>, <I>env</I><B>_</B><I>From</I>, <I>From</I>, <I>Message-Id</I>, <I>Received</I>, and | |
160 <I>substitute</I> checksums are omitted by the DCC client if the | |
161 server says it has no information. Counts for <I>Fuz1</I> and <I>Fuz2</I> | |
162 are omitted if the message body is empty or contains too lit- | |
163 tle of the right kind of information for the checksum to be | |
164 computed. | |
165 <I>count</I> is the total number of recipients of messages with that check- | |
166 sum reported directly or indirectly to the DCC server. The | |
167 special count "MANY" means that DCC client have claimed that | |
168 the message is directed at millions of recipients. "MANY" | |
169 imples the message is definitely bulk, but not necessarily | |
170 unsolicited. The special counts "OK" and "OK2" mean the | |
171 checksum has been marked "good" or "half-good" by DCC servers. | |
172 | |
173 <A NAME="Mailing-lists"><B>Mailing lists</B></A> | |
174 Legitimate mailing list traffic differs from spam only in being solicited | |
175 by recipients. Each client should have a private whitelist. | |
176 | |
177 DCC whitelists can also mark mail as unsolicited bulk using blacklist | |
178 entries for commonly forged values such as "From: user@public.com". | |
179 | |
180 <A NAME="White-and-Blacklists"><B>White and Blacklists</B></A> | |
181 DCC server and client whitelist files share a common format. Server | |
182 files are always named <I>whitelist</I> and one is required to be in the DCC | |
183 home directory with the other server files. Client whitelist files are | |
184 named <I>whiteclnt</I> in the DCC home directory or a subdirectory specified | |
185 with the <B>-U</B> option for <B><A HREF="dccm.html">dccm(8)</A></B>. They specify mail that should not be | |
186 reported to a DCC server or that is always unsolicited and almost cer- | |
187 tainly bulk. | |
188 | |
189 A DCC whitelist file contains blank lines, comments starting with "#", | |
190 and lines of the following forms: | |
191 <I>include</I> <I>file</I> | |
192 Copies the contents of <I>file</I> into the whitelist. It can occur | |
193 only in the main whitelist or whiteclnt file and not in an | |
194 included file. The file name should be absolute or relative to | |
195 the DCC home directory. | |
196 | |
197 <I>count</I> <I>value</I> | |
198 lines specify checksums that should be white- or blacklisted. | |
199 <I>count</I> <I>env</I><B>_</B><I>From</I> <I>821-path</I> | |
200 <I>count</I> <I>env</I><B>_</B><I>To</I> <I>dest-mailbox</I> | |
201 <I>count</I> <I>From</I> <I>822-mailbox</I> | |
202 <I>count</I> <I>Message-ID</I> <I><string></I> | |
203 <I>count</I> <I>Received</I> <I>string</I> | |
204 <I>count</I> <I>Substitute</I> <I>header</I> <I>string</I> | |
205 <I>count</I> <I>Hex</I> <I>ctype</I> <I>cksum</I> | |
206 <I>count</I> <I>ip</I> <I>IP-address</I> | |
207 | |
208 <I>MANY</I> <I>value</I> | |
209 indicates that millions of targets have received messages | |
210 with the header, IP address, or checksum <I>value</I>. | |
211 <I>OK</I> <I>value</I> | |
212 <I>OK2</I> <I>value</I> | |
213 say that messages with the header, IP address, or check- | |
214 sum <I>value</I> are OK and should not reported to DCC servers | |
215 or be greylisted. <I>OK2</I> says that the message is "half | |
216 OK." Two <I>OK2</I> checksums associated with a message are | |
217 equivalent to one <I>OK</I>. | |
218 A DCC server never shares or <I>floods</I> reports containing | |
219 checksums marked in its whitelist with OK or OK2 to other | |
220 servers. A DCC client does not report or ask its server | |
221 about messages with a checksum marked OK or OK2 in the | |
222 client whitelist. This is intended to allow a DCC client | |
223 to keep private mail so private that even its checksums | |
224 are not disclosed. | |
225 <I>MX</I> <I>IP-address-or-hostname</I> | |
226 <I>MXDCC</I> <I>IP-address-or-hostname</I> | |
227 mark an address or block of addresses of trust mail | |
228 relays including MX servers, smart hosts, and bastion or | |
229 DMZ relays. The DCC clients <B><A HREF="dccm.html">dccm(8)</A></B>, <B><A HREF="dccifd.html">dccifd(8)</A></B>, and | |
230 <B><A HREF="dccproc.html">dccproc(8)</A></B> parse and skip initial Received: headers added | |
231 by listed MX servers to determine the external sources of | |
232 mail messages. Unsolicited bulk mail that has been for- | |
233 warded through listed addresses is discarded by <B><A HREF="dccm.html">dccm(8)</A></B> | |
234 and <B><A HREF="dccifd.html">dccifd(8)</A></B> as if with <B>-a</B> <I>DISCARD</I> instead of rejected. | |
235 <I>MXDCC</I> marks addresses that are MX servers that run DCC | |
236 clients. The checksums for a mail message that has been | |
237 forwarded through an address listed as MXDCC queried | |
238 instead of reported. | |
239 <I>SUBMIT</I> <I>IP-address-or-hostname</I> | |
240 marks an IP address or block addresses of SMTP submission | |
241 clients such as web browsers that cannot tolerate 4yz | |
242 temporary rejections but that cannot be trusted to not | |
243 send spam. Since they are local addresses, DCC Reputa- | |
244 tions are not computed for them. | |
245 | |
246 <I>value</I> in <I>count</I> <I>value</I> lines can be | |
247 <I>dest-mailbox</I> | |
248 is an RFC 821 address or a local user name. | |
249 <I>821-path</I> | |
250 is an RFC 821 address. | |
251 <I>822-mailbox</I> | |
252 is an RFC 822 address with optional name. | |
253 <I>Substitute</I> <I>header</I> | |
254 is the name of an SMTP header such as "Sender" or the | |
255 name of one of two SMTP envlope values, "HELO," or | |
256 "Mail_Host" for the resolved host name from the <I>821-path</I> | |
257 in the message. | |
258 <I>Hex</I> <I>ctype</I> <I>cksum</I> | |
259 starts with the string <I>Hex</I> followed a checksum type, and | |
260 a string of four hexadecimal numbers obtained from a DCC | |
261 log file or the <B><A HREF="dccproc.html">dccproc(8)</A></B> command using <B>-CQ</B>. The check- | |
262 sum type is <I>body</I>, <I>Fuz1</I>, or <I>Fuz2</I> or one of the preceding | |
263 checksum types such as <I>env</I><B>_</B><I>From</I>. | |
264 <I>IP-address</I> | |
265 is a host name, IPv4 or IPv6 address, or a block of IP | |
266 addresses in the standard xxx/mm from with mm limited for | |
267 server whitelists to 16 for IPv4 or 112 for IPv6. There | |
268 can be at most 64 CIDR blocks in a client <I>whiteclnt</I> file. | |
269 A host name is converted to IP addresses with DNS, | |
270 <I>/etc/hosts</I> or other mechanisms and one checksum for each | |
271 addresses added to the whitelist. | |
272 | |
273 <I>option</I> <I>setting</I> | |
274 can only be in a DCC client <I>whiteclnt</I> file used by <B><A HREF="dccifd.html">dccifd(8)</A></B>, | |
275 <B><A HREF="dccm.html">dccm(8)</A></B> or <B><A HREF="dccproc.html">dccproc(8)</A></B>. Settings in per-user whiteclnt files | |
276 override settings in the global file. <I>Setting</I> can be any of the | |
277 following: | |
278 <I>option</I> <I>log-all</I> | |
279 to log all mail messages. | |
280 <I>option</I> <I>log-normal</I> | |
281 to log only messages that meet the logging thresholds. | |
282 <I>option</I> <I>log-subdirectory-day</I> | |
283 <I>option</I> <I>log-subdirectory-hour</I> | |
284 <I>option</I> <I>log-subdirectory-minute</I> | |
285 creates log files containing mail messages in subdirecto- | |
286 ries of the form <I>JJJ</I>, <I>JJJ/HH</I>, or <I>JJJ/HH/MM</I> where <I>JJJ</I> is the | |
287 current julian day, <I>HH</I> is the current hour, and <I>MM</I> is the | |
288 current minute. See also the <B>-l</B> <I>logdir</I> option for <B><A HREF="dccm.html">dccm(8)</A></B>, | |
289 <B><A HREF="dccifd.html">dccifd(8)</A></B>, and <B><A HREF="dccproc.html">dccproc(8)</A></B>. | |
290 <I>option</I> <I>dcc-on</I> | |
291 <I>option</I> <I>dcc-off</I> | |
292 Control DCC filtering. See the discussion of <B>-W</B> for | |
293 <B><A HREF="dccm.html">dccm(8)</A></B> and <B><A HREF="dccifd.html">dccifd(8)</A></B>. | |
294 <I>option</I> <I>greylist-on</I> | |
295 <I>option</I> <I>greylist-off</I> | |
296 to control greylisting. Greylisting for other recipients | |
297 in the same SMTP transaction can still cause greylist tem- | |
298 porary rejections. <I>greylist-off</I> in the main whiteclnt | |
299 file. | |
300 <I>option</I> <I>greylist-log-on</I> | |
301 <I>option</I> <I>greylist-log-off</I> | |
302 to control logging of greylisted mail messages. | |
303 <I>option</I> <I>DCC-rep-off</I> | |
304 <I>option</I> <I>DCC-rep-on</I> | |
305 to honor or ignore DCC Reputations computed by the DCC | |
306 server. | |
307 <I>option</I> <I>DNSBL1-off</I> | |
308 <I>option</I> <I>DNSBL1-on</I> | |
309 <I>option</I> <I>DNSBL2-off</I> | |
310 <I>option</I> <I>DNSBL2-on</I> | |
311 <I>option</I> <I>DNSBL3-off</I> | |
312 <I>option</I> <I>DNSBL3-on</I> | |
313 honor or ignore results of DNS blacklist checks configured | |
314 with <B>-B</B> for <B><A HREF="dccm.html">dccm(8)</A></B>, <B><A HREF="dccifd.html">dccifd(8)</A></B>, and <B><A HREF="dccproc.html">dccproc(8)</A></B>. | |
315 <I>option</I> <I>MTA-first</I> | |
316 <I>option</I> <I>MTA-last</I> | |
317 consider MTA determinations of spam or not-spam first so | |
318 they can be overridden by <I>whiteclnt</I> files, or last so that | |
319 they can override <I>whiteclnt</I> <I>files.</I> | |
320 <I>option</I> <I>forced-discard-ok</I> | |
321 <I>option</I> <I>no-forced-discard</I> | |
322 control whether <B><A HREF="dccm.html">dccm(8)</A></B> and <B><A HREF="dccifd.html">dccifd(8)</A></B> are allowed to dis- | |
323 card a message for one mailbox for which it is spam when it | |
324 is not spam and must be delivered to another mailbox. This | |
325 can happen if a mail message is addressed to two or more | |
326 mailboxes with differing whitelists. Discarding can be | |
327 undesirable because false positives are not communicated to | |
328 mail senders. To avoid discarding, <B><A HREF="dccm.html">dccm(8)</A></B> and <B><A HREF="dccifd.html">dccifd(8)</A></B> | |
329 running in proxy mode temporarily reject SMTP envelope <I>Rcpt</I> | |
330 <I>To</I> values that involve differing <I>whiteclnt</I> files. | |
331 <I>option</I> <I>threshold</I> <I>type,rej-thold</I> | |
332 has the same effects as <B>-c</B> <I>type,rej-thold</I> for <B><A HREF="dccproc.html">dccproc(8)</A></B> or | |
333 <B>-t</B> <I>type,rej-thold</I> for <B><A HREF="dccm.html">dccm(8)</A></B> and <B><A HREF="dccifd.html">dccifd(8)</A></B>. It is useful | |
334 only in per-user whiteclnt files to override the global DCC | |
335 checksum thresholds. | |
336 <I>option</I> <I>spam-trap-accept</I> | |
337 <I>option</I> <I>spam-trap-reject</I> | |
338 say that mail should be reported to the DCC server as | |
339 extremely bulk or with target counts of <I>MANY</I>. Greylisting, | |
340 DNS blacklist (DNSBL), and other checks are turned off. | |
341 <I>Spam-trap-accept</I> tells the MTA to accept the message while | |
342 <I>spam-trap-reject</I> tells the MTA to reject the message. Use | |
343 <I>Spam-trap-accept</I> for spam traps that should not be dis- | |
344 closed. <I>Spam-trap-reject</I> can be used on <I>catch-all</I> mail- | |
345 boxes that might receive legitimate mail by typographical | |
346 errors and that senders should be told about. | |
347 | |
348 In the absence of explicit settings, the default in the main | |
349 whiteclnt file is equivalent to | |
350 <I>option</I> <I>log-normal</I> | |
351 <I>option</I> <I>dcc-on</I> | |
352 <I>option</I> <I>greylist-on</I> | |
353 <I>option</I> <I>greylist-log-on</I> | |
354 <I>option</I> <I>DCC-rep-off</I> | |
355 <I>option</I> <I>DNSBL1-off</I> | |
356 <I>option</I> <I>DNSBL2-off</I> | |
357 <I>option</I> <I>DNSBL3-off</I> | |
358 <I>MTA-last</I> | |
359 <I>option</I> <I>no-forced-discard</I> | |
360 The defaults for individual recipient <I>whiteclnt</I> files are the | |
361 same except as change by explicit settings in the main file. | |
362 | |
363 Checksums of the IP address of the SMTP client sending a mail message are | |
364 practically unforgeable, because it is impractical for an SMTP client to | |
365 "spoof" its address or pretend to use some other IP address. That would | |
366 make the IP address of the sender useful for whitelisting, except that | |
367 the IP address of the SMTP client is often not available to users of | |
368 <B><A HREF="dccproc.html">dccproc(8)</A></B>. In addition, legitimate mail relays make whitelist entries | |
369 for IP addresses of little use. For example, the IP address from which a | |
370 message arrived might be that of a local relay instead of the home | |
371 address of a whitelisted mailing list. | |
372 | |
373 Envelope and header <I>From</I> values can be forged, so whitelist entries for | |
374 their checksums are not entirely reliable. | |
375 | |
376 Checksums of <I>env</I><B>_</B><I>To</I> values are never sent to DCC servers. They are valid | |
377 in only <I>whiteclnt</I> files and used only by <B><A HREF="dccm.html">dccm(8)</A></B>, <B><A HREF="dccifd.html">dccifd(8)</A></B>, and | |
378 <B><A HREF="dccproc.html">dccproc(8)</A></B> when the envelope <I>Rcpt</I> <I>To</I> value is known. | |
379 | |
380 <A NAME="Greylists"><B>Greylists</B></A> | |
381 The DCC server, <B><A HREF="dccd.html">dccd(8)</A></B>, can be used to maintain a greylist database for | |
382 some DCC clients including <B><A HREF="dccm.html">dccm(8)</A></B> and <B><A HREF="dccifd.html">dccifd(8)</A></B>. Greylisting involves | |
383 temporarily refusing mail from unfamiliar SMTP clients and is unrelated | |
384 to filtering with a Distributed Checksum Clearinghouse. | |
385 See http://projects.puremagic.com/greylisting/ | |
386 | |
387 <A NAME="Privacy"><B>Privacy</B></A> | |
388 Because sending mail is a less private act than receiving it, and because | |
389 sending bulk mail is usually not private at all and cannot be very pri- | |
390 vate, the DCC tries first to protect the privacy of mail recipients, and | |
391 second the privacy of senders of mail that is not bulk. | |
392 | |
393 DCC clients necessarily disclose some information about mail they have | |
394 received. The DCC database contains checksums of mail bodies, header | |
395 lines, and source addresses. While it contains significantly less infor- | |
396 mation than is available by "snooping" on Internet links, it is important | |
397 that the DCC database be treated as containing sensitive information and | |
398 to not put the most private information in the DCC database. Given the | |
399 contents of a message, one might determine whether that message has been | |
400 received by a system that subscribes to the DCC. Guesses about the | |
401 sender and addressee of a message can also be validated if the checksums | |
402 of the message have been sent to a DCC server. | |
403 | |
404 Because the DCC is distributed, organizations can operate their own DCC | |
405 servers, and configure them to share or "flood" only the checksums of | |
406 bulk mail that is not in local whitelists. | |
407 | |
408 DCC clients should not report the checksums of messages known to be pri- | |
409 vate to a DCC server. For example, checksums of messages local to a sys- | |
410 tem or that are otherwise known a priori to not be unsolicited bulk | |
411 should not be sent to a remote DCC server. This can accomplished by | |
412 adding entries for the sender to the client's local whitelist file. | |
413 Client whitelist files can also include entries for email recipients | |
414 whose mail should not be reported to a DCC server. | |
415 | |
416 <A NAME="Security"><B>Security</B></A> | |
417 Whenever considering security, one must first consider the risks. The | |
418 worst DCC security problems are unauthorized commands to a DCC service, | |
419 denial of the DCC service, and corruption of DCC data. The worst that | |
420 can be done with remote commands to a DCC server is to turn it off or | |
421 otherwise cause it to stop responding. The DCC is designed to fail | |
422 gracefully, so that a denial of service attack would at worst allow | |
423 delivery of mail that would otherwise be rejected. Corruption of DCC | |
424 data might at worst cause mail that is already somewhat "bulk" by virtue | |
425 of being received by two or more people to appear have higher recipient | |
426 numbers. Since DCC users <I>must</I> whitelist all sources of legitimate bulk | |
427 mail, this is also not a concern. Such security risks should be | |
428 addressed, but only with defenses that don't cost more than the possible | |
429 damage from an attack. | |
430 | |
431 The DCC must contend with senders of unsolicited bulk mail who resort to | |
432 unlawful actions to express their displeasure at having their advertising | |
433 blocked. Because the DCC protocol is based on UDP, an unhappy advertiser | |
434 could try to flood a DCC server with packets supposedly from subscribers | |
435 or non-subscribers. DCC servers defend against that attack by rate-lim- | |
436 iting requests from anonymous users. | |
437 | |
438 Also because of the use of UDP, clients must be protected against forged | |
439 answers to their queries. Otherwise an unsolicited bulk mail advertiser | |
440 could send a stream of "not spam" answers to an SMTP client while simul- | |
441 taneously sending mail that would otherwise be rejected. This is not a | |
442 problem for authenticated clients of the DCC because they share a secret | |
443 with the DCC. Unauthenticated, anonymous DCC clients do not share any | |
444 secrets with the DCC, except for unique and unpredictable bits in each | |
445 query or report sent to the DCC. Therefore, DCC servers cryptographi- | |
446 cally sign answers to unauthenticated clients with bits from the corre- | |
447 sponding queries. This protects against attackers that do not have | |
448 access to the stream of packets from the DCC client. | |
449 | |
450 The passwords or shared secrets used in the DCC client and server pro- | |
451 grams are "cleartext" for several reasons. In any shared secret authen- | |
452 tication system, at least one party must know the secret or keep the | |
453 secret in cleartext. You could encrypt the secrets in a file, but | |
454 because they are used by programs, you would need a cleartext copy of the | |
455 key to decrypt the file somewhere in the system, making such a scheme | |
456 more expensive but no more secure than a file of cleartext passwords. | |
457 Asymmetric systems such as that used in UNIX allow one party to not know | |
458 the secrets, but they must be and are designed to be computationally | |
459 expensive when used in applications like the DCC that involve thousands | |
460 or more authentication checks per second. Moreover, because of "dictio- | |
461 nary attacks," asymmetric systems are now little more secure than keeping | |
462 passwords in cleartext. An adversary can compare the hash values of com- | |
463 binations of common words with /etc/passwd hash values to look for bad | |
464 passwords. Worse, by the nature of a client/server protocol like that | |
465 used in the DCC, clients must have the cleartext password. Since it is | |
466 among the more numerous and much less secure clients that adversaries | |
467 would seek files of DCC passwords, it would be a waste to complicate the | |
468 DCC server with an asymmetric system. | |
469 | |
470 The DCC protocol is vulnerable to dictionary attacks to recover pass- | |
471 words. An adversary could capture some DCC packets, and then check to | |
472 see if any of the 100,000 to 1,000,000 passwords in so called "cracker | |
473 dictionaries" applied to a packet generated the same signature. This is | |
474 a concern only if DCC passwords are poorly chosen, such as any combina- | |
475 tion of words in an English dictionary. There are ways to prevent this | |
476 vulnerability regardless of how badly passwords are chosen, but they are | |
477 computationally expensive and require additional network round trips. | |
478 Since DCC passwords are created and typed into files once and do not need | |
479 to be remembered by people, it is cheaper and quite easy to simply choose | |
480 good passwords that are not in dictionaries. | |
481 | |
482 <A NAME="Reliability"><B>Reliability</B></A> | |
483 It is better to fail to filter unsolicited bulk mail than to fail to | |
484 deliver legitimate mail, so DCC clients fail in the direction of assuming | |
485 that mail is legitimate or even whitelisted. | |
486 | |
487 A DCC client sends a report or other request and waits for an answer. If | |
488 no answer arrives within a reasonable time, the client retransmits. | |
489 There are many things that might result in the client not receiving an | |
490 answer, but the most important is packet loss. If the client's request | |
491 does not reach the server, it is easy and harmless for the client to | |
492 retransmit. If the client's request reached the server but the server's | |
493 response was lost, a retransmission to the same server would be misunder- | |
494 stood as a new report of another copy of the same message unless it is | |
495 detected as a retransmission by the server. The DCC protocol includes | |
496 transactions identifiers for this purpose. If the client retransmitted | |
497 to a second server, the retransmission would be misunderstood by the sec- | |
498 ond server as a new report of the same message. | |
499 | |
500 Each request from a client includes a timestamp to aid the client in mea- | |
501 suring the round trip time to the server and to let the client pick the | |
502 closest server. Clients monitor the speed of all of the servers they | |
503 know including those they are not currently using, and use the quickest. | |
504 | |
505 <A NAME="Client-and-Server-IDs"><B>Client and Server-IDs</B></A> | |
506 Servers and clients use numbers or IDs to identify themselves. ID 1 is | |
507 reserved for anonymous, unauthenticated clients. All other IDs are asso- | |
508 ciated with a pair of passwords in the <I>ids</I> file, the current and next or | |
509 previous and current passwords. Clients included their client IDs in | |
510 their messages. When they are not using the anonymous ID, they sign | |
511 their messages to servers with the first password associated with their | |
512 client-ID. Servers treat messages with signatures that match neither of | |
513 the passwords for the client-ID in their own <I>ids</I> file as if the client | |
514 had used the anonymous ID. | |
515 | |
516 Each server has a unique <I>server-ID</I> less than 32768. Servers use their | |
517 IDs to identify checksums that they <I>flood</I> to other servers. Each server | |
518 expects local clients sending administrative commands to use the server's | |
519 ID and sign administrative commands with the associated password. | |
520 | |
521 Server-IDs must be unique among all systems that share reports by "flood- | |
522 ing." All servers must be told of the IDs all other servers whose | |
523 reports can be received in the local <I>@prefix@/flod</I> file described in | |
524 <B><A HREF="dccd.html">dccd(8)</A></B>. However, server-IDs can be mapped during flooding between inde- | |
525 pendent DCC organizations. | |
526 | |
527 <I>Passwd-IDs</I> are server-IDs that should not be assigned to servers. They | |
528 appear in the often publicly readable <I>@prefix@/flod</I> and specify passwords | |
529 in the private <I>@prefix@/ids</I> file for the inter-server flooding protocol | |
530 | |
531 The client identified by a <I>client-ID</I> might be a single computer with a | |
532 single IP address, a single but multi-homed computer, or many computers. | |
533 Client-IDs are not used to identify checksum reports, but the organiza- | |
534 tion operating the client. A client-ID need only be unique among clients | |
535 using a single server. A single client can use different client-IDs for | |
536 different servers, each client-ID authenticated with a separate password. | |
537 | |
538 An obscure but important part of all of this is that the inter-server | |
539 flooding algorithm depends on server-IDs and timestamps attached to | |
540 reports of checksums. The inter-server flooding mechanism requires coop- | |
541 erating DCC servers to maintain reasonable clocks ticking in UTC. | |
542 Clients include timestamps in their requests, but as long as their time- | |
543 stamps are unlikely to be repeated, they need not be very accurate. | |
544 | |
545 <A NAME="Installation-Considerations"><B>Installation Considerations</B></A> | |
546 DCC clients on a computer share information about which servers are cur- | |
547 rently working and their speeds in a shared memory segment. This segment | |
548 also contains server host names, IP addresses, and the passwords needed | |
549 to authenticate known clients to servers. That generally requires that | |
550 <B><A HREF="dccm.html">dccm(8)</A></B>, <B><A HREF="dccproc.html">dccproc(8)</A></B>, <B><A HREF="dccifd.html">dccifd(8)</A></B>, and <B><A HREF="cdcc.html">cdcc(8)</A></B> execute with an UID that can | |
551 write to the DCC home directory and its files. The sendmail interface, | |
552 dccm, is a daemon that can be started by an "rc" or other script already | |
553 running with the correct UID. The other two, dccproc and cdcc need to be | |
554 set-UID because they are used by end users. They relinquish set-UID | |
555 privileges when not needed. | |
556 | |
557 Files that contain cleartext passwords including the shared file used by | |
558 clients must be readable only by "owner." | |
559 | |
560 The data files required by a DCC can be in a single "home" directory, | |
561 <I>@prefix@</I>. Distinct DCC servers can run on a single computer, provided | |
562 they use distinct UDP port numbers and home directories. It is possible | |
563 and convenient for the DCC clients using a server on the same computer to | |
564 use the same home directory as the server. | |
565 | |
566 The DCC source distribution includes sample control files. They should | |
567 be modified appropriately and then copied to the DCC home directory. | |
568 Files that contain cleartext passwords must not be publicly readable. | |
569 | |
570 The DCC source includes "feature" m4 files to configure sendmail to use | |
571 <B><A HREF="dccm.html">dccm(8)</A></B> to check a DCC server about incoming mail. | |
572 | |
573 See also the <A HREF="INSTALL.html">INSTALL.html</A> file. | |
574 | |
575 <A NAME="Client-Installation"><B>Client Installation</B></A> | |
576 Installing a DCC client starts with obtaining or compiling program bina- | |
577 ries for the client server data control tool, <B><A HREF="cdcc.html">cdcc(8)</A></B>. Installing the | |
578 sendmail DCC interface, <B><A HREF="dccm.html">dccm(8)</A></B>, or <B><A HREF="dccproc.html">dccproc(8)</A></B>, the general or | |
579 <B>procmail(1)</B> interface is the main part of the client installation. Con- | |
580 necting the DCC to sendmail with dccm is most powerful, but requires | |
581 administrative control of the system running sendmail. | |
582 | |
583 As noted above, cdcc and dccproc should be set-UID to a suitable UID. | |
584 Root or 0 is thought to be safe for both, because they are careful to | |
585 release privileges except when they need them to read or write files in | |
586 the DCC home directory. A DCC home directory, <I>@prefix@</I> should be cre- | |
587 ated. It must be owned and writable by the UID to which cdcc is set. | |
588 | |
589 After the DCC client programs have been obtained, contact the operator(s) | |
590 of the chosen DCC server(s) to obtain each server's hostname, port num- | |
591 ber, and a <I>client-ID</I> and corresponding password. No client-IDs or pass- | |
592 words are needed touse DCC servers that allow anonymous clients. Use the | |
593 <I>load</I> or <I>add</I> commands of cdcc to create a <I>map</I> file in the DCC home direc- | |
594 tory. It is usually necessary to create a client whitelist file of the | |
595 format described above. To accommodate users sharing a computer but not | |
596 ideas about what is solicited bulk mail, the client whitelist file can be | |
597 any valid path name and need not be in the DCC home directory. | |
598 | |
599 If dccm is chosen, arrange to start it with suitable arguments before | |
600 sendmail is started. See the <I>homedir/dcc</I><B>_</B><I>conf</I> file and the <I>misc/rcDCC</I> | |
601 script in the DCC source. The procmail DCCM interface, <B><A HREF="dccproc.html">dccproc(8)</A></B>, can | |
602 be run manually or by a <B>procmailrc(5)</B> rule. | |
603 | |
604 <A NAME="Server-Installation"><B>Server Installation</B></A> | |
605 The DCC server, <B><A HREF="dccd.html">dccd(8)</A></B>, also requires that the DCC home directory exist. | |
606 It does not use the client shared or memory mapped file of server | |
607 addresses, but it requires other files. One is the <I>@prefix@/ids</I> file of | |
608 client-IDs, server-IDs, and corresponding passwords. Another is a <I>flod</I> | |
609 file of peers that send and receive floods of reports of checksums with | |
610 large counts. Both files are described in <B><A HREF="dccd.html">dccd(8)</A></B>. | |
611 | |
612 The server daemon should be started when the system is rebooted, probably | |
613 before sendmail. See the <I>misc/rcDCC</I> and <I>misc/start-dccd</I> files in the DCC | |
614 source. | |
615 | |
616 The database should be cleaned regularly with <B><A HREF="dbclean.html">dbclean(8)</A></B> such as by run- | |
617 ning the crontab job that is in the misc directory. | |
618 | |
619 | |
620 </PRE> | |
621 <H2><A NAME="SEE-ALSO">SEE ALSO</A></H2><PRE> | |
622 <B><A HREF="cdcc.html">cdcc(8)</A></B>, <B><A HREF="dbclean.html">dbclean(8)</A></B>, <B><A HREF="dcc.html">dcc(8)</A></B>, <B><A HREF="dccd.html">dccd(8)</A></B>, <B><A HREF="dccifd.html">dccifd(8)</A></B>, <B><A HREF="dccm.html">dccm(8)</A></B>, <B><A HREF="dccproc.html">dccproc(8)</A></B>, | |
623 <B><A HREF="dblist.html">dblist(8)</A></B>, <B><A HREF="dccsight.html">dccsight(8)</A></B>, <B>sendmail(8)</B>. | |
624 | |
625 | |
626 </PRE> | |
627 <H2><A NAME="HISTORY">HISTORY</A></H2><PRE> | |
628 Distributed Checksum Clearinghouses are based on an idea of Paul Vixie | |
629 with code designed and written at Rhyolite Software starting in 2000. | |
630 This document describes version 1.3.103. | |
631 | |
632 February 26, 2009 | |
633 </PRE> | |
634 <HR> | |
635 <ADDRESS> | |
636 Man(1) output converted with | |
637 <a href="http://www.oac.uci.edu/indiv/ehood/man2html.html">man2html</a> | |
638 modified for the DCC $Date 2001/04/29 03:22:18 $ | |
639 <BR> | |
640 <A HREF="http://www.dcc-servers.net/dcc/"> | |
641 <IMG SRC="http://logos.dcc-servers.net/border.png" | |
642 class=logo ALT="DCC logo"> | |
643 </A> | |
644 <A HREF="http://validator.w3.org/check?uri=referer"> | |
645 <IMG class=logo ALT="Valid HTML 4.01 Strict" | |
646 SRC="http://www.w3.org/Icons/valid-html401"> | |
647 </A> | |
648 </ADDRESS> | |
649 </BODY> | |
650 </HTML> |