Mercurial > notdcc
diff dcc.html.in @ 0:c7f6b056b673
First import of vendor version
author | Peter Gervai <grin@grin.hu> |
---|---|
date | Tue, 10 Mar 2009 13:49:58 +0100 |
parents | |
children |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/dcc.html.in Tue Mar 10 13:49:58 2009 +0100 @@ -0,0 +1,650 @@ +<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"> +<HTML> +<HEAD> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1"> + <TITLE>dcc.0.8</TITLE> + <META http-equiv="Content-Style-Type" content="text/css"> + <STYLE type="text/css"> + BODY {background-color:white; color:black} + ADDRESS {font-size:smaller} + IMG.logo {width:6em; vertical-align:middle} + </STYLE> +</HEAD> +<BODY> +<PRE> +<!-- Manpage converted by man2html 3.0.1 --> +<B><A HREF="dcc.html">DCC(8)</A></B> Distributed Checksum Clearinghouse <B><A HREF="dcc.html">DCC(8)</A></B> + + +</PRE> +<H2><A NAME="NAME">NAME</A></H2><PRE> + <B>DCC</B> -- Distributed Checksum Clearinghouse + + +</PRE> +<H2><A NAME="DESCRIPTION">DESCRIPTION</A></H2><PRE> + The Distributed Checksum Clearinghouse or <B>DCC</B> is a cooperative, distrib- + uted system intended to detect "bulk" mail or mail sent to many people. + It allows individuals receiving a single mail message to determine that + many other people have received essentially identical copies of the mes- + sage and so reject or discard the message. + + Source for the server, client, and utilities is available at Rhyolite + Software, LLC, http://www.rhyolite.com/dcc/ It is free for organizations + that do not sell spam or virus filtering services. + + <A NAME="How-the-DCC-Is-Used"><B>How the DCC Is Used</B></A> + The DCC can be viewed as a tool for end users to enforce their right to + "opt-in" to streams of bulk mail by refusing bulk mail except from + sources in a "whitelist." Whitelists are the responsibility of DCC + clients, since only they know which bulk mail they solicited. + + False positives or mail marked as bulk by a DCC server that is not bulk + occur only when a recipient of a message reports it to a DCC server as + having been received many times or when the "fuzzy" checksums of differ- + ing messages are the same. The fuzzy checksums ignore aspects of mes- + sages in order to compute identical checksums for substantially identical + messages. The fuzzy checksums are designed to ignore only differences + that do not affect meanings. So in practice, you do not need to worry + about DCC false positive indications of "bulk," but not all bulk mail is + unsolicited bulk mail or spam. You must either use whitelists to distin- + guish solicited from unsolicited bulk mail or only use DCC indications of + "bulk" as part of a scoring system such as SpamAssassin. Besides unso- + licited bulk email or spam, bulk messages include legitimate mail such as + order confirmations from merchants, legitimate mailing lists, and empty + or test messages. + + A DCC server estimates the number copies of a message by counting check- + sums reported by DCC clients. Each client must decide which bulk mes- + sages are unsolicited and what degree of "bulkiness" is objectionable. + Client DCC software marks, rejects, or discards mail that is bulk accord- + ing to local thresholds on target addresses from DCC servers and unso- + licited according to local whitelists. + + DCC servers are usually configured to receive reports from as many tar- + gets as possible, including sources that cannot be trusted to not exag- + gerate the number of copies of a message they see. A user of a DCC + client angry about receiving a message could report it with 1,000,000 + separate DCC reports or with a single report claiming 1,000,000 targets. + An unprincipled user could subscribe a "spam trap" to mailing lists such + as those of the IETF or CERT. Such abuses of the system area not prob- + lems, because much legitimate mail is "bulk." You cannot reject bulk + mail unless you have a whitelist of sources of legitimate bulk mail. + + DCC can also be used by an Internet service provider to detect bulk mail + coming from its own customers. In such circumstances, the DCC client + might be configured to only log bulk mail from unexpected (not + whitelisted) customers. + + <A NAME="What-the-DCC-Is"><B>What the DCC Is</B></A> + A DCC server accumulates counts of cryptographic checksums of messages + but not the messages themselves. It exchanges reports of frequently seen + checksums with other servers. DCC clients send reports of checksums + related to incoming mail to a nearby DCC server running <B><A HREF="dccd.html">dccd(8)</A></B>. Each + report from a client includes the number of recipients for the message. + A DCC server accumulates the reports and responds to clients the the cur- + rent total number of recipients for each checksum. The client adds an + SMTP header to incoming mail containing the total counts. It then dis- + cards or rejects mail that is not whitelisted and has counts that exceed + local thresholds. + + A special value of the number of addressees is "MANY" and means it is + certain that this message was bulk and might be unsolicited, perhaps + because it came from a locally blacklisted source or was addressed to an + invalid address or "spam trap." The special value "MANY" is merely the + largest value that fits in the fixed sized field containing the count of + addressees. That "infinity" accumulated total can be reached with mil- + lions of independent reports as well as with one or two. + + DCC servers <I>flood</I> or send reports of checksums of bulk mail to neighbor- + ing servers. + + To keep a server's database of checksums from growing without bound, + checksums are forgotten when they become old. Checksums of bulk mail are + kept longer. See <B><A HREF="dbclean.html">dbclean(8)</A></B>. + + DCC clients pick the nearest working DCC server using a small shared or + memory mapped file, <I>@prefix@/map</I>. It contains server names, port num- + bers, passwords, recent performance measures, and so forth. This file + allows clients to use quick retransmission timeouts and to waste little + time on servers that have temporarily stopped working or become unreach- + able. The utility program <B><A HREF="cdcc.html">cdcc(8)</A></B> is used to maintain this file as well + as to check the health of servers. + + <A NAME="X-DCC-Headers"><B>X-DCC Headers</B></A> + The DCC software includes several programs used by clients. <B><A HREF="dccm.html">Dccm(8)</A></B> uses + the sendmail "milter" interface to query a DCC server, add header lines + to incoming mail, and reject mail whose total checksum counts are high. + Dccm is intended to be run with SMTP servers using sendmail. + + <B><A HREF="dccproc.html">Dccproc(8)</A></B> adds header lines to mail presented by file name or <I>stdin</I>, but + relies on other programs such as procmail to deal with mail with large + counts. <B><A HREF="dccsight.html">Dccsight(8)</A></B> is similar but deals with previously computed check- + sums. + + <B><A HREF="dccifd.html">Dccifd(8)</A></B> is similar to dccproc but is not run separately for each mail + message and so is far more efficient. It receives mail messages via a + socket somewhat like dccm, but with a simpler protocol that can be used + by Perl scripts or other programs. + + DCC SMTP header lines are of one of the forms: + + X-DCC-brand-Metrics: client server-ID; bulk cknm1=count cknm2=count ... + X-DCC-brand-Metrics: client; whitelist + where + <I>whitelist</I> appears if the global or per-user <I>whiteclnt</I> file marks the + message as good. + <I>brand</I> is the "brand name" of the DCC server, such as "RHYOLITE". + <I>client</I> is the name or IP address of the DCC client that added the + header line to the SMTP message. + <I>server-ID</I> is the numeric ID of the DCC server that the DCC client con- + tacted. + <I>bulk</I> is present if one or more checksum counts exceeded the DCC + client's thresholds to make the message "bulky." + <I>bulk</I> <I>rep</I> is present if the DCC reputation of the IP address of the + sender is bad. + <I>cknm1</I>,<I>cknm2</I>,... are types of checksums: + <I>IP</I> address of SMTP client + <I>env</I><B>_</B><I>From</I> SMTP envelope value + <I>From</I> SMTP header line + <I>Message-ID</I> SMTP header line + <I>Received</I> last Received: header line in the SMTP message + <I>substitute</I> SMTP header line chosen by the DCC client, pre- + fixed with the name of the header + <I>Body</I> SMTP body ignoring white-space + <I>Fuz1</I> filtered or "fuzzy" body checksum + <I>Fuz2</I> another filtered or "fuzzy" body checksum + <I>rep</I> DCC reputation of the mail sender or the esti- + mated probability that the message is bulk. + Counts for <I>IP</I>, <I>env</I><B>_</B><I>From</I>, <I>From</I>, <I>Message-Id</I>, <I>Received</I>, and + <I>substitute</I> checksums are omitted by the DCC client if the + server says it has no information. Counts for <I>Fuz1</I> and <I>Fuz2</I> + are omitted if the message body is empty or contains too lit- + tle of the right kind of information for the checksum to be + computed. + <I>count</I> is the total number of recipients of messages with that check- + sum reported directly or indirectly to the DCC server. The + special count "MANY" means that DCC client have claimed that + the message is directed at millions of recipients. "MANY" + imples the message is definitely bulk, but not necessarily + unsolicited. The special counts "OK" and "OK2" mean the + checksum has been marked "good" or "half-good" by DCC servers. + + <A NAME="Mailing-lists"><B>Mailing lists</B></A> + Legitimate mailing list traffic differs from spam only in being solicited + by recipients. Each client should have a private whitelist. + + DCC whitelists can also mark mail as unsolicited bulk using blacklist + entries for commonly forged values such as "From: user@public.com". + + <A NAME="White-and-Blacklists"><B>White and Blacklists</B></A> + DCC server and client whitelist files share a common format. Server + files are always named <I>whitelist</I> and one is required to be in the DCC + home directory with the other server files. Client whitelist files are + named <I>whiteclnt</I> in the DCC home directory or a subdirectory specified + with the <B>-U</B> option for <B><A HREF="dccm.html">dccm(8)</A></B>. They specify mail that should not be + reported to a DCC server or that is always unsolicited and almost cer- + tainly bulk. + + A DCC whitelist file contains blank lines, comments starting with "#", + and lines of the following forms: + <I>include</I> <I>file</I> + Copies the contents of <I>file</I> into the whitelist. It can occur + only in the main whitelist or whiteclnt file and not in an + included file. The file name should be absolute or relative to + the DCC home directory. + + <I>count</I> <I>value</I> + lines specify checksums that should be white- or blacklisted. + <I>count</I> <I>env</I><B>_</B><I>From</I> <I>821-path</I> + <I>count</I> <I>env</I><B>_</B><I>To</I> <I>dest-mailbox</I> + <I>count</I> <I>From</I> <I>822-mailbox</I> + <I>count</I> <I>Message-ID</I> <I><string></I> + <I>count</I> <I>Received</I> <I>string</I> + <I>count</I> <I>Substitute</I> <I>header</I> <I>string</I> + <I>count</I> <I>Hex</I> <I>ctype</I> <I>cksum</I> + <I>count</I> <I>ip</I> <I>IP-address</I> + + <I>MANY</I> <I>value</I> + indicates that millions of targets have received messages + with the header, IP address, or checksum <I>value</I>. + <I>OK</I> <I>value</I> + <I>OK2</I> <I>value</I> + say that messages with the header, IP address, or check- + sum <I>value</I> are OK and should not reported to DCC servers + or be greylisted. <I>OK2</I> says that the message is "half + OK." Two <I>OK2</I> checksums associated with a message are + equivalent to one <I>OK</I>. + A DCC server never shares or <I>floods</I> reports containing + checksums marked in its whitelist with OK or OK2 to other + servers. A DCC client does not report or ask its server + about messages with a checksum marked OK or OK2 in the + client whitelist. This is intended to allow a DCC client + to keep private mail so private that even its checksums + are not disclosed. + <I>MX</I> <I>IP-address-or-hostname</I> + <I>MXDCC</I> <I>IP-address-or-hostname</I> + mark an address or block of addresses of trust mail + relays including MX servers, smart hosts, and bastion or + DMZ relays. The DCC clients <B><A HREF="dccm.html">dccm(8)</A></B>, <B><A HREF="dccifd.html">dccifd(8)</A></B>, and + <B><A HREF="dccproc.html">dccproc(8)</A></B> parse and skip initial Received: headers added + by listed MX servers to determine the external sources of + mail messages. Unsolicited bulk mail that has been for- + warded through listed addresses is discarded by <B><A HREF="dccm.html">dccm(8)</A></B> + and <B><A HREF="dccifd.html">dccifd(8)</A></B> as if with <B>-a</B> <I>DISCARD</I> instead of rejected. + <I>MXDCC</I> marks addresses that are MX servers that run DCC + clients. The checksums for a mail message that has been + forwarded through an address listed as MXDCC queried + instead of reported. + <I>SUBMIT</I> <I>IP-address-or-hostname</I> + marks an IP address or block addresses of SMTP submission + clients such as web browsers that cannot tolerate 4yz + temporary rejections but that cannot be trusted to not + send spam. Since they are local addresses, DCC Reputa- + tions are not computed for them. + + <I>value</I> in <I>count</I> <I>value</I> lines can be + <I>dest-mailbox</I> + is an RFC 821 address or a local user name. + <I>821-path</I> + is an RFC 821 address. + <I>822-mailbox</I> + is an RFC 822 address with optional name. + <I>Substitute</I> <I>header</I> + is the name of an SMTP header such as "Sender" or the + name of one of two SMTP envlope values, "HELO," or + "Mail_Host" for the resolved host name from the <I>821-path</I> + in the message. + <I>Hex</I> <I>ctype</I> <I>cksum</I> + starts with the string <I>Hex</I> followed a checksum type, and + a string of four hexadecimal numbers obtained from a DCC + log file or the <B><A HREF="dccproc.html">dccproc(8)</A></B> command using <B>-CQ</B>. The check- + sum type is <I>body</I>, <I>Fuz1</I>, or <I>Fuz2</I> or one of the preceding + checksum types such as <I>env</I><B>_</B><I>From</I>. + <I>IP-address</I> + is a host name, IPv4 or IPv6 address, or a block of IP + addresses in the standard xxx/mm from with mm limited for + server whitelists to 16 for IPv4 or 112 for IPv6. There + can be at most 64 CIDR blocks in a client <I>whiteclnt</I> file. + A host name is converted to IP addresses with DNS, + <I>/etc/hosts</I> or other mechanisms and one checksum for each + addresses added to the whitelist. + + <I>option</I> <I>setting</I> + can only be in a DCC client <I>whiteclnt</I> file used by <B><A HREF="dccifd.html">dccifd(8)</A></B>, + <B><A HREF="dccm.html">dccm(8)</A></B> or <B><A HREF="dccproc.html">dccproc(8)</A></B>. Settings in per-user whiteclnt files + override settings in the global file. <I>Setting</I> can be any of the + following: + <I>option</I> <I>log-all</I> + to log all mail messages. + <I>option</I> <I>log-normal</I> + to log only messages that meet the logging thresholds. + <I>option</I> <I>log-subdirectory-day</I> + <I>option</I> <I>log-subdirectory-hour</I> + <I>option</I> <I>log-subdirectory-minute</I> + creates log files containing mail messages in subdirecto- + ries of the form <I>JJJ</I>, <I>JJJ/HH</I>, or <I>JJJ/HH/MM</I> where <I>JJJ</I> is the + current julian day, <I>HH</I> is the current hour, and <I>MM</I> is the + current minute. See also the <B>-l</B> <I>logdir</I> option for <B><A HREF="dccm.html">dccm(8)</A></B>, + <B><A HREF="dccifd.html">dccifd(8)</A></B>, and <B><A HREF="dccproc.html">dccproc(8)</A></B>. + <I>option</I> <I>dcc-on</I> + <I>option</I> <I>dcc-off</I> + Control DCC filtering. See the discussion of <B>-W</B> for + <B><A HREF="dccm.html">dccm(8)</A></B> and <B><A HREF="dccifd.html">dccifd(8)</A></B>. + <I>option</I> <I>greylist-on</I> + <I>option</I> <I>greylist-off</I> + to control greylisting. Greylisting for other recipients + in the same SMTP transaction can still cause greylist tem- + porary rejections. <I>greylist-off</I> in the main whiteclnt + file. + <I>option</I> <I>greylist-log-on</I> + <I>option</I> <I>greylist-log-off</I> + to control logging of greylisted mail messages. + <I>option</I> <I>DCC-rep-off</I> + <I>option</I> <I>DCC-rep-on</I> + to honor or ignore DCC Reputations computed by the DCC + server. + <I>option</I> <I>DNSBL1-off</I> + <I>option</I> <I>DNSBL1-on</I> + <I>option</I> <I>DNSBL2-off</I> + <I>option</I> <I>DNSBL2-on</I> + <I>option</I> <I>DNSBL3-off</I> + <I>option</I> <I>DNSBL3-on</I> + honor or ignore results of DNS blacklist checks configured + with <B>-B</B> for <B><A HREF="dccm.html">dccm(8)</A></B>, <B><A HREF="dccifd.html">dccifd(8)</A></B>, and <B><A HREF="dccproc.html">dccproc(8)</A></B>. + <I>option</I> <I>MTA-first</I> + <I>option</I> <I>MTA-last</I> + consider MTA determinations of spam or not-spam first so + they can be overridden by <I>whiteclnt</I> files, or last so that + they can override <I>whiteclnt</I> <I>files.</I> + <I>option</I> <I>forced-discard-ok</I> + <I>option</I> <I>no-forced-discard</I> + control whether <B><A HREF="dccm.html">dccm(8)</A></B> and <B><A HREF="dccifd.html">dccifd(8)</A></B> are allowed to dis- + card a message for one mailbox for which it is spam when it + is not spam and must be delivered to another mailbox. This + can happen if a mail message is addressed to two or more + mailboxes with differing whitelists. Discarding can be + undesirable because false positives are not communicated to + mail senders. To avoid discarding, <B><A HREF="dccm.html">dccm(8)</A></B> and <B><A HREF="dccifd.html">dccifd(8)</A></B> + running in proxy mode temporarily reject SMTP envelope <I>Rcpt</I> + <I>To</I> values that involve differing <I>whiteclnt</I> files. + <I>option</I> <I>threshold</I> <I>type,rej-thold</I> + has the same effects as <B>-c</B> <I>type,rej-thold</I> for <B><A HREF="dccproc.html">dccproc(8)</A></B> or + <B>-t</B> <I>type,rej-thold</I> for <B><A HREF="dccm.html">dccm(8)</A></B> and <B><A HREF="dccifd.html">dccifd(8)</A></B>. It is useful + only in per-user whiteclnt files to override the global DCC + checksum thresholds. + <I>option</I> <I>spam-trap-accept</I> + <I>option</I> <I>spam-trap-reject</I> + say that mail should be reported to the DCC server as + extremely bulk or with target counts of <I>MANY</I>. Greylisting, + DNS blacklist (DNSBL), and other checks are turned off. + <I>Spam-trap-accept</I> tells the MTA to accept the message while + <I>spam-trap-reject</I> tells the MTA to reject the message. Use + <I>Spam-trap-accept</I> for spam traps that should not be dis- + closed. <I>Spam-trap-reject</I> can be used on <I>catch-all</I> mail- + boxes that might receive legitimate mail by typographical + errors and that senders should be told about. + + In the absence of explicit settings, the default in the main + whiteclnt file is equivalent to + <I>option</I> <I>log-normal</I> + <I>option</I> <I>dcc-on</I> + <I>option</I> <I>greylist-on</I> + <I>option</I> <I>greylist-log-on</I> + <I>option</I> <I>DCC-rep-off</I> + <I>option</I> <I>DNSBL1-off</I> + <I>option</I> <I>DNSBL2-off</I> + <I>option</I> <I>DNSBL3-off</I> + <I>MTA-last</I> + <I>option</I> <I>no-forced-discard</I> + The defaults for individual recipient <I>whiteclnt</I> files are the + same except as change by explicit settings in the main file. + + Checksums of the IP address of the SMTP client sending a mail message are + practically unforgeable, because it is impractical for an SMTP client to + "spoof" its address or pretend to use some other IP address. That would + make the IP address of the sender useful for whitelisting, except that + the IP address of the SMTP client is often not available to users of + <B><A HREF="dccproc.html">dccproc(8)</A></B>. In addition, legitimate mail relays make whitelist entries + for IP addresses of little use. For example, the IP address from which a + message arrived might be that of a local relay instead of the home + address of a whitelisted mailing list. + + Envelope and header <I>From</I> values can be forged, so whitelist entries for + their checksums are not entirely reliable. + + Checksums of <I>env</I><B>_</B><I>To</I> values are never sent to DCC servers. They are valid + in only <I>whiteclnt</I> files and used only by <B><A HREF="dccm.html">dccm(8)</A></B>, <B><A HREF="dccifd.html">dccifd(8)</A></B>, and + <B><A HREF="dccproc.html">dccproc(8)</A></B> when the envelope <I>Rcpt</I> <I>To</I> value is known. + + <A NAME="Greylists"><B>Greylists</B></A> + The DCC server, <B><A HREF="dccd.html">dccd(8)</A></B>, can be used to maintain a greylist database for + some DCC clients including <B><A HREF="dccm.html">dccm(8)</A></B> and <B><A HREF="dccifd.html">dccifd(8)</A></B>. Greylisting involves + temporarily refusing mail from unfamiliar SMTP clients and is unrelated + to filtering with a Distributed Checksum Clearinghouse. + See http://projects.puremagic.com/greylisting/ + + <A NAME="Privacy"><B>Privacy</B></A> + Because sending mail is a less private act than receiving it, and because + sending bulk mail is usually not private at all and cannot be very pri- + vate, the DCC tries first to protect the privacy of mail recipients, and + second the privacy of senders of mail that is not bulk. + + DCC clients necessarily disclose some information about mail they have + received. The DCC database contains checksums of mail bodies, header + lines, and source addresses. While it contains significantly less infor- + mation than is available by "snooping" on Internet links, it is important + that the DCC database be treated as containing sensitive information and + to not put the most private information in the DCC database. Given the + contents of a message, one might determine whether that message has been + received by a system that subscribes to the DCC. Guesses about the + sender and addressee of a message can also be validated if the checksums + of the message have been sent to a DCC server. + + Because the DCC is distributed, organizations can operate their own DCC + servers, and configure them to share or "flood" only the checksums of + bulk mail that is not in local whitelists. + + DCC clients should not report the checksums of messages known to be pri- + vate to a DCC server. For example, checksums of messages local to a sys- + tem or that are otherwise known a priori to not be unsolicited bulk + should not be sent to a remote DCC server. This can accomplished by + adding entries for the sender to the client's local whitelist file. + Client whitelist files can also include entries for email recipients + whose mail should not be reported to a DCC server. + + <A NAME="Security"><B>Security</B></A> + Whenever considering security, one must first consider the risks. The + worst DCC security problems are unauthorized commands to a DCC service, + denial of the DCC service, and corruption of DCC data. The worst that + can be done with remote commands to a DCC server is to turn it off or + otherwise cause it to stop responding. The DCC is designed to fail + gracefully, so that a denial of service attack would at worst allow + delivery of mail that would otherwise be rejected. Corruption of DCC + data might at worst cause mail that is already somewhat "bulk" by virtue + of being received by two or more people to appear have higher recipient + numbers. Since DCC users <I>must</I> whitelist all sources of legitimate bulk + mail, this is also not a concern. Such security risks should be + addressed, but only with defenses that don't cost more than the possible + damage from an attack. + + The DCC must contend with senders of unsolicited bulk mail who resort to + unlawful actions to express their displeasure at having their advertising + blocked. Because the DCC protocol is based on UDP, an unhappy advertiser + could try to flood a DCC server with packets supposedly from subscribers + or non-subscribers. DCC servers defend against that attack by rate-lim- + iting requests from anonymous users. + + Also because of the use of UDP, clients must be protected against forged + answers to their queries. Otherwise an unsolicited bulk mail advertiser + could send a stream of "not spam" answers to an SMTP client while simul- + taneously sending mail that would otherwise be rejected. This is not a + problem for authenticated clients of the DCC because they share a secret + with the DCC. Unauthenticated, anonymous DCC clients do not share any + secrets with the DCC, except for unique and unpredictable bits in each + query or report sent to the DCC. Therefore, DCC servers cryptographi- + cally sign answers to unauthenticated clients with bits from the corre- + sponding queries. This protects against attackers that do not have + access to the stream of packets from the DCC client. + + The passwords or shared secrets used in the DCC client and server pro- + grams are "cleartext" for several reasons. In any shared secret authen- + tication system, at least one party must know the secret or keep the + secret in cleartext. You could encrypt the secrets in a file, but + because they are used by programs, you would need a cleartext copy of the + key to decrypt the file somewhere in the system, making such a scheme + more expensive but no more secure than a file of cleartext passwords. + Asymmetric systems such as that used in UNIX allow one party to not know + the secrets, but they must be and are designed to be computationally + expensive when used in applications like the DCC that involve thousands + or more authentication checks per second. Moreover, because of "dictio- + nary attacks," asymmetric systems are now little more secure than keeping + passwords in cleartext. An adversary can compare the hash values of com- + binations of common words with /etc/passwd hash values to look for bad + passwords. Worse, by the nature of a client/server protocol like that + used in the DCC, clients must have the cleartext password. Since it is + among the more numerous and much less secure clients that adversaries + would seek files of DCC passwords, it would be a waste to complicate the + DCC server with an asymmetric system. + + The DCC protocol is vulnerable to dictionary attacks to recover pass- + words. An adversary could capture some DCC packets, and then check to + see if any of the 100,000 to 1,000,000 passwords in so called "cracker + dictionaries" applied to a packet generated the same signature. This is + a concern only if DCC passwords are poorly chosen, such as any combina- + tion of words in an English dictionary. There are ways to prevent this + vulnerability regardless of how badly passwords are chosen, but they are + computationally expensive and require additional network round trips. + Since DCC passwords are created and typed into files once and do not need + to be remembered by people, it is cheaper and quite easy to simply choose + good passwords that are not in dictionaries. + + <A NAME="Reliability"><B>Reliability</B></A> + It is better to fail to filter unsolicited bulk mail than to fail to + deliver legitimate mail, so DCC clients fail in the direction of assuming + that mail is legitimate or even whitelisted. + + A DCC client sends a report or other request and waits for an answer. If + no answer arrives within a reasonable time, the client retransmits. + There are many things that might result in the client not receiving an + answer, but the most important is packet loss. If the client's request + does not reach the server, it is easy and harmless for the client to + retransmit. If the client's request reached the server but the server's + response was lost, a retransmission to the same server would be misunder- + stood as a new report of another copy of the same message unless it is + detected as a retransmission by the server. The DCC protocol includes + transactions identifiers for this purpose. If the client retransmitted + to a second server, the retransmission would be misunderstood by the sec- + ond server as a new report of the same message. + + Each request from a client includes a timestamp to aid the client in mea- + suring the round trip time to the server and to let the client pick the + closest server. Clients monitor the speed of all of the servers they + know including those they are not currently using, and use the quickest. + + <A NAME="Client-and-Server-IDs"><B>Client and Server-IDs</B></A> + Servers and clients use numbers or IDs to identify themselves. ID 1 is + reserved for anonymous, unauthenticated clients. All other IDs are asso- + ciated with a pair of passwords in the <I>ids</I> file, the current and next or + previous and current passwords. Clients included their client IDs in + their messages. When they are not using the anonymous ID, they sign + their messages to servers with the first password associated with their + client-ID. Servers treat messages with signatures that match neither of + the passwords for the client-ID in their own <I>ids</I> file as if the client + had used the anonymous ID. + + Each server has a unique <I>server-ID</I> less than 32768. Servers use their + IDs to identify checksums that they <I>flood</I> to other servers. Each server + expects local clients sending administrative commands to use the server's + ID and sign administrative commands with the associated password. + + Server-IDs must be unique among all systems that share reports by "flood- + ing." All servers must be told of the IDs all other servers whose + reports can be received in the local <I>@prefix@/flod</I> file described in + <B><A HREF="dccd.html">dccd(8)</A></B>. However, server-IDs can be mapped during flooding between inde- + pendent DCC organizations. + + <I>Passwd-IDs</I> are server-IDs that should not be assigned to servers. They + appear in the often publicly readable <I>@prefix@/flod</I> and specify passwords + in the private <I>@prefix@/ids</I> file for the inter-server flooding protocol + + The client identified by a <I>client-ID</I> might be a single computer with a + single IP address, a single but multi-homed computer, or many computers. + Client-IDs are not used to identify checksum reports, but the organiza- + tion operating the client. A client-ID need only be unique among clients + using a single server. A single client can use different client-IDs for + different servers, each client-ID authenticated with a separate password. + + An obscure but important part of all of this is that the inter-server + flooding algorithm depends on server-IDs and timestamps attached to + reports of checksums. The inter-server flooding mechanism requires coop- + erating DCC servers to maintain reasonable clocks ticking in UTC. + Clients include timestamps in their requests, but as long as their time- + stamps are unlikely to be repeated, they need not be very accurate. + + <A NAME="Installation-Considerations"><B>Installation Considerations</B></A> + DCC clients on a computer share information about which servers are cur- + rently working and their speeds in a shared memory segment. This segment + also contains server host names, IP addresses, and the passwords needed + to authenticate known clients to servers. That generally requires that + <B><A HREF="dccm.html">dccm(8)</A></B>, <B><A HREF="dccproc.html">dccproc(8)</A></B>, <B><A HREF="dccifd.html">dccifd(8)</A></B>, and <B><A HREF="cdcc.html">cdcc(8)</A></B> execute with an UID that can + write to the DCC home directory and its files. The sendmail interface, + dccm, is a daemon that can be started by an "rc" or other script already + running with the correct UID. The other two, dccproc and cdcc need to be + set-UID because they are used by end users. They relinquish set-UID + privileges when not needed. + + Files that contain cleartext passwords including the shared file used by + clients must be readable only by "owner." + + The data files required by a DCC can be in a single "home" directory, + <I>@prefix@</I>. Distinct DCC servers can run on a single computer, provided + they use distinct UDP port numbers and home directories. It is possible + and convenient for the DCC clients using a server on the same computer to + use the same home directory as the server. + + The DCC source distribution includes sample control files. They should + be modified appropriately and then copied to the DCC home directory. + Files that contain cleartext passwords must not be publicly readable. + + The DCC source includes "feature" m4 files to configure sendmail to use + <B><A HREF="dccm.html">dccm(8)</A></B> to check a DCC server about incoming mail. + + See also the <A HREF="INSTALL.html">INSTALL.html</A> file. + + <A NAME="Client-Installation"><B>Client Installation</B></A> + Installing a DCC client starts with obtaining or compiling program bina- + ries for the client server data control tool, <B><A HREF="cdcc.html">cdcc(8)</A></B>. Installing the + sendmail DCC interface, <B><A HREF="dccm.html">dccm(8)</A></B>, or <B><A HREF="dccproc.html">dccproc(8)</A></B>, the general or + <B>procmail(1)</B> interface is the main part of the client installation. Con- + necting the DCC to sendmail with dccm is most powerful, but requires + administrative control of the system running sendmail. + + As noted above, cdcc and dccproc should be set-UID to a suitable UID. + Root or 0 is thought to be safe for both, because they are careful to + release privileges except when they need them to read or write files in + the DCC home directory. A DCC home directory, <I>@prefix@</I> should be cre- + ated. It must be owned and writable by the UID to which cdcc is set. + + After the DCC client programs have been obtained, contact the operator(s) + of the chosen DCC server(s) to obtain each server's hostname, port num- + ber, and a <I>client-ID</I> and corresponding password. No client-IDs or pass- + words are needed touse DCC servers that allow anonymous clients. Use the + <I>load</I> or <I>add</I> commands of cdcc to create a <I>map</I> file in the DCC home direc- + tory. It is usually necessary to create a client whitelist file of the + format described above. To accommodate users sharing a computer but not + ideas about what is solicited bulk mail, the client whitelist file can be + any valid path name and need not be in the DCC home directory. + + If dccm is chosen, arrange to start it with suitable arguments before + sendmail is started. See the <I>homedir/dcc</I><B>_</B><I>conf</I> file and the <I>misc/rcDCC</I> + script in the DCC source. The procmail DCCM interface, <B><A HREF="dccproc.html">dccproc(8)</A></B>, can + be run manually or by a <B>procmailrc(5)</B> rule. + + <A NAME="Server-Installation"><B>Server Installation</B></A> + The DCC server, <B><A HREF="dccd.html">dccd(8)</A></B>, also requires that the DCC home directory exist. + It does not use the client shared or memory mapped file of server + addresses, but it requires other files. One is the <I>@prefix@/ids</I> file of + client-IDs, server-IDs, and corresponding passwords. Another is a <I>flod</I> + file of peers that send and receive floods of reports of checksums with + large counts. Both files are described in <B><A HREF="dccd.html">dccd(8)</A></B>. + + The server daemon should be started when the system is rebooted, probably + before sendmail. See the <I>misc/rcDCC</I> and <I>misc/start-dccd</I> files in the DCC + source. + + The database should be cleaned regularly with <B><A HREF="dbclean.html">dbclean(8)</A></B> such as by run- + ning the crontab job that is in the misc directory. + + +</PRE> +<H2><A NAME="SEE-ALSO">SEE ALSO</A></H2><PRE> + <B><A HREF="cdcc.html">cdcc(8)</A></B>, <B><A HREF="dbclean.html">dbclean(8)</A></B>, <B><A HREF="dcc.html">dcc(8)</A></B>, <B><A HREF="dccd.html">dccd(8)</A></B>, <B><A HREF="dccifd.html">dccifd(8)</A></B>, <B><A HREF="dccm.html">dccm(8)</A></B>, <B><A HREF="dccproc.html">dccproc(8)</A></B>, + <B><A HREF="dblist.html">dblist(8)</A></B>, <B><A HREF="dccsight.html">dccsight(8)</A></B>, <B>sendmail(8)</B>. + + +</PRE> +<H2><A NAME="HISTORY">HISTORY</A></H2><PRE> + Distributed Checksum Clearinghouses are based on an idea of Paul Vixie + with code designed and written at Rhyolite Software starting in 2000. + This document describes version 1.3.103. + + February 26, 2009 +</PRE> +<HR> +<ADDRESS> +Man(1) output converted with +<a href="http://www.oac.uci.edu/indiv/ehood/man2html.html">man2html</a> +modified for the DCC $Date 2001/04/29 03:22:18 $ +<BR> +<A HREF="http://www.dcc-servers.net/dcc/"> + <IMG SRC="http://logos.dcc-servers.net/border.png" + class=logo ALT="DCC logo"> + </A> +<A HREF="http://validator.w3.org/check?uri=referer"> + <IMG class=logo ALT="Valid HTML 4.01 Strict" + SRC="http://www.w3.org/Icons/valid-html401"> + </A> +</ADDRESS> +</BODY> +</HTML>