Mercurial > notdcc
diff dcc.8.in @ 0:c7f6b056b673
First import of vendor version
author | Peter Gervai <grin@grin.hu> |
---|---|
date | Tue, 10 Mar 2009 13:49:58 +0100 |
parents | |
children |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/dcc.8.in Tue Mar 10 13:49:58 2009 +0100 @@ -0,0 +1,967 @@ +.\" Copyright (c) 2008 by Rhyolite Software, LLC +.\" +.\" This agreement is not applicable to any entity which sells anti-spam +.\" solutions to others or provides an anti-spam solution as part of a +.\" security solution sold to other entities, or to a private network +.\" which employs the DCC or uses data provided by operation of the DCC +.\" but does not provide corresponding data to other users. +.\" +.\" Permission to use, copy, modify, and distribute this software without +.\" changes for any purpose with or without fee is hereby granted, provided +.\" that the above copyright notice and this permission notice appear in all +.\" copies and any distributed versions or copies are either unchanged +.\" or not called anything similar to "DCC" or "Distributed Checksum +.\" Clearinghouse". +.\" +.\" Parties not eligible to receive a license under this agreement can +.\" obtain a commercial license to use DCC by contacting Rhyolite Software +.\" at sales@rhyolite.com. +.\" +.\" A commercial license would be for Distributed Checksum and Reputation +.\" Clearinghouse software. That software includes additional features. This +.\" free license for Distributed ChecksumClearinghouse Software does not in any +.\" way grant permision to use Distributed Checksum and Reputation Clearinghouse +.\" software +.\" +.\" THE SOFTWARE IS PROVIDED "AS IS" AND RHYOLITE SOFTWARE, LLC DISCLAIMS ALL +.\" WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES +.\" OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL RHYOLITE SOFTWARE, LLC +.\" BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES +.\" OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, +.\" WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, +.\" ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS +.\" SOFTWARE. +.\" +.\" Rhyolite Software DCC 1.3.103-1.112 $Revision$ +.\" +.Dd February 26, 2009 +.ds volume-ds-DCC Distributed Checksum Clearinghouse +.Dt DCC 8 DCC +.Os " " +.Sh NAME +.Nm DCC +.Nd Distributed Checksum Clearinghouse +.Sh DESCRIPTION +The Distributed Checksum Clearinghouse or +.Nm +is a cooperative, distributed +system intended to detect "bulk" mail or mail sent to many people. +It allows individuals receiving a single mail message to determine +that many +other people have received essentially identical copies of the message +and so reject or discard the message. +.Pp +Source for the server, client, and utilities +is available at Rhyolite Software, LLC, http://www.rhyolite.com/dcc/ +It is free for organizations that do not sell spam or virus filtering +services. +.Ss How the DCC Is Used +The DCC can be viewed as a tool for end users to enforce their +right to "opt-in" to streams of bulk mail +by refusing bulk mail except from sources in a "whitelist." +Whitelists are the responsibility of DCC clients, +since only they know which bulk mail they solicited. +.Pp +False positives or mail marked as bulk by a DCC server that +is not bulk occur only when a recipient of a message reports it +to a DCC server as having been received many times +or when the "fuzzy" checksums of differing messages are the same. +The fuzzy checksums ignore aspects of messages in order to compute +identical checksums for substantially identical messages. +The fuzzy checksums are designed to ignore only +differences that do not affect meanings. +So in practice, you do not need to worry about DCC false positive indications +of "bulk," but not all bulk mail is unsolicited bulk mail or spam. +You must either use whitelists to distinguish solicited from unsolicited bulk +mail +or only use DCC indications of "bulk" as part of a scoring system such +as SpamAssassin. +Besides unsolicited bulk email or spam, +bulk messages include legitimate mail such as +order confirmations from merchants, +legitimate mailing lists, +and empty or test messages. +.Pp +A DCC server estimates the number copies of a +message by counting checksums reported by DCC clients. +Each client must decide which +bulk messages are unsolicited and what degree of "bulkiness" is objectionable. +Client DCC software marks, rejects, or discards mail that is bulk +according to local thresholds on target addresses from DCC servers +and unsolicited according to local whitelists. +.Pp +DCC servers are usually configured to receive reports from as many targets +as possible, including sources that cannot be trusted to not exaggerate the +number of copies of a message they see. +A user of a DCC client angry about receiving a message could report it with +1,000,000 separate DCC reports +or with a single report claiming 1,000,000 targets. +An unprincipled user could subscribe a "spam trap" to mailing lists +such as those of the IETF or CERT. +Such abuses of the system area not problems, +because much legitimate mail is "bulk." +You cannot reject bulk mail unless you have a whitelist of sources +of legitimate bulk mail. +.Pp +DCC can also be used by an Internet service provider to detect bulk +mail coming from its own customers. +In such circumstances, the DCC client might be configured to only log +bulk mail from unexpected (not whitelisted) customers. +.Ss What the DCC Is +A DCC server accumulates counts of cryptographic checksums of +messages but not the messages themselves. +It exchanges reports of frequently seen checksums with other servers. +DCC clients send reports of checksums related to incoming mail to +a nearby DCC server running +.Xr dccd 8 . +Each report from a client includes the number of recipients for the message. +A DCC server accumulates the reports and responds to clients the +the current total number of recipients for each checksum. +The client adds an SMTP header to incoming mail containing the total +counts. +It then discards or rejects mail that is not whitelisted and has +counts that exceed local thresholds. +.Pp +A special value of the number of addressees is "MANY" and means +it is certain that this message was bulk and might be unsolicited, +perhaps because it came from a locally blacklisted source or was +addressed to an invalid address or "spam trap." +The special value "MANY" is merely the largest value +that fits in the fixed sized field containing the count of addressees. +That "infinity" accumulated total can be reached with millions of +independent reports as well as with one or two. +.Pp +DCC servers +.Em flood +or send +reports of checksums of bulk mail to neighboring servers. +.Pp +To keep a server's database of checksums from growing without bound, +checksums are forgotten when they become old. +Checksums of bulk mail are kept longer. +See +.Xr dbclean 8 . +.Pp +DCC clients pick the nearest working DCC server using a small shared +or memory mapped file, +.Pa @prefix@/map . +It contains server names, port numbers, passwords, recent performance +measures, and so forth. +This file allows clients to use quick retransmission timeouts +and to waste little time on servers that have temporarily +stopped working or become unreachable. +The utility program +.Xr cdcc 8 +is used to maintain this file as well as to check the health of servers. +.Ss X-DCC Headers +The DCC software includes several programs used by clients. +.Xr Dccm 8 +uses the sendmail "milter" interface to query a DCC server, +add header lines to incoming mail, +and reject mail whose total checksum counts are high. +Dccm is intended to be run with SMTP servers using sendmail. +.Pp +.Xr Dccproc 8 +adds header lines to mail presented by file name or +.Pa stdin , +but relies on other programs +such as procmail to deal with mail with large counts. +.Xr Dccsight 8 +is similar but deals with previously computed checksums. +.Pp +.Xr Dccifd 8 +is similar to dccproc but is not run separately for each mail message +and so is far more efficient. +It receives mail messages via a socket somewhat like dccm, +but with a simpler protocol that can be used by Perl scripts +or other programs. +.Pp +DCC SMTP header lines are of one of the forms: +.Bd -literal -offset 2n +X-DCC-brand-Metrics: client server-ID; bulk cknm1=count cknm2=count ... +X-DCC-brand-Metrics: client; whitelist +.Ed +where +.Bl -hang -offset 3n -compact +.It Em whitelist +appears if the global or per-user +.Pa whiteclnt +file marks the message as good. +.It Em brand +is the "brand name" of the DCC server, such as "RHYOLITE". +.It Em client +is the name or IP address of the DCC client that added the +header line to the SMTP message. +.It Em server-ID +is the numeric ID of the DCC server that the DCC client contacted. +.It Em bulk +is present if one or more checksum counts exceeded the DCC client's +thresholds to make the message "bulky." +.It Em bulk rep +is present if the DCC reputation of the IP address of the sender is bad. +.It Em cknm1 , Ns Em cknm2 , Ns ... +are types of checksums: +.Bl -hang -offset 2n -width "Message-IDx" -compact +.It Em IP +address of SMTP client +.It Em env_From +SMTP envelope value +.It Em From +SMTP header line +.It Em Message-ID +SMTP header line +.It Em Received +last Received: header line in the SMTP message +.It Em substitute +SMTP header line chosen by the DCC client, prefixed with the name of +the header +.It Em Body +SMTP body ignoring white-space +.It Em Fuz1 +filtered or "fuzzy" body checksum +.It Em Fuz2 +another filtered or "fuzzy" body checksum +.It Em rep +DCC reputation of the mail sender or the estimated +probability that the message is bulk. +.El +Counts for +.Em IP , env_From , From , +.Em Message-Id , Received , +and +.Em substitute +checksums are omitted by the DCC client if the server +says it has no information. +Counts for +.Em Fuz1 +and +.Em Fuz2 +are omitted if the message body is empty or +contains too little of the right kind of information +for the checksum to be computed. +.It Em count +is the total number of recipients of messages with that +checksum reported directly or indirectly to the DCC server. +The special count "MANY" means that DCC client have claimed that +the message is directed at millions of recipients. +"MANY" imples the message is definitely bulk, but not necessarily unsolicited. +The special counts "OK" and "OK2" mean the checksum has been +marked "good" or "half-good" by DCC servers. +.El +.Pp +.Ss Mailing lists +Legitimate mailing list traffic differs from spam only in being solicited +by recipients. +Each client should have a private whitelist. +.Pp +DCC whitelists can also mark mail as unsolicited bulk using +blacklist entries for commonly forged values such as "From: user@public.com". +.Ss White and Blacklists +DCC server and client whitelist files share a common format. +Server files are always named +.Pa whitelist +and one is required to be in the DCC home directory +with the other server files. +Client whitelist files are +named +.Pa whiteclnt +in the DCC home directory or a subdirectory specified with the +.Fl U +option for +.Xr dccm 8 . +They specify mail that should not be reported to a DCC server or that is +always unsolicited and almost certainly bulk. +.Pp +A DCC whitelist file contains blank lines, comments starting +with "#", +and lines of the following forms: +.Bl -tag -offset 2n -width 4n -compact +.It Ar include file +Copies the contents of +.Ar file +into the whitelist. +It can occur only in the main whitelist or whiteclnt file and not in an +included file. +The file name should be absolute or relative to the DCC home directory. +.Pp +.It Ar count Em value +lines specify checksums that should be white- or blacklisted. +.Bl -inset -offset 2n -compact +.It Ar count Em env_From Ar 821-path +.It Ar count Em env_To Ar dest-mailbox +.It Ar count Em From Ar 822-mailbox +.It Ar count Em Message-ID Ar <string> +.It Ar count Em Received Ar string +.It Ar count Em Substitute Ar header string +.It Ar count Ar Hex ctype cksum +.It Ar count Em ip Ar IP-address +.El +.Pp +.Bl -tag -offset 2n -width 4n -compact +.It Ar MANY Em value +indicates that millions of targets have received messages with +the header, IP address, or checksum +.Em value . +.It Ar OK Em value +.It Ar OK2 Em value +say that messages with +the header, IP address, or checksum +.Em value +are OK and should not reported to DCC servers +or be greylisted. +.Ar OK2 +says that the message is "half OK." +Two +.Ar OK2 +checksums associated with a message are equivalent to one +.Ar OK . +.br +A DCC server never shares or +.Em floods +reports containing checksums +marked in its whitelist with OK or OK2 to other servers. +A DCC client does not report or ask its server about messages +with a checksum marked OK or OK2 in the client whitelist. +This is intended to allow a DCC client to keep private mail +so private that even its checksums are not disclosed. +.It Ar MX Em IP-address-or-hostname +.It Ar MXDCC Em IP-address-or-hostname +mark an address or block of addresses of trust mail relays including +MX servers, smart hosts, and bastion or DMZ relays. +The DCC clients +.Xr dccm 8 , +.Xr dccifd 8 , +and +.Xr dccproc 8 +parse and skip initial Received: headers added by listed MX servers to +determine the external sources of mail messages. +Unsolicited bulk mail that has been forwarded through listed addresses +is discarded by +.Xr dccm 8 +and +.Xr dccifd 8 +as if with +.Fl a Ar DISCARD +instead of rejected. +.Ar MXDCC +marks addresses that are MX servers that run DCC clients. +The checksums for a mail message that has been forwarded through +an address listed as MXDCC +queried instead of reported. +.It Ar SUBMIT Em IP-address-or-hostname +marks an IP address or block addresses of SMTP submission clients +such as web browsers +that cannot tolerate 4yz temporary rejections +but that cannot be trusted to not send spam. +Since they are local addresses, DCC Reputations are not computed for them. +.El +.Pp +.Ar value +in +.Ar count Em value +lines can be +.Bl -tag -offset 2n -width 4n -compact +.It Ar dest-mailbox +is an RFC\ 821 address or a local user name. +.It Ar 821-path +is an RFC\ 821 address. +.It Ar 822-mailbox +is an RFC\ 822 address with optional name. +.It Em Substitute Ar header +is the name of an SMTP header such as "Sender" or +the name of one of two SMTP envlope values, "HELO," or +"Mail_Host" for the resolved host name from the +.Ar 821-path +in +the message. +.It Ar Hex ctype cksum +starts with the string +.Em Hex +followed a checksum type, and +a string of four hexadecimal numbers obtained from a DCC log file +or the +.Xr dccproc 8 +command using +.Fl CQ . +The checksum type is +.Em body , Fuz1 , +or +.Em Fuz2 +or one of the preceding checksum types such as +.Em env_From . +.It Ar IP-address +is a host name, IPv4 or IPv6 address, or a block +of IP addresses in the standard xxx/mm from with +mm limited for server whitelists to 16 for IPv4 or 112 for IPv6. +There can be at most 64 CIDR blocks in a client +.Pa whiteclnt +file. +A host name is converted to IP addresses with DNS, +.Pa /etc/hosts +or other mechanisms +and one checksum for each addresses added to the whitelist. +.El +.Pp +.It Ar option setting +can only be in a DCC client +.Pa whiteclnt +file used by +.Xr dccifd 8 , +.Xr dccm 8 +or +.Xr dccproc 8 . +Settings in per-user whiteclnt files override settings +in the global file. +.Ar Setting +can be any of the following: +.Bl -tag -offset 2n -width 2n -compact +.It Ar option log-all +to log all mail messages. +.It Ar option log-normal +to log only messages that meet the logging thresholds. +.It Ar option log-subdirectory-day +.It Ar option log-subdirectory-hour +.It Ar option log-subdirectory-minute +creates log files containing mail messages in subdirectories +of the form +.Ar JJJ , +.Ar JJJ/HH , +or +.Ar JJJ/HH/MM +where +.Ar JJJ +is the current julian day, +.Ar HH +is the current hour, and +.Ar MM +is the current minute. +See also the +.Fl l Ar logdir +option for +.Xr dccm 8 , +.Xr dccifd 8 , +and +.Xr dccproc 8 . +.It Ar option dcc-on +.It Ar option dcc-off +Control DCC filtering. +See the discussion of +.Fl W +for +.Xr dccm 8 +and +.Xr dccifd 8 . +.It Ar option greylist-on +.It Ar option greylist-off +to control greylisting. +Greylisting for other recipients in the same SMTP transaction +can still cause greylist temporary rejections. +.Ar greylist-off +in the main whiteclnt file. +.It Ar option greylist-log-on +.It Ar option greylist-log-off +to control logging of greylisted mail messages. +.It Ar option DCC-rep-off +.It Ar option DCC-rep-on +to honor or ignore DCC Reputations computed by the DCC server. +.It Ar option DNSBL1-off +.It Ar option DNSBL1-on +.It Ar option DNSBL2-off +.It Ar option DNSBL2-on +.It Ar option DNSBL3-off +.It Ar option DNSBL3-on +honor or ignore results of DNS blacklist checks configured with +.Fl B +for +.Xr dccm 8 , +.Xr dccifd 8 , +and +.Xr dccproc 8 . +.It Ar option MTA-first +.It Ar option MTA-last +consider MTA determinations of spam or not-spam first so they can be overridden +by +.Pa whiteclnt +files, or last so that they can override +.Pa whiteclnt files. +.It Ar option forced-discard-ok +.It Ar option no-forced-discard +control whether +.Xr dccm 8 +and +.Xr dccifd 8 +are allowed to discard a message for one mailbox for which +it is spam when it is not spam and must be delivered to another mailbox. +This can happen if a mail message is addressed to two or more mailboxes with +differing whitelists. +Discarding can be undesirable because false positives are not communicated +to mail senders. +To avoid discarding, +.Xr dccm 8 +and +.Xr dccifd 8 +running in proxy mode temporarily reject SMTP envelope +.Em Rcpt To +values that involve differing +.Pa whiteclnt +files. +.It Ar option threshold type,rej-thold +has the same effects as +.Fl c Ar type,rej-thold +for +.Xr dccproc 8 +or +.Fl t Ar type,rej-thold +for +.Xr dccm 8 +and +.Xr dccifd 8 . +It is useful only in per-user whiteclnt files to override the global +DCC checksum thresholds. +.It Ar option spam-trap-accept +.It Ar option spam-trap-reject +say that mail should be reported to the DCC server as extremely +bulk or with target counts of +.Ar MANY . +Greylisting, DNS blacklist (DNSBL), and other checks are turned off. +.Ar Spam-trap-accept +tells the MTA to accept the message while +.Ar spam-trap-reject +tells the MTA to reject the message. +Use +.Ar Spam-trap-accept +for spam traps that should not be disclosed. +.Ar Spam-trap-reject +can be used on +.Em catch-all +mailboxes that might receive legitimate mail by typographical errors +and that senders should be told about. +.El +.Pp +In the absence of explicit settings, +the default in the main whiteclnt file is equivalent to +.Bl -hang -offset 4n -width 4n -compact +.It Ar option log-normal +.It Ar option dcc-on +.It Ar option greylist-on +.It Ar option greylist-log-on +.It Ar option DCC-rep-off +.It Ar option DNSBL1-off +.It Ar option DNSBL2-off +.It Ar option DNSBL3-off +.It Ar MTA-last +.It Ar option no-forced-discard +.El +The defaults for individual recipient +.Pa whiteclnt +files are the same except as change by explicit settings +in the main file. +.El +.Pp +Checksums of the IP address of the SMTP client sending a mail message +are practically unforgeable, because it is impractical for +an SMTP client to "spoof" its address or pretend to use some other IP address. +That would make the IP address of the sender useful for whitelisting, +except that the IP address of the SMTP client +is often not available to users of +.Xr dccproc 8 . +In addition, legitimate mail relays make whitelist entries for IP +addresses of little use. +For example, +the IP address from which a message arrived might be that of a +local relay instead of the home address of a whitelisted mailing list. +.Pp +Envelope and header +.Ar From +values can be forged, +so whitelist entries for their checksums are not entirely reliable. +.Pp +Checksums of +.Ar env_To +values are never sent to DCC servers. +They are valid in only +.Pa whiteclnt +files +and used only by +.Xr dccm 8 , +.Xr dccifd 8 , +and +.Xr dccproc 8 +when the envelope +.Em Rcpt To +value is known. +.Ss Greylists +The DCC server, +.Xr dccd 8 , +can be used to maintain a greylist database for some DCC clients +including +.Xr dccm 8 +and +.Xr dccifd 8 . +Greylisting involves temporarily refusing mail from unfamiliar +SMTP clients and is unrelated to filtering with a +Distributed Checksum Clearinghouse. +.br +See http://projects.puremagic.com/greylisting/ +.Ss Privacy +Because sending mail is a less private act than receiving it, +and because sending bulk mail is usually not private at all +and cannot be very private, +the DCC tries first to protect the privacy of mail recipients, +and second the privacy of senders of mail that is not bulk. +.Pp +DCC clients necessarily disclose some information about mail they have +received. +The DCC database contains checksums of mail bodies, +header lines, and source addresses. +While it contains significantly less information than is +available by "snooping" on Internet links, +it is important that the DCC database be treated as containing +sensitive information and to not put the most private information +in the DCC database. +Given the contents of a message, one might determine +whether that message has been received +by a system that subscribes to the DCC. +Guesses about the sender and addressee of a message can also be +validated if the checksums of the message have been sent to a DCC server. +.Pp +Because the DCC is distributed, +organizations can operate their own DCC servers, and configure +them to share or "flood" only the checksums of bulk mail that is not +in local whitelists. +.Pp +DCC clients should not report the checksums of messages known to be +private to a DCC server. +For example, checksums of messages local to +a system or that are otherwise known a priori to not be unsolicited bulk +should not be sent to a remote DCC server. +This can accomplished by adding entries for the sender to the +client's local whitelist file. +Client whitelist files can also include entries for email recipients +whose mail should not be reported to a DCC server. +.Ss Security +Whenever considering security, +one must first consider the risks. +The worst DCC security problems are +unauthorized commands to a DCC service, +denial of the DCC service, +and corruption of DCC data. +The worst that can be done with remote commands to a DCC server is +to turn it off or otherwise cause it to stop responding. +The DCC is designed to fail gracefully, +so that a denial of service attack +would at worst allow delivery of mail that would otherwise be rejected. +Corruption of DCC data might at worst cause mail that is already +somewhat "bulk" by virtue of being received by two or more people +to appear have higher recipient numbers. +Since DCC users +.Em must +whitelist all sources of legitimate bulk mail, +this is also not a concern. +Such security risks should be addressed, +but only with defenses that don't cost more than the possible damage from +an attack. +.Pp +The DCC must contend with senders of unsolicited bulk mail who +resort to unlawful actions +to express their displeasure at having their advertising blocked. +Because the DCC protocol is based +on UDP, an unhappy advertiser could try to +flood a DCC server with +packets supposedly from subscribers or non-subscribers. +DCC servers defend against that attack by rate-limiting requests +from anonymous users. +.Pp +Also because of the use of UDP, clients must be protected +against forged answers to their queries. +Otherwise an unsolicited bulk mail advertiser could send +a stream of "not spam" answers to an SMTP +client while simultaneously sending mail that would otherwise be +rejected. +This is not a problem for authenticated clients of the +DCC because they share a secret with the DCC. +Unauthenticated, anonymous DCC +clients do not share any secrets with the DCC, except for unique and +unpredictable bits in each query or report sent to the DCC. +Therefore, DCC servers cryptographically sign answers to +unauthenticated clients with bits from the corresponding queries. +This protects against attackers that do not +have access to the stream of packets from the DCC client. +.Pp +The passwords or shared secrets used in the DCC client and server programs +are "cleartext" for several reasons. +In any shared secret authentication system, +at least one party must know the secret or keep the secret in cleartext. +You could encrypt the secrets in a file, but because they are used +by programs, you would need a cleartext copy of the key to decrypt +the file somewhere in the system, making such a scheme more expensive +but no more secure than a file of cleartext passwords. +Asymmetric systems such as that used in UNIX allow one party to not +know the secrets, but they must be and are +designed to be computationally expensive when used in applications +like the DCC that involve thousands or more authentication checks per second. +Moreover, because of "dictionary attacks," +asymmetric systems are now little more secure than +keeping passwords in cleartext. +An adversary can compare the hash values of combinations of common words +with /etc/passwd hash values to look for bad passwords. +Worse, by the nature of a client/server protocol like that used in +the DCC, clients must have the cleartext password. +Since it is among the more numerous and much less secure clients +that adversaries would seek files of DCC passwords, +it would be a waste to complicate the DCC server with an asymmetric +system. +.Pp +The DCC protocol is vulnerable to dictionary attacks to recover passwords. +An adversary could capture some DCC packets, and then check to see +if any of the 100,000 to 1,000,000 passwords in so called +"cracker dictionaries" +applied to a packet generated the same signature. +This is a concern only if DCC passwords are poorly chosen, such +as any combination of words in an English dictionary. +There are ways to prevent this vulnerability regardless of +how badly passwords are chosen, but they are computationally expensive +and require additional network round trips. +Since DCC passwords are created and typed into files once +and do not need to be remembered by people, +it is cheaper and quite easy to simply choose good passwords +that are not in dictionaries. +.Ss Reliability +It is better to fail to filter unsolicited bulk mail than to fail +to deliver legitimate mail, so DCC clients fail in the direction of +assuming that mail is legitimate or even whitelisted. +.Pp +A DCC client sends a report or other request and waits for an answer. +If no answer arrives within a reasonable time, +the client retransmits. +There are many things that +might result in the client not receiving an answer, +but the most important is packet loss. +If the client's request does not reach the server, +it is easy and harmless for the client to retransmit. +If the client's request reached the server but the server's response was lost, +a retransmission to the same server would be misunderstood as +a new report of another copy of the same message unless it is detected +as a retransmission by the server. +The DCC protocol includes transactions identifiers for this purpose. +If the client retransmitted to a second server, +the retransmission would be misunderstood by the second server as +a new report of the same message. +.Pp +Each request from a client includes a timestamp to aid the client in +measuring the round trip time to the server and to let the client pick +the closest server. +Clients monitor the speed of all of the servers they know including +those they are not currently using, +and use the quickest. +.Ss Client and Server-IDs +Servers and clients use numbers or IDs to identify themselves. +ID 1 is reserved for anonymous, unauthenticated clients. +All other IDs are associated with a pair of passwords in the +.Pa ids +file, the +current and next or previous and current passwords. +Clients included their client IDs in their messages. +When they are not using the anonymous ID, +they sign their messages to servers with the first password +associated with their client-ID. +Servers treat messages with signatures that match neither of the passwords +for the client-ID in their own +.Pa ids +file as if the client had used the anonymous ID. +.Pp +Each server has a unique +.Em server-ID +less than 32768. +Servers use their IDs to identify checksums that they +.Em flood +to other servers. +Each server expects local clients sending administrative +commands to use the server's ID and sign administrative commands +with the associated password. +.Pp +Server-IDs must be unique among all systems that share reports +by "flooding." +All servers must be told of the IDs all other servers whose +reports can be received in the local +.Pa @prefix@/flod +file described in +.Xr dccd 8 . +However, server-IDs can be mapped during flooding between +independent DCC organizations. +.Pp +.Em Passwd-IDs +are server-IDs that should not be assigned to servers. +They appear in the often publicly readable +.Pa @prefix@/flod +and specify passwords in the private +.Pa @prefix@/ids +file for the inter-server flooding protocol +.Pp +The client identified by a +.Em client-ID +might be a single computer with a +single IP address, a single but multi-homed computer, or many computers. +Client-IDs are not used to identify checksum reports, but +the organization operating the client. +A client-ID need only be unique among clients using a single server. +A single client can use different client-IDs for different servers, +each client-ID authenticated with a separate password. +.Pp +An obscure but important part of all of this is that the +inter-server flooding algorithm +depends on server-IDs and timestamps attached to reports of checksums. +The inter-server flooding mechanism +requires cooperating DCC servers to maintain reasonable clocks +ticking in UTC. +Clients include timestamps in their requests, but as long as their +timestamps are unlikely to be repeated, they need not be very accurate. +.Ss Installation Considerations +DCC clients on a computer share information about which servers +are currently working and their speeds in a shared memory segment. +This segment also contains server host names, IP addresses, and +the passwords needed to authenticate known clients to servers. +That generally requires that +.Xr dccm 8 , +.Xr dccproc 8 , +.Xr dccifd 8 , +and +.Xr cdcc 8 +execute with an UID that +can write to the DCC home directory and its files. +The sendmail interface, dccm, +is a daemon that can be started by an "rc" or other script already +running with the correct UID. +The other two, dccproc and cdcc need to be set-UID because they are +used by end users. +They relinquish set-UID privileges when not needed. +.Pp +Files that contain cleartext passwords including the shared file used by clients +must be readable only by "owner." +.Pp +The data files required by a DCC can be in a single "home" directory, +.Pa @prefix@ . +Distinct DCC servers can run on a single computer, provided they use +distinct UDP port numbers and home directories. +It is possible and convenient for the DCC clients using a server +on the same computer to use the same home directory as the server. +.Pp +The DCC source distribution includes sample control files. +They should be modified appropriately and then copied to the DCC +home directory. +Files that contain cleartext passwords must not be publicly readable. +.Pp +The DCC source includes "feature" m4 files to configure +sendmail to use +.Xr dccm 8 +to check a DCC server about incoming mail. +.Pp +See also the INSTALL.html file. +.Ss Client Installation +Installing a DCC client starts with obtaining or compiling program binaries +for the client server data control tool, +.Xr cdcc 8 . +Installing the sendmail DCC interface, +.Xr dccm 8 , +or +.Xr dccproc 8 , +the general or +.Xr procmail 1 +interface +is the main part of the client installation. +Connecting the DCC to sendmail with dccm is most powerful, +but requires administrative control of the system running sendmail. +.Pp +As noted above, cdcc and dccproc should be +set-UID to a suitable UID. +Root or 0 is thought to be safe for both, because they are +careful to release privileges except when they need them to +read or write files in the DCC home directory. +A DCC home directory, +.Pa @prefix@ +should be created. +It must be owned and writable by the UID to which cdcc is set. +.Pp +After the DCC client programs have been obtained, +contact the operator(s) of the chosen DCC server(s) +to obtain +each server's +hostname, +port number, +and a +.Em client-ID +and corresponding password. +No client-IDs or passwords are needed touse +DCC servers that allow anonymous clients. +Use the +.Em load +or +.Em add +commands +of cdcc to create a +.Pa map +file in the DCC home directory. +It is usually necessary to create a client whitelist file of +the format described above. +To accommodate users sharing a computer but not ideas about what +is solicited bulk mail, +the client whitelist file can be any valid path name +and need not be in the DCC home directory. +.Pp +If dccm is chosen, +arrange to start it with suitable arguments +before sendmail is started. +See the +.Pa homedir/dcc_conf +file and the +.Pa misc/rcDCC +script in the DCC source. +The procmail DCCM interface, +.Xr dccproc 8 , +can be run manually or by a +.Xr procmailrc 5 +rule. +.Ss Server Installation +The DCC server, +.Xr dccd 8 , +also requires that the DCC home directory exist. +It does not use the client shared or memory mapped file of server +addresses, +but it requires other files. +One is the +.Pa @prefix@/ids +file of client-IDs, server-IDs, and corresponding passwords. +Another is a +.Pa flod +file of peers that send and receive floods of reports of checksums +with large counts. +Both files are described +in +.Xr dccd 8 . +.Pp +The server daemon should be started when the system is rebooted, +probably before sendmail. +See the +.Pa misc/rcDCC +and +.Pa misc/start-dccd +files in the DCC source. +.Pp +The database should be cleaned regularly with +.Xr dbclean 8 +such as by running the crontab job that is in the misc directory. +.Sh SEE ALSO +.Xr cdcc 8 , +.Xr dbclean 8 , +.Xr dcc 8 , +.Xr dccd 8 , +.Xr dccifd 8 , +.Xr dccm 8 , +.Xr dccproc 8 , +.Xr dblist 8 , +.Xr dccsight 8 , +.Xr sendmail 8 . +.Sh HISTORY +Distributed Checksum Clearinghouses are based on an idea of Paul Vixie +with code designed and written at Rhyolite Software starting in 2000. +This document describes version 1.3.103.