notdcc: dcc.0 comparison

comparison dcc.0 @ 0:c7f6b056b673

First import of vendor version

author	Peter Gervai <grin@grin.hu>
date	Tue, 10 Mar 2009 13:49:58 +0100
parents
children

comparison

equal deleted inserted replaced

--1:000000000000
+:c7f6b056b673
+DCC(8)                Distributed Checksum Clearinghouse                DCC(8)
+NNAAMMEE
+DDCCCC -- Distributed Checksum Clearinghouse
+DDEESSCCRRIIPPTTIIOONN
+The Distributed Checksum Clearinghouse or DDCCCC is a cooperative, distrib-
+uted system intended to detect "bulk" mail or mail sent to many people.
+It allows individuals receiving a single mail message to determine that
+many other people have received essentially identical copies of the mes-
+sage and so reject or discard the message.
+Source for the server, client, and utilities is available at Rhyolite
+Software, LLC, http://www.rhyolite.com/dcc/ It is free for organizations
+that do not sell spam or virus filtering services.
+HHooww tthhee DDCCCC IIss UUsseedd
+The DCC can be viewed as a tool for end users to enforce their right to
+"opt-in" to streams of bulk mail by refusing bulk mail except from
+sources in a "whitelist."  Whitelists are the responsibility of DCC
+clients, since only they know which bulk mail they solicited.
+False positives or mail marked as bulk by a DCC server that is not bulk
+occur only when a recipient of a message reports it to a DCC server as
+having been received many times or when the "fuzzy" checksums of differ-
+ing messages are the same.  The fuzzy checksums ignore aspects of mes-
+sages in order to compute identical checksums for substantially identical
+messages.  The fuzzy checksums are designed to ignore only differences
+that do not affect meanings.  So in practice, you do not need to worry
+about DCC false positive indications of "bulk," but not all bulk mail is
+unsolicited bulk mail or spam.  You must either use whitelists to distin-
+guish solicited from unsolicited bulk mail or only use DCC indications of
+"bulk" as part of a scoring system such as SpamAssassin.  Besides unso-
+licited bulk email or spam, bulk messages include legitimate mail such as
+order confirmations from merchants, legitimate mailing lists, and empty
+or test messages.
+A DCC server estimates the number copies of a message by counting check-
+sums reported by DCC clients.  Each client must decide which bulk mes-
+sages are unsolicited and what degree of "bulkiness" is objectionable.
+Client DCC software marks, rejects, or discards mail that is bulk accord-
+ing to local thresholds on target addresses from DCC servers and unso-
+licited according to local whitelists.
+DCC servers are usually configured to receive reports from as many tar-
+gets as possible, including sources that cannot be trusted to not exag-
+gerate the number of copies of a message they see.  A user of a DCC
+client angry about receiving a message could report it with 1,000,000
+separate DCC reports or with a single report claiming 1,000,000 targets.
+An unprincipled user could subscribe a "spam trap" to mailing lists such
+as those of the IETF or CERT.  Such abuses of the system area not prob-
+lems, because much legitimate mail is "bulk."  You cannot reject bulk
+mail unless you have a whitelist of sources of legitimate bulk mail.
+DCC can also be used by an Internet service provider to detect bulk mail
+coming from its own customers.  In such circumstances, the DCC client
+might be configured to only log bulk mail from unexpected (not
+whitelisted) customers.
+WWhhaatt tthhee DDCCCC IIss
+A DCC server accumulates counts of cryptographic checksums of messages
+but not the messages themselves.  It exchanges reports of frequently seen
+checksums with other servers.  DCC clients send reports of checksums
+related to incoming mail to a nearby DCC server running dccd(8).  Each
+report from a client includes the number of recipients for the message.
+A DCC server accumulates the reports and responds to clients the the cur-
+rent total number of recipients for each checksum.  The client adds an
+SMTP header to incoming mail containing the total counts.  It then dis-
+cards or rejects mail that is not whitelisted and has counts that exceed
+local thresholds.
+A special value of the number of addressees is "MANY" and means it is
+certain that this message was bulk and might be unsolicited, perhaps
+because it came from a locally blacklisted source or was addressed to an
+invalid address or "spam trap."  The special value "MANY" is merely the
+largest value that fits in the fixed sized field containing the count of
+addressees.  That "infinity" accumulated total can be reached with mil-
+lions of independent reports as well as with one or two.
+DCC servers _f_l_o_o_d or send reports of checksums of bulk mail to neighbor-
+ing servers.
+To keep a server's database of checksums from growing without bound,
+checksums are forgotten when they become old.  Checksums of bulk mail are
+kept longer.  See dbclean(8).
+DCC clients pick the nearest working DCC server using a small shared or
+memory mapped file, _/_v_a_r_/_d_c_c_/_m_a_p.  It contains server names, port num-
+bers, passwords, recent performance measures, and so forth.  This file
+allows clients to use quick retransmission timeouts and to waste little
+time on servers that have temporarily stopped working or become unreach-
+able.  The utility program cdcc(8) is used to maintain this file as well
+as to check the health of servers.
+XX--DDCCCC HHeeaaddeerrss
+The DCC software includes several programs used by clients.  Dccm(8) uses
+the sendmail "milter" interface to query a DCC server, add header lines
+to incoming mail, and reject mail whose total checksum counts are high.
+Dccm is intended to be run with SMTP servers using sendmail.
+Dccproc(8) adds header lines to mail presented by file name or _s_t_d_i_n, but
+relies on other programs such as procmail to deal with mail with large
+counts.  Dccsight(8) is similar but deals with previously computed check-
+sums.
+Dccifd(8) is similar to dccproc but is not run separately for each mail
+message and so is far more efficient.  It receives mail messages via a
+socket somewhat like dccm, but with a simpler protocol that can be used
+by Perl scripts or other programs.
+DCC SMTP header lines are of one of the forms:
+X-DCC-brand-Metrics: client server-ID; bulk cknm1=count cknm2=count ...
+X-DCC-brand-Metrics: client; whitelist
+where
+_w_h_i_t_e_l_i_s_t appears if the global or per-user _w_h_i_t_e_c_l_n_t file marks the
+message as good.
+_b_r_a_n_d   is the "brand name" of the DCC server, such as "RHYOLITE".
+_c_l_i_e_n_t  is the name or IP address of the DCC client that added the
+header line to the SMTP message.
+_s_e_r_v_e_r_-_I_D is the numeric ID of the DCC server that the DCC client con-
+tacted.
+_b_u_l_k    is present if one or more checksum counts exceeded the DCC
+client's thresholds to make the message "bulky."
+_b_u_l_k _r_e_p is present if the DCC reputation of the IP address of the
+sender is bad.
+_c_k_n_m_1,_c_k_n_m_2,... are types of checksums:
+_I_P           address of SMTP client
+_e_n_v___F_r_o_m     SMTP envelope value
+_F_r_o_m         SMTP header line
+_M_e_s_s_a_g_e_-_I_D   SMTP header line
+_R_e_c_e_i_v_e_d     last Received: header line in the SMTP message
+_s_u_b_s_t_i_t_u_t_e   SMTP header line chosen by the DCC client, pre-
+fixed with the name of the header
+_B_o_d_y         SMTP body ignoring white-space
+_F_u_z_1         filtered or "fuzzy" body checksum
+_F_u_z_2         another filtered or "fuzzy" body checksum
+_r_e_p          DCC reputation of the mail sender or the esti-
+mated probability that the message is bulk.
+Counts for _I_P, _e_n_v___F_r_o_m, _F_r_o_m, _M_e_s_s_a_g_e_-_I_d, _R_e_c_e_i_v_e_d, and
+_s_u_b_s_t_i_t_u_t_e checksums are omitted by the DCC client if the
+server says it has no information.  Counts for _F_u_z_1 and _F_u_z_2
+are omitted if the message body is empty or contains too lit-
+tle of the right kind of information for the checksum to be
+computed.
+_c_o_u_n_t   is the total number of recipients of messages with that check-
+sum reported directly or indirectly to the DCC server.  The
+special count "MANY" means that DCC client have claimed that
+the message is directed at millions of recipients.  "MANY"
+imples the message is definitely bulk, but not necessarily
+unsolicited.  The special counts "OK" and "OK2" mean the
+checksum has been marked "good" or "half-good" by DCC servers.
+MMaaiilliinngg lliissttss
+Legitimate mailing list traffic differs from spam only in being solicited
+by recipients.  Each client should have a private whitelist.
+DCC whitelists can also mark mail as unsolicited bulk using blacklist
+entries for commonly forged values such as "From: user@public.com".
+WWhhiittee aanndd BBllaacckklliissttss
+DCC server and client whitelist files share a common format.  Server
+files are always named _w_h_i_t_e_l_i_s_t and one is required to be in the DCC
+home directory with the other server files.  Client whitelist files are
+named _w_h_i_t_e_c_l_n_t in the DCC home directory or a subdirectory specified
+with the --UU option for dccm(8).  They specify mail that should not be
+reported to a DCC server or that is always unsolicited and almost cer-
+tainly bulk.
+A DCC whitelist file contains blank lines, comments starting with "#",
+and lines of the following forms:
+_i_n_c_l_u_d_e _f_i_l_e
+Copies the contents of _f_i_l_e into the whitelist.  It can occur
+only in the main whitelist or whiteclnt file and not in an
+included file.  The file name should be absolute or relative to
+the DCC home directory.
+_c_o_u_n_t _v_a_l_u_e
+lines specify checksums that should be white- or blacklisted.
+_c_o_u_n_t _e_n_v___F_r_o_m _8_2_1_-_p_a_t_h
+_c_o_u_n_t _e_n_v___T_o _d_e_s_t_-_m_a_i_l_b_o_x
+_c_o_u_n_t _F_r_o_m _8_2_2_-_m_a_i_l_b_o_x
+_c_o_u_n_t _M_e_s_s_a_g_e_-_I_D _<_s_t_r_i_n_g_>
+_c_o_u_n_t _R_e_c_e_i_v_e_d _s_t_r_i_n_g
+_c_o_u_n_t _S_u_b_s_t_i_t_u_t_e _h_e_a_d_e_r _s_t_r_i_n_g
+_c_o_u_n_t _H_e_x _c_t_y_p_e _c_k_s_u_m
+_c_o_u_n_t _i_p _I_P_-_a_d_d_r_e_s_s
+_M_A_N_Y _v_a_l_u_e
+indicates that millions of targets have received messages
+with the header, IP address, or checksum _v_a_l_u_e.
+_O_K _v_a_l_u_e
+_O_K_2 _v_a_l_u_e
+say that messages with the header, IP address, or check-
+sum _v_a_l_u_e are OK and should not reported to DCC servers
+or be greylisted.  _O_K_2 says that the message is "half
+OK."  Two _O_K_2 checksums associated with a message are
+equivalent to one _O_K.
+A DCC server never shares or _f_l_o_o_d_s reports containing
+checksums marked in its whitelist with OK or OK2 to other
+servers.  A DCC client does not report or ask its server
+about messages with a checksum marked OK or OK2 in the
+client whitelist.  This is intended to allow a DCC client
+to keep private mail so private that even its checksums
+are not disclosed.
+_M_X _I_P_-_a_d_d_r_e_s_s_-_o_r_-_h_o_s_t_n_a_m_e
+_M_X_D_C_C _I_P_-_a_d_d_r_e_s_s_-_o_r_-_h_o_s_t_n_a_m_e
+mark an address or block of addresses of trust mail
+relays including MX servers, smart hosts, and bastion or
+DMZ relays.  The DCC clients dccm(8), dccifd(8), and
+dccproc(8) parse and skip initial Received: headers added
+by listed MX servers to determine the external sources of
+mail messages.  Unsolicited bulk mail that has been for-
+warded through listed addresses is discarded by dccm(8)
+and dccifd(8) as if with --aa _D_I_S_C_A_R_D instead of rejected.
+_M_X_D_C_C marks addresses that are MX servers that run DCC
+clients.  The checksums for a mail message that has been
+forwarded through an address listed as MXDCC queried
+instead of reported.
+_S_U_B_M_I_T _I_P_-_a_d_d_r_e_s_s_-_o_r_-_h_o_s_t_n_a_m_e
+marks an IP address or block addresses of SMTP submission
+clients such as web browsers that cannot tolerate 4yz
+temporary rejections but that cannot be trusted to not
+send spam.  Since they are local addresses, DCC Reputa-
+tions are not computed for them.
+_v_a_l_u_e in _c_o_u_n_t _v_a_l_u_e lines can be
+_d_e_s_t_-_m_a_i_l_b_o_x
+is an RFC 821 address or a local user name.
+_8_2_1_-_p_a_t_h
+is an RFC 821 address.
+_8_2_2_-_m_a_i_l_b_o_x
+is an RFC 822 address with optional name.
+_S_u_b_s_t_i_t_u_t_e _h_e_a_d_e_r
+is the name of an SMTP header such as "Sender" or the
+name of one of two SMTP envlope values, "HELO," or
+"Mail_Host" for the resolved host name from the _8_2_1_-_p_a_t_h
+in the message.
+_H_e_x _c_t_y_p_e _c_k_s_u_m
+starts with the string _H_e_x followed a checksum type, and
+a string of four hexadecimal numbers obtained from a DCC
+log file or the dccproc(8) command using --CCQQ.  The check-
+sum type is _b_o_d_y, _F_u_z_1, or _F_u_z_2 or one of the preceding
+checksum types such as _e_n_v___F_r_o_m.
+_I_P_-_a_d_d_r_e_s_s
+is a host name, IPv4 or IPv6 address, or a block of IP
+addresses in the standard xxx/mm from with mm limited for
+server whitelists to 16 for IPv4 or 112 for IPv6.  There
+can be at most 64 CIDR blocks in a client _w_h_i_t_e_c_l_n_t file.
+A host name is converted to IP addresses with DNS,
+_/_e_t_c_/_h_o_s_t_s or other mechanisms and one checksum for each
+addresses added to the whitelist.
+_o_p_t_i_o_n _s_e_t_t_i_n_g
+can only be in a DCC client _w_h_i_t_e_c_l_n_t file used by dccifd(8),
+dccm(8) or dccproc(8).  Settings in per-user whiteclnt files
+override settings in the global file.  _S_e_t_t_i_n_g can be any of the
+following:
+_o_p_t_i_o_n _l_o_g_-_a_l_l
+to log all mail messages.
+_o_p_t_i_o_n _l_o_g_-_n_o_r_m_a_l
+to log only messages that meet the logging thresholds.
+_o_p_t_i_o_n _l_o_g_-_s_u_b_d_i_r_e_c_t_o_r_y_-_d_a_y
+_o_p_t_i_o_n _l_o_g_-_s_u_b_d_i_r_e_c_t_o_r_y_-_h_o_u_r
+_o_p_t_i_o_n _l_o_g_-_s_u_b_d_i_r_e_c_t_o_r_y_-_m_i_n_u_t_e
+creates log files containing mail messages in subdirecto-
+ries of the form _J_J_J, _J_J_J_/_H_H, or _J_J_J_/_H_H_/_M_M where _J_J_J is the
+current julian day, _H_H is the current hour, and _M_M is the
+current minute.  See also the --ll _l_o_g_d_i_r option for dccm(8),
+dccifd(8), and dccproc(8).
+_o_p_t_i_o_n _d_c_c_-_o_n
+_o_p_t_i_o_n _d_c_c_-_o_f_f
+Control DCC filtering.  See the discussion of --WW for
+dccm(8) and dccifd(8).
+_o_p_t_i_o_n _g_r_e_y_l_i_s_t_-_o_n
+_o_p_t_i_o_n _g_r_e_y_l_i_s_t_-_o_f_f
+to control greylisting.  Greylisting for other recipients
+in the same SMTP transaction can still cause greylist tem-
+porary rejections.  _g_r_e_y_l_i_s_t_-_o_f_f in the main whiteclnt
+file.
+_o_p_t_i_o_n _g_r_e_y_l_i_s_t_-_l_o_g_-_o_n
+_o_p_t_i_o_n _g_r_e_y_l_i_s_t_-_l_o_g_-_o_f_f
+to control logging of greylisted mail messages.
+_o_p_t_i_o_n _D_C_C_-_r_e_p_-_o_f_f
+_o_p_t_i_o_n _D_C_C_-_r_e_p_-_o_n
+to honor or ignore DCC Reputations computed by the DCC
+server.
+_o_p_t_i_o_n _D_N_S_B_L_1_-_o_f_f
+_o_p_t_i_o_n _D_N_S_B_L_1_-_o_n
+_o_p_t_i_o_n _D_N_S_B_L_2_-_o_f_f
+_o_p_t_i_o_n _D_N_S_B_L_2_-_o_n
+_o_p_t_i_o_n _D_N_S_B_L_3_-_o_f_f
+_o_p_t_i_o_n _D_N_S_B_L_3_-_o_n
+honor or ignore results of DNS blacklist checks configured
+with --BB for dccm(8), dccifd(8), and dccproc(8).
+_o_p_t_i_o_n _M_T_A_-_f_i_r_s_t
+_o_p_t_i_o_n _M_T_A_-_l_a_s_t
+consider MTA determinations of spam or not-spam first so
+they can be overridden by _w_h_i_t_e_c_l_n_t files, or last so that
+they can override _w_h_i_t_e_c_l_n_t _f_i_l_e_s_.
+_o_p_t_i_o_n _f_o_r_c_e_d_-_d_i_s_c_a_r_d_-_o_k
+_o_p_t_i_o_n _n_o_-_f_o_r_c_e_d_-_d_i_s_c_a_r_d
+control whether dccm(8) and dccifd(8) are allowed to dis-
+card a message for one mailbox for which it is spam when it
+is not spam and must be delivered to another mailbox.  This
+can happen if a mail message is addressed to two or more
+mailboxes with differing whitelists.  Discarding can be
+undesirable because false positives are not communicated to
+mail senders.  To avoid discarding, dccm(8) and dccifd(8)
+running in proxy mode temporarily reject SMTP envelope _R_c_p_t
+_T_o values that involve differing _w_h_i_t_e_c_l_n_t files.
+_o_p_t_i_o_n _t_h_r_e_s_h_o_l_d _t_y_p_e_,_r_e_j_-_t_h_o_l_d
+has the same effects as --cc _t_y_p_e_,_r_e_j_-_t_h_o_l_d for dccproc(8) or
+--tt _t_y_p_e_,_r_e_j_-_t_h_o_l_d for dccm(8) and dccifd(8).  It is useful
+only in per-user whiteclnt files to override the global DCC
+checksum thresholds.
+_o_p_t_i_o_n _s_p_a_m_-_t_r_a_p_-_a_c_c_e_p_t
+_o_p_t_i_o_n _s_p_a_m_-_t_r_a_p_-_r_e_j_e_c_t
+say that mail should be reported to the DCC server as
+extremely bulk or with target counts of _M_A_N_Y.  Greylisting,
+DNS blacklist (DNSBL), and other checks are turned off.
+_S_p_a_m_-_t_r_a_p_-_a_c_c_e_p_t tells the MTA to accept the message while
+_s_p_a_m_-_t_r_a_p_-_r_e_j_e_c_t tells the MTA to reject the message.  Use
+_S_p_a_m_-_t_r_a_p_-_a_c_c_e_p_t for spam traps that should not be dis-
+closed.  _S_p_a_m_-_t_r_a_p_-_r_e_j_e_c_t can be used  on _c_a_t_c_h_-_a_l_l mail-
+boxes that might receive legitimate mail by typographical
+errors and that senders should be told about.
+In the absence of explicit settings, the default in the main
+whiteclnt file is equivalent to
+_o_p_t_i_o_n _l_o_g_-_n_o_r_m_a_l
+_o_p_t_i_o_n _d_c_c_-_o_n
+_o_p_t_i_o_n _g_r_e_y_l_i_s_t_-_o_n
+_o_p_t_i_o_n _g_r_e_y_l_i_s_t_-_l_o_g_-_o_n
+_o_p_t_i_o_n _D_C_C_-_r_e_p_-_o_f_f
+_o_p_t_i_o_n _D_N_S_B_L_1_-_o_f_f
+_o_p_t_i_o_n _D_N_S_B_L_2_-_o_f_f
+_o_p_t_i_o_n _D_N_S_B_L_3_-_o_f_f
+_M_T_A_-_l_a_s_t
+_o_p_t_i_o_n _n_o_-_f_o_r_c_e_d_-_d_i_s_c_a_r_d
+The defaults for individual recipient _w_h_i_t_e_c_l_n_t files are the
+same except as change by explicit settings in the main file.
+Checksums of the IP address of the SMTP client sending a mail message are
+practically unforgeable, because it is impractical for an SMTP client to
+"spoof" its address or pretend to use some other IP address.  That would
+make the IP address of the sender useful for whitelisting, except that
+the IP address of the SMTP client is often not available to users of
+dccproc(8).  In addition, legitimate mail relays make whitelist entries
+for IP addresses of little use.  For example, the IP address from which a
+message arrived might be that of a local relay instead of the home
+address of a whitelisted mailing list.
+Envelope and header _F_r_o_m values can be forged, so whitelist entries for
+their checksums are not entirely reliable.
+Checksums of _e_n_v___T_o values are never sent to DCC servers.  They are valid
+in only _w_h_i_t_e_c_l_n_t files and used only by dccm(8), dccifd(8), and
+dccproc(8) when the envelope _R_c_p_t _T_o value is known.
+GGrreeyylliissttss
+The DCC server, dccd(8), can be used to maintain a greylist database for
+some DCC clients including dccm(8) and dccifd(8).  Greylisting involves
+temporarily refusing mail from unfamiliar SMTP clients and is unrelated
+to filtering with a Distributed Checksum Clearinghouse.
+See http://projects.puremagic.com/greylisting/
+PPrriivvaaccyy
+Because sending mail is a less private act than receiving it, and because
+sending bulk mail is usually not private at all and cannot be very pri-
+vate, the DCC tries first to protect the privacy of mail recipients, and
+second the privacy of senders of mail that is not bulk.
+DCC clients necessarily disclose some information about mail they have
+received.  The DCC database contains checksums of mail bodies, header
+lines, and source addresses.  While it contains significantly less infor-
+mation than is available by "snooping" on Internet links, it is important
+that the DCC database be treated as containing sensitive information and
+to not put the most private information in the DCC database.  Given the
+contents of a message, one might determine whether that message has been
+received by a system that subscribes to the DCC.  Guesses about the
+sender and addressee of a message can also be validated if the checksums
+of the message have been sent to a DCC server.
+Because the DCC is distributed, organizations can operate their own DCC
+servers, and configure them to share or "flood" only the checksums of
+bulk mail that is not in local whitelists.
+DCC clients should not report the checksums of messages known to be pri-
+vate to a DCC server.  For example, checksums of messages local to a sys-
+tem or that are otherwise known a priori to not be unsolicited bulk
+should not be sent to a remote DCC server.  This can accomplished by
+adding entries for the sender to the client's local whitelist file.
+Client whitelist files can also include entries for email recipients
+whose mail should not be reported to a DCC server.
+SSeeccuurriittyy
+Whenever considering security, one must first consider the risks.  The
+worst DCC security problems are unauthorized commands to a DCC service,
+denial of the DCC service, and corruption of DCC data.  The worst that
+can be done with remote commands to a DCC server is to turn it off or
+otherwise cause it to stop responding.  The DCC is designed to fail
+gracefully, so that a denial of service attack would at worst allow
+delivery of mail that would otherwise be rejected.  Corruption of DCC
+data might at worst cause mail that is already somewhat "bulk" by virtue
+of being received by two or more people to appear have higher recipient
+numbers.  Since DCC users _m_u_s_t whitelist all sources of legitimate bulk
+mail, this is also not a concern.  Such security risks should be
+addressed, but only with defenses that don't cost more than the possible
+damage from an attack.
+The DCC must contend with senders of unsolicited bulk mail who resort to
+unlawful actions to express their displeasure at having their advertising
+blocked.  Because the DCC protocol is based on UDP, an unhappy advertiser
+could try to flood a DCC server with packets supposedly from subscribers
+or non-subscribers.  DCC servers defend against that attack by rate-lim-
+iting requests from anonymous users.
+Also because of the use of UDP, clients must be protected against forged
+answers to their queries.  Otherwise an unsolicited bulk mail advertiser
+could send a stream of "not spam" answers to an SMTP client while simul-
+taneously sending mail that would otherwise be rejected.  This is not a
+problem for authenticated clients of the DCC because they share a secret
+with the DCC.  Unauthenticated, anonymous DCC clients do not share any
+secrets with the DCC, except for unique and unpredictable bits in each
+query or report sent to the DCC.  Therefore, DCC servers cryptographi-
+cally sign answers to unauthenticated clients with bits from the corre-
+sponding queries.  This protects against attackers that do not have
+access to the stream of packets from the DCC client.
+The passwords or shared secrets used in the DCC client and server pro-
+grams are "cleartext" for several reasons.  In any shared secret authen-
+tication system, at least one party must know the secret or keep the
+secret in cleartext.  You could encrypt the secrets in a file, but
+because they are used by programs, you would need a cleartext copy of the
+key to decrypt the file somewhere in the system, making such a scheme
+more expensive but no more secure than a file of cleartext passwords.
+Asymmetric systems such as that used in UNIX allow one party to not know
+the secrets, but they must be and are designed to be computationally
+expensive when used in applications like the DCC that involve thousands
+or more authentication checks per second.  Moreover, because of "dictio-
+nary attacks," asymmetric systems are now little more secure than keeping
+passwords in cleartext.  An adversary can compare the hash values of com-
+binations of common words with /etc/passwd hash values to look for bad
+passwords.  Worse, by the nature of a client/server protocol like that
+used in the DCC, clients must have the cleartext password.  Since it is
+among the more numerous and much less secure clients that adversaries
+would seek files of DCC passwords, it would be a waste to complicate the
+DCC server with an asymmetric system.
+The DCC protocol is vulnerable to dictionary attacks to recover pass-
+words.  An adversary could capture some DCC packets, and then check to
+see if any of the 100,000 to 1,000,000 passwords in so called "cracker
+dictionaries" applied to a packet generated the same signature.  This is
+a concern only if DCC passwords are poorly chosen, such as any combina-
+tion of words in an English dictionary.  There are ways to prevent this
+vulnerability regardless of how badly passwords are chosen, but they are
+computationally expensive and require additional network round trips.
+Since DCC passwords are created and typed into files once and do not need
+to be remembered by people, it is cheaper and quite easy to simply choose
+good passwords that are not in dictionaries.
+RReelliiaabbiilliittyy
+It is better to fail to filter unsolicited bulk mail than to fail to
+deliver legitimate mail, so DCC clients fail in the direction of assuming
+that mail is legitimate or even whitelisted.
+A DCC client sends a report or other request and waits for an answer.  If
+no answer arrives within a reasonable time, the client retransmits.
+There are many things that might result in the client not receiving an
+answer, but the most important is packet loss.  If the client's request
+does not reach the server, it is easy and harmless for the client to
+retransmit.  If the client's request reached the server but the server's
+response was lost, a retransmission to the same server would be misunder-
+stood as a new report of another copy of the same message unless it is
+detected as a retransmission by the server.  The DCC protocol includes
+transactions identifiers for this purpose.  If the client retransmitted
+to a second server, the retransmission would be misunderstood by the sec-
+ond server as a new report of the same message.
+Each request from a client includes a timestamp to aid the client in mea-
+suring the round trip time to the server and to let the client pick the
+closest server.  Clients monitor the speed of all of the servers they
+know including those they are not currently using, and use the quickest.
+CClliieenntt aanndd SSeerrvveerr--IIDDss
+Servers and clients use numbers or IDs to identify themselves.  ID 1 is
+reserved for anonymous, unauthenticated clients.  All other IDs are asso-
+ciated with a pair of passwords in the _i_d_s file, the current and next or
+previous and current passwords.  Clients included their client IDs in
+their messages.  When they are not using the anonymous ID, they sign
+their messages to servers with the first password associated with their
+client-ID.  Servers treat messages with signatures that match neither of
+the passwords for the client-ID in their own _i_d_s file as if the client
+had used the anonymous ID.
+Each server has a unique _s_e_r_v_e_r_-_I_D less than 32768.  Servers use their
+IDs to identify checksums that they _f_l_o_o_d to other servers.  Each server
+expects local clients sending administrative commands to use the server's
+ID and sign administrative commands with the associated password.
+Server-IDs must be unique among all systems that share reports by "flood-
+ing."  All servers must be told of the IDs all other servers whose
+reports can be received in the local _/_v_a_r_/_d_c_c_/_f_l_o_d file described in
+dccd(8).  However, server-IDs can be mapped during flooding between inde-
+pendent DCC organizations.
+_P_a_s_s_w_d_-_I_D_s are server-IDs that should not be assigned to servers.  They
+appear in the often publicly readable _/_v_a_r_/_d_c_c_/_f_l_o_d and specify passwords
+in the private _/_v_a_r_/_d_c_c_/_i_d_s file for the inter-server flooding protocol
+The client identified by a _c_l_i_e_n_t_-_I_D might be a single computer with a
+single IP address, a single but multi-homed computer, or many computers.
+Client-IDs are not used to identify checksum reports, but the organiza-
+tion operating the client.  A client-ID need only be unique among clients
+using a single server.  A single client can use different client-IDs for
+different servers, each client-ID authenticated with a separate password.
+An obscure but important part of all of this is that the inter-server
+flooding algorithm depends on server-IDs and timestamps attached to
+reports of checksums.  The inter-server flooding mechanism requires coop-
+erating DCC servers to maintain reasonable clocks ticking in UTC.
+Clients include timestamps in their requests, but as long as their time-
+stamps are unlikely to be repeated, they need not be very accurate.
+IInnssttaallllaattiioonn CCoonnssiiddeerraattiioonnss
+DCC clients on a computer share information about which servers are cur-
+rently working and their speeds in a shared memory segment.  This segment
+also contains server host names, IP addresses, and the passwords needed
+to authenticate known clients to servers.  That generally requires that
+dccm(8), dccproc(8), dccifd(8), and cdcc(8) execute with an UID that can
+write to the DCC home directory and its files.  The sendmail interface,
+dccm, is a daemon that can be started by an "rc" or other script already
+running with the correct UID.  The other two, dccproc and cdcc need to be
+set-UID because they are used by end users.  They relinquish set-UID
+privileges when not needed.
+Files that contain cleartext passwords including the shared file used by
+clients must be readable only by "owner."
+The data files required by a DCC can be in a single "home" directory,
+_/_v_a_r_/_d_c_c.  Distinct DCC servers can run on a single computer, provided
+they use distinct UDP port numbers and home directories.  It is possible
+and convenient for the DCC clients using a server on the same computer to
+use the same home directory as the server.
+The DCC source distribution includes sample control files.  They should
+be modified appropriately and then copied to the DCC home directory.
+Files that contain cleartext passwords must not be publicly readable.
+The DCC source includes "feature" m4 files to configure sendmail to use
+dccm(8) to check a DCC server about incoming mail.
+See also the INSTALL.html file.
+CClliieenntt IInnssttaallllaattiioonn
+Installing a DCC client starts with obtaining or compiling program bina-
+ries for the client server data control tool, cdcc(8).  Installing the
+sendmail DCC interface, dccm(8), or dccproc(8), the general or
+procmail(1) interface is the main part of the client installation.  Con-
+necting the DCC to sendmail with dccm is most powerful, but requires
+administrative control of the system running sendmail.
+As noted above, cdcc and dccproc should be set-UID to a suitable UID.
+Root or 0 is thought to be safe for both, because they are careful to
+release privileges except when they need them to read or write files in
+the DCC home directory.  A DCC home directory, _/_v_a_r_/_d_c_c should be cre-
+ated.  It must be owned and writable by the UID to which cdcc is set.
+After the DCC client programs have been obtained, contact the operator(s)
+of the chosen DCC server(s) to obtain each server's hostname, port num-
+ber, and a _c_l_i_e_n_t_-_I_D and corresponding password.  No client-IDs or pass-
+words are needed touse DCC servers that allow anonymous clients.  Use the
+_l_o_a_d or _a_d_d commands of cdcc to create a _m_a_p file in the DCC home direc-
+tory.  It is usually necessary to create a client whitelist file of the
+format described above.  To accommodate users sharing a computer but not
+ideas about what is solicited bulk mail, the client whitelist file can be
+any valid path name and need not be in the DCC home directory.
+If dccm is chosen, arrange to start it with suitable arguments before
+sendmail is started.  See the _h_o_m_e_d_i_r_/_d_c_c___c_o_n_f file and the _m_i_s_c_/_r_c_D_C_C
+script in the DCC source.  The procmail DCCM interface, dccproc(8), can
+be run manually or by a procmailrc(5) rule.
+SSeerrvveerr IInnssttaallllaattiioonn
+The DCC server, dccd(8), also requires that the DCC home directory exist.
+It does not use the client shared or memory mapped file of server
+addresses, but it requires other files.  One is the _/_v_a_r_/_d_c_c_/_i_d_s file of
+client-IDs,  server-IDs, and corresponding passwords.  Another is a _f_l_o_d
+file of peers that send and receive floods of reports of checksums with
+large counts.  Both files are described in dccd(8).
+The server daemon should be started when the system is rebooted, probably
+before sendmail.  See the _m_i_s_c_/_r_c_D_C_C and _m_i_s_c_/_s_t_a_r_t_-_d_c_c_d files in the DCC
+source.
+The database should be cleaned regularly with dbclean(8) such as by run-
+ning the crontab job that is in the misc directory.
+SSEEEE AALLSSOO
+cdcc(8), dbclean(8), dcc(8), dccd(8), dccifd(8), dccm(8), dccproc(8),
+dblist(8), dccsight(8), sendmail(8).
+HHIISSTTOORRYY
+Distributed Checksum Clearinghouses are based on an idea of Paul Vixie
+with code designed and written at Rhyolite Software starting in 2000.
+This document describes version 1.3.103.
+February 26, 2009

Mercurial > notdcc

comparison dcc.0 @ 0:c7f6b056b673