diff dccproc.8.in @ 0:c7f6b056b673

First import of vendor version
author Peter Gervai <grin@grin.hu>
date Tue, 10 Mar 2009 13:49:58 +0100
parents
children
line wrap: on
line diff
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/dccproc.8.in	Tue Mar 10 13:49:58 2009 +0100
@@ -0,0 +1,780 @@
+.\" Copyright (c) 2008 by Rhyolite Software, LLC
+.\"
+.\" This agreement is not applicable to any entity which sells anti-spam
+.\" solutions to others or provides an anti-spam solution as part of a
+.\" security solution sold to other entities, or to a private network
+.\" which employs the DCC or uses data provided by operation of the DCC
+.\" but does not provide corresponding data to other users.
+.\"
+.\" Permission to use, copy, modify, and distribute this software without
+.\" changes for any purpose with or without fee is hereby granted, provided
+.\" that the above copyright notice and this permission notice appear in all
+.\" copies and any distributed versions or copies are either unchanged
+.\" or not called anything similar to "DCC" or "Distributed Checksum
+.\" Clearinghouse".
+.\"
+.\" Parties not eligible to receive a license under this agreement can
+.\" obtain a commercial license to use DCC by contacting Rhyolite Software
+.\" at sales@rhyolite.com.
+.\"
+.\" A commercial license would be for Distributed Checksum and Reputation
+.\" Clearinghouse software.  That software includes additional features.  This
+.\" free license for Distributed ChecksumClearinghouse Software does not in any
+.\" way grant permision to use Distributed Checksum and Reputation Clearinghouse
+.\" software
+.\"
+.\" THE SOFTWARE IS PROVIDED "AS IS" AND RHYOLITE SOFTWARE, LLC DISCLAIMS ALL
+.\" WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES
+.\" OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL RHYOLITE SOFTWARE, LLC
+.\" BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES
+.\" OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS,
+.\" WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION,
+.\" ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS
+.\" SOFTWARE.
+.\"
+.\" Rhyolite Software DCC 1.3.103-1.120 $Revision$
+.\"
+.Dd February 26, 2009
+.ds volume-ds-DCC Distributed Checksum Clearinghouse
+.Dt dccproc 8 DCC
+.Os " "
+.Sh NAME
+.Nm dccproc
+.Nd Distributed Checksum Clearinghouse Procmail Interface
+.Sh SYNOPSIS
+.Nm dccproc
+.Bk -words
+.Op Fl VdAQCHER
+.Op Fl h Ar homedir
+.Op Fl m Ar map
+.Op Fl w Ar whiteclnt
+.Op Fl T Ar tmpdir
+.Op Fl a Ar IP-address
+.Op Fl f Ar env_from
+.Op Fl t Ar targets
+.Op Fl x Ar exitcode
+.br
+.Oo
+.Fl c Xo
+.Sm off
+.Ar type,
+.Op Ar log-thold,
+.Ar rej-thold
+.Sm on
+.Xc
+.Oc
+.Oo
+.Fl g Xo
+.Sm off
+.Op Ar not-
+.Ar type
+.Sm on
+.Xc
+.Oc
+.Op Fl S Ar header
+.br
+.Op Fl i Ar infile
+.Op Fl o Ar outfile
+.Op Fl l Ar logdir
+.Op Fl B Ar dnsbl-option
+.Op Fl L Ar ltype,facility.level
+.Ek
+.Sh DESCRIPTION
+.Nm Dccproc
+copies a complete SMTP message from standard input or a file
+to standard output or another file.
+As it copies the message,
+it computes the DCC checksums for the message,
+reports them to a DCC server, and adds
+a header line to the message.
+Another program such as
+.Xr procmail 1
+can use the added header line to filter mail.
+Dccproc does not support any thresholds of its own,
+because equivalent effects can be achieved with regular expressions
+and you can apply dccproc several times using different DCC servers
+and then score mail based what all of the DCC servers say.
+.Pp
+Error messages are sent to stderr as well as the system log.
+Connect stderr and stdout to the same file to see errors in context,
+but direct stderr to /dev/null to keep DCC error messages out of the mail.
+The
+.Fl i
+option can also be used to separate the error messages.
+.Pp
+.Nm Dccproc
+sends reports of checksums related to mail received by DCC clients
+and queries about the total number of reports of particular checksums.
+A DCC server receives no
+mail, address, headers, or other information,
+but only cryptographically secure checksums of such information.
+A DCC server cannot determine the text or other information that corresponds
+to the checksums it receives.
+It only acts as a clearinghouse of counts of checksums computed by clients.
+.Pp
+For the sake of privacy for even the checksums of private mail,
+the checksums of senders of purely internal mail or other
+mail that is known to not be unsolicited bulk can be listed in a whitelist
+to not be reported to the DCC server.
+.Pp
+When
+.Xr sendmail 8
+is used,
+.Xr dccm 8
+is a better DCC interface.
+.Xr Dccifd 8
+is more efficient than
+.Nm
+because it is a daemon, but that has costs in complexity.
+See
+.Xr dccsight 8
+for a way to use previously computed checksums.
+.Ss OPTIONS
+The following options are available:
+.Bl -tag -width 3n
+.It Fl V
+displays the version of the DCC
+.Xr procmail 1
+interface.
+.It Fl d
+enables debugging output from the DCC client software.
+Additional
+.Fl d
+options increase the number of messages.
+One causes error messages to be sent to STDERR as well as the system log.
+.It Fl A
+adds to existing X-DCC headers (if any)
+of the brand of the current server
+instead of
+replacing existing headers.
+.It Fl Q
+only queries the DCC server about the checksums of messages
+instead of reporting and then querying.
+This is useful when
+.Nm
+is used to filter mail that has already been reported to a DCC
+server by another DCC client such as
+.Xr dccm 8 .
+No single mail message should be reported to a DCC
+server more than once per recipient.
+.Pp
+It is better to use
+.Em MXDCC
+lines in the
+.Fl w Ar whiteclnt
+file for your MX mail servers that use DCC than
+.Fl Q
+.It Fl C
+outputs only the X-DCC header
+and the checksums for the message.
+.It Fl H
+outputs only the X-DCC header.
+.It Fl E
+adds lines to the start of the log file turned on with
+.Fl l
+and
+.Fl c
+describing what might have been the envelope of the message.
+The information for the inferred envelope comes from arguments including
+.Fl a
+and headers in the message when
+.Fl R
+is used.
+No lines are generated for which no information is available,
+such as the envelope recipient.
+.It Fl R
+says the first Received lines have the standard
+"helo\ (name\ [address])..."
+format and the address is that of the SMTP client
+that would otherwise be provided with
+.Fl a .
+The
+.Fl a
+option should be used
+if the local SMTP server adds a Received line with some other format
+or does not add a Received line.
+Received headers specifying IP addresses marked
+.Em MX
+or
+.Em MXDCC
+in the
+.Fl w Ar whiteclnt
+file are skipped.
+.It Fl h Ar homedir
+overrides the default DCC home directory,
+.Pa @prefix@ .
+.It Fl m Ar map
+specifies a name or path of the memory mapped parameter file instead
+of the default
+.Pa map
+in the DCC home directory.
+It should be created with the
+.Ic new map
+operation of the
+.Xr cdcc 8
+command.
+.It Fl w Ar whiteclnt
+specifies an optional file containing SMTP client IP addresses and
+SMTP headers
+of mail that do not need X-DCC headers and whose checksums should not
+be reported to the DCC server.
+It can also contain checksums of spam.
+If the pathname is not absolute, it is relative to the DCC home directory.
+Thus, individual users with private whitelists usually specify them
+with absolute paths.
+Common whitelists shared by users must be in the DCC home directory or
+one of its subdirectories and owned by the set-UID user of
+.Nm dccproc .
+It is useful to
+.Ar include
+a common or system-wide whitelist in private lists.
+.Pp
+Because the contents of the
+.Ar whiteclnt
+file are used frequently, a companion file is automatically
+created and maintained.
+It has the same pathname but with an added suffix of
+.Ar .dccw .
+It contains a memory mapped hash table of the main file.
+.Pp
+.Ar Option
+lines can be used to modify many aspects of
+.Nm
+filtering,
+as described in the main
+.Xr dcc 8
+man page.
+For example, an
+.Ar option spam-trap-accept
+line turns off DCC filtering and reports the message as spam.
+.It Fl T Ar tmpdir
+changes the default directory for temporary files from the system default.
+The system default is
+.Pa /tmp .
+.It Fl a Ar IP-address
+specifies the IP address (not the host name) of
+the immediately previous SMTP client.
+It is often not available.
+.Fl a Ar 0.0.0.0
+is ignored.
+.Fl a .
+The
+.Fl a
+option should be used
+instead of
+.Fl R
+if the local SMTP server adds a Received line with some other format
+or does not add a Received line.
+.It Fl f Ar env_from
+specifies the RFC\ 821 envelope "Mail\ From" value with which the
+message arrived.
+It is often not available.
+If
+.Fl f
+is not present, the contents of the first Return-Path: or UNIX style
+From_ header is used.
+The
+.Ar env_from
+string is often but need not be bracketed with "<>".
+.It Fl t Ar targets
+specifies the number of addressees of the message if other than 1.
+The string
+.Ar many
+instead of a number asserts that there were too many addressees
+and that the message is unsolicited bulk email.
+.It Fl x Ar exitcode
+specifies the code or status with which
+.Nm
+exits if the
+.Fl c
+thresholds are reached or the
+.Fl w Ar whiteclnt
+file blacklists the message.
+.Pp
+The default value is EX_NOUSER.
+EX_NOUSER is 67 on many systems.
+Use 0 to always exit successfully.
+.It Fl c Xo
+.Sm off
+.Ar type,
+.Op Ar log-thold,
+.Ar rej-thold
+.Sm on
+.Xc
+sets logging and "spam" thresholds for checksum
+.Ar type .
+The checksum types are
+.Ar IP ,
+.Ar env_From ,
+.Ar From ,
+.Ar Message-ID ,
+.Ar substitute ,
+.Ar Received ,
+.Ar Body ,
+.Ar Fuz1 ,
+.Ar Fuz2 ,
+.Ar rep-total ,
+and
+.Ar rep .
+The first six,
+.Ar IP
+through
+.Ar substitute ,
+have no effect except when a local DCC server configured with
+.Fl K
+is used.
+The
+.Ar substitute
+thresholds apply to the first substitute heading encountered in the mail
+message.
+The string
+.Ar ALL
+sets thresholds for all types, but is unlikely to be useful except for
+setting logging thresholds.
+The string
+.Ar CMN
+specifies the commonly used checksums
+.Ar Body ,
+.Ar Fuz1 ,
+and
+.Ar Fuz2 .
+.Ar Rej-thold
+and
+.Ar log-thold
+must be numbers, the string
+.Ar NEVER ,
+or the string
+.Ar MANY
+indicating millions of targets.
+Counts from the DCC server as large as the threshold for any single type
+are taken as sufficient evidence
+that the message should be logged or rejected.
+.Pp
+.Ar Log-thold
+is the threshold at which messages are logged.
+It can be handy to log messages at a lower threshold to find
+solicited bulk mail sources such as mailing lists.
+If no logging threshold is set,
+only rejected mail and messages with complicated combinations of white
+and blacklisting are logged.
+Messages that reach at least one of their rejection thresholds are
+logged regardless of logging thresholds.
+.Pp
+.Ar Rej-thold
+is the threshold at which messages are considered "bulk,"
+and so should be rejected or discarded if not whitelisted.
+.Pp
+DCC Reputation thresholds in the commercial version
+of the DCC are controlled by thresholds on checksum types
+.Ar rep
+and
+.Ar rep-total .
+Messages from an IP address that the DCC database says has sent
+more than
+.Fl t Ar rep-total,log-thold
+messages are logged.
+A DCC Reputation is computed for messages received
+from IP addresses that
+have sent more than
+.Fl t Ar rep-total,log-thold
+messages.
+The DCC Reputation of an IP address is the percentage of its messages
+that have been detected as bulk
+or having at least 10 recipients.
+The defaults are equivalent to
+.Fl t Ar rep,never
+and
+.Fl t Ar rep-total,never,20 .
+.Pp
+Bad DCC Reputations do not reject mail unless enabled by an
+.Ar option DCC-rep-on
+line in a
+.Pa whiteclnt
+file.
+.Pp
+The checksums of locally whitelisted messages are not checked with
+the DCC server and so only the number of targets of the current copy of
+a whitelisted message are compared against the thresholds.
+.Pp
+The default is
+.Ar ALL,NEVER ,
+so that nothing is discarded, rejected, or logged.
+A common choice is
+.Ar CMN,25,50
+to reject or discard
+mail with common bodies except as overridden by
+the whitelist of the DCC server, the sendmail
+.Em ${dcc_isspam}
+and
+.Em ${dcc_notspam}
+macros, and
+.Fl g ,
+and
+.Fl w .
+.It Fl g Xo
+.Sm off
+.Op Ar not-
+.Ar type
+.Sm on
+.Xc
+indicates that whitelisted,
+.Ar OK
+or
+.Ar OK2 ,
+counts from the DCC server for a type of checksum are to be believed.
+They should be ignored if prefixed with
+.Ar not- .
+.Ar Type
+is one of the same set of strings as for
+.Fl c .
+Only
+.Ar IP ,
+.Ar env_From ,
+and
+.Ar From
+are likely choices.
+By default all three are honored,
+and hence the need for
+.Ar not- .
+.It Fl S Ar hdr
+adds to the list of substitute or locally chosen headers that
+are checked with the
+.Fl w Ar whiteclnt
+file and sent to the DCC server.
+The checksum of the last header of type
+.Ar hdr
+found in the message is checked.
+As many as 6 different substitute headers can be specified, but only
+the checksum of the first of the 6 will be sent to the DCC server.
+.It Fl i Ar infile
+specifies an input file for the entire message
+instead of standard input.
+If not absolute, the pathname is interpreted relative to the
+directory in which
+.Nm
+was started.
+.It Fl o Ar outfile
+specifies an output file for the entire message including headers
+instead of standard output.
+If not absolute, the pathname is interpreted relative to the
+directory in which
+.Nm
+was started.
+.It Fl l Ar logdir
+specifies a directory for copies of messages whose
+checksum target counts exceed
+.Fl c
+thresholds.
+The format of each file is affected by
+.Fl E .
+.Pp
+See the FILES section below concerning the contents of the files.
+See also the
+.Ar option log-subdirectory-{day,hour,minute}
+lines in
+.Pa whiteclnt
+files described in
+.Xr dcc 8 .
+.Pp
+The directory is relative to the DCC home directory if it is not absolute
+.It Fl B Ar dnsbl-option
+enables DNS blacklist checks of the SMTP client IP address, SMTP envelope
+Mail_From sender domain name, and of host names in URLs in the message body.
+Body URL blacklisting has too many false positives to use on
+abuse mailboxes.
+It is less effective than greylisting with
+.Xr dccm 8
+or
+.Xr dccifd 8
+but can be useful in situations where
+greylisting cannot be used.
+.Pp
+.Ar Dnsbl-option
+is either one of the
+.Fl B Ar set:option
+forms or
+.Bd -literal -compact -offset 4n
+.Fl B Xo
+.Sm off
+.Ar domain Oo Ar ,IPaddr
+.Op Ar /xx Op Ar ,bltype Oc
+.Sm on
+.Xc
+.Ed
+.Ar Domain
+is a DNS blacklist domain such as example.com
+that will be searched.
+.Ar IPaddr Ns Op Ar /xxx
+is the string "any"
+an IP address in the DNS blacklist
+that indicates that the mail message
+should be rejected,
+or a CIDR block covering results from the DNS blacklist.
+"127.0.0.2" is assumed if
+.Ar IPaddr
+is absent.
+IPv6 addresses can be specified with the usual colon (:) notation.
+Names can be used instead of numeric addresses.
+The type of DNS blacklist
+is specified by
+.Ar bltype
+as
+.Ar name ,
+.Ar IPv4 ,
+or
+.Ar IPv6 .
+Given an envelope sender domain name or a domain name in a URL of
+spam.domain.org
+and a blacklist of type
+.Ar name ,
+spam.domain.org.example.com will be tried.
+Blacklist types of
+.Ar IPv4
+and
+.Ar IPv6
+require that the domain name in a URL sender address
+be resolved into an IPv4 or IPv6
+address.
+The address is then written as a reversed string of decimal
+octets to check the DNS blacklist, as in 2.0.0.127.example.com,
+.Pp
+More than one blacklist can be specified and blacklists can be grouped.
+All searching within a group is stopped at the first positive result.
+.Pp
+Unlike
+.Xr dccm 8
+and
+.Xr dccifd 8 ,
+no
+.Ar option\ DNSBL-on
+line is required in the
+.Pa whiteclnt
+file.
+A
+.Fl B
+argument is sufficient to show that DNSBL filtering is wanted by the
+.Nm
+user.
+.Bl -tag -width 3n
+.It Fl B Ar set:no-client
+says that SMTP client IP addresses and reverse DNS domain names should
+not be checked in the following blacklists.
+.br
+.Fl B Ar set:client
+restores the default for the following blacklists.
+.It Fl B Ar set:no-mail_host
+says that SMTP envelope Mail_From sender domain names should
+not be checked in the following blacklists.
+.Fl B Ar set:mail_host
+restores the default.
+.It Fl B Ar set:no-URL
+says that URLs in the message body should not be checked in the
+in the following blacklists.
+.Fl B Ar set:URL
+restores the default.
+.It Fl B Ar set:no-MX
+says MX servers of sender Mail_From domain names and host names in URLs
+should not be checked in the following blacklists.
+.br
+.Fl B Ar set:MX
+restores the default.
+.It Fl B Ar set:no-NS
+says DNS servers of sender Mail_From domain names and host names in URLs
+should not be checked in the following blacklists.
+.Fl B Ar set:NS
+restores the default.
+.It Fl B Ar set:defaults
+is equivalent to all of
+.Fl B Ar set:no-temp-fail
+.Fl B Ar set:client
+.br
+.Fl B Ar set:mail_host
+.Fl B Ar set:URL
+.Fl B Ar set:MX
+and
+.Fl B Ar set:NS
+.It Fl B Ar set:group=X
+adds later DNS blacklists specified with
+.Bd -literal -compact -offset 4n
+.Fl B Xo
+.Sm off
+.Ar domain Oo Ar ,IPaddr
+.Op Ar /xx Op Ar ,bltype Oc
+.Sm on
+.Xc
+.Ed
+to group 1, 2, or 3.
+.It Fl B Ar set:debug=X
+sets the DNS blacklist logging level
+.It Fl B Ar set:msg-secs=S
+limits
+.Nm
+to
+.Ar S
+seconds total for checking all DNS blacklists.
+The default is 25.
+.It Fl B Ar set:URL-secs=S
+limits
+.Nm
+to at most
+.Ar S
+seconds resolving and checking any single URL.
+The default is 11.
+Some spam contains dozens of URLs and that
+some "spamvertised" URLs contain host names that need minutes to
+resolve.
+Busy mail systems cannot afford to spend minutes checking each incoming
+mail message.
+.El
+.It Fl L Ar ltype,facility.level
+specifies how messages should be logged.
+.Ar Ltype
+must be
+.Ar error ,
+.Ar info ,
+or
+.Ar off
+to indicate which of the two types of messages are being controlled or
+to turn off all
+.Xr syslog 3
+messages from
+.Nm .
+.Ar Level
+must be a
+.Xr syslog 3
+level among
+.Ar EMERG ,
+.Ar ALERT ,
+.Ar CRIT , ERR ,
+.Ar WARNING ,
+.Ar NOTICE ,
+.Ar INFO ,
+and
+.Ar DEBUG .
+.Ar Facility
+must be among
+.Ar AUTH ,
+.Ar AUTHPRIV ,
+.Ar CRON ,
+.Ar DAEMON ,
+.Ar FTP ,
+.Ar KERN ,
+.Ar LPR ,
+.Ar MAIL ,
+.Ar NEWS ,
+.Ar USER ,
+.Ar UUCP ,
+and
+.Ar LOCAL0
+through
+.Ar LOCAL7 .
+The default is equivalent to
+.Dl Fl L Ar info,MAIL.NOTICE  Fl L Ar error,MAIL.ERR
+.El
+.Pp
+.Nm
+exits with 0 on success and with the
+.Fl x
+value if the
+.Fl c
+thresholds are reached or the
+.Fl w Ar whiteclnt
+file blacklists the message.
+If at all possible,
+the input mail message is output to standard output or the
+.Fl o Ar outfile
+despite errors.
+If possible, error messages are put into the system log instead of
+being mixed with the output mail message.
+The exit status is zero for errors so that the mail message
+will not be rejected.
+.Pp
+If
+.Nm
+is run more than 500 times in fewer than 5000 seconds,
+.Nm
+tries to start
+.Xr Dccifd 8 .
+The attempt is made at most once per hour.
+Dccifd is significantly more efficient than
+.Nm .
+With luck, mechanisms such as SpamAssassin will notice when dccifd is
+running and switch to dccifd.
+.Sh FILES
+.Bl -tag -width whiteclnt -compact
+.It Pa @prefix@
+DCC home directory in which other files are found.
+.It Pa map
+memory mapped file in the DCC home directory
+of information concerning DCC servers.
+.It Pa whiteclnt
+contains the client whitelist in
+the format described in
+.Xr dcc 8 .
+.It Pa whiteclnt.dccw
+is a memory mapped hash table corresponding to the
+.Pa whiteclnt
+file.
+.It Pa tmpdir
+contains temporary files created and deleted as
+.Nm
+processes the message.
+.It Pa logdir
+is an optional directory specified with
+.Fl l
+and containing marked mail.
+Each file in the directory contains one message, at least one of whose
+checksums reached one of its
+.Fl c
+thresholds.
+The entire body of the SMTP message including its header
+is followed by the checksums for the message.
+.El
+.Sh EXAMPLES
+The following
+.Xr procmailrc 5
+rule adds an X-DCC header to passing mail
+.Bd -literal -offset 4n
+:0 f
+| /usr/local/bin/dccproc -ERw whiteclnt
+.Ed
+.Pp
+This
+.Xr procmailrc 5
+recipe rejects mail with total counts of 10 or larger for
+the commonly used checksums:
+.Bd -literal -offset 4n
+:0 fW
+| /usr/local/bin/dccproc -ERw whiteclnt -ccmn,10
+:0 e
+{
+    EXITCODE=67
+    :0
+    /dev/null
+}
+.Ed
+.Sh SEE ALSO
+.Xr cdcc 8 ,
+.Xr dcc 8 ,
+.Xr dbclean 8 ,
+.Xr dccd 8 ,
+.Xr dblist 8 ,
+.Xr dccifd 8 ,
+.Xr dccm 8 ,
+.Xr dccsight 8 ,
+.Xr mail 1 ,
+.Xr procmail 1 .
+.Sh HISTORY
+Distributed Checksum Clearinghouses are based on an idea of Paul Vixie.
+Implementation of
+.Nm
+was started at Rhyolite Software in 2000.
+This document describes version 1.3.103.
+.Sh BUGS
+.Nm
+uses
+.Fl c
+where
+.Xr dccm 8
+uses
+.Fl t .