view dccproc.8.in @ 4:d329bb5c36d0

Changes making it compile the new upstream release
author Peter Gervai <grin@grin.hu>
date Tue, 10 Mar 2009 14:57:12 +0100
parents c7f6b056b673
children
line wrap: on
line source

.\" Copyright (c) 2008 by Rhyolite Software, LLC
.\"
.\" This agreement is not applicable to any entity which sells anti-spam
.\" solutions to others or provides an anti-spam solution as part of a
.\" security solution sold to other entities, or to a private network
.\" which employs the DCC or uses data provided by operation of the DCC
.\" but does not provide corresponding data to other users.
.\"
.\" Permission to use, copy, modify, and distribute this software without
.\" changes for any purpose with or without fee is hereby granted, provided
.\" that the above copyright notice and this permission notice appear in all
.\" copies and any distributed versions or copies are either unchanged
.\" or not called anything similar to "DCC" or "Distributed Checksum
.\" Clearinghouse".
.\"
.\" Parties not eligible to receive a license under this agreement can
.\" obtain a commercial license to use DCC by contacting Rhyolite Software
.\" at sales@rhyolite.com.
.\"
.\" A commercial license would be for Distributed Checksum and Reputation
.\" Clearinghouse software.  That software includes additional features.  This
.\" free license for Distributed ChecksumClearinghouse Software does not in any
.\" way grant permision to use Distributed Checksum and Reputation Clearinghouse
.\" software
.\"
.\" THE SOFTWARE IS PROVIDED "AS IS" AND RHYOLITE SOFTWARE, LLC DISCLAIMS ALL
.\" WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES
.\" OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL RHYOLITE SOFTWARE, LLC
.\" BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES
.\" OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS,
.\" WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION,
.\" ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS
.\" SOFTWARE.
.\"
.\" Rhyolite Software DCC 1.3.103-1.120 $Revision$
.\"
.Dd February 26, 2009
.ds volume-ds-DCC Distributed Checksum Clearinghouse
.Dt dccproc 8 DCC
.Os " "
.Sh NAME
.Nm dccproc
.Nd Distributed Checksum Clearinghouse Procmail Interface
.Sh SYNOPSIS
.Nm dccproc
.Bk -words
.Op Fl VdAQCHER
.Op Fl h Ar homedir
.Op Fl m Ar map
.Op Fl w Ar whiteclnt
.Op Fl T Ar tmpdir
.Op Fl a Ar IP-address
.Op Fl f Ar env_from
.Op Fl t Ar targets
.Op Fl x Ar exitcode
.br
.Oo
.Fl c Xo
.Sm off
.Ar type,
.Op Ar log-thold,
.Ar rej-thold
.Sm on
.Xc
.Oc
.Oo
.Fl g Xo
.Sm off
.Op Ar not-
.Ar type
.Sm on
.Xc
.Oc
.Op Fl S Ar header
.br
.Op Fl i Ar infile
.Op Fl o Ar outfile
.Op Fl l Ar logdir
.Op Fl B Ar dnsbl-option
.Op Fl L Ar ltype,facility.level
.Ek
.Sh DESCRIPTION
.Nm Dccproc
copies a complete SMTP message from standard input or a file
to standard output or another file.
As it copies the message,
it computes the DCC checksums for the message,
reports them to a DCC server, and adds
a header line to the message.
Another program such as
.Xr procmail 1
can use the added header line to filter mail.
Dccproc does not support any thresholds of its own,
because equivalent effects can be achieved with regular expressions
and you can apply dccproc several times using different DCC servers
and then score mail based what all of the DCC servers say.
.Pp
Error messages are sent to stderr as well as the system log.
Connect stderr and stdout to the same file to see errors in context,
but direct stderr to /dev/null to keep DCC error messages out of the mail.
The
.Fl i
option can also be used to separate the error messages.
.Pp
.Nm Dccproc
sends reports of checksums related to mail received by DCC clients
and queries about the total number of reports of particular checksums.
A DCC server receives no
mail, address, headers, or other information,
but only cryptographically secure checksums of such information.
A DCC server cannot determine the text or other information that corresponds
to the checksums it receives.
It only acts as a clearinghouse of counts of checksums computed by clients.
.Pp
For the sake of privacy for even the checksums of private mail,
the checksums of senders of purely internal mail or other
mail that is known to not be unsolicited bulk can be listed in a whitelist
to not be reported to the DCC server.
.Pp
When
.Xr sendmail 8
is used,
.Xr dccm 8
is a better DCC interface.
.Xr Dccifd 8
is more efficient than
.Nm
because it is a daemon, but that has costs in complexity.
See
.Xr dccsight 8
for a way to use previously computed checksums.
.Ss OPTIONS
The following options are available:
.Bl -tag -width 3n
.It Fl V
displays the version of the DCC
.Xr procmail 1
interface.
.It Fl d
enables debugging output from the DCC client software.
Additional
.Fl d
options increase the number of messages.
One causes error messages to be sent to STDERR as well as the system log.
.It Fl A
adds to existing X-DCC headers (if any)
of the brand of the current server
instead of
replacing existing headers.
.It Fl Q
only queries the DCC server about the checksums of messages
instead of reporting and then querying.
This is useful when
.Nm
is used to filter mail that has already been reported to a DCC
server by another DCC client such as
.Xr dccm 8 .
No single mail message should be reported to a DCC
server more than once per recipient.
.Pp
It is better to use
.Em MXDCC
lines in the
.Fl w Ar whiteclnt
file for your MX mail servers that use DCC than
.Fl Q
.It Fl C
outputs only the X-DCC header
and the checksums for the message.
.It Fl H
outputs only the X-DCC header.
.It Fl E
adds lines to the start of the log file turned on with
.Fl l
and
.Fl c
describing what might have been the envelope of the message.
The information for the inferred envelope comes from arguments including
.Fl a
and headers in the message when
.Fl R
is used.
No lines are generated for which no information is available,
such as the envelope recipient.
.It Fl R
says the first Received lines have the standard
"helo\ (name\ [address])..."
format and the address is that of the SMTP client
that would otherwise be provided with
.Fl a .
The
.Fl a
option should be used
if the local SMTP server adds a Received line with some other format
or does not add a Received line.
Received headers specifying IP addresses marked
.Em MX
or
.Em MXDCC
in the
.Fl w Ar whiteclnt
file are skipped.
.It Fl h Ar homedir
overrides the default DCC home directory,
.Pa @prefix@ .
.It Fl m Ar map
specifies a name or path of the memory mapped parameter file instead
of the default
.Pa map
in the DCC home directory.
It should be created with the
.Ic new map
operation of the
.Xr cdcc 8
command.
.It Fl w Ar whiteclnt
specifies an optional file containing SMTP client IP addresses and
SMTP headers
of mail that do not need X-DCC headers and whose checksums should not
be reported to the DCC server.
It can also contain checksums of spam.
If the pathname is not absolute, it is relative to the DCC home directory.
Thus, individual users with private whitelists usually specify them
with absolute paths.
Common whitelists shared by users must be in the DCC home directory or
one of its subdirectories and owned by the set-UID user of
.Nm dccproc .
It is useful to
.Ar include
a common or system-wide whitelist in private lists.
.Pp
Because the contents of the
.Ar whiteclnt
file are used frequently, a companion file is automatically
created and maintained.
It has the same pathname but with an added suffix of
.Ar .dccw .
It contains a memory mapped hash table of the main file.
.Pp
.Ar Option
lines can be used to modify many aspects of
.Nm
filtering,
as described in the main
.Xr dcc 8
man page.
For example, an
.Ar option spam-trap-accept
line turns off DCC filtering and reports the message as spam.
.It Fl T Ar tmpdir
changes the default directory for temporary files from the system default.
The system default is
.Pa /tmp .
.It Fl a Ar IP-address
specifies the IP address (not the host name) of
the immediately previous SMTP client.
It is often not available.
.Fl a Ar 0.0.0.0
is ignored.
.Fl a .
The
.Fl a
option should be used
instead of
.Fl R
if the local SMTP server adds a Received line with some other format
or does not add a Received line.
.It Fl f Ar env_from
specifies the RFC\ 821 envelope "Mail\ From" value with which the
message arrived.
It is often not available.
If
.Fl f
is not present, the contents of the first Return-Path: or UNIX style
From_ header is used.
The
.Ar env_from
string is often but need not be bracketed with "<>".
.It Fl t Ar targets
specifies the number of addressees of the message if other than 1.
The string
.Ar many
instead of a number asserts that there were too many addressees
and that the message is unsolicited bulk email.
.It Fl x Ar exitcode
specifies the code or status with which
.Nm
exits if the
.Fl c
thresholds are reached or the
.Fl w Ar whiteclnt
file blacklists the message.
.Pp
The default value is EX_NOUSER.
EX_NOUSER is 67 on many systems.
Use 0 to always exit successfully.
.It Fl c Xo
.Sm off
.Ar type,
.Op Ar log-thold,
.Ar rej-thold
.Sm on
.Xc
sets logging and "spam" thresholds for checksum
.Ar type .
The checksum types are
.Ar IP ,
.Ar env_From ,
.Ar From ,
.Ar Message-ID ,
.Ar substitute ,
.Ar Received ,
.Ar Body ,
.Ar Fuz1 ,
.Ar Fuz2 ,
.Ar rep-total ,
and
.Ar rep .
The first six,
.Ar IP
through
.Ar substitute ,
have no effect except when a local DCC server configured with
.Fl K
is used.
The
.Ar substitute
thresholds apply to the first substitute heading encountered in the mail
message.
The string
.Ar ALL
sets thresholds for all types, but is unlikely to be useful except for
setting logging thresholds.
The string
.Ar CMN
specifies the commonly used checksums
.Ar Body ,
.Ar Fuz1 ,
and
.Ar Fuz2 .
.Ar Rej-thold
and
.Ar log-thold
must be numbers, the string
.Ar NEVER ,
or the string
.Ar MANY
indicating millions of targets.
Counts from the DCC server as large as the threshold for any single type
are taken as sufficient evidence
that the message should be logged or rejected.
.Pp
.Ar Log-thold
is the threshold at which messages are logged.
It can be handy to log messages at a lower threshold to find
solicited bulk mail sources such as mailing lists.
If no logging threshold is set,
only rejected mail and messages with complicated combinations of white
and blacklisting are logged.
Messages that reach at least one of their rejection thresholds are
logged regardless of logging thresholds.
.Pp
.Ar Rej-thold
is the threshold at which messages are considered "bulk,"
and so should be rejected or discarded if not whitelisted.
.Pp
DCC Reputation thresholds in the commercial version
of the DCC are controlled by thresholds on checksum types
.Ar rep
and
.Ar rep-total .
Messages from an IP address that the DCC database says has sent
more than
.Fl t Ar rep-total,log-thold
messages are logged.
A DCC Reputation is computed for messages received
from IP addresses that
have sent more than
.Fl t Ar rep-total,log-thold
messages.
The DCC Reputation of an IP address is the percentage of its messages
that have been detected as bulk
or having at least 10 recipients.
The defaults are equivalent to
.Fl t Ar rep,never
and
.Fl t Ar rep-total,never,20 .
.Pp
Bad DCC Reputations do not reject mail unless enabled by an
.Ar option DCC-rep-on
line in a
.Pa whiteclnt
file.
.Pp
The checksums of locally whitelisted messages are not checked with
the DCC server and so only the number of targets of the current copy of
a whitelisted message are compared against the thresholds.
.Pp
The default is
.Ar ALL,NEVER ,
so that nothing is discarded, rejected, or logged.
A common choice is
.Ar CMN,25,50
to reject or discard
mail with common bodies except as overridden by
the whitelist of the DCC server, the sendmail
.Em ${dcc_isspam}
and
.Em ${dcc_notspam}
macros, and
.Fl g ,
and
.Fl w .
.It Fl g Xo
.Sm off
.Op Ar not-
.Ar type
.Sm on
.Xc
indicates that whitelisted,
.Ar OK
or
.Ar OK2 ,
counts from the DCC server for a type of checksum are to be believed.
They should be ignored if prefixed with
.Ar not- .
.Ar Type
is one of the same set of strings as for
.Fl c .
Only
.Ar IP ,
.Ar env_From ,
and
.Ar From
are likely choices.
By default all three are honored,
and hence the need for
.Ar not- .
.It Fl S Ar hdr
adds to the list of substitute or locally chosen headers that
are checked with the
.Fl w Ar whiteclnt
file and sent to the DCC server.
The checksum of the last header of type
.Ar hdr
found in the message is checked.
As many as 6 different substitute headers can be specified, but only
the checksum of the first of the 6 will be sent to the DCC server.
.It Fl i Ar infile
specifies an input file for the entire message
instead of standard input.
If not absolute, the pathname is interpreted relative to the
directory in which
.Nm
was started.
.It Fl o Ar outfile
specifies an output file for the entire message including headers
instead of standard output.
If not absolute, the pathname is interpreted relative to the
directory in which
.Nm
was started.
.It Fl l Ar logdir
specifies a directory for copies of messages whose
checksum target counts exceed
.Fl c
thresholds.
The format of each file is affected by
.Fl E .
.Pp
See the FILES section below concerning the contents of the files.
See also the
.Ar option log-subdirectory-{day,hour,minute}
lines in
.Pa whiteclnt
files described in
.Xr dcc 8 .
.Pp
The directory is relative to the DCC home directory if it is not absolute
.It Fl B Ar dnsbl-option
enables DNS blacklist checks of the SMTP client IP address, SMTP envelope
Mail_From sender domain name, and of host names in URLs in the message body.
Body URL blacklisting has too many false positives to use on
abuse mailboxes.
It is less effective than greylisting with
.Xr dccm 8
or
.Xr dccifd 8
but can be useful in situations where
greylisting cannot be used.
.Pp
.Ar Dnsbl-option
is either one of the
.Fl B Ar set:option
forms or
.Bd -literal -compact -offset 4n
.Fl B Xo
.Sm off
.Ar domain Oo Ar ,IPaddr
.Op Ar /xx Op Ar ,bltype Oc
.Sm on
.Xc
.Ed
.Ar Domain
is a DNS blacklist domain such as example.com
that will be searched.
.Ar IPaddr Ns Op Ar /xxx
is the string "any"
an IP address in the DNS blacklist
that indicates that the mail message
should be rejected,
or a CIDR block covering results from the DNS blacklist.
"127.0.0.2" is assumed if
.Ar IPaddr
is absent.
IPv6 addresses can be specified with the usual colon (:) notation.
Names can be used instead of numeric addresses.
The type of DNS blacklist
is specified by
.Ar bltype
as
.Ar name ,
.Ar IPv4 ,
or
.Ar IPv6 .
Given an envelope sender domain name or a domain name in a URL of
spam.domain.org
and a blacklist of type
.Ar name ,
spam.domain.org.example.com will be tried.
Blacklist types of
.Ar IPv4
and
.Ar IPv6
require that the domain name in a URL sender address
be resolved into an IPv4 or IPv6
address.
The address is then written as a reversed string of decimal
octets to check the DNS blacklist, as in 2.0.0.127.example.com,
.Pp
More than one blacklist can be specified and blacklists can be grouped.
All searching within a group is stopped at the first positive result.
.Pp
Unlike
.Xr dccm 8
and
.Xr dccifd 8 ,
no
.Ar option\ DNSBL-on
line is required in the
.Pa whiteclnt
file.
A
.Fl B
argument is sufficient to show that DNSBL filtering is wanted by the
.Nm
user.
.Bl -tag -width 3n
.It Fl B Ar set:no-client
says that SMTP client IP addresses and reverse DNS domain names should
not be checked in the following blacklists.
.br
.Fl B Ar set:client
restores the default for the following blacklists.
.It Fl B Ar set:no-mail_host
says that SMTP envelope Mail_From sender domain names should
not be checked in the following blacklists.
.Fl B Ar set:mail_host
restores the default.
.It Fl B Ar set:no-URL
says that URLs in the message body should not be checked in the
in the following blacklists.
.Fl B Ar set:URL
restores the default.
.It Fl B Ar set:no-MX
says MX servers of sender Mail_From domain names and host names in URLs
should not be checked in the following blacklists.
.br
.Fl B Ar set:MX
restores the default.
.It Fl B Ar set:no-NS
says DNS servers of sender Mail_From domain names and host names in URLs
should not be checked in the following blacklists.
.Fl B Ar set:NS
restores the default.
.It Fl B Ar set:defaults
is equivalent to all of
.Fl B Ar set:no-temp-fail
.Fl B Ar set:client
.br
.Fl B Ar set:mail_host
.Fl B Ar set:URL
.Fl B Ar set:MX
and
.Fl B Ar set:NS
.It Fl B Ar set:group=X
adds later DNS blacklists specified with
.Bd -literal -compact -offset 4n
.Fl B Xo
.Sm off
.Ar domain Oo Ar ,IPaddr
.Op Ar /xx Op Ar ,bltype Oc
.Sm on
.Xc
.Ed
to group 1, 2, or 3.
.It Fl B Ar set:debug=X
sets the DNS blacklist logging level
.It Fl B Ar set:msg-secs=S
limits
.Nm
to
.Ar S
seconds total for checking all DNS blacklists.
The default is 25.
.It Fl B Ar set:URL-secs=S
limits
.Nm
to at most
.Ar S
seconds resolving and checking any single URL.
The default is 11.
Some spam contains dozens of URLs and that
some "spamvertised" URLs contain host names that need minutes to
resolve.
Busy mail systems cannot afford to spend minutes checking each incoming
mail message.
.El
.It Fl L Ar ltype,facility.level
specifies how messages should be logged.
.Ar Ltype
must be
.Ar error ,
.Ar info ,
or
.Ar off
to indicate which of the two types of messages are being controlled or
to turn off all
.Xr syslog 3
messages from
.Nm .
.Ar Level
must be a
.Xr syslog 3
level among
.Ar EMERG ,
.Ar ALERT ,
.Ar CRIT , ERR ,
.Ar WARNING ,
.Ar NOTICE ,
.Ar INFO ,
and
.Ar DEBUG .
.Ar Facility
must be among
.Ar AUTH ,
.Ar AUTHPRIV ,
.Ar CRON ,
.Ar DAEMON ,
.Ar FTP ,
.Ar KERN ,
.Ar LPR ,
.Ar MAIL ,
.Ar NEWS ,
.Ar USER ,
.Ar UUCP ,
and
.Ar LOCAL0
through
.Ar LOCAL7 .
The default is equivalent to
.Dl Fl L Ar info,MAIL.NOTICE  Fl L Ar error,MAIL.ERR
.El
.Pp
.Nm
exits with 0 on success and with the
.Fl x
value if the
.Fl c
thresholds are reached or the
.Fl w Ar whiteclnt
file blacklists the message.
If at all possible,
the input mail message is output to standard output or the
.Fl o Ar outfile
despite errors.
If possible, error messages are put into the system log instead of
being mixed with the output mail message.
The exit status is zero for errors so that the mail message
will not be rejected.
.Pp
If
.Nm
is run more than 500 times in fewer than 5000 seconds,
.Nm
tries to start
.Xr Dccifd 8 .
The attempt is made at most once per hour.
Dccifd is significantly more efficient than
.Nm .
With luck, mechanisms such as SpamAssassin will notice when dccifd is
running and switch to dccifd.
.Sh FILES
.Bl -tag -width whiteclnt -compact
.It Pa @prefix@
DCC home directory in which other files are found.
.It Pa map
memory mapped file in the DCC home directory
of information concerning DCC servers.
.It Pa whiteclnt
contains the client whitelist in
the format described in
.Xr dcc 8 .
.It Pa whiteclnt.dccw
is a memory mapped hash table corresponding to the
.Pa whiteclnt
file.
.It Pa tmpdir
contains temporary files created and deleted as
.Nm
processes the message.
.It Pa logdir
is an optional directory specified with
.Fl l
and containing marked mail.
Each file in the directory contains one message, at least one of whose
checksums reached one of its
.Fl c
thresholds.
The entire body of the SMTP message including its header
is followed by the checksums for the message.
.El
.Sh EXAMPLES
The following
.Xr procmailrc 5
rule adds an X-DCC header to passing mail
.Bd -literal -offset 4n
:0 f
| /usr/local/bin/dccproc -ERw whiteclnt
.Ed
.Pp
This
.Xr procmailrc 5
recipe rejects mail with total counts of 10 or larger for
the commonly used checksums:
.Bd -literal -offset 4n
:0 fW
| /usr/local/bin/dccproc -ERw whiteclnt -ccmn,10
:0 e
{
    EXITCODE=67
    :0
    /dev/null
}
.Ed
.Sh SEE ALSO
.Xr cdcc 8 ,
.Xr dcc 8 ,
.Xr dbclean 8 ,
.Xr dccd 8 ,
.Xr dblist 8 ,
.Xr dccifd 8 ,
.Xr dccm 8 ,
.Xr dccsight 8 ,
.Xr mail 1 ,
.Xr procmail 1 .
.Sh HISTORY
Distributed Checksum Clearinghouses are based on an idea of Paul Vixie.
Implementation of
.Nm
was started at Rhyolite Software in 2000.
This document describes version 1.3.103.
.Sh BUGS
.Nm
uses
.Fl c
where
.Xr dccm 8
uses
.Fl t .