comparison dccproc.0 @ 0:c7f6b056b673

First import of vendor version
author Peter Gervai <grin@grin.hu>
date Tue, 10 Mar 2009 13:49:58 +0100
parents
children
comparison
equal deleted inserted replaced
-1:000000000000 0:c7f6b056b673
1 dccproc(8) Distributed Checksum Clearinghouse dccproc(8)
2
3 NNAAMMEE
4 ddccccpprroocc -- Distributed Checksum Clearinghouse Procmail Interface
5
6 SSYYNNOOPPSSIISS
7 ddccccpprroocc [--VVddAAQQCCHHEERR] [--hh _h_o_m_e_d_i_r] [--mm _m_a_p] [--ww _w_h_i_t_e_c_l_n_t] [--TT _t_m_p_d_i_r]
8 [--aa _I_P_-_a_d_d_r_e_s_s] [--ff _e_n_v___f_r_o_m] [--tt _t_a_r_g_e_t_s] [--xx _e_x_i_t_c_o_d_e]
9 [--cc _t_y_p_e_,[_l_o_g_-_t_h_o_l_d_,]_r_e_j_-_t_h_o_l_d] [--gg [_n_o_t_-]_t_y_p_e] [--SS _h_e_a_d_e_r]
10 [--ii _i_n_f_i_l_e] [--oo _o_u_t_f_i_l_e] [--ll _l_o_g_d_i_r] [--BB _d_n_s_b_l_-_o_p_t_i_o_n]
11 [--LL _l_t_y_p_e_,_f_a_c_i_l_i_t_y_._l_e_v_e_l]
12
13 DDEESSCCRRIIPPTTIIOONN
14 DDccccpprroocc copies a complete SMTP message from standard input or a file to
15 standard output or another file. As it copies the message, it computes
16 the DCC checksums for the message, reports them to a DCC server, and adds
17 a header line to the message. Another program such as procmail(1) can
18 use the added header line to filter mail. Dccproc does not support any
19 thresholds of its own, because equivalent effects can be achieved with
20 regular expressions and you can apply dccproc several times using differ-
21 ent DCC servers and then score mail based what all of the DCC servers
22 say.
23
24 Error messages are sent to stderr as well as the system log. Connect
25 stderr and stdout to the same file to see errors in context, but direct
26 stderr to /dev/null to keep DCC error messages out of the mail. The --ii
27 option can also be used to separate the error messages.
28
29 DDccccpprroocc sends reports of checksums related to mail received by DCC
30 clients and queries about the total number of reports of particular
31 checksums. A DCC server receives no mail, address, headers, or other
32 information, but only cryptographically secure checksums of such informa-
33 tion. A DCC server cannot determine the text or other information that
34 corresponds to the checksums it receives. It only acts as a clearing-
35 house of counts of checksums computed by clients.
36
37 For the sake of privacy for even the checksums of private mail, the
38 checksums of senders of purely internal mail or other mail that is known
39 to not be unsolicited bulk can be listed in a whitelist to not be
40 reported to the DCC server.
41
42 When sendmail(8) is used, dccm(8) is a better DCC interface. Dccifd(8)
43 is more efficient than ddccccpprroocc because it is a daemon, but that has costs
44 in complexity. See dccsight(8) for a way to use previously computed
45 checksums.
46
47 OOPPTTIIOONNSS
48 The following options are available:
49
50 --VV displays the version of the DCC procmail(1) interface.
51
52 --dd enables debugging output from the DCC client software. Additional
53 --dd options increase the number of messages. One causes error mes-
54 sages to be sent to STDERR as well as the system log.
55
56 --AA adds to existing X-DCC headers (if any) of the brand of the current
57 server instead of replacing existing headers.
58
59 --QQ only queries the DCC server about the checksums of messages instead
60 of reporting and then querying. This is useful when ddccccpprroocc is used
61 to filter mail that has already been reported to a DCC server by
62 another DCC client such as dccm(8). No single mail message should
63 be reported to a DCC server more than once per recipient.
64
65 It is better to use _M_X_D_C_C lines in the --ww _w_h_i_t_e_c_l_n_t file for your MX
66 mail servers that use DCC than --QQ
67
68 --CC outputs only the X-DCC header and the checksums for the message.
69
70 --HH outputs only the X-DCC header.
71
72 --EE adds lines to the start of the log file turned on with --ll and --cc
73 describing what might have been the envelope of the message. The
74 information for the inferred envelope comes from arguments including
75 --aa and headers in the message when --RR is used. No lines are gener-
76 ated for which no information is available, such as the envelope
77 recipient.
78
79 --RR says the first Received lines have the standard
80 "helo (name [address])..." format and the address is that of the
81 SMTP client that would otherwise be provided with --aa. The --aa option
82 should be used if the local SMTP server adds a Received line with
83 some other format or does not add a Received line. Received headers
84 specifying IP addresses marked _M_X or _M_X_D_C_C in the --ww _w_h_i_t_e_c_l_n_t file
85 are skipped.
86
87 --hh _h_o_m_e_d_i_r
88 overrides the default DCC home directory, _/_v_a_r_/_d_c_c.
89
90 --mm _m_a_p
91 specifies a name or path of the memory mapped parameter file instead
92 of the default _m_a_p in the DCC home directory. It should be created
93 with the nneeww mmaapp operation of the cdcc(8) command.
94
95 --ww _w_h_i_t_e_c_l_n_t
96 specifies an optional file containing SMTP client IP addresses and
97 SMTP headers of mail that do not need X-DCC headers and whose check-
98 sums should not be reported to the DCC server. It can also contain
99 checksums of spam. If the pathname is not absolute, it is relative
100 to the DCC home directory. Thus, individual users with private
101 whitelists usually specify them with absolute paths. Common
102 whitelists shared by users must be in the DCC home directory or one
103 of its subdirectories and owned by the set-UID user of ddccccpprroocc. It
104 is useful to _i_n_c_l_u_d_e a common or system-wide whitelist in private
105 lists.
106
107 Because the contents of the _w_h_i_t_e_c_l_n_t file are used frequently, a
108 companion file is automatically created and maintained. It has the
109 same pathname but with an added suffix of _._d_c_c_w. It contains a mem-
110 ory mapped hash table of the main file.
111
112 _O_p_t_i_o_n lines can be used to modify many aspects of ddccccpprroocc filter-
113 ing, as described in the main dcc(8) man page. For example, an
114 _o_p_t_i_o_n _s_p_a_m_-_t_r_a_p_-_a_c_c_e_p_t line turns off DCC filtering and reports the
115 message as spam.
116
117 --TT _t_m_p_d_i_r
118 changes the default directory for temporary files from the system
119 default. The system default is _/_t_m_p.
120
121 --aa _I_P_-_a_d_d_r_e_s_s
122 specifies the IP address (not the host name) of the immediately pre-
123 vious SMTP client. It is often not available. --aa _0_._0_._0_._0 is
124 ignored. --aa. The --aa option should be used instead of --RR if the
125 local SMTP server adds a Received line with some other format or
126 does not add a Received line.
127
128 --ff _e_n_v___f_r_o_m
129 specifies the RFC 821 envelope "Mail From" value with which the mes-
130 sage arrived. It is often not available. If --ff is not present, the
131 contents of the first Return-Path: or UNIX style From_ header is
132 used. The _e_n_v___f_r_o_m string is often but need not be bracketed with
133 "<>".
134
135 --tt _t_a_r_g_e_t_s
136 specifies the number of addressees of the message if other than 1.
137 The string _m_a_n_y instead of a number asserts that there were too many
138 addressees and that the message is unsolicited bulk email.
139
140 --xx _e_x_i_t_c_o_d_e
141 specifies the code or status with which ddccccpprroocc exits if the --cc
142 thresholds are reached or the --ww _w_h_i_t_e_c_l_n_t file blacklists the mes-
143 sage.
144
145 The default value is EX_NOUSER. EX_NOUSER is 67 on many systems.
146 Use 0 to always exit successfully.
147
148 --cc _t_y_p_e_,[_l_o_g_-_t_h_o_l_d_,]_r_e_j_-_t_h_o_l_d
149 sets logging and "spam" thresholds for checksum _t_y_p_e. The checksum
150 types are _I_P, _e_n_v___F_r_o_m, _F_r_o_m, _M_e_s_s_a_g_e_-_I_D, _s_u_b_s_t_i_t_u_t_e, _R_e_c_e_i_v_e_d,
151 _B_o_d_y, _F_u_z_1, _F_u_z_2, _r_e_p_-_t_o_t_a_l, and _r_e_p. The first six, _I_P through
152 _s_u_b_s_t_i_t_u_t_e, have no effect except when a local DCC server configured
153 with --KK is used. The _s_u_b_s_t_i_t_u_t_e thresholds apply to the first sub-
154 stitute heading encountered in the mail message. The string _A_L_L
155 sets thresholds for all types, but is unlikely to be useful except
156 for setting logging thresholds. The string _C_M_N specifies the com-
157 monly used checksums _B_o_d_y, _F_u_z_1, and _F_u_z_2. _R_e_j_-_t_h_o_l_d and _l_o_g_-_t_h_o_l_d
158 must be numbers, the string _N_E_V_E_R, or the string _M_A_N_Y indicating
159 millions of targets. Counts from the DCC server as large as the
160 threshold for any single type are taken as sufficient evidence that
161 the message should be logged or rejected.
162
163 _L_o_g_-_t_h_o_l_d is the threshold at which messages are logged. It can be
164 handy to log messages at a lower threshold to find solicited bulk
165 mail sources such as mailing lists. If no logging threshold is set,
166 only rejected mail and messages with complicated combinations of
167 white and blacklisting are logged. Messages that reach at least one
168 of their rejection thresholds are logged regardless of logging
169 thresholds.
170
171 _R_e_j_-_t_h_o_l_d is the threshold at which messages are considered "bulk,"
172 and so should be rejected or discarded if not whitelisted.
173
174 DCC Reputation thresholds in the commercial version of the DCC are
175 controlled by thresholds on checksum types _r_e_p and _r_e_p_-_t_o_t_a_l. Mes-
176 sages from an IP address that the DCC database says has sent more
177 than --tt _r_e_p_-_t_o_t_a_l_,_l_o_g_-_t_h_o_l_d messages are logged. A DCC Reputation
178 is computed for messages received from IP addresses that have sent
179 more than --tt _r_e_p_-_t_o_t_a_l_,_l_o_g_-_t_h_o_l_d messages. The DCC Reputation of an
180 IP address is the percentage of its messages that have been detected
181 as bulk or having at least 10 recipients. The defaults are equiva-
182 lent to --tt _r_e_p_,_n_e_v_e_r and --tt _r_e_p_-_t_o_t_a_l_,_n_e_v_e_r_,_2_0.
183
184 Bad DCC Reputations do not reject mail unless enabled by an _o_p_t_i_o_n
185 _D_C_C_-_r_e_p_-_o_n line in a _w_h_i_t_e_c_l_n_t file.
186
187 The checksums of locally whitelisted messages are not checked with
188 the DCC server and so only the number of targets of the current copy
189 of a whitelisted message are compared against the thresholds.
190
191 The default is _A_L_L_,_N_E_V_E_R, so that nothing is discarded, rejected, or
192 logged. A common choice is _C_M_N_,_2_5_,_5_0 to reject or discard mail with
193 common bodies except as overridden by the whitelist of the DCC
194 server, the sendmail _$_{_d_c_c___i_s_s_p_a_m_} and _$_{_d_c_c___n_o_t_s_p_a_m_} macros, and
195 --gg, and --ww.
196
197 --gg [_n_o_t_-]_t_y_p_e
198 indicates that whitelisted, _O_K or _O_K_2, counts from the DCC server
199 for a type of checksum are to be believed. They should be ignored
200 if prefixed with _n_o_t_-. _T_y_p_e is one of the same set of strings as
201 for --cc. Only _I_P, _e_n_v___F_r_o_m, and _F_r_o_m are likely choices. By default
202 all three are honored, and hence the need for _n_o_t_-.
203
204 --SS _h_d_r
205 adds to the list of substitute or locally chosen headers that are
206 checked with the --ww _w_h_i_t_e_c_l_n_t file and sent to the DCC server. The
207 checksum of the last header of type _h_d_r found in the message is
208 checked. As many as 6 different substitute headers can be speci-
209 fied, but only the checksum of the first of the 6 will be sent to
210 the DCC server.
211
212 --ii _i_n_f_i_l_e
213 specifies an input file for the entire message instead of standard
214 input. If not absolute, the pathname is interpreted relative to the
215 directory in which ddccccpprroocc was started.
216
217 --oo _o_u_t_f_i_l_e
218 specifies an output file for the entire message including headers
219 instead of standard output. If not absolute, the pathname is inter-
220 preted relative to the directory in which ddccccpprroocc was started.
221
222 --ll _l_o_g_d_i_r
223 specifies a directory for copies of messages whose checksum target
224 counts exceed --cc thresholds. The format of each file is affected by
225 --EE.
226
227 See the FILES section below concerning the contents of the files.
228 See also the _o_p_t_i_o_n _l_o_g_-_s_u_b_d_i_r_e_c_t_o_r_y_-_{_d_a_y_,_h_o_u_r_,_m_i_n_u_t_e_} lines in
229 _w_h_i_t_e_c_l_n_t files described in dcc(8).
230
231 The directory is relative to the DCC home directory if it is not
232 absolute
233
234 --BB _d_n_s_b_l_-_o_p_t_i_o_n
235 enables DNS blacklist checks of the SMTP client IP address, SMTP
236 envelope Mail_From sender domain name, and of host names in URLs in
237 the message body. Body URL blacklisting has too many false posi-
238 tives to use on abuse mailboxes. It is less effective than
239 greylisting with dccm(8) or dccifd(8) but can be useful in situa-
240 tions where greylisting cannot be used.
241
242 _D_n_s_b_l_-_o_p_t_i_o_n is either one of the --BB _s_e_t_:_o_p_t_i_o_n forms or
243 --BB _d_o_m_a_i_n[_,_I_P_a_d_d_r[_/_x_x[_,_b_l_t_y_p_e]]]
244 _D_o_m_a_i_n is a DNS blacklist domain such as example.com that will be
245 searched. _I_P_a_d_d_r[_/_x_x_x] is the string "any" an IP address in the DNS
246 blacklist that indicates that the mail message should be rejected,
247 or a CIDR block covering results from the DNS blacklist.
248 "127.0.0.2" is assumed if _I_P_a_d_d_r is absent. IPv6 addresses can be
249 specified with the usual colon (:) notation. Names can be used
250 instead of numeric addresses. The type of DNS blacklist is speci-
251 fied by _b_l_t_y_p_e as _n_a_m_e, _I_P_v_4, or _I_P_v_6. Given an envelope sender
252 domain name or a domain name in a URL of spam.domain.org and a
253 blacklist of type _n_a_m_e, spam.domain.org.example.com will be tried.
254 Blacklist types of _I_P_v_4 and _I_P_v_6 require that the domain name in a
255 URL sender address be resolved into an IPv4 or IPv6 address. The
256 address is then written as a reversed string of decimal octets to
257 check the DNS blacklist, as in 2.0.0.127.example.com,
258
259 More than one blacklist can be specified and blacklists can be
260 grouped. All searching within a group is stopped at the first posi-
261 tive result.
262
263 Unlike dccm(8) and dccifd(8), no _o_p_t_i_o_n _D_N_S_B_L_-_o_n line is required in
264 the _w_h_i_t_e_c_l_n_t file. A --BB argument is sufficient to show that DNSBL
265 filtering is wanted by the ddccccpprroocc user.
266
267 --BB _s_e_t_:_n_o_-_c_l_i_e_n_t
268 says that SMTP client IP addresses and reverse DNS domain names
269 should not be checked in the following blacklists.
270 --BB _s_e_t_:_c_l_i_e_n_t restores the default for the following black-
271 lists.
272
273 --BB _s_e_t_:_n_o_-_m_a_i_l___h_o_s_t
274 says that SMTP envelope Mail_From sender domain names should
275 not be checked in the following blacklists. --BB _s_e_t_:_m_a_i_l___h_o_s_t
276 restores the default.
277
278 --BB _s_e_t_:_n_o_-_U_R_L
279 says that URLs in the message body should not be checked in the
280 in the following blacklists. --BB _s_e_t_:_U_R_L restores the default.
281
282 --BB _s_e_t_:_n_o_-_M_X
283 says MX servers of sender Mail_From domain names and host names
284 in URLs should not be checked in the following blacklists.
285 --BB _s_e_t_:_M_X restores the default.
286
287 --BB _s_e_t_:_n_o_-_N_S
288 says DNS servers of sender Mail_From domain names and host
289 names in URLs should not be checked in the following black-
290 lists. --BB _s_e_t_:_N_S restores the default.
291
292 --BB _s_e_t_:_d_e_f_a_u_l_t_s
293 is equivalent to all of --BB _s_e_t_:_n_o_-_t_e_m_p_-_f_a_i_l --BB _s_e_t_:_c_l_i_e_n_t
294 --BB _s_e_t_:_m_a_i_l___h_o_s_t --BB _s_e_t_:_U_R_L --BB _s_e_t_:_M_X and --BB _s_e_t_:_N_S
295
296 --BB _s_e_t_:_g_r_o_u_p_=_X
297 adds later DNS blacklists specified with
298 --BB _d_o_m_a_i_n[_,_I_P_a_d_d_r[_/_x_x[_,_b_l_t_y_p_e]]]
299 to group 1, 2, or 3.
300
301 --BB _s_e_t_:_d_e_b_u_g_=_X
302 sets the DNS blacklist logging level
303
304 --BB _s_e_t_:_m_s_g_-_s_e_c_s_=_S
305 limits ddccccpprroocc to _S seconds total for checking all DNS black-
306 lists. The default is 25.
307
308 --BB _s_e_t_:_U_R_L_-_s_e_c_s_=_S
309 limits ddccccpprroocc to at most _S seconds resolving and checking any
310 single URL. The default is 11. Some spam contains dozens of
311 URLs and that some "spamvertised" URLs contain host names that
312 need minutes to resolve. Busy mail systems cannot afford to
313 spend minutes checking each incoming mail message.
314
315 --LL _l_t_y_p_e_,_f_a_c_i_l_i_t_y_._l_e_v_e_l
316 specifies how messages should be logged. _L_t_y_p_e must be _e_r_r_o_r, _i_n_f_o,
317 or _o_f_f to indicate which of the two types of messages are being con-
318 trolled or to turn off all syslog(3) messages from ddccccpprroocc. _L_e_v_e_l
319 must be a syslog(3) level among _E_M_E_R_G, _A_L_E_R_T, _C_R_I_T, _E_R_R, _W_A_R_N_I_N_G,
320 _N_O_T_I_C_E, _I_N_F_O, and _D_E_B_U_G. _F_a_c_i_l_i_t_y must be among _A_U_T_H, _A_U_T_H_P_R_I_V,
321 _C_R_O_N, _D_A_E_M_O_N, _F_T_P, _K_E_R_N, _L_P_R, _M_A_I_L, _N_E_W_S, _U_S_E_R, _U_U_C_P, and _L_O_C_A_L_0
322 through _L_O_C_A_L_7. The default is equivalent to
323 --LL _i_n_f_o_,_M_A_I_L_._N_O_T_I_C_E --LL _e_r_r_o_r_,_M_A_I_L_._E_R_R
324
325 ddccccpprroocc exits with 0 on success and with the --xx value if the --cc thresh-
326 olds are reached or the --ww _w_h_i_t_e_c_l_n_t file blacklists the message. If at
327 all possible, the input mail message is output to standard output or the
328 --oo _o_u_t_f_i_l_e despite errors. If possible, error messages are put into the
329 system log instead of being mixed with the output mail message. The exit
330 status is zero for errors so that the mail message will not be rejected.
331
332 If ddccccpprroocc is run more than 500 times in fewer than 5000 seconds, ddccccpprroocc
333 tries to start Dccifd(8). The attempt is made at most once per hour.
334 Dccifd is significantly more efficient than ddccccpprroocc. With luck, mecha-
335 nisms such as SpamAssassin will notice when dccifd is running and switch
336 to dccifd.
337
338 FFIILLEESS
339 /var/dcc DCC home directory in which other files are found.
340 map memory mapped file in the DCC home directory of information
341 concerning DCC servers.
342 whiteclnt contains the client whitelist in the format described in
343 dcc(8).
344 whiteclnt.dccw
345 is a memory mapped hash table corresponding to the _w_h_i_t_e_c_l_n_t
346 file.
347 tmpdir contains temporary files created and deleted as ddccccpprroocc pro-
348 cesses the message.
349 logdir is an optional directory specified with --ll and containing
350 marked mail. Each file in the directory contains one message,
351 at least one of whose checksums reached one of its --cc thresh-
352 olds. The entire body of the SMTP message including its
353 header is followed by the checksums for the message.
354
355 EEXXAAMMPPLLEESS
356 The following procmailrc(5) rule adds an X-DCC header to passing mail
357
358 :0 f
359 | /usr/local/bin/dccproc -ERw whiteclnt
360
361 This procmailrc(5) recipe rejects mail with total counts of 10 or larger
362 for the commonly used checksums:
363
364 :0 fW
365 | /usr/local/bin/dccproc -ERw whiteclnt -ccmn,10
366 :0 e
367 {
368 EXITCODE=67
369 :0
370 /dev/null
371 }
372
373 SSEEEE AALLSSOO
374 cdcc(8), dcc(8), dbclean(8), dccd(8), dblist(8), dccifd(8), dccm(8),
375 dccsight(8), mail(1), procmail(1).
376
377 HHIISSTTOORRYY
378 Distributed Checksum Clearinghouses are based on an idea of Paul Vixie.
379 Implementation of ddccccpprroocc was started at Rhyolite Software in 2000. This
380 document describes version 1.3.103.
381
382 BBUUGGSS
383 ddccccpprroocc uses --cc where dccm(8) uses --tt.
384
385 February 26, 2009