diff options
| author | Reiner Steib | 2006-04-20 20:14:50 +0000 |
|---|---|---|
| committer | Reiner Steib | 2006-04-20 20:14:50 +0000 |
| commit | 93f86ee0b157d3f328ebd407b11abd6002a4b130 (patch) | |
| tree | 5df26e098cb03ba01d3b54dd14438eb7ce5bc80f | |
| parent | 5a02d811ed1b04c4f877e1bfd8607750dc02cf22 (diff) | |
| download | emacs-93f86ee0b157d3f328ebd407b11abd6002a4b130.tar.gz emacs-93f86ee0b157d3f328ebd407b11abd6002a4b130.zip | |
2006-04-20 Reiner Steib <Reiner.Steib@gmx.de>
* gnus.texi (Spam Statistics Package): Fix typo in @pxref.
(Splitting mail using spam-stat): Fix @xref.
2006-04-20 Chong Yidong <cyd@stupidchicken.com>
* gnus.texi (Spam Package): Major revision of the text. Previouly
this node was "Filtering Spam Using The Spam ELisp Package".
| -rw-r--r-- | man/ChangeLog | 12 | ||||
| -rw-r--r-- | man/gnus.texi | 613 |
2 files changed, 328 insertions, 297 deletions
diff --git a/man/ChangeLog b/man/ChangeLog index 8f5e306d290..100920e311a 100644 --- a/man/ChangeLog +++ b/man/ChangeLog | |||
| @@ -1,3 +1,13 @@ | |||
| 1 | 2006-04-20 Reiner Steib <Reiner.Steib@gmx.de> | ||
| 2 | |||
| 3 | * gnus.texi (Spam Statistics Package): Fix typo in @pxref. | ||
| 4 | (Splitting mail using spam-stat): Fix @xref. | ||
| 5 | |||
| 6 | 2006-04-20 Chong Yidong <cyd@stupidchicken.com> | ||
| 7 | |||
| 8 | * gnus.texi (Spam Package): Major revision of the text. Previouly | ||
| 9 | this node was "Filtering Spam Using The Spam ELisp Package". | ||
| 10 | |||
| 1 | 2006-04-20 Carsten Dominik <dominik@science.uva.nl> | 11 | 2006-04-20 Carsten Dominik <dominik@science.uva.nl> |
| 2 | 12 | ||
| 3 | * org.texi: (Time stamps): Better explanation of the purpose of | 13 | * org.texi: (Time stamps): Better explanation of the purpose of |
| @@ -8,7 +18,7 @@ | |||
| 8 | 2006-04-18 J.D. Smith <jdsmith@as.arizona.edu> | 18 | 2006-04-18 J.D. Smith <jdsmith@as.arizona.edu> |
| 9 | 19 | ||
| 10 | * misc.texi (Shell Ring): Added notes on saved input when | 20 | * misc.texi (Shell Ring): Added notes on saved input when |
| 11 | navigating off the end of the history list. | 21 | navigating off the end of the history list. |
| 12 | 22 | ||
| 13 | 2006-04-18 Chong Yidong <cyd@mit.edu> | 23 | 2006-04-18 Chong Yidong <cyd@mit.edu> |
| 14 | 24 | ||
diff --git a/man/gnus.texi b/man/gnus.texi index 75e6243ba5e..2f1a7322dc0 100644 --- a/man/gnus.texi +++ b/man/gnus.texi | |||
| @@ -799,7 +799,8 @@ Various | |||
| 799 | * Moderation:: What to do if you're a moderator. | 799 | * Moderation:: What to do if you're a moderator. |
| 800 | * Image Enhancements:: Modern versions of Emacs/XEmacs can display images. | 800 | * Image Enhancements:: Modern versions of Emacs/XEmacs can display images. |
| 801 | * Fuzzy Matching:: What's the big fuzz? | 801 | * Fuzzy Matching:: What's the big fuzz? |
| 802 | * Thwarting Email Spam:: A how-to on avoiding unsolicited commercial email. | 802 | * Thwarting Email Spam:: Simple ways to avoid unsolicited commercial email. |
| 803 | * Spam Package:: A package for filtering and processing spam. | ||
| 803 | * Other modes:: Interaction with other modes. | 804 | * Other modes:: Interaction with other modes. |
| 804 | * Various Various:: Things that are really various. | 805 | * Various Various:: Things that are really various. |
| 805 | 806 | ||
| @@ -818,7 +819,8 @@ Image Enhancements | |||
| 818 | 819 | ||
| 819 | * X-Face:: Display a funky, teensy black-and-white image. | 820 | * X-Face:: Display a funky, teensy black-and-white image. |
| 820 | * Face:: Display a funkier, teensier colored image. | 821 | * Face:: Display a funkier, teensier colored image. |
| 821 | * Smileys:: Show all those happy faces the way they were meant to be shown. | 822 | * Smileys:: Show all those happy faces the way they were |
| 823 | meant to be shown. | ||
| 822 | * Picons:: How to display pictures of what you're reading. | 824 | * Picons:: How to display pictures of what you're reading. |
| 823 | * XVarious:: Other XEmacsy Gnusey variables. | 825 | * XVarious:: Other XEmacsy Gnusey variables. |
| 824 | 826 | ||
| @@ -828,28 +830,19 @@ Thwarting Email Spam | |||
| 828 | * Anti-Spam Basics:: Simple steps to reduce the amount of spam. | 830 | * Anti-Spam Basics:: Simple steps to reduce the amount of spam. |
| 829 | * SpamAssassin:: How to use external anti-spam tools. | 831 | * SpamAssassin:: How to use external anti-spam tools. |
| 830 | * Hashcash:: Reduce spam by burning CPU time. | 832 | * Hashcash:: Reduce spam by burning CPU time. |
| 831 | * Filtering Spam Using The Spam ELisp Package:: | ||
| 832 | * Filtering Spam Using Statistics with spam-stat:: | ||
| 833 | 833 | ||
| 834 | Filtering Spam Using The Spam ELisp Package | 834 | Spam Package |
| 835 | 835 | ||
| 836 | * Spam ELisp Package Sequence of Events:: | 836 | * Spam Package Introduction:: |
| 837 | * Spam ELisp Package Filtering of Incoming Mail:: | 837 | * Filtering Incoming Mail:: |
| 838 | * Spam ELisp Package Global Variables:: | 838 | * Detecting Spam in Groups:: |
| 839 | * Spam ELisp Package Configuration Examples:: | 839 | * Spam and Ham Processors:: |
| 840 | * Blacklists and Whitelists:: | 840 | * Spam Package Configuration Examples:: |
| 841 | * BBDB Whitelists:: | 841 | * Spam Back Ends:: |
| 842 | * Gmane Spam Reporting:: | 842 | * Extending the Spam package:: |
| 843 | * Anti-spam Hashcash Payments:: | 843 | * Spam Statistics Package:: |
| 844 | * Blackholes:: | ||
| 845 | * Regular Expressions Header Matching:: | ||
| 846 | * Bogofilter:: | ||
| 847 | * ifile spam filtering:: | ||
| 848 | * spam-stat spam filtering:: | ||
| 849 | * SpamOracle:: | ||
| 850 | * Extending the Spam ELisp package:: | ||
| 851 | 844 | ||
| 852 | Filtering Spam Using Statistics with spam-stat | 845 | Spam Statistics Package |
| 853 | 846 | ||
| 854 | * Creating a spam-stat dictionary:: | 847 | * Creating a spam-stat dictionary:: |
| 855 | * Splitting mail using spam-stat:: | 848 | * Splitting mail using spam-stat:: |
| @@ -20797,7 +20790,8 @@ four days, Gnus will decay the scores four times, for instance. | |||
| 20797 | * Fetching a Group:: Starting Gnus just to read a group. | 20790 | * Fetching a Group:: Starting Gnus just to read a group. |
| 20798 | * Image Enhancements:: Modern versions of Emacs/XEmacs can display images. | 20791 | * Image Enhancements:: Modern versions of Emacs/XEmacs can display images. |
| 20799 | * Fuzzy Matching:: What's the big fuzz? | 20792 | * Fuzzy Matching:: What's the big fuzz? |
| 20800 | * Thwarting Email Spam:: A how-to on avoiding unsolicited commercial email. | 20793 | * Thwarting Email Spam:: Simple ways to avoid unsolicited commercial email. |
| 20794 | * Spam Package:: A package for filtering and processing spam. | ||
| 20801 | * Other modes:: Interaction with other modes. | 20795 | * Other modes:: Interaction with other modes. |
| 20802 | * Various Various:: Things that are really various. | 20796 | * Various Various:: Things that are really various. |
| 20803 | @end menu | 20797 | @end menu |
| @@ -22479,8 +22473,6 @@ This is annoying. Here's what you can do about it. | |||
| 22479 | * Anti-Spam Basics:: Simple steps to reduce the amount of spam. | 22473 | * Anti-Spam Basics:: Simple steps to reduce the amount of spam. |
| 22480 | * SpamAssassin:: How to use external anti-spam tools. | 22474 | * SpamAssassin:: How to use external anti-spam tools. |
| 22481 | * Hashcash:: Reduce spam by burning CPU time. | 22475 | * Hashcash:: Reduce spam by burning CPU time. |
| 22482 | * Filtering Spam Using The Spam ELisp Package:: | ||
| 22483 | * Filtering Spam Using Statistics with spam-stat:: | ||
| 22484 | @end menu | 22476 | @end menu |
| 22485 | 22477 | ||
| 22486 | @node The problem of spam | 22478 | @node The problem of spam |
| @@ -22796,41 +22788,107 @@ hashcash cookies, it is expected that this is performed by your hand | |||
| 22796 | customized mail filtering scripts. Improvements in this area would be | 22788 | customized mail filtering scripts. Improvements in this area would be |
| 22797 | a useful contribution, however. | 22789 | a useful contribution, however. |
| 22798 | 22790 | ||
| 22799 | @node Filtering Spam Using The Spam ELisp Package | 22791 | @node Spam Package |
| 22800 | @subsection Filtering Spam Using The Spam ELisp Package | 22792 | @section Spam Package |
| 22793 | @cindex spam filtering | ||
| 22794 | @cindex spam | ||
| 22795 | |||
| 22796 | The Spam package provides Gnus with a centralized mechanism for | ||
| 22797 | detecting and filtering spam. It filters new mail, and processes | ||
| 22798 | messages according to whether they are spam or ham. (@dfn{Ham} is the | ||
| 22799 | name used throughout this manual to indicate non-spam messages.) | ||
| 22800 | |||
| 22801 | @menu | ||
| 22802 | * Spam Package Introduction:: | ||
| 22803 | * Filtering Incoming Mail:: | ||
| 22804 | * Detecting Spam in Groups:: | ||
| 22805 | * Spam and Ham Processors:: | ||
| 22806 | * Spam Package Configuration Examples:: | ||
| 22807 | * Spam Back Ends:: | ||
| 22808 | * Extending the Spam package:: | ||
| 22809 | * Spam Statistics Package:: | ||
| 22810 | @end menu | ||
| 22811 | |||
| 22812 | @node Spam Package Introduction | ||
| 22813 | @subsection Spam Package Introduction | ||
| 22801 | @cindex spam filtering | 22814 | @cindex spam filtering |
| 22815 | @cindex spam filtering sequence of events | ||
| 22802 | @cindex spam | 22816 | @cindex spam |
| 22803 | 22817 | ||
| 22804 | The idea behind @file{spam.el} is to have a control center for spam detection | 22818 | You must read this section to understand how the Spam package works. |
| 22805 | and filtering in Gnus. To that end, @file{spam.el} does two things: it | 22819 | Do not skip, speed-read, or glance through this section. |
| 22806 | filters new mail, and it analyzes mail known to be spam or ham. | ||
| 22807 | @dfn{Ham} is the name used throughout @file{spam.el} to indicate | ||
| 22808 | non-spam messages. | ||
| 22809 | 22820 | ||
| 22810 | @cindex spam-initialize | 22821 | @cindex spam-initialize |
| 22811 | First of all, you @strong{must} run the function | 22822 | @vindex spam-use-stat |
| 22812 | @code{spam-initialize} to autoload @code{spam.el} and to install the | 22823 | To use the Spam package, you @strong{must} first run the function |
| 22813 | @code{spam.el} hooks. There is one exception: if you use the | 22824 | @code{spam-initialize}: |
| 22814 | @code{spam-use-stat} (@pxref{spam-stat spam filtering}) setting, you | ||
| 22815 | should turn it on before @code{spam-initialize}: | ||
| 22816 | 22825 | ||
| 22817 | @example | 22826 | @example |
| 22818 | (setq spam-use-stat t) ;; if needed | ||
| 22819 | (spam-initialize) | 22827 | (spam-initialize) |
| 22820 | @end example | 22828 | @end example |
| 22821 | 22829 | ||
| 22822 | So, what happens when you load @file{spam.el}? | 22830 | This autoloads @code{spam.el} and installs the various hooks necessary |
| 22823 | 22831 | to let the Spam package do its job. In order to make use of the Spam | |
| 22824 | First, some hooks will get installed by @code{spam-initialize}. There | 22832 | package, you have to set up certain group parameters and variables, |
| 22825 | are some hooks for @code{spam-stat} so it can save its databases, and | 22833 | which we will describe below. All of the variables controlling the |
| 22826 | there are hooks so interesting things will happen when you enter and | 22834 | Spam package can be found in the @samp{spam} customization group. |
| 22827 | leave a group. More on the sequence of events later (@pxref{Spam | 22835 | |
| 22828 | ELisp Package Sequence of Events}). | 22836 | There are two ``contact points'' between the Spam package and the rest |
| 22829 | 22837 | of Gnus: checking new mail for spam, and leaving a group. | |
| 22830 | You get the following keyboard commands: | 22838 | |
| 22839 | Checking new mail for spam is done in one of two ways: while splitting | ||
| 22840 | incoming mail, or when you enter a group. | ||
| 22841 | |||
| 22842 | The first way, checking for spam while splitting incoming mail, is | ||
| 22843 | suited to mail back ends such as @code{nnml} or @code{nnimap}, where | ||
| 22844 | new mail appears in a single spool file. The Spam package processes | ||
| 22845 | incoming mail, and sends mail considered to be spam to a designated | ||
| 22846 | ``spam'' group. @xref{Filtering Incoming Mail}. | ||
| 22847 | |||
| 22848 | The second way is suited to back ends such as @code{nntp}, which have | ||
| 22849 | no incoming mail spool, or back ends where the server is in charge of | ||
| 22850 | splitting incoming mail. In this case, when you enter a Gnus group, | ||
| 22851 | the unseen or unread messages in that group are checked for spam. | ||
| 22852 | Detected spam messages are marked as spam. @xref{Detecting Spam in | ||
| 22853 | Groups}. | ||
| 22854 | |||
| 22855 | @cindex spam back ends | ||
| 22856 | In either case, you have to tell the Spam package what method to use | ||
| 22857 | to detect spam messages. There are several methods, or @dfn{spam back | ||
| 22858 | ends} (not to be confused with Gnus back ends!) to choose from: spam | ||
| 22859 | ``blacklists'' and ``whitelists'', dictionary-based filters, and so | ||
| 22860 | forth. @xref{Spam Back Ends}. | ||
| 22861 | |||
| 22862 | In the Gnus summary buffer, messages that have been identified as spam | ||
| 22863 | always appear with a @samp{$} symbol. | ||
| 22864 | |||
| 22865 | The Spam package divides Gnus groups into three categories: ham | ||
| 22866 | groups, spam groups, and unclassified groups. You should mark each of | ||
| 22867 | the groups you subscribe to as either a ham group or a spam group, | ||
| 22868 | using the @code{spam-contents} group parameter (@pxref{Group | ||
| 22869 | Parameters}). Spam groups have a special property: when you enter a | ||
| 22870 | spam group, all unseen articles are marked as spam. Thus, mail split | ||
| 22871 | into a spam group is automatically marked as spam. | ||
| 22872 | |||
| 22873 | Identifying spam messages is only half of the Spam package's job. The | ||
| 22874 | second half comes into play whenever you exit a group buffer. At this | ||
| 22875 | point, the Spam package does several things: | ||
| 22876 | |||
| 22877 | First, it calls @dfn{spam and ham processors} to process the articles | ||
| 22878 | according to whether they are spam or ham. There is a pair of spam | ||
| 22879 | and ham processors associated with each spam back end, and what the | ||
| 22880 | processors do depends on the back end. At present, the main role of | ||
| 22881 | spam and ham processors is for dictionary-based spam filters: they add | ||
| 22882 | the contents of the messages in the group to the filter's dictionary, | ||
| 22883 | to improve its ability to detect future spam. The @code{spam-process} | ||
| 22884 | group parameter specifies what spam processors to use. @xref{Spam and | ||
| 22885 | Ham Processors}. | ||
| 22886 | |||
| 22887 | If the spam filter failed to mark a spam message, you can mark it | ||
| 22888 | yourself, so that the message is processed as spam when you exit the | ||
| 22889 | group: | ||
| 22831 | 22890 | ||
| 22832 | @table @kbd | 22891 | @table @kbd |
| 22833 | |||
| 22834 | @item M-d | 22892 | @item M-d |
| 22835 | @itemx M s x | 22893 | @itemx M s x |
| 22836 | @itemx S x | 22894 | @itemx S x |
| @@ -22838,189 +22896,103 @@ You get the following keyboard commands: | |||
| 22838 | @kindex S x | 22896 | @kindex S x |
| 22839 | @kindex M s x | 22897 | @kindex M s x |
| 22840 | @findex gnus-summary-mark-as-spam | 22898 | @findex gnus-summary-mark-as-spam |
| 22841 | @code{gnus-summary-mark-as-spam}. | 22899 | @findex gnus-summary-mark-as-spam |
| 22842 | 22900 | Mark current article as spam, showing it with the @samp{$} mark | |
| 22843 | Mark current article as spam, showing it with the @samp{$} mark. | 22901 | (@code{gnus-summary-mark-as-spam}). |
| 22844 | Whenever you see a spam article, make sure to mark its summary line | ||
| 22845 | with @kbd{M-d} before leaving the group. This is done automatically | ||
| 22846 | for unread articles in @emph{spam} groups. | ||
| 22847 | |||
| 22848 | @item M s t | ||
| 22849 | @itemx S t | ||
| 22850 | @kindex M s t | ||
| 22851 | @kindex S t | ||
| 22852 | @findex spam-bogofilter-score | ||
| 22853 | @code{spam-bogofilter-score}. | ||
| 22854 | |||
| 22855 | You must have Bogofilter installed for that command to work properly. | ||
| 22856 | |||
| 22857 | @xref{Bogofilter}. | ||
| 22858 | |||
| 22859 | @end table | 22902 | @end table |
| 22860 | 22903 | ||
| 22861 | Also, when you load @file{spam.el}, you will be able to customize its | 22904 | @noindent |
| 22862 | variables. Try @code{customize-group} on the @samp{spam} variable | 22905 | Similarly, you can unmark an article if it has been erroneously marked |
| 22863 | group. | 22906 | as spam. @xref{Setting Marks}. |
| 22864 | |||
| 22865 | @menu | ||
| 22866 | * Spam ELisp Package Sequence of Events:: | ||
| 22867 | * Spam ELisp Package Filtering of Incoming Mail:: | ||
| 22868 | * Spam ELisp Package Global Variables:: | ||
| 22869 | * Spam ELisp Package Configuration Examples:: | ||
| 22870 | * Blacklists and Whitelists:: | ||
| 22871 | * BBDB Whitelists:: | ||
| 22872 | * Gmane Spam Reporting:: | ||
| 22873 | * Anti-spam Hashcash Payments:: | ||
| 22874 | * Blackholes:: | ||
| 22875 | * Regular Expressions Header Matching:: | ||
| 22876 | * Bogofilter:: | ||
| 22877 | * ifile spam filtering:: | ||
| 22878 | * spam-stat spam filtering:: | ||
| 22879 | * SpamOracle:: | ||
| 22880 | * Extending the Spam ELisp package:: | ||
| 22881 | @end menu | ||
| 22882 | |||
| 22883 | @node Spam ELisp Package Sequence of Events | ||
| 22884 | @subsubsection Spam ELisp Package Sequence of Events | ||
| 22885 | @cindex spam filtering | ||
| 22886 | @cindex spam filtering sequence of events | ||
| 22887 | @cindex spam | ||
| 22888 | |||
| 22889 | You must read this section to understand how @code{spam.el} works. | ||
| 22890 | Do not skip, speed-read, or glance through this section. | ||
| 22891 | |||
| 22892 | There are two @emph{contact points}, if you will, between | ||
| 22893 | @code{spam.el} and the rest of Gnus: checking new mail for spam, and | ||
| 22894 | leaving a group. | ||
| 22895 | |||
| 22896 | Getting new mail is done in one of two ways. You can either split | ||
| 22897 | your incoming mail or you can classify new articles as ham or spam | ||
| 22898 | when you enter the group. | ||
| 22899 | |||
| 22900 | Splitting incoming mail is better suited to mail backends such as | ||
| 22901 | @code{nnml} or @code{nnimap} where new mail appears in a single file | ||
| 22902 | called a @dfn{Spool File}. See @xref{Spam ELisp Package Filtering of | ||
| 22903 | Incoming Mail}. | ||
| 22904 | |||
| 22905 | For backends such as @code{nntp} there is no incoming mail spool, so | ||
| 22906 | an alternate mechanism must be used. This may also happen for | ||
| 22907 | backends where the server is in charge of splitting incoming mail, and | ||
| 22908 | Gnus does not do further splitting. The @code{spam-autodetect} and | ||
| 22909 | @code{spam-autodetect-methods} group parameters (accessible with | ||
| 22910 | @kbd{G c} and @kbd{G p} as usual), and the corresponding variables | ||
| 22911 | @code{gnus-spam-autodetect-methods} and | ||
| 22912 | @code{gnus-spam-autodetect-methods} (accessible with @kbd{M-x | ||
| 22913 | customize-variable} as usual). | ||
| 22914 | |||
| 22915 | When @code{spam-autodetect} is used, it hooks into the process of | ||
| 22916 | entering a group. Thus, entering a group with unseen or unread | ||
| 22917 | articles becomes the substitute for checking incoming mail. Whether | ||
| 22918 | only unseen articles or all unread articles will be processed is | ||
| 22919 | determined by the @code{spam-autodetect-recheck-messages}. When set | ||
| 22920 | to @code{t}, unread messages will be rechecked. | ||
| 22921 | |||
| 22922 | @code{spam-autodetect} grants the user at once more and less control | ||
| 22923 | of spam filtering. The user will have more control over each group's | ||
| 22924 | spam methods, so for instance the @samp{ding} group may have | ||
| 22925 | @code{spam-use-BBDB} as the autodetection method, while the | ||
| 22926 | @samp{suspect} group may have the @code{spam-use-blacklist} and | ||
| 22927 | @code{spam-use-bogofilter} methods enabled. Every article detected to | ||
| 22928 | be spam will be marked with the spam mark @samp{$} and processed on | ||
| 22929 | exit from the group as normal spam. The user has less control over | ||
| 22930 | the @emph{sequence} of checks, as he might with @code{spam-split}. | ||
| 22931 | |||
| 22932 | When the newly split mail goes into groups, or messages are | ||
| 22933 | autodetected to be ham or spam, those groups must be exited (after | ||
| 22934 | entering, if needed) for further spam processing to happen. It | ||
| 22935 | matters whether the group is considered a ham group, a spam group, or | ||
| 22936 | is unclassified, based on its @code{spam-content} parameter | ||
| 22937 | (@pxref{Spam ELisp Package Global Variables}). Spam groups have the | ||
| 22938 | additional characteristic that, when entered, any unseen or unread | ||
| 22939 | articles (depending on the @code{spam-mark-only-unseen-as-spam} | ||
| 22940 | variable) will be marked as spam. Thus, mail split into a spam group | ||
| 22941 | gets automatically marked as spam when you enter the group. | ||
| 22942 | |||
| 22943 | So, when you exit a group, the @code{spam-processors} are applied, if | ||
| 22944 | any are set, and the processed mail is moved to the | ||
| 22945 | @code{ham-process-destination} or the @code{spam-process-destination} | ||
| 22946 | depending on the article's classification. If the | ||
| 22947 | @code{ham-process-destination} or the @code{spam-process-destination}, | ||
| 22948 | whichever is appropriate, are @code{nil}, the article is left in the | ||
| 22949 | current group. | ||
| 22950 | |||
| 22951 | If a spam is found in any group (this can be changed to only non-spam | ||
| 22952 | groups with @code{spam-move-spam-nonspam-groups-only}), it is | ||
| 22953 | processed by the active @code{spam-processors} (@pxref{Spam ELisp | ||
| 22954 | Package Global Variables}) when the group is exited. Furthermore, the | ||
| 22955 | spam is moved to the @code{spam-process-destination} (@pxref{Spam | ||
| 22956 | ELisp Package Global Variables}) for further training or deletion. | ||
| 22957 | You have to load the @code{gnus-registry.el} package and enable the | ||
| 22958 | @code{spam-log-to-registry} variable if you want spam to be processed | ||
| 22959 | no more than once. Thus, spam is detected and processed everywhere, | ||
| 22960 | which is what most people want. If the | ||
| 22961 | @code{spam-process-destination} is @code{nil}, the spam is marked as | ||
| 22962 | expired, which is usually the right thing to do. | ||
| 22963 | |||
| 22964 | If spam can not be moved---because of a read-only backend such as | ||
| 22965 | @acronym{NNTP}, for example, it will be copied. | ||
| 22966 | 22907 | ||
| 22967 | If a ham mail is found in a ham group, as determined by the | 22908 | Normally, a ham message found in a non-ham group is not processed as |
| 22968 | @code{ham-marks} parameter, it is processed as ham by the active ham | 22909 | ham---the rationale is that it should be moved into a ham group for |
| 22969 | @code{spam-processor} when the group is exited. With the variables | 22910 | further processing (see below). However, you can force these articles |
| 22911 | to be processed as ham by setting | ||
| 22970 | @code{spam-process-ham-in-spam-groups} and | 22912 | @code{spam-process-ham-in-spam-groups} and |
| 22971 | @code{spam-process-ham-in-nonham-groups} the behavior can be further | 22913 | @code{spam-process-ham-in-nonham-groups}. |
| 22972 | altered so ham found anywhere can be processed. You have to load the | ||
| 22973 | @code{gnus-registry.el} package and enable the | ||
| 22974 | @code{spam-log-to-registry} variable if you want ham to be processed | ||
| 22975 | no more than once. Thus, ham is detected and processed only when | ||
| 22976 | necessary, which is what most people want. More on this in | ||
| 22977 | @xref{Spam ELisp Package Configuration Examples}. | ||
| 22978 | 22914 | ||
| 22979 | If ham can not be moved---because of a read-only backend such as | 22915 | @vindex gnus-ham-process-destinations |
| 22980 | @acronym{NNTP}, for example, it will be copied. | 22916 | @vindex gnus-spam-process-destinations |
| 22917 | The second thing that the Spam package does when you exit a group is | ||
| 22918 | to move ham articles out of spam groups, and spam articles out of ham | ||
| 22919 | groups. Ham in a spam group is moved to the group specified by the | ||
| 22920 | variable @code{gnus-ham-process-destinations}, or the group parameter | ||
| 22921 | @code{ham-process-destination}. Spam in a ham group is moved to the | ||
| 22922 | group specified by the variable @code{gnus-spam-process-destinations}, | ||
| 22923 | or the group parameter @code{spam-process-destination}. If these | ||
| 22924 | variables are not set, the articles are left in their current group. | ||
| 22925 | If an article cannot not be moved (e.g., with a read-only backend such | ||
| 22926 | as @acronym{NNTP}), it is copied. | ||
| 22927 | |||
| 22928 | If an article is moved to another group, it is processed again when | ||
| 22929 | you visit the new group. Normally, this is not a problem, but if you | ||
| 22930 | want each article to be processed only once, load the | ||
| 22931 | @code{gnus-registry.el} package and set the variable | ||
| 22932 | @code{spam-log-to-registry} to @code{t}. @xref{Spam Package | ||
| 22933 | Configuration Examples}. | ||
| 22934 | |||
| 22935 | Normally, spam groups ignore @code{gnus-spam-process-destinations}. | ||
| 22936 | However, if you set @code{spam-move-spam-nonspam-groups-only} to | ||
| 22937 | @code{nil}, spam will also be moved out of spam groups, depending on | ||
| 22938 | the @code{spam-process-destination} parameter. | ||
| 22939 | |||
| 22940 | The final thing the Spam package does is to mark spam articles as | ||
| 22941 | expired, which is usually the right thing to do. | ||
| 22981 | 22942 | ||
| 22982 | If all this seems confusing, don't worry. Soon it will be as natural | 22943 | If all this seems confusing, don't worry. Soon it will be as natural |
| 22983 | as typing Lisp one-liners on a neural interface@dots{} err, sorry, that's | 22944 | as typing Lisp one-liners on a neural interface@dots{} err, sorry, that's |
| 22984 | 50 years in the future yet. Just trust us, it's not so bad. | 22945 | 50 years in the future yet. Just trust us, it's not so bad. |
| 22985 | 22946 | ||
| 22986 | @node Spam ELisp Package Filtering of Incoming Mail | 22947 | @node Filtering Incoming Mail |
| 22987 | @subsubsection Spam ELisp Package Filtering of Incoming Mail | 22948 | @subsection Filtering Incoming Mail |
| 22988 | @cindex spam filtering | 22949 | @cindex spam filtering |
| 22989 | @cindex spam filtering incoming mail | 22950 | @cindex spam filtering incoming mail |
| 22990 | @cindex spam | 22951 | @cindex spam |
| 22991 | 22952 | ||
| 22992 | To use the @file{spam.el} facilities for incoming mail filtering, you | 22953 | To use the Spam package to filter incoming mail, you must first set up |
| 22993 | must add the following to your fancy split list | 22954 | fancy mail splitting. @xref{Fancy Mail Splitting}. The Spam package |
| 22994 | @code{nnmail-split-fancy} or @code{nnimap-split-fancy}: | 22955 | defines a special splitting function that you can add to your fancy |
| 22956 | split variable (either @code{nnmail-split-fancy} or | ||
| 22957 | @code{nnimap-split-fancy}, depending on your mail back end): | ||
| 22995 | 22958 | ||
| 22996 | @example | 22959 | @example |
| 22997 | (: spam-split) | 22960 | (: spam-split) |
| 22998 | @end example | 22961 | @end example |
| 22999 | 22962 | ||
| 23000 | Note that the fancy split may be called @code{nnmail-split-fancy} or | 22963 | @vindex spam-split-group |
| 23001 | @code{nnimap-split-fancy}, depending on whether you use the nnmail or | 22964 | @noindent |
| 23002 | nnimap back ends to retrieve your mail. | 22965 | The @code{spam-split} function scans incoming mail according to your |
| 23003 | 22966 | chosen spam back end(s), and sends messages identified as spam to a | |
| 23004 | Also, @code{spam-split} will not modify incoming mail in any way. | 22967 | spam group. By default, the spam group is a group named @samp{spam}, |
| 23005 | 22968 | but you can change this by customizing @code{spam-split-group}. Make | |
| 23006 | The @code{spam-split} function will process incoming mail and send the | 22969 | sure the contents of @code{spam-split-group} are an unqualified group |
| 23007 | mail considered to be spam into the group name given by the variable | 22970 | name. For instance, in an @code{nnimap} server @samp{your-server}, |
| 23008 | @code{spam-split-group}. By default that group name is @samp{spam}, | 22971 | the value @samp{spam} means @samp{nnimap+your-server:spam}. The value |
| 23009 | but you can customize @code{spam-split-group}. Make sure the contents | 22972 | @samp{nnimap+server:spam} is therefore wrong---it gives the group |
| 23010 | of @code{spam-split-group} are an @emph{unqualified} group name, for | 22973 | @samp{nnimap+your-server:nnimap+server:spam}. |
| 23011 | instance in an @code{nnimap} server @samp{your-server} the value | 22974 | |
| 23012 | @samp{spam} will turn out to be @samp{nnimap+your-server:spam}. The | 22975 | @code{spam-split} does not modify the contents of messages in any way. |
| 23013 | value @samp{nnimap+server:spam}, therefore, is wrong and will | ||
| 23014 | actually give you the group | ||
| 23015 | @samp{nnimap+your-server:nnimap+server:spam} which may or may not | ||
| 23016 | work depending on your server's tolerance for strange group names. | ||
| 23017 | |||
| 23018 | You can also give @code{spam-split} a parameter, | ||
| 23019 | e.g. @code{spam-use-regex-headers} or @code{"maybe-spam"}. Why is | ||
| 23020 | this useful? | ||
| 23021 | 22976 | ||
| 23022 | Take these split rules (with @code{spam-use-regex-headers} and | 22977 | @vindex nnimap-split-download-body |
| 23023 | @code{spam-use-blackholes} set): | 22978 | Note for IMAP users: if you use the @code{spam-check-bogofilter}, |
| 22979 | @code{spam-check-ifile}, and @code{spam-check-stat} spam back ends, | ||
| 22980 | you should also set set the variable @code{nnimap-split-download-body} | ||
| 22981 | to @code{t}. These spam back ends are most useful when they can | ||
| 22982 | ``scan'' the full message body. By default, the nnimap back end only | ||
| 22983 | retrieves the message headers; @code{nnimap-split-download-body} tells | ||
| 22984 | it to retrieve the message bodies as well. We don't set this by | ||
| 22985 | default because it will slow @acronym{IMAP} down, and that is not an | ||
| 22986 | appropriate decision to make on behalf of the user. @xref{Splitting | ||
| 22987 | in IMAP}. | ||
| 22988 | |||
| 22989 | You have to specify one or more spam back ends for @code{spam-split} | ||
| 22990 | to use, by setting the @code{spam-use-*} variables. @xref{Spam Back | ||
| 22991 | Ends}. Normally, @code{spam-split} simply uses all the spam back ends | ||
| 22992 | you enabled in this way. However, you can tell @code{spam-split} to | ||
| 22993 | use only some of them. Why this is useful? Suppose you are using the | ||
| 22994 | @code{spam-use-regex-headers} and @code{spam-use-blackholes} spam back | ||
| 22995 | ends, and the following split rule: | ||
| 23024 | 22996 | ||
| 23025 | @example | 22997 | @example |
| 23026 | nnimap-split-fancy '(| | 22998 | nnimap-split-fancy '(| |
| @@ -23030,21 +23002,23 @@ Take these split rules (with @code{spam-use-regex-headers} and | |||
| 23030 | "mail") | 23002 | "mail") |
| 23031 | @end example | 23003 | @end example |
| 23032 | 23004 | ||
| 23033 | Now, the problem is that you want all ding messages to make it to the | 23005 | @noindent |
| 23034 | ding folder. But that will let obvious spam (for example, spam | 23006 | The problem is that you want all ding messages to make it to the ding |
| 23035 | detected by SpamAssassin, and @code{spam-use-regex-headers}) through, | 23007 | folder. But that will let obvious spam (for example, spam detected by |
| 23036 | when it's sent to the ding list. On the other hand, some messages to | 23008 | SpamAssassin, and @code{spam-use-regex-headers}) through, when it's |
| 23037 | the ding list are from a mail server in the blackhole list, so the | 23009 | sent to the ding list. On the other hand, some messages to the ding |
| 23038 | invocation of @code{spam-split} can't be before the ding rule. | 23010 | list are from a mail server in the blackhole list, so the invocation |
| 23039 | 23011 | of @code{spam-split} can't be before the ding rule. | |
| 23040 | You can let SpamAssassin headers supersede ding rules, but all other | 23012 | |
| 23041 | @code{spam-split} rules (including a second invocation of the | 23013 | The solution is to let SpamAssassin headers supersede ding rules, and |
| 23042 | regex-headers check) will be after the ding rule: | 23014 | perform the other @code{spam-split} rules (including a second |
| 23015 | invocation of the regex-headers check) after the ding rule. This is | ||
| 23016 | done by passing a parameter to @code{spam-split}: | ||
| 23043 | 23017 | ||
| 23044 | @example | 23018 | @example |
| 23045 | nnimap-split-fancy | 23019 | nnimap-split-fancy |
| 23046 | '(| | 23020 | '(| |
| 23047 | ;; @r{all spam detected by @code{spam-use-regex-headers} goes to @samp{regex-spam}} | 23021 | ;; @r{spam detected by @code{spam-use-regex-headers} goes to @samp{regex-spam}} |
| 23048 | (: spam-split "regex-spam" 'spam-use-regex-headers) | 23022 | (: spam-split "regex-spam" 'spam-use-regex-headers) |
| 23049 | (any "ding" "ding") | 23023 | (any "ding" "ding") |
| 23050 | ;; @r{all other spam detected by spam-split goes to @code{spam-split-group}} | 23024 | ;; @r{all other spam detected by spam-split goes to @code{spam-split-group}} |
| @@ -23053,58 +23027,68 @@ nnimap-split-fancy | |||
| 23053 | "mail") | 23027 | "mail") |
| 23054 | @end example | 23028 | @end example |
| 23055 | 23029 | ||
| 23030 | @noindent | ||
| 23056 | This lets you invoke specific @code{spam-split} checks depending on | 23031 | This lets you invoke specific @code{spam-split} checks depending on |
| 23057 | your particular needs, and to target the results of those checks to a | 23032 | your particular needs, and target the results of those checks to a |
| 23058 | particular spam group. You don't have to throw all mail into all the | 23033 | particular spam group. You don't have to throw all mail into all the |
| 23059 | spam tests. Another reason why this is nice is that messages to | 23034 | spam tests. Another reason why this is nice is that messages to |
| 23060 | mailing lists you have rules for don't have to have resource-intensive | 23035 | mailing lists you have rules for don't have to have resource-intensive |
| 23061 | blackhole checks performed on them. You could also specify different | 23036 | blackhole checks performed on them. You could also specify different |
| 23062 | spam checks for your nnmail split vs. your nnimap split. Go crazy. | 23037 | spam checks for your nnmail split vs. your nnimap split. Go crazy. |
| 23063 | 23038 | ||
| 23064 | You should still have specific checks such as | 23039 | You should set the @code{spam-use-*} variables for whatever spam back |
| 23065 | @code{spam-use-regex-headers} set to @code{t}, even if you | 23040 | ends you intend to use. The reason is that when loading |
| 23066 | specifically invoke @code{spam-split} with the check. The reason is | 23041 | @file{spam.el}, some conditional loading is done depending on what |
| 23067 | that when loading @file{spam.el}, some conditional loading is done | 23042 | @code{spam-use-xyz} variables you have set. @xref{Spam Back Ends}. |
| 23068 | depending on what @code{spam-use-xyz} variables you have set. This | 23043 | |
| 23069 | is usually not critical, though. | 23044 | @c @emph{TODO: spam.el needs to provide a uniform way of training all the |
| 23070 | 23045 | @c statistical databases. Some have that functionality built-in, others | |
| 23071 | @emph{Note for IMAP users} | 23046 | @c don't.} |
| 23072 | 23047 | ||
| 23073 | The boolean variable @code{nnimap-split-download-body} needs to be | 23048 | @node Detecting Spam in Groups |
| 23074 | set, if you want to split based on the whole message instead of just | 23049 | @subsection Detecting Spam in Groups |
| 23075 | the headers. By default, the nnimap back end will only retrieve the | 23050 | |
| 23076 | message headers. If you use @code{spam-check-bogofilter}, | 23051 | To detect spam when visiting a group, set the group's |
| 23077 | @code{spam-check-ifile}, or @code{spam-check-stat} (the splitters that | 23052 | @code{spam-autodetect} and @code{spam-autodetect-methods} group |
| 23078 | can benefit from the full message body), you should set this variable. | 23053 | parameters. These are accessible with @kbd{G c} or @kbd{G p}, as |
| 23079 | It is not set by default because it will slow @acronym{IMAP} down, and | 23054 | usual (@pxref{Group Parameters}). |
| 23080 | that is not an appropriate decision to make on behalf of the user. | 23055 | |
| 23081 | 23056 | You should set the @code{spam-use-*} variables for whatever spam back | |
| 23082 | @xref{Splitting in IMAP}. | 23057 | ends you intend to use. The reason is that when loading |
| 23083 | 23058 | @file{spam.el}, some conditional loading is done depending on what | |
| 23084 | @emph{TODO: spam.el needs to provide a uniform way of training all the | 23059 | @code{spam-use-xyz} variables you have set. |
| 23085 | statistical databases. Some have that functionality built-in, others | 23060 | |
| 23086 | don't.} | 23061 | By default, only unseen articles are processed for spam. You can |
| 23087 | 23062 | force Gnus to recheck all messages in the group by setting the | |
| 23088 | @node Spam ELisp Package Global Variables | 23063 | variable @code{spam-autodetect-recheck-messages} to @code{t}. |
| 23089 | @subsubsection Spam ELisp Package Global Variables | 23064 | |
| 23065 | If you use the @code{spam-autodetect} method of checking for spam, you | ||
| 23066 | can specify different spam detection methods for different groups. | ||
| 23067 | For instance, the @samp{ding} group may have @code{spam-use-BBDB} as | ||
| 23068 | the autodetection method, while the @samp{suspect} group may have the | ||
| 23069 | @code{spam-use-blacklist} and @code{spam-use-bogofilter} methods | ||
| 23070 | enabled. Unlike with @code{spam-split}, you don't have any control | ||
| 23071 | over the @emph{sequence} of checks, but this is probably unimportant. | ||
| 23072 | |||
| 23073 | @node Spam and Ham Processors | ||
| 23074 | @subsection Spam and Ham Processors | ||
| 23090 | @cindex spam filtering | 23075 | @cindex spam filtering |
| 23091 | @cindex spam filtering variables | 23076 | @cindex spam filtering variables |
| 23092 | @cindex spam variables | 23077 | @cindex spam variables |
| 23093 | @cindex spam | 23078 | @cindex spam |
| 23094 | 23079 | ||
| 23095 | @vindex gnus-spam-process-newsgroups | 23080 | @vindex gnus-spam-process-newsgroups |
| 23096 | The concepts of ham processors and spam processors are very important. | 23081 | Spam and ham processors specify special actions to take when you exit |
| 23097 | Ham processors and spam processors for a group can be set with the | 23082 | a group buffer. Spam processors act on spam messages, and ham |
| 23098 | @code{spam-process} group parameter, or the | 23083 | processors on ham messages. At present, the main role of these |
| 23099 | @code{gnus-spam-process-newsgroups} variable. Ham processors take | 23084 | processors is to update the dictionaries of dictionary-based spam back |
| 23100 | mail known to be non-spam (@emph{ham}) and process it in some way so | 23085 | ends such as Bogofilter (@pxref{Bogofilter}) and the Spam Statistics |
| 23101 | that later similar mail will also be considered non-spam. Spam | 23086 | package (@pxref{Spam Statistics Filtering}). |
| 23102 | processors take mail known to be spam and process it so similar spam | 23087 | |
| 23103 | will be detected later. | 23088 | The spam and ham processors that apply to each group are determined by |
| 23104 | 23089 | the group's@code{spam-process} group parameter. If this group | |
| 23105 | The format of the spam or ham processor entry used to be a symbol, | 23090 | parameter is not defined, they are determined by the variable |
| 23106 | but now it is a @sc{cons} cell. See the individual spam processor entries | 23091 | @code{gnus-spam-process-newsgroups}. |
| 23107 | for more information. | ||
| 23108 | 23092 | ||
| 23109 | @vindex gnus-spam-newsgroup-contents | 23093 | @vindex gnus-spam-newsgroup-contents |
| 23110 | Gnus learns from the spam you get. You have to collect your spam in | 23094 | Gnus learns from the spam you get. You have to collect your spam in |
| @@ -23258,8 +23242,8 @@ When autodetecting spam, this variable tells @code{spam.el} whether | |||
| 23258 | only unseen articles or all unread articles should be checked for | 23242 | only unseen articles or all unread articles should be checked for |
| 23259 | spam. It is recommended that you leave it off. | 23243 | spam. It is recommended that you leave it off. |
| 23260 | 23244 | ||
| 23261 | @node Spam ELisp Package Configuration Examples | 23245 | @node Spam Package Configuration Examples |
| 23262 | @subsubsection Spam ELisp Package Configuration Examples | 23246 | @subsection Spam Package Configuration Examples |
| 23263 | @cindex spam filtering | 23247 | @cindex spam filtering |
| 23264 | @cindex spam filtering configuration examples | 23248 | @cindex spam filtering configuration examples |
| 23265 | @cindex spam configuration examples | 23249 | @cindex spam configuration examples |
| @@ -23384,11 +23368,11 @@ bogofilter or DCC). | |||
| 23384 | 23368 | ||
| 23385 | Because of the @code{gnus-group-spam-classification-spam} entry, all | 23369 | Because of the @code{gnus-group-spam-classification-spam} entry, all |
| 23386 | messages are marked as spam (with @code{$}). When I find a false | 23370 | messages are marked as spam (with @code{$}). When I find a false |
| 23387 | positive, I mark the message with some other ham mark (@code{ham-marks}, | 23371 | positive, I mark the message with some other ham mark |
| 23388 | @ref{Spam ELisp Package Global Variables}). On group exit, those | 23372 | (@code{ham-marks}, @ref{Spam and Ham Processors}). On group exit, |
| 23389 | messages are copied to both groups, @samp{INBOX} (where I want to have | 23373 | those messages are copied to both groups, @samp{INBOX} (where I want |
| 23390 | the article) and @samp{training.ham} (for training bogofilter) and | 23374 | to have the article) and @samp{training.ham} (for training bogofilter) |
| 23391 | deleted from the @samp{spam.detected} folder. | 23375 | and deleted from the @samp{spam.detected} folder. |
| 23392 | 23376 | ||
| 23393 | The @code{gnus-article-sort-by-chars} entry simplifies detection of | 23377 | The @code{gnus-article-sort-by-chars} entry simplifies detection of |
| 23394 | false positives for me. I receive lots of worms (sweN, @dots{}), that all | 23378 | false positives for me. I receive lots of worms (sweN, @dots{}), that all |
| @@ -23424,6 +23408,29 @@ through my local news server (leafnode). I.e. the article numbers are | |||
| 23424 | not the same as on news.gmane.org, thus @code{spam-report.el} has to check | 23408 | not the same as on news.gmane.org, thus @code{spam-report.el} has to check |
| 23425 | the @code{X-Report-Spam} header to find the correct number. | 23409 | the @code{X-Report-Spam} header to find the correct number. |
| 23426 | 23410 | ||
| 23411 | @node Spam Back Ends | ||
| 23412 | @subsection Spam Back Ends | ||
| 23413 | @cindex spam back ends | ||
| 23414 | |||
| 23415 | The spam package offers a variety of back ends for detecting spam. | ||
| 23416 | Each back end defines a set of methods for detecting spam | ||
| 23417 | (@pxref{Filtering Incoming Mail}, @pxref{Detecting Spam in Groups}), | ||
| 23418 | and a pair of spam and ham processors (@pxref{Spam and Ham | ||
| 23419 | Processors}). | ||
| 23420 | |||
| 23421 | @menu | ||
| 23422 | * Blacklists and Whitelists:: | ||
| 23423 | * BBDB Whitelists:: | ||
| 23424 | * Gmane Spam Reporting:: | ||
| 23425 | * Anti-spam Hashcash Payments:: | ||
| 23426 | * Blackholes:: | ||
| 23427 | * Regular Expressions Header Matching:: | ||
| 23428 | * Bogofilter:: | ||
| 23429 | * ifile spam filtering:: | ||
| 23430 | * Spam Statistics Filtering:: | ||
| 23431 | * SpamOracle:: | ||
| 23432 | @end menu | ||
| 23433 | |||
| 23427 | @node Blacklists and Whitelists | 23434 | @node Blacklists and Whitelists |
| 23428 | @subsubsection Blacklists and Whitelists | 23435 | @subsubsection Blacklists and Whitelists |
| 23429 | @cindex spam filtering | 23436 | @cindex spam filtering |
| @@ -23728,6 +23735,15 @@ You should not enable this if you use @code{spam-use-bogofilter-headers}. | |||
| 23728 | 23735 | ||
| 23729 | @end defvar | 23736 | @end defvar |
| 23730 | 23737 | ||
| 23738 | @table @kbd | ||
| 23739 | @item M s t | ||
| 23740 | @itemx S t | ||
| 23741 | @kindex M s t | ||
| 23742 | @kindex S t | ||
| 23743 | @findex spam-bogofilter-score | ||
| 23744 | Get the Bogofilter spamicity score (@code{spam-bogofilter-score}). | ||
| 23745 | @end table | ||
| 23746 | |||
| 23731 | @defvar spam-use-bogofilter-headers | 23747 | @defvar spam-use-bogofilter-headers |
| 23732 | 23748 | ||
| 23733 | Set this variable if you want @code{spam-split} to use Eric Raymond's | 23749 | Set this variable if you want @code{spam-split} to use Eric Raymond's |
| @@ -23829,20 +23845,21 @@ purpose. A ham and a spam processor are provided, plus the | |||
| 23829 | should be used. The 1.2.1 version of ifile was used to test this | 23845 | should be used. The 1.2.1 version of ifile was used to test this |
| 23830 | functionality. | 23846 | functionality. |
| 23831 | 23847 | ||
| 23832 | @node spam-stat spam filtering | 23848 | @node Spam Statistics Filtering |
| 23833 | @subsubsection spam-stat spam filtering | 23849 | @subsubsection Spam Statistics Filtering |
| 23834 | @cindex spam filtering | 23850 | @cindex spam filtering |
| 23835 | @cindex spam-stat, spam filtering | 23851 | @cindex spam-stat, spam filtering |
| 23836 | @cindex spam-stat | 23852 | @cindex spam-stat |
| 23837 | @cindex spam | 23853 | @cindex spam |
| 23838 | 23854 | ||
| 23839 | @xref{Filtering Spam Using Statistics with spam-stat}. | 23855 | This back end uses the Spam Statistics Emacs Lisp package to perform |
| 23856 | statistics-based filtering (@pxref{Spam Statistics Package}). Before | ||
| 23857 | using this, you may want to perform some additional steps to | ||
| 23858 | initialize your Spam Statistics dictionary. @xref{Creating a | ||
| 23859 | spam-stat dictionary}. | ||
| 23840 | 23860 | ||
| 23841 | @defvar spam-use-stat | 23861 | @defvar spam-use-stat |
| 23842 | 23862 | ||
| 23843 | Enable this variable if you want @code{spam-split} to use | ||
| 23844 | spam-stat.el, an Emacs Lisp statistical analyzer. | ||
| 23845 | |||
| 23846 | @end defvar | 23863 | @end defvar |
| 23847 | 23864 | ||
| 23848 | @defvar gnus-group-spam-exit-processor-stat | 23865 | @defvar gnus-group-spam-exit-processor-stat |
| @@ -23902,18 +23919,17 @@ One possibility is to run SpamOracle as a @code{:prescript} from the | |||
| 23902 | @xref{Mail Source Specifiers}, (@pxref{SpamAssassin}). This method has | 23919 | @xref{Mail Source Specifiers}, (@pxref{SpamAssassin}). This method has |
| 23903 | the advantage that the user can see the @emph{X-Spam} headers. | 23920 | the advantage that the user can see the @emph{X-Spam} headers. |
| 23904 | 23921 | ||
| 23905 | The easiest method is to make @file{spam.el} (@pxref{Filtering Spam | 23922 | The easiest method is to make @file{spam.el} (@pxref{Spam Package}) |
| 23906 | Using The Spam ELisp Package}) call SpamOracle. | 23923 | call SpamOracle. |
| 23907 | 23924 | ||
| 23908 | @vindex spam-use-spamoracle | 23925 | @vindex spam-use-spamoracle |
| 23909 | To enable SpamOracle usage by @file{spam.el}, set the variable | 23926 | To enable SpamOracle usage by @file{spam.el}, set the variable |
| 23910 | @code{spam-use-spamoracle} to @code{t} and configure the | 23927 | @code{spam-use-spamoracle} to @code{t} and configure the |
| 23911 | @code{nnmail-split-fancy} or @code{nnimap-split-fancy} as described in | 23928 | @code{nnmail-split-fancy} or @code{nnimap-split-fancy}. @xref{Spam |
| 23912 | the section @xref{Filtering Spam Using The Spam ELisp Package}. In | 23929 | Package}. In this example the @samp{INBOX} of an nnimap server is |
| 23913 | this example the @samp{INBOX} of an nnimap server is filtered using | 23930 | filtered using SpamOracle. Mails recognized as spam mails will be |
| 23914 | SpamOracle. Mails recognized as spam mails will be moved to | 23931 | moved to @code{spam-split-group}, @samp{Junk} in this case. Ham |
| 23915 | @code{spam-split-group}, @samp{Junk} in this case. Ham messages stay | 23932 | messages stay in @samp{INBOX}: |
| 23916 | in @samp{INBOX}: | ||
| 23917 | 23933 | ||
| 23918 | @example | 23934 | @example |
| 23919 | (setq spam-use-spamoracle t | 23935 | (setq spam-use-spamoracle t |
| @@ -23945,14 +23961,14 @@ database to live somewhere special, set | |||
| 23945 | 23961 | ||
| 23946 | SpamOracle employs a statistical algorithm to determine whether a | 23962 | SpamOracle employs a statistical algorithm to determine whether a |
| 23947 | message is spam or ham. In order to get good results, meaning few | 23963 | message is spam or ham. In order to get good results, meaning few |
| 23948 | false hits or misses, SpamOracle needs training. SpamOracle learns the | 23964 | false hits or misses, SpamOracle needs training. SpamOracle learns |
| 23949 | characteristics of your spam mails. Using the @emph{add} mode | 23965 | the characteristics of your spam mails. Using the @emph{add} mode |
| 23950 | (training mode) one has to feed good (ham) and spam mails to | 23966 | (training mode) one has to feed good (ham) and spam mails to |
| 23951 | SpamOracle. This can be done by pressing @kbd{|} in the Summary buffer | 23967 | SpamOracle. This can be done by pressing @kbd{|} in the Summary |
| 23952 | and pipe the mail to a SpamOracle process or using @file{spam.el}'s | 23968 | buffer and pipe the mail to a SpamOracle process or using |
| 23953 | spam- and ham-processors, which is much more convenient. For a | 23969 | @file{spam.el}'s spam- and ham-processors, which is much more |
| 23954 | detailed description of spam- and ham-processors, @xref{Filtering Spam | 23970 | convenient. For a detailed description of spam- and ham-processors, |
| 23955 | Using The Spam ELisp Package}. | 23971 | @xref{Spam Package}. |
| 23956 | 23972 | ||
| 23957 | @defvar gnus-group-spam-exit-processor-spamoracle | 23973 | @defvar gnus-group-spam-exit-processor-spamoracle |
| 23958 | Add this symbol to a group's @code{spam-process} parameter by | 23974 | Add this symbol to a group's @code{spam-process} parameter by |
| @@ -24001,8 +24017,8 @@ the user marks some messages as spam messages, these messages will be | |||
| 24001 | processed by SpamOracle. The processor sends the messages to | 24017 | processed by SpamOracle. The processor sends the messages to |
| 24002 | SpamOracle as new samples for spam. | 24018 | SpamOracle as new samples for spam. |
| 24003 | 24019 | ||
| 24004 | @node Extending the Spam ELisp package | 24020 | @node Extending the Spam package |
| 24005 | @subsubsection Extending the Spam ELisp package | 24021 | @subsection Extending the Spam package |
| 24006 | @cindex spam filtering | 24022 | @cindex spam filtering |
| 24007 | @cindex spam elisp package, extending | 24023 | @cindex spam elisp package, extending |
| 24008 | @cindex extending the spam elisp package | 24024 | @cindex extending the spam elisp package |
| @@ -24109,9 +24125,8 @@ to the @code{spam-autodetect-methods} group parameter in | |||
| 24109 | 24125 | ||
| 24110 | @end enumerate | 24126 | @end enumerate |
| 24111 | 24127 | ||
| 24112 | 24128 | @node Spam Statistics Package | |
| 24113 | @node Filtering Spam Using Statistics with spam-stat | 24129 | @subsection Spam Statistics Package |
| 24114 | @subsection Filtering Spam Using Statistics with spam-stat | ||
| 24115 | @cindex Paul Graham | 24130 | @cindex Paul Graham |
| 24116 | @cindex Graham, Paul | 24131 | @cindex Graham, Paul |
| 24117 | @cindex naive Bayesian spam filtering | 24132 | @cindex naive Bayesian spam filtering |
| @@ -24138,7 +24153,11 @@ non-spam mail. Use the 15 most conspicuous words, compute the total | |||
| 24138 | probability of the mail being spam. If this probability is higher | 24153 | probability of the mail being spam. If this probability is higher |
| 24139 | than a certain threshold, the mail is considered to be spam. | 24154 | than a certain threshold, the mail is considered to be spam. |
| 24140 | 24155 | ||
| 24141 | Gnus supports this kind of filtering. But it needs some setting up. | 24156 | The Spam Statistics package adds support to Gnus for this kind of |
| 24157 | filtering. It can be used as one of the back ends of the Spam package | ||
| 24158 | (@pxref{Spam Package}), or by itself. | ||
| 24159 | |||
| 24160 | Before using the Spam Statistics package, you need to set it up. | ||
| 24142 | First, you need two collections of your mail, one with spam, one with | 24161 | First, you need two collections of your mail, one with spam, one with |
| 24143 | non-spam. Then you need to create a dictionary using these two | 24162 | non-spam. Then you need to create a dictionary using these two |
| 24144 | collections, and save it. And last but not least, you need to use | 24163 | collections, and save it. And last but not least, you need to use |
| @@ -24224,8 +24243,10 @@ The filename used to store the dictionary. This defaults to | |||
| 24224 | @node Splitting mail using spam-stat | 24243 | @node Splitting mail using spam-stat |
| 24225 | @subsubsection Splitting mail using spam-stat | 24244 | @subsubsection Splitting mail using spam-stat |
| 24226 | 24245 | ||
| 24227 | In order to use @code{spam-stat} to split your mail, you need to add the | 24246 | This section describes how to use the Spam statistics |
| 24228 | following to your @file{~/.gnus.el} file: | 24247 | @emph{independently} of the @xref{Spam Package}. |
| 24248 | |||
| 24249 | First, add the following to your @file{~/.gnus.el} file: | ||
| 24229 | 24250 | ||
| 24230 | @lisp | 24251 | @lisp |
| 24231 | (require 'spam-stat) | 24252 | (require 'spam-stat) |