Discussion:
Filter on Scandinavian characters in subject
Jostein Berntsen
2018-09-14 13:40:20 UTC
Permalink
Hi,

when I use recipes like these to filter messages with Scadinavian
characters (æ,ø,å) in Subject it fails to work. My locale is
nb_NO.UTF-8. Is there a recipe that can be used to match these cases?

:0
* ^Subject:.*lån
innboks/IN-spam/

:0
* Subject:.*belønning
innboks/IN-spam/



Jostein

____________________________________________________________
procmail mailing list -- ***@lists.rwth-aachen.de Procmail homepage: http://www.procmail.org/
To unsubscribe send an email to procmail-***@lists.rwth-aachen.de
https://lists.rwth
Andreas Schamanek
2018-09-14 15:34:04 UTC
Permalink
Post by Jostein Berntsen
when I use recipes like these to filter messages with Scadinavian
characters (æ,ø,å) in Subject it fails to work. My locale is
nb_NO.UTF-8. Is there a recipe that can be used to match these cases?
:0
* ^Subject:.*lån
innboks/IN-spam/
A proper message must have such characters encoded. Look at the source
of messages. You will see something like (for "lån")

=?UTF-8?B?bMOlbg==?=

When you match against this (mind the ? und escape them as \?) it
should work.

--
-- Andreas

:-)
____________________________________________________________
procmail mailing list -- ***@lists.rwth-aachen.de Procmail homepage: http://www.procmail.org/
To unsubscribe send an email to procmail-***@lists.rwth-aachen.de
https://lists.rwth-aachen.d
Jostein Berntsen
2018-09-14 15:57:13 UTC
Permalink
Post by Jostein Berntsen
when I use recipes like these to filter messages with Scadinavian
characters (æ,ø,å) in Subject it fails to work. My locale is
nb_NO.UTF-8. Is there a recipe that can be used to match these cases?
:0
* ^Subject:.*lån
innboks/IN-spam/
A proper message must have such characters encoded. Look at the source of
messages. You will see something like (for "lån")
=?UTF-8?B?bMOlbg==?=
Post by Jostein Berntsen
When you match against this (mind the ? und escape them as \?) it should
work.
Thanks. I solved it doing this:

:0 h
* ^Subject:.*=\?
SUBJECT=| formail -cXSubject: | perl -MEncode -ne 'print
encode("UTF8",decode("MIME-Header",$_))'

:0 hE
SUBJECT=| formail -cXSubject:

:0
* SUBJECT ?? ^Subject:.*lån
innboks/IN-spam/


Something for the manual maybe?


Jostein

____________________________________________________________
procmail mailing list -- ***@lists.rwth-aachen.de Procmail homepage: http://www.procmail.org/
To unsubscribe send an email to procmail-***@lists.rwth-aachen.de
https://lists.rwth-aachen.de/postorius/lists/procmail.lists.rwth-aachen.d
@lbutlr
2018-09-19 20:07:17 UTC
Permalink
Post by Jostein Berntsen
Post by Jostein Berntsen
when I use recipes like these to filter messages with Scadinavian
characters (æ,ø,å) in Subject it fails to work. My locale is
nb_NO.UTF-8. Is there a recipe that can be used to match these cases?
:0
* ^Subject:.*lån
innboks/IN-spam/
A proper message must have such characters encoded. Look at the source of
messages. You will see something like (for "lån")
=?UTF-8?B?bMOlbg==?=
Post by Jostein Berntsen
When you match against this (mind the ? und escape them as \?) it should
work.
:0 h
* ^Subject:.*=\?
SUBJECT=| formail -cXSubject: | perl -MEncode -ne 'print
encode("UTF8",decode("MIME-Header",$_))'
By rewriting the message to include UTF-8 characters in the headers you have just made your message invalid as the mail headers can only contain 7-BIT ASCII and anything else must be encoded.

However, it's your mail, do as you will. You *will* have issues if you try to do something else with those messages, ever. Like, for example, import them into a different client. Or put them on an IMAP server.
Post by Jostein Berntsen
Something for the manual maybe?
No. Andreas gave you the right solution, match against the encoded text in the subject

:0
* ^Subject:.*\UTF-8\?\V\?bMOlbg

{ do stuff }

Or, save your UTF-8 decoded subject into a variable like UTFSUB=| formail…



--
Space Directive 723: Terraformers are expressly forbidden from
recreating Swindon.
____________________________________________________________
procmail mailing list -- ***@lists.rwth-aachen.de Procmail homepage: http://www.procmail.org/
To unsubscribe send an email to procmail-***@lists.rwth-aachen.de
https://lists.rwth-aachen.de
Andreas Schamanek
2018-09-21 11:05:32 UTC
Permalink
Post by @lbutlr
Post by Jostein Berntsen
:0 h
* ^Subject:.*=\?
SUBJECT=| formail -cXSubject: | perl -MEncode -ne 'print
encode("UTF8",decode("MIME-Header",$_))'
By rewriting the message to include UTF-8 characters in the headers
you have just made your message invalid ...
This is not rewriting a message, it is assigning a variable. It's
exactly what you later suggested yourself, and I agree that it is the
Post by @lbutlr
Or, save your UTF-8 decoded subject into a variable like UTFSUB=| formail…
--
-- Andreas

:-)
____________________________________________________________
procmail mailing list -- ***@lists.rwth-aachen.de Procmail homepage: http://www.procmail.org/
To unsubscribe send an email to procmail-***@lists.rwth-aachen.de
https://lists.rwth-aachen.de/po
Jostein Berntsen
2018-09-24 17:58:35 UTC
Permalink
Post by @lbutlr
Post by Jostein Berntsen
:0 h
* ^Subject:.*=\?
SUBJECT=| formail -cXSubject: | perl -MEncode -ne 'print
encode("UTF8",decode("MIME-Header",$_))'
By rewriting the message to include UTF-8 characters in the headers you
have just made your message invalid ...
This is not rewriting a message, it is assigning a variable. It's exactly
what you later suggested yourself, and I agree that it is the more versatile
So my approach is a good one after all? :)


Jostein
Post by @lbutlr
Or, save your UTF-8 decoded subject into a variable like UTFSUB=| formail…
--
-- Andreas
:-)
____________________________________________________________
https://lists.rwth-aachen.de/postorius/lists/procmail.lists.rwth-aachen.de
____________________________________________________________
procmail mailing list -- ***@lists.rwth-aachen.de Procmail homepage: http://www.procmail.org/
To unsubscribe send an email to procmail-***@lists.rwth-aachen.de
https://lists.rwth-
Ruud H.G. van Tol
2018-09-24 18:32:09 UTC
Permalink
Post by Jostein Berntsen
Post by @lbutlr
Post by Jostein Berntsen
:0 h
* ^Subject:.*=\?
SUBJECT=| formail -cXSubject: | perl -MEncode -ne 'print
encode("UTF8",decode("MIME-Header",$_))'
By rewriting the message to include UTF-8 characters in the headers you
have just made your message invalid ...
This is not rewriting a message, it is assigning a variable. It's exactly
what you later suggested yourself, and I agree that it is the more versatile
So my approach is a good one after all? :)
Yes, See also procmailex.

-- Ruud
____________________________________________________________
procmail mailing list -- ***@lists.rwth-aachen.de Procmail homepage: http://www.procmail.org/
To unsubscribe send an email to procmail-***@lists.rwth-aachen.de
https://lists.rwth-aachen.de/postorius/lists/procmail.lis

Loading...