guile-email discussion
 help / color / mirror / Atom feed
From: Ricardo Wurmus <rekado@elephly.net>
To: guile-email@systemreboot.net
Subject: [guile-email] parse-email-headers returns just “fields”
Date: Tue, 21 Apr 2020 14:24:16 +0200	[thread overview]
Message-ID: <87k129kowf.fsf@elephly.net> (raw)

[-- Attachment #1: Type: text/plain, Size: 896 bytes --]

Hi,

I’m currently trying to parse the debbugs bug log files directly.  They
contain emails (and other information), so I cut out the email text and
feed it to parse-email.  In some cases the emails don’t seem to have a
content-transfer-encoding header, which leads to an error when trying to
decode the body.

So instead of using parse-email directly I use

  (email->headers+body
   (string->bytevector content "utf-8"))

and run parse-email-headers over the first value, add a dummy
content-transfer-encoding header with value 'binary if it’s missing and
then parse the body.

Now I noticed two odd things:

* I sometimes need to discard the first two lines of the raw email to
  get the headers to be fully parsed
* in some cases the result of parse-email-headers is a literal “fields”,
  not an alist.

Attached is one of these emails.

-- 
Ricardo



[-- Attachment #2: mail.txt --]
[-- Type: text/plain, Size: 4531 bytes --]

From drew.adams@oracle.com Sat Sep 13 09:41:10 2008
X-Spam-Checker-Version: SpamAssassin 3.2.3-bugs.debian.org_2005_01_02
	(2007-08-08) on rzlab.ucr.edu
X-Spam-Level: 
X-Spam-Status: No, score=-6.7 required=4.0 tests=AWL,BAYES_00,
	RCVD_IN_DNSWL_MED,UNPARSEABLE_RELAY autolearn=ham
	version=3.2.3-bugs.debian.org_2005_01_02
Received: (at submit) by emacsbugs.donarmstrong.com; 13 Sep 2008 16:41:10 +0000
Received: from fencepost.gnu.org (fencepost.gnu.org [140.186.70.10])
	by rzlab.ucr.edu (8.13.8/8.13.8/Debian-3) with ESMTP id m8DGf5S7011950
	for <submit@emacsbugs.donarmstrong.com>; Sat, 13 Sep 2008 09:41:07 -0700
Received: from mail.gnu.org ([199.232.76.166]:57008 helo=mx10.gnu.org)
	by fencepost.gnu.org with esmtp (Exim 4.67)
	(envelope-from <drew.adams@oracle.com>)
	id 1KeY9S-0000Yg-Bg
	for emacs-pretest-bug@gnu.org; Sat, 13 Sep 2008 12:39:14 -0400
Received: from Debian-exim by monty-python.gnu.org with spam-scanned (Exim 4.60)
	(envelope-from <drew.adams@oracle.com>)
	id 1KeYBB-0004EU-3I
	for emacs-pretest-bug@gnu.org; Sat, 13 Sep 2008 12:41:04 -0400
Received: from agminet01.oracle.com ([141.146.126.228]:17202)
	by monty-python.gnu.org with esmtps (TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32)
	(Exim 4.60)
	(envelope-from <drew.adams@oracle.com>)
	id 1KeYBA-0004Cs-4Y
	for emacs-pretest-bug@gnu.org; Sat, 13 Sep 2008 12:41:00 -0400
Received: from agmgw1.us.oracle.com (agmgw1.us.oracle.com [152.68.180.212])
	by agminet01.oracle.com (Switch-3.2.4/Switch-3.1.7) with ESMTP id m8DGeoWf024917
	for <emacs-pretest-bug@gnu.org>; Sat, 13 Sep 2008 11:40:50 -0500
Received: from acsmt701.oracle.com (acsmt701.oracle.com [141.146.40.71])
	by agmgw1.us.oracle.com (Switch-3.2.0/Switch-3.2.0) with ESMTP id m8DGensx018593
	for <emacs-pretest-bug@gnu.org>; Sat, 13 Sep 2008 10:40:50 -0600
Received: from dradamslap1 (/24.23.165.218)
	by default (Oracle Beehive Gateway v4.0)
	with ESMTP ; Sat, 13 Sep 2008 09:40:49 -0700
From: "Drew Adams" <drew.adams@oracle.com>
To: <emacs-pretest-bug@gnu.org>
Subject: 23.0.60; incorrect code for filesets-get-filelist
Date: Sat, 13 Sep 2008 09:40:59 -0700
Message-ID: <002901c915bf$811df210$0200a8c0@us.oracle.com>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Mailer: Microsoft Office Outlook 11
Thread-Index: AckVv4Ci1hEH5okgSQ2EUXtBuurWgg==
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3350
X-Brightmail-Tracker: AAAAAQAAAAI=
X-Brightmail-Tracker: AAAAAQAAAAI=
X-Whitelist: TRUE
X-Whitelist: TRUE
X-detected-operating-system: by monty-python.gnu.org: GNU/Linux 2.4-2.6

The part that treats a :tree of the code defining
`filesets-get-filelist' is not correct and never could have been
correct. And it does not correspond to the (correct) code from the
filesets author.  One wonders if the GNU Emacs code was ever tested.
 
This is the `case' clause that treats :tree in the definition
of `filesets-get-filelist':
 
((:tree)
 (let ((dir  (nth 0 entry))
       (patt (nth 1 entry)))
   (filesets-directory-files dir patt ':files t)))
 
But `entry' here is a complete fileset, which is of the form
("my-fs" (:tree "/some/directory" "^.+\.suffix$"))
 
The above code thus tries to use "my-fs" as the directory, whereas it
should use "/some/directory".
 
This is the (correct) code in the latest version from the author
(http://members.a1.net/t.link/CompEmacsFilesets.html). (The comment is
from the author.)
 
((:tree)
 ;;well, the way trees are handled is a mess +++
 (let* ((dirpatt (if (consp (nth 1 entry))
                     (filesets-entry-get-tree entry)
                   entry))
        (dir     (nth 0 dirpatt))
        (patt    (nth 1 dirpatt)))
   (filesets-list-dir dir patt ':files t)))
 
However, I think the following would be sufficient:
 
((:tree)
 (let* ((dirpatt (filesets-entry-get-tree entry))
        (dir  (nth 0 dirpatt))
        (patt (nth 1 dirpatt)))
   (filesets-directory-files dir patt ':files t)))
 
I don't see why the author's more complex treatment would ever be
needed, since in order for the :tree clause of the `case' to be
reached (consp (nth 1 entry)) must be a cons, AFAICT.
 
At any rate, either the author's code or what I suggest immediately
above is needed. There is no way that the current GNU Emacs code can
work with a :tree fileset.


In GNU Emacs 23.0.60.1 (i386-mingw-nt5.1.2600)
 of 2008-09-03 on LENNART-69DE564
Windowing system distributor `Microsoft Corp.', version 5.1.2600
configured using `configure --with-gcc (3.4) --no-opt --cflags -Ic:/g/include
-fno-crossjumping'
 





[-- Attachment #3: Type: text/plain, Size: 110 bytes --]

-- 
guile-email mailing list
guile-email@systemreboot.net
https://lists.systemreboot.net/listinfo/guile-email

             reply	other threads:[~2020-04-23  0:26 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-21 12:24 Ricardo Wurmus [this message]
2020-04-23  1:26 ` Arun Isaac
2020-04-23  1:26   ` Arun Isaac
2020-04-23  6:35   ` Ricardo Wurmus
2020-04-23 11:31     ` Arun Isaac
2020-04-23 11:31       ` Arun Isaac
2020-04-23 14:40       ` Ricardo Wurmus
2020-04-23 21:54         ` Arun Isaac
2020-04-23 21:54           ` Arun Isaac

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87k129kowf.fsf@elephly.net \
    --to=rekado@elephly.net \
    --cc=guile-email@systemreboot.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox