From: Ricardo Wurmus <rekado@elephly.net> To: guile-email@systemreboot.net Subject: [guile-email] parse-email-headers returns just “fields” Date: Tue, 21 Apr 2020 14:24:16 +0200 Message-ID: <87k129kowf.fsf@elephly.net> (raw) [-- Attachment #1: Type: text/plain, Size: 896 bytes --] Hi, I’m currently trying to parse the debbugs bug log files directly. They contain emails (and other information), so I cut out the email text and feed it to parse-email. In some cases the emails don’t seem to have a content-transfer-encoding header, which leads to an error when trying to decode the body. So instead of using parse-email directly I use (email->headers+body (string->bytevector content "utf-8")) and run parse-email-headers over the first value, add a dummy content-transfer-encoding header with value 'binary if it’s missing and then parse the body. Now I noticed two odd things: * I sometimes need to discard the first two lines of the raw email to get the headers to be fully parsed * in some cases the result of parse-email-headers is a literal “fields”, not an alist. Attached is one of these emails. -- Ricardo [-- Attachment #2: mail.txt --] [-- Type: text/plain, Size: 4531 bytes --] From drew.adams@oracle.com Sat Sep 13 09:41:10 2008 X-Spam-Checker-Version: SpamAssassin 3.2.3-bugs.debian.org_2005_01_02 (2007-08-08) on rzlab.ucr.edu X-Spam-Level: X-Spam-Status: No, score=-6.7 required=4.0 tests=AWL,BAYES_00, RCVD_IN_DNSWL_MED,UNPARSEABLE_RELAY autolearn=ham version=3.2.3-bugs.debian.org_2005_01_02 Received: (at submit) by emacsbugs.donarmstrong.com; 13 Sep 2008 16:41:10 +0000 Received: from fencepost.gnu.org (fencepost.gnu.org [140.186.70.10]) by rzlab.ucr.edu (8.13.8/8.13.8/Debian-3) with ESMTP id m8DGf5S7011950 for <submit@emacsbugs.donarmstrong.com>; Sat, 13 Sep 2008 09:41:07 -0700 Received: from mail.gnu.org ([199.232.76.166]:57008 helo=mx10.gnu.org) by fencepost.gnu.org with esmtp (Exim 4.67) (envelope-from <drew.adams@oracle.com>) id 1KeY9S-0000Yg-Bg for emacs-pretest-bug@gnu.org; Sat, 13 Sep 2008 12:39:14 -0400 Received: from Debian-exim by monty-python.gnu.org with spam-scanned (Exim 4.60) (envelope-from <drew.adams@oracle.com>) id 1KeYBB-0004EU-3I for emacs-pretest-bug@gnu.org; Sat, 13 Sep 2008 12:41:04 -0400 Received: from agminet01.oracle.com ([141.146.126.228]:17202) by monty-python.gnu.org with esmtps (TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from <drew.adams@oracle.com>) id 1KeYBA-0004Cs-4Y for emacs-pretest-bug@gnu.org; Sat, 13 Sep 2008 12:41:00 -0400 Received: from agmgw1.us.oracle.com (agmgw1.us.oracle.com [152.68.180.212]) by agminet01.oracle.com (Switch-3.2.4/Switch-3.1.7) with ESMTP id m8DGeoWf024917 for <emacs-pretest-bug@gnu.org>; Sat, 13 Sep 2008 11:40:50 -0500 Received: from acsmt701.oracle.com (acsmt701.oracle.com [141.146.40.71]) by agmgw1.us.oracle.com (Switch-3.2.0/Switch-3.2.0) with ESMTP id m8DGensx018593 for <emacs-pretest-bug@gnu.org>; Sat, 13 Sep 2008 10:40:50 -0600 Received: from dradamslap1 (/24.23.165.218) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Sat, 13 Sep 2008 09:40:49 -0700 From: "Drew Adams" <drew.adams@oracle.com> To: <emacs-pretest-bug@gnu.org> Subject: 23.0.60; incorrect code for filesets-get-filelist Date: Sat, 13 Sep 2008 09:40:59 -0700 Message-ID: <002901c915bf$811df210$0200a8c0@us.oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook 11 Thread-Index: AckVv4Ci1hEH5okgSQ2EUXtBuurWgg== X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3350 X-Brightmail-Tracker: AAAAAQAAAAI= X-Brightmail-Tracker: AAAAAQAAAAI= X-Whitelist: TRUE X-Whitelist: TRUE X-detected-operating-system: by monty-python.gnu.org: GNU/Linux 2.4-2.6 The part that treats a :tree of the code defining `filesets-get-filelist' is not correct and never could have been correct. And it does not correspond to the (correct) code from the filesets author. One wonders if the GNU Emacs code was ever tested. This is the `case' clause that treats :tree in the definition of `filesets-get-filelist': ((:tree) (let ((dir (nth 0 entry)) (patt (nth 1 entry))) (filesets-directory-files dir patt ':files t))) But `entry' here is a complete fileset, which is of the form ("my-fs" (:tree "/some/directory" "^.+\.suffix$")) The above code thus tries to use "my-fs" as the directory, whereas it should use "/some/directory". This is the (correct) code in the latest version from the author (http://members.a1.net/t.link/CompEmacsFilesets.html). (The comment is from the author.) ((:tree) ;;well, the way trees are handled is a mess +++ (let* ((dirpatt (if (consp (nth 1 entry)) (filesets-entry-get-tree entry) entry)) (dir (nth 0 dirpatt)) (patt (nth 1 dirpatt))) (filesets-list-dir dir patt ':files t))) However, I think the following would be sufficient: ((:tree) (let* ((dirpatt (filesets-entry-get-tree entry)) (dir (nth 0 dirpatt)) (patt (nth 1 dirpatt))) (filesets-directory-files dir patt ':files t))) I don't see why the author's more complex treatment would ever be needed, since in order for the :tree clause of the `case' to be reached (consp (nth 1 entry)) must be a cons, AFAICT. At any rate, either the author's code or what I suggest immediately above is needed. There is no way that the current GNU Emacs code can work with a :tree fileset. In GNU Emacs 23.0.60.1 (i386-mingw-nt5.1.2600) of 2008-09-03 on LENNART-69DE564 Windowing system distributor `Microsoft Corp.', version 5.1.2600 configured using `configure --with-gcc (3.4) --no-opt --cflags -Ic:/g/include -fno-crossjumping' [-- Attachment #3: Type: text/plain, Size: 110 bytes --] -- guile-email mailing list guile-email@systemreboot.net https://lists.systemreboot.net/listinfo/guile-email
next reply index Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-04-21 12:24 Ricardo Wurmus [this message] 2020-04-23 1:26 ` Arun Isaac 2020-04-23 1:26 ` Arun Isaac 2020-04-23 6:35 ` Ricardo Wurmus 2020-04-23 11:31 ` Arun Isaac 2020-04-23 11:31 ` Arun Isaac 2020-04-23 14:40 ` Ricardo Wurmus 2020-04-23 21:54 ` Arun Isaac 2020-04-23 21:54 ` Arun Isaac
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=87k129kowf.fsf@elephly.net \ --to=rekado@elephly.net \ --cc=guile-email@systemreboot.net \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
guile-email discussion Archives are clonable: git clone --mirror http://lists.systemreboot.net/guile-email/0 guile-email/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 guile-email guile-email/ http://lists.systemreboot.net/guile-email \ guile-email@systemreboot.net public-inbox-index guile-email Example config snippet for mirrors AGPL code for this site: git clone https://public-inbox.org/public-inbox.git