guile-email discussion
 help / color / mirror / Atom feed
* [guile-email] more decoding errors
@ 2019-09-27 21:41 Christopher Baines
  2019-09-28  8:12 ` Arun Isaac
  0 siblings, 1 reply; 4+ messages in thread
From: Christopher Baines @ 2019-09-27 21:41 UTC (permalink / raw)
  To: guile-email


[-- Attachment #1.1: Type: text/plain, Size: 721 bytes --]

Hey,

So I've found another case of tricky decodings. I've been looking at the
first email in this mbox file [1]. The From header is:

  From: =?UTF-8?Q?Ludovic_Court=E8s?= <ludo@gnu.org>

1: https://lists.gnu.org/archive/mbox/guix-commits/2016-01

This seems to trip up guile-email within the decode-mime-encoded-word
function, the bytevector returned by q-encoding-decode doesn't work with
bytevector->string.

I'm still not exactly sure why, some experimenting shows that the
following decodes to what I guess is the intended text:

  Ludovic_Court=C3=A8s

Similarly to the recent changes, specifying 'substitute as the third
argument to bytevector->string at least avoids crashing in this case.

Any thoughts?

Chris

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 962 bytes --]

[-- Attachment #2: Type: text/plain, Size: 110 bytes --]

-- 
guile-email mailing list
guile-email@systemreboot.net
https://lists.systemreboot.net/listinfo/guile-email

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [guile-email] more decoding errors
  2019-09-27 21:41 [guile-email] more decoding errors Christopher Baines
@ 2019-09-28  8:12 ` Arun Isaac
  2019-09-28  8:12   ` Arun Isaac
  2019-09-28  9:34   ` Christopher Baines
  0 siblings, 2 replies; 4+ messages in thread
From: Arun Isaac @ 2019-09-28  8:12 UTC (permalink / raw)
  To: Christopher Baines, guile-email


[-- Attachment #1.1: Type: text/plain, Size: 901 bytes --]


Hi,

Thanks for the bug report!

>   From: =?UTF-8?Q?Ludovic_Court=E8s?= <ludo@gnu.org>

This MIME encoded word is incorrectly encoded. UTF-8 is declared as the
encoding, while actually ISO-8859-1 has been used.

> I'm still not exactly sure why, some experimenting shows that the
> following decodes to what I guess is the intended text:
>
>   Ludovic_Court=C3=A8s

Yes, this is the correct encoding.

> Similarly to the recent changes, specifying 'substitute as the third
> argument to bytevector->string at least avoids crashing in this case.

Indeed, the substitute conversion strategy is the only correct way to
proceed. I've made this change and pushed to master.

https://git.systemreboot.net/guile-email/commit/?id=9d82904011516530b6ef1bcd53cef220db485e7a

I'm ok with having another bugfix release, but I think we should wait a
week or two in case more bugs are found. WDYT?

Regards,
Arun.

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 487 bytes --]

[-- Attachment #2: Type: text/plain, Size: 110 bytes --]

-- 
guile-email mailing list
guile-email@systemreboot.net
https://lists.systemreboot.net/listinfo/guile-email

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [guile-email] more decoding errors
  2019-09-28  8:12 ` Arun Isaac
@ 2019-09-28  8:12   ` Arun Isaac
  2019-09-28  9:34   ` Christopher Baines
  1 sibling, 0 replies; 4+ messages in thread
From: Arun Isaac @ 2019-09-28  8:12 UTC (permalink / raw)
  To: Christopher Baines, guile-email

[-- Attachment #1: Type: text/plain, Size: 901 bytes --]


Hi,

Thanks for the bug report!

>   From: =?UTF-8?Q?Ludovic_Court=E8s?= <ludo@gnu.org>

This MIME encoded word is incorrectly encoded. UTF-8 is declared as the
encoding, while actually ISO-8859-1 has been used.

> I'm still not exactly sure why, some experimenting shows that the
> following decodes to what I guess is the intended text:
>
>   Ludovic_Court=C3=A8s

Yes, this is the correct encoding.

> Similarly to the recent changes, specifying 'substitute as the third
> argument to bytevector->string at least avoids crashing in this case.

Indeed, the substitute conversion strategy is the only correct way to
proceed. I've made this change and pushed to master.

https://git.systemreboot.net/guile-email/commit/?id=9d82904011516530b6ef1bcd53cef220db485e7a

I'm ok with having another bugfix release, but I think we should wait a
week or two in case more bugs are found. WDYT?

Regards,
Arun.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 487 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [guile-email] more decoding errors
  2019-09-28  8:12 ` Arun Isaac
  2019-09-28  8:12   ` Arun Isaac
@ 2019-09-28  9:34   ` Christopher Baines
  1 sibling, 0 replies; 4+ messages in thread
From: Christopher Baines @ 2019-09-28  9:34 UTC (permalink / raw)
  To: Arun Isaac; +Cc: guile-email

[-- Attachment #1: Type: text/plain, Size: 1187 bytes --]


Arun Isaac <arunisaac@systemreboot.net> writes:

> Thanks for the bug report!
>
>>   From: =?UTF-8?Q?Ludovic_Court=E8s?= <ludo@gnu.org>
>
> This MIME encoded word is incorrectly encoded. UTF-8 is declared as the
> encoding, while actually ISO-8859-1 has been used.

Right.. OK :)

>> Similarly to the recent changes, specifying 'substitute as the third
>> argument to bytevector->string at least avoids crashing in this case.
>
> Indeed, the substitute conversion strategy is the only correct way to
> proceed. I've made this change and pushed to master.
>
> https://git.systemreboot.net/guile-email/commit/?id=9d82904011516530b6ef1bcd53cef220db485e7a
>
> I'm ok with having another bugfix release, but I think we should wait a
> week or two in case more bugs are found. WDYT?

Thanks Arun :)

So I've tried to parse all the mbox files for the guix-commits mailing
list with this change, and I haven't found any new problems.

But yeah, I'm in no rush to get this fixed. I was looking ahead at
loading old mbox files in to the Guix Data Service to load information
about old revisions, but I'm not ready to do that on the processing side
in the Guix Data Service.

Thanks again,

Chris

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 962 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-09-28 18:13 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-27 21:41 [guile-email] more decoding errors Christopher Baines
2019-09-28  8:12 ` Arun Isaac
2019-09-28  8:12   ` Arun Isaac
2019-09-28  9:34   ` Christopher Baines

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox