From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail.systemreboot.net (mugam.systemreboot.net [139.59.75.54]) by localhost (mpop-1.4.9) with POP3 for ; Mon, 25 May 2020 20:35:16 +0530 Return-path: Envelope-to: arunisaac@systemreboot.net Delivery-date: Mon, 25 May 2020 20:23:02 +0530 Received: from localhost.localdomain ([127.0.0.1] helo=[192.168.2.12]) by systemreboot.net with esmtp (Exim 4.93) (envelope-from ) id 1jdETO-000p5F-C6 for arunisaac@systemreboot.net; Mon, 25 May 2020 20:23:02 +0530 Received: from [192.168.2.1] (helo=steel) by systemreboot.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1jdETL-000p57-Sh; Mon, 25 May 2020 20:22:59 +0530 From: Arun Isaac To: Ricardo Wurmus Cc: guile-email@systemreboot.net In-Reply-To: <87y2ppezni.fsf@elephly.net> References: <87a726gqsx.fsf@elephly.net> <87y2ppezni.fsf@elephly.net> Date: Mon, 25 May 2020 20:22:54 +0530 Message-ID: MIME-Version: 1.0 Subject: Re: [guile-email] slow parse of email with huge attachment X-BeenThere: guile-email@systemreboot.net X-Mailman-Version: 2.1.33 Precedence: list List-Id: guile-email discussion mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: multipart/mixed; boundary="===============4809062567954243785==" Errors-To: guile-email-bounces@systemreboot.net --===============4809062567954243785== Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha256; protocol="application/pgp-signature" --=-=-= Content-Type: text/plain Hi, I've made some improvements. I got the following snippet down from around 16s to 6s. --8<---------------cut here---------------start------------->8--- (statprof (lambda () (map parse-email (call-with-input-file "large-base64-attachment.mbox" mbox->emails)))) --8<---------------cut here---------------end--------------->8--- Specifically, I made the following improvements to guile-email. - I rewrote the base64 decoder from scratch to be a little faster. Earlier, I was using the decoder I had copied from Guix. - I eliminated several unneccessary bytevector<->string conversions. - I rewrote read-bytes-till in (email utils) to process multiple bytes at a time, instead of byte by byte. There is still scope for improvement, but do test and let me know if this serves your purpose for now. Also, I am curious to see your new benchmark of parse-email to compare with the 8s you reported earlier. Cheers! --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQEzBAEBCAAdFiEEf3MDQ/Lwnzx3v3nTLiXui2GAK7MFAl7L28YACgkQLiXui2GA K7NkxAgAtP9y38g1efzdDow/94i3OkGGUQ5Vj60nFsNBD46x3/1UDUNGm5HoK0iH xed2DlXVE1+dDZs9SwRIEQJZY3u7IK0RMkNECAqYEMWeogtkQYslZon10afKiFyY S9citNAQMmWec0KFN8q0J586UMweO0g3ipu/p1t7HpjpG7bntL4k5oeNNSIJJp9x 2PpRlasshON3gMfrszI0AQsTzOlVhOhPFFpAtYd/ovEpuqTh+2RaT2/eDF7iEUwU 3ZjvXMm1p0eqrqSwNfqLuLU9ffOOboGjLqoXbbXS41eoEIBl+e9kZKmcYJBiizNA IKwTKIMxTrlTudPkP6fuS3OQTZXdLg== =7vtG -----END PGP SIGNATURE----- --=-=-=-- --===============4809062567954243785== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline -- guile-email mailing list guile-email@systemreboot.net https://lists.systemreboot.net/listinfo/guile-email --===============4809062567954243785==-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: Arun Isaac To: Ricardo Wurmus Cc: guile-email@systemreboot.net Subject: Re: [guile-email] slow parse of email with huge attachment In-Reply-To: <87y2ppezni.fsf@elephly.net> References: <87a726gqsx.fsf@elephly.net> <87y2ppezni.fsf@elephly.net> Date: Mon, 25 May 2020 20:22:54 +0530 Message-ID: MIME-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha256; protocol="application/pgp-signature" List-Id: Message-ID: <20200525145254.l3XSSRnYw4h_oOob2uLJKhrrKRvkuHL62BmyOjEv2y0@z> --=-=-= Content-Type: text/plain Hi, I've made some improvements. I got the following snippet down from around 16s to 6s. --8<---------------cut here---------------start------------->8--- (statprof (lambda () (map parse-email (call-with-input-file "large-base64-attachment.mbox" mbox->emails)))) --8<---------------cut here---------------end--------------->8--- Specifically, I made the following improvements to guile-email. - I rewrote the base64 decoder from scratch to be a little faster. Earlier, I was using the decoder I had copied from Guix. - I eliminated several unneccessary bytevector<->string conversions. - I rewrote read-bytes-till in (email utils) to process multiple bytes at a time, instead of byte by byte. There is still scope for improvement, but do test and let me know if this serves your purpose for now. Also, I am curious to see your new benchmark of parse-email to compare with the 8s you reported earlier. Cheers! --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQEzBAEBCAAdFiEEf3MDQ/Lwnzx3v3nTLiXui2GAK7MFAl7L28wACgkQLiXui2GA K7MITwgAu1v0hbs+s68vODX43XkP5uTa+P1GxxA+RgkaF/NhyIovpiktPYCaPDj5 vZwntDmHfHBNmgMMDXkx2ET3UzyLGJyHWSHdUM+YOkKj9/8lxsDWh05rQzT9HcYu aakqFuw5e5ghCPGWIGEd6h+pP5k+5YAZjXDiefNBSCdXu0JE3ikWBSBRdowlgipY 5tVGqlc6eAoY5ag0KPSceb3Lzar6LG/RmutS29saXP5STlzFtat54nDxTyjGhFhD Q6WGdctrYeMncd4nrDDRUxmD+so/yE5lj2r+p0ingC6ZCkztxtgy8a/7f6gEohr8 /Aetg1M4cOOi3C74Ii1gxBH8RjRolQ== =p2Y+ -----END PGP SIGNATURE----- --=-=-=--