[Imap-protocol] QRESYNC & Long Command Lines

E-mail headers

From:	Michael M Slusarz <slusarz@curecanti.org>
To:	imap-protocol@u.washington.edu
Date:	Fri, 08 Jun 2018 12:34:50 -0000
Message-ID:	20130218002432.Horde.oHbZYiST0vPMI_hKIDlLvg1@bigworm.curecanti.org permalink / raw / eml / mbox

There seems to be a conflict between the recommendations in RFC 2683  
[3.2.1.5] and the QRESYNC extension to the SELECT/EXAMINE command.

A user recently reported an issue involving an overly long UID list  
passed to the FETCH command.  Although not using caching themselves,  
while analyzing the issue I quickly realized that there could  
potentially be issues in passing a lengthy cached UID list to  
SELECT/EXAMINE.

This particular user's mailbox contained 6-digit UIDs that were not  
sequential.  In this hypothetical:

- 1500 cached messages
- non-sequential UIDs
- All UIDs are 6 digits

...the UID string alone would be approximately 10500 characters (1500  
UIDs * 6 characters + 1499 ',' characters).  This doesn't factor in  
the rest of the command - or even the optional 4th sequence match  
parameter.  Obviously, the 1000 character limit for outgoing client  
IMAP commands suggested in RFC 2683 has pretty much been obsoleted.

The concern is that I know at least some IMAP servers have a hard ~8KB  
limit on any command.  This limit would render QRESYNC useless on  
mailboxes over a certain size - or at least diminish the usefulness of  
the SELECT/EXAMINE QRESYNC parameters.  You could just do a basic  
SELECT/EXAMINE and then issue a series of

   tag1 UID FETCH "known-uids" (FLAGS) (CHANGEDSINCE  
"mod-sequence-value" VANISHED)

commands, but that sort of defeats one of the main benefits of QRESYNC  
- synchronization in a single command.  Not to mention that this is  
not anywhere near inherently obvious from reading the RFC and runs  
directly counter to one of the main reasons for the extension as  
identified in the abstract ("gives an IMAP client the ability to  
quickly resynchronize any previously opened mailbox as part of the  
SELECT command, without the need for server-side state or **additional  
client round-trips**.")

Is this a valid concern, at least with current IMAP server implementations?

michael

Reply

E-mail headers

From:	tss@iki.fi
To:	imap-protocol@localhost
Date:	Fri, 08 Jun 2018 12:34:50 -0000
Message-ID:	8100D6D4-66F1-40C4-A9B2-6A03F8CCC175@iki.fi permalink / raw / eml / mbox

On 18.2.2013, at 9.24, Michael M Slusarz <slusarz@curecanti.org> wrote:

> There seems to be a conflict between the recommendations in RFC 2683 [3.2.1.5] and the QRESYNC extension to the SELECT/EXAMINE command.
> 
> A user recently reported an issue involving an overly long UID list passed to the FETCH command.  Although not using caching themselves, while analyzing the issue I quickly realized that there could potentially be issues in passing a lengthy cached UID list to SELECT/EXAMINE.
> 
> This particular user's mailbox contained 6-digit UIDs that were not sequential.  In this hypothetical:
> 
> - 1500 cached messages
> - non-sequential UIDs
> - All UIDs are 6 digits
> 
> ...the UID string alone would be approximately 10500 characters (1500 UIDs * 6 characters + 1499 ',' characters).  This doesn't factor in the rest of the command - or even the optional 4th sequence match parameter.  Obviously, the 1000 character limit for outgoing client IMAP commands suggested in RFC 2683 has pretty much been obsoleted.

I don't think the idea was to provide all the known UIDs. The first UID range is about giving FLAGS replies only those ones listed. I think it's basically always 1:<last UID you know of>. Or if you happen to be doing something more special where you're caching only a partial state of the mailbox, you could decide to create for example max. 1000 bytes of UID string and then start merging the UID ranges. You'll get a few bytes of more data but it shouldn't matter that much.

For the known sequence ranges the idea is to provide some kind of small snapshots around the mailbox, again not everything.

Reply

E-mail headers

From:	jkt@flaska.net
To:	imap-protocol@localhost
Date:	Fri, 08 Jun 2018 12:34:50 -0000
Message-ID:	2deda4fd-081b-4b1c-9702-269c3b6bca25@flaska.net permalink / raw / eml / mbox

On Monday, 18 February 2013 08:24:32 CEST, Michael M Slusarz wrote:
> The concern is that I know at least some IMAP servers have a 
> hard ~8KB  limit on any command.  This limit would render 
> QRESYNC useless on  mailboxes over a certain size - or at least 
> diminish the usefulness of  the SELECT/EXAMINE QRESYNC 
> parameters.

My client doesn't use the known-uids argument (we always build and maintain the full seq-UID mapping), so I have not spent much time reasoning about that argument. However, Trojita sends a list of known seq-uid pairs. The list is built similarly to the example in RFC5162 by halving the intervals between the included UIDs, starting in the middle of the mailbox (i.e. if the mailbox contains 1000 messages, the first UID belongs to the message #500, the second one to #750, the third one to #875 etc). It's a rather crude heuristic assuming that the probability of an expunge lowers proportionally to how long the message has been present in the mailbox.

I believe that clients should always send the fourth argument to prevent a pathologic case of QRESYNC where it is in fact less efficient than a session without QRESYNC, see the example on page 6 of the RFC. My guess (not based on any evidence) is that it's reasonable to assume that servers do not maintain a list of expunges indefinitely and that it's plausible that the client will every now and then sync using a HIGHESTMODSEQ which the server no longer remembers. If that is the case and the currently assigned UIDs are sparse, QRESYNC will have to return more data than a simple UID SEARCH ALL, so in my opinion it makes sense to send O(log2 n) extra numbers in each sync. Your expected usage pattern might be different, though.

Now, about the third argument, the known-uids -- Timo's suggestion is a good one. What to do here depends on how you handle the FLAGS. If you cache them in a persistent location, it probably makes sense to always send a big enough range of UIDs so that the IMAP server can notify you upon sync time about any changes, eliminating the need to explicitly FETCH FLAGS later on (and to keep track of which messages have fresh flags and which are stale). If you, however, always show just a subset of messages to the user, it might be reasonable to get rid of this client-side caching and to not cache the UIDs for messages which are never going to be shown.

CONDSTORE and QRESYNC are life savers for clients that maintain a fully synchronized view of the whole mailbox (speaking about FLAGS and UIDs now, *not* envelopes and other immutable data!) such as Trojita. If your client is 
optimized towards a different goal, especially when you do not cache flags between sessions and your knowledge of the UID-seq mapping is sparse, it seems to me that these extensions do not provide much benefit over simply doing a sync with the facilities from baseline RFC3501. I might be wrong, as always, so please feel free to correct me.

With kind regards,
Jan

-- 
Trojit?, a fast Qt IMAP e-mail client -- http://trojita.flaska.net/

Reply

E-mail headers

From:	slusarz@curecanti.org
To:	imap-protocol@localhost
Date:	Fri, 08 Jun 2018 12:34:50 -0000
Message-ID:	20130302210212.Horde.q66x7X5K6O_eEUdg-T5_zg1@bigworm.curecanti.org permalink / raw / eml / mbox

Quoting Timo Sirainen <tss@iki.fi>:

> On 18.2.2013, at 9.24, Michael M Slusarz <slusarz@curecanti.org> wrote:
>
>> There seems to be a conflict between the recommendations in RFC  
>> 2683 [3.2.1.5] and the QRESYNC extension to the SELECT/EXAMINE  
>> command.
>>
>> A user recently reported an issue involving an overly long UID list  
>> passed to the FETCH command.  Although not using caching  
>> themselves, while analyzing the issue I quickly realized that there  
>> could potentially be issues in passing a lengthy cached UID list to  
>> SELECT/EXAMINE.
>>
>> This particular user's mailbox contained 6-digit UIDs that were not  
>> sequential.  In this hypothetical:
>>
>> - 1500 cached messages
>> - non-sequential UIDs
>> - All UIDs are 6 digits
>>
>> ...the UID string alone would be approximately 10500 characters  
>> (1500 UIDs * 6 characters + 1499 ',' characters).  This doesn't  
>> factor in the rest of the command - or even the optional 4th  
>> sequence match parameter.  Obviously, the 1000 character limit for  
>> outgoing client IMAP commands suggested in RFC 2683 has pretty much  
>> been obsoleted.
>
> I don't think the idea was to provide all the known UIDs. The first  
> UID range is about giving FLAGS replies only those ones listed. I  
> think it's basically always 1:<last UID you know of>. Or if you  
> happen to be doing something more special where you're caching only  
> a partial state of the mailbox, you could decide to create for  
> example max. 1000 bytes of UID string and then start merging the UID  
> ranges. You'll get a few bytes of more data but it shouldn't matter  
> that much.

Our client NEVER caches the entire mailbox - only tiny view windows  
(100 messages/slice).  I have a trash mailbox with 100,000+ messages.   
I am never going to load that entire mailbox into memory/cache.  At  
best, I will open the mailbox (loading, say, 100 message in the  
slice), do a search to find what I need (loading up to another 100  
unique messages), and maybe move around the search results a bit  
(possibly loading another 100-200 messages as I do a bit of  
scrolling).  So I only have 400 messages in my cache.  I most  
certainly do not want flag changes for the entire mailbox.

But with the command size limitations I do have to make some  
compromises for long UID lists.  So I picked an arbitrary sequence-set  
length which, when exceeded, we necessarily fall-back to a <first  
known UID>:<last known UID> range.  I guess I could get fancier and  
try to optimize this more -- e.g. cut out UID "holes" over a certain  
length -- but seems like a waste of time since I've never actually  
heard of a SELECT/EXAMINE failing with our software due to a long UID  
string in the 4 years we have implemented QRESYNC.  So this is more of  
a theoretical exercise than anything else.

michael

Reply

E-mail headers

From:	slusarz@curecanti.org
To:	imap-protocol@localhost
Date:	Fri, 08 Jun 2018 12:34:50 -0000
Message-ID:	20130302212805.Horde.Q2lND55F8fxpy0sVKrcluQ8@bigworm.curecanti.org permalink / raw / eml / mbox

Quoting Jan Kundr?t <jkt@flaska.net>:

> CONDSTORE and QRESYNC are life savers for clients that maintain a  
> fully synchronized view of the whole mailbox (speaking about FLAGS  
> and UIDs now, *not* envelopes and other immutable data!) such as  
> Trojita. If your client is optimized towards a different goal,  
> especially when you do not cache flags between sessions and your  
> knowledge of the UID-seq mapping is sparse, it seems to me that  
> these extensions do not provide much benefit over simply doing a  
> sync with the facilities from baseline RFC3501. I might be wrong, as  
> always, so please feel free to correct me.

I would partially disagree about this statement.  I think that  
CONDSTORE/QRESYNC is MORE important for newer, more disconnected  
clients - think mobile than more traditional clients.

Traditional flag syncing, e.g. UID FETCH <known UID list> (FLAGS), can  
potentially be an expensive operation bandwidth wise for a mobile  
client.  Couple that with the fact that a wireless connection can cut  
in/out frequently, and these costs can add up.

Say a mobile client is caching 200 messages.  Let's say this client is  
polling for new mail every 5 minutes (push notifications/IDLE may not  
always be available since the connection may not be constant).  An  
average FETCH response might be about 50 bytes.  Assume that no flag  
changes take place in the mailbox at all.

That's 120,000 bytes of additional network traffic an hour (I'm going  
to assume CONDSTORE/QRESYNC command syncing traffic is negligible)  
that has to be parsed to determine that nothing has changed in the  
mailbox.  120,000 bytes for a desktop computer over an hour on a  
decent network connection is minimal.  On a mobile device, having to  
parse this additional traffic might add up to 1-2 minutes less  
battery.  (Under this scenario, that would be ~85 MB of extra traffic  
a month.  Considering that some mobile plans are limited to 1-2 GB of  
traffic a month, that is a substantial chunk of traffic being used)

And this is just CONDSTORE.  VANISHED responses w/QRESYNC eliminate  
the need to do UID synchronization to check for deleted messages.

All this being said... I don't think any of the decent Android e-mail  
clients (e.g. stock client, k-9 mail) implement QRESYNC, let alone  
CONDSTORE.  Don't know about the iOS client.  If this is indeed true,  
they are missing out on at least some resource savings.

michael

Reply

mailing list archives

[Imap-protocol] QRESYNC & Long Command Lines