mailing list archives

meli community discussions

⚠️ if something does not work as intended when interracting with the mailing lists,
reach out Github mirror Gitea repo @epilys:matrix.org

E-mail headers
From: Michael M Slusarz <slusarz@curecanti.org>
To: imap-protocol@u.washington.edu
Date: Fri, 08 Jun 2018 12:34:49 -0000
Message-ID: 20121115215854.Horde.zz6B0O0tmt3ylHiEXzXhQQ4@bigworm.curecanti.org permalink / raw / eml / mbox
All,

Moving this over from the Dovecot list, where this thread has been  
off/on (http://markmail.org/thread/z7ctwle2go6zafas).

Background: I am a disconnected client author (webmail).  Several  
years ago, I was in the process of adding extensions like QRESYNC and  
LANGUAGE to the code.  Staring at IMAP data over the course of several  
months, I was depressed/alarmed at the amount of state re-creation  
that needed to be done every time we connected to the server which,  
for a non-persistent webmail backend, was pretty much for every  
action.  We needed to do a NAMESPACE, CAPABILITY, (since there was no  
guarantee that we are connecting to the same backend IMAP server, it  
was/is not possible to cache these values on the client side), ENABLE  
QRESYNC, and LANGUAGE call on every user action.

imapproxy (http://imapproxy.org/) has been around for awhile and  
helped with the overhead dealing with establishing the connection  
between web backend and IMAP server.  However, what would be useful is  
that if we could somehow be guaranteed that the imapproxy connection  
was restored, rather than being newly created, then we could avoid  
having to send all the initialization commands and could reliably  
cache the NAMESPACE and CAPABILITY information.  So I went and hacked  
in the XIMAPPROXY code to imapproxy (see, e.g.,  
http://squirrelmail.svn.sourceforge.net/viewvc/squirrelmail/trunk/imap_proxy/README?revision=14250), performance increases were significant, and life was  
ok.

But this is far from an ideal solution.  Some drawbacks:
   - You are limited to a 1 -> 1 backend to imapproxy server mapping,  
since there is no way to track which connection is being reused
   - It requires a separate service to be maintained.
   - It is specific to this proxy server.

Earlier this year, Timo mentioned that (paraphrasing) imapproxy was  
worthless since Dovecot was plenty fast in creating network  
connections.  While this eliminated one of the benefits of using an  
imap proxy it doesn't eliminate the state-restoring optimizations it  
hackishly provides.  Discussion ensued.  It was eventually determined  
that some sort of standardized state storage method would be a useful  
IMAP addition with the suggestion that maybe a more formalized draft  
would be useful to provoke yet more discussion.

I finally got off my rear and did just that.  Below find a  
proof-of-concept draft of a proposed SUSPEND extension, complete with  
example use cases:

https://raw.github.com/slusarz/horde-sandbox/master/imap-suspend-draft/draft-imap-suspend-00.txt

A rough Dovecot server implementation for this kind of feature has  
already been created  
(http://dovecot.markmail.org/thread/qp45yod5ukqf3jfn).

Comments/criticisms requested.

michael
Reply
E-mail headers
From: blong@google.com
To: imap-protocol@localhost
Date: Fri, 08 Jun 2018 12:34:49 -0000
Message-ID: CABa8R6tHP2My0k2LqT1RzHoLQZA+X_jwUMU0cydm2sAwo8f4Fg@mail.gmail.com permalink / raw / eml / mbox
At the very least, I think you need to specify exactly what state is
maintained.

I also think ideas like a server which supports suspend can't issue
capability at login is non-starter.  Just because the server supports this
suspend, doesn't mean all the clients will, and it makes no sense to punish
the other clients for this.

Also, wouldn't something like the RECONNECT proposal from lemonade make
more sense, actually connecting to the exact state?

Can't we just pipeline all of this?

And frankly, the amount of state needed to transfer on reconnect for actual
folder state can be high (though, not as bad with CONDSTORE/QRESYNC).

Brandon


On Thu, Nov 15, 2012 at 8:58 PM, Michael M Slusarz <slusarz@curecanti.org>wrote:

> All,
>
> Moving this over from the Dovecot list, where this thread has been off/on (
> http://markmail.org/thread/**z7ctwle2go6zafas<http://markmail.org/thread/z7ctwle2go6zafas>
> ).
>
> Background: I am a disconnected client author (webmail).  Several years
> ago, I was in the process of adding extensions like QRESYNC and LANGUAGE to
> the code.  Staring at IMAP data over the course of several months, I was
> depressed/alarmed at the amount of state re-creation that needed to be done
> every time we connected to the server which, for a non-persistent webmail
> backend, was pretty much for every action.  We needed to do a NAMESPACE,
> CAPABILITY, (since there was no guarantee that we are connecting to the
> same backend IMAP server, it was/is not possible to cache these values on
> the client side), ENABLE QRESYNC, and LANGUAGE call on every user action.
>
> imapproxy (http://imapproxy.org/) has been around for awhile and helped
> with the overhead dealing with establishing the connection between web
> backend and IMAP server.  However, what would be useful is that if we could
> somehow be guaranteed that the imapproxy connection was restored, rather
> than being newly created, then we could avoid having to send all the
> initialization commands and could reliably cache the NAMESPACE and
> CAPABILITY information.  So I went and hacked in the XIMAPPROXY code to
> imapproxy (see, e.g., http://squirrelmail.svn.**sourceforge.net/viewvc/**
> squirrelmail/trunk/imap_proxy/**README?revision=14250<http://squirrelmail.svn.sourceforge.net/viewvc/squirrelmail/trunk/imap_proxy/README?revision=14250>),
> performance increases were significant, and life was ok.
>
> But this is far from an ideal solution.  Some drawbacks:
>   - You are limited to a 1 -> 1 backend to imapproxy server mapping, since
> there is no way to track which connection is being reused
>   - It requires a separate service to be maintained.
>   - It is specific to this proxy server.
>
> Earlier this year, Timo mentioned that (paraphrasing) imapproxy was
> worthless since Dovecot was plenty fast in creating network connections.
>  While this eliminated one of the benefits of using an imap proxy it
> doesn't eliminate the state-restoring optimizations it hackishly provides.
>  Discussion ensued.  It was eventually determined that some sort of
> standardized state storage method would be a useful IMAP addition with the
> suggestion that maybe a more formalized draft would be useful to provoke
> yet more discussion.
>
> I finally got off my rear and did just that.  Below find a
> proof-of-concept draft of a proposed SUSPEND extension, complete with
> example use cases:
>
> https://raw.github.com/**slusarz/horde-sandbox/master/**
> imap-suspend-draft/draft-imap-**suspend-00.txt<https://raw.github.com/slusarz/horde-sandbox/master/imap-suspend-draft/draft-imap-suspend-00.txt>
>
> A rough Dovecot server implementation for this kind of feature has already
> been created (http://dovecot.markmail.org/**thread/qp45yod5ukqf3jfn<http://dovecot.markmail.org/thread/qp45yod5ukqf3jfn>
> ).
>
> Comments/criticisms requested.
>
> michael
>
> ______________________________**_________________
> Imap-protocol mailing list
> Imap-protocol@u.washington.edu
> http://mailman2.u.washington.**edu/mailman/listinfo/imap-**protocol<http://mailman2.u.washington.edu/mailman/listinfo/imap-protocol>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman13.u.washington.edu/pipermail/imap-protocol/attachments/20121115/621e1ab6/attachment.html>
Reply
E-mail headers
From: imap@tlinx.org
To: imap-protocol@localhost
Date: Fri, 08 Jun 2018 12:34:49 -0000
Message-ID: 50B54704.9020805@tlinx.org permalink / raw / eml / mbox
Michael M Slusarz wrote:
>
> Background: I am a disconnected client author (webmail).... I was 
> depressed/alarmed at the amount of state re-creation ...need.. 
> NAMESPACE, CAPABILITY, (since there was no guarantee that we are 
> connecting to the same backend IMAP server, it was/is not possible to 
> cache these values on the client side), ENABLE QRESYNC, and LANGUAGE 
> call on every user action.
---
Just ignore me if I'm too clueless for this to be of any help, but if I was
doing what you are doing, I'd cache the client TCP connections to the IMAP
server until I timed their session-out from my webserver.

I.e. would set cookie upon connection to their imap server that contains
a 'handle' that would indicate something in my 'IMAP-TCP connection cache',
and route their requests accordingly.

Not only are starting TCP connections "expensive", but you have an added 
expense
of IMAP state... if HTTP(S), believes in using PIPELINING to enable lower
server loads and faster response time, why not a webmail->imap gateway?

Set the "session-timeout to ~5-15 minutes" of inactivity like many things
do and you can recycle their imap connection, but if they reconnect w/the
same cookie, it seems you could simply route their commands to the already
open connection to their imap server -- and there would be no overhead
of looking for capabilities, or resetting state...etc, as the TCP conn.
would guarantee it's the same server.

It would all be transparent to the user -- if they reconnect to you, you
look in your cache, and see if the TCP connection is still valid (or if it,
possibly, has been closed from the other end).  If it has been, the safest
thing to do would be to give a message that the remote IMAP connection
timed-out and they need to re-connect.

Sorry, if this sounds too clueless for words, and I am missing something
obvious about why this can't be done -- but it sounds like rolling your
own connection cache would be the way to go...(but probably am missing
the forest for the trees....)...

-l
Reply
E-mail headers
From: slusarz@curecanti.org
To: imap-protocol@localhost
Date: Fri, 08 Jun 2018 12:34:49 -0000
Message-ID: 20121116143137.Horde.7Kb3aEW5DhM6CnuA2hAx4Q8@bigworm.curecanti.org permalink / raw / eml / mbox
Quoting Brandon Long <blong@google.com>:

> At the very least, I think you need to specify exactly what state is
> maintained.

Point taken: this isn't explicitly defined in the draft.  State as  
intended by the draft means that if a connection is RESUMEd, it would  
be impossible to tell the difference in the authenticated state from  
when the SUSPEND command was issued.  In other words, the server  
connection state would be identical to what it was at the time the  
SUSPEND command (that created the suspend token) was issued.

Maybe a better way to describe is that SUSPEND/RESUME is intended to  
allow a server to save the current IMAP configuration between sessions.

> I also think ideas like a server which supports suspend can't issue
> capability at login is non-starter.  Just because the server supports this
> suspend, doesn't mean all the clients will, and it makes no sense to punish
> the other clients for this.

One of the decisions I made when creating the draft was when to  
initiated resumption of the saved state.  For some reason, I was  
thinking that somehow being in the authenticated state, versus the not  
authenticated state, made a difference when it came to security  
concerns.  Looking at this decision now, that is obviously not the case.

If a client is going to use a non-secure authentication method, then  
it doesn't really matter *when* the server token is sent to the server  
- it's going to be insecure.  So requiring the RESUME command to be  
sent after authentication doesn't address that concern.

The RESUME command SHOULD be sent before authentication occurs, if  
possible.  (We can't restrict the command to only the  
not-authenticated state since PREAUTH connections are dumped directly  
into the authenticated state, where they should be able to RESUME if  
necessary.)  Allowing RESUME to be sent before authentication would  
have several advantages:

   - It addresses your concern.  If a client passes a valid suspend  
token before authentication, the server will know at the time of  
recreating the state that the CAPABILITY command, or any other  
untagged initialization response it wants to send, does not need to be  
sent.  Non-SUSPEND clients, and SUSPEND clients that have yet to  
receive a token, would continue to received these untagged responses.

   - From a server implementer's POV this may be a more efficient way  
to resume.  Not familiar with how any particular server works, but it  
seems that a server could leverage the fact that it is re-using  
session state instead of having to create a new session and then  
immediately toss this session out when a RESUME command is immediately  
issued after authentication.

> Also, wouldn't something like the RECONNECT proposal from lemonade make
> more sense, actually connecting to the exact state?

I assume you would be talking about this:

http://tools.ietf.org/html/draft-ietf-lemonade-reconnect-07

I don't claim to be an expert but I *think* this is what eventually  
became QRESYNC.  The reconnect draft was focused on reconnecting to a  
server with the intention of grabbing all changes to a particular  
mailbox from a previous connection.

Conversely, The SUSPEND draft doesn't care at all about the state of  
ANY mailbox.  It only seeks to restore the configuration so that the  
client doesn't unnecessarily spend time having to recreate these  
commands (which have no bearing on information eventually presented to  
the end-user) every time it connects.

The IMAP landscape has changed immensely since that draft was first  
proposed (2004).  Back then there was no need to affirmatively enable  
any IMAP features.  Since then, the following commands have been  
defined in extensions that require the client to proactively implement  
before they can be used:

COMPRESS=DEFLATE
ENABLE (CONDSTORE/QRESYNC)
LANGUAGE
COMPARATOR
CONVERSIONS
saved CONTEXTs
NOTIFY

Use of these advanced features become prohibitively expensive to a  
disconnected client if they have to be configured every time that  
client reconnects.

I tried to avoid listing the exact extensions that compromise server  
configuration state in an attempt to keep the draft short and to allow  
for future expandability.  But such a discussion/list might be  
necessary for clarity.

> Can't we just pipeline all of this?

You can't pipeline all necessary commands. For example, if the server  
doesn't send CAPABILITY on authentication, that needs to be grabbed  
before any other commands can be sent.

And other commands may rely on pre-existing commands.  LANGUAGE should  
be sent first since human-readable messages should be shown in that  
language immediately.  If you pipeline that command with other  
commands, there is no guarantee that the LANGAUGE command is handled  
first.

> And frankly, the amount of state needed to transfer on reconnect for actual
> folder state can be high (though, not as bad with CONDSTORE/QRESYNC).

I don't understand why folder (i.e. mailbox) state needs to be saved?   
As mentioned in the draft, the net result of a successful RESUME is  
that the client is left in the authenticated state.  A server does not  
need to keep state information for mailboxes when in this mode.

Thanks for the comments.  I will update the draft to, at a minimum,  
reflect the RESUME usage discussed above.

michael
Reply
E-mail headers
From: blong@google.com
To: imap-protocol@localhost
Date: Fri, 08 Jun 2018 12:34:49 -0000
Message-ID: CABa8R6tf84Zzz_6Q3jj9VkEf+4UPiwDxCYQdCjxF5qKiowZLrw@mail.gmail.com permalink / raw / eml / mbox
I believe he's dealing with an environment running php scripts with even
support as running as CGIs which are exec'd on every access.

Which isn't to say that some mechanism wouldn't be possible to do this
(fork a daemon that holds the connections and pass them to the new server
or something) but it gets complicated.

It would be much simpler to just have a long running server serve the
webpages directly instead of through CGI, but that's what he's got.

Brandon


On Tue, Nov 27, 2012 at 3:04 PM, L Walsh <imap@tlinx.org> wrote:

> Michael M Slusarz wrote:
>
>>
>> Background: I am a disconnected client author (webmail).... I was
>> depressed/alarmed at the amount of state re-creation ...need.. NAMESPACE,
>> CAPABILITY, (since there was no guarantee that we are connecting to the
>> same backend IMAP server, it was/is not possible to cache these values on
>> the client side), ENABLE QRESYNC, and LANGUAGE call on every user action.
>>
> ---
> Just ignore me if I'm too clueless for this to be of any help, but if I was
> doing what you are doing, I'd cache the client TCP connections to the IMAP
> server until I timed their session-out from my webserver.
>
> I.e. would set cookie upon connection to their imap server that contains
> a 'handle' that would indicate something in my 'IMAP-TCP connection cache',
> and route their requests accordingly.
>
> Not only are starting TCP connections "expensive", but you have an added
> expense
> of IMAP state... if HTTP(S), believes in using PIPELINING to enable lower
> server loads and faster response time, why not a webmail->imap gateway?
>
> Set the "session-timeout to ~5-15 minutes" of inactivity like many things
> do and you can recycle their imap connection, but if they reconnect w/the
> same cookie, it seems you could simply route their commands to the already
> open connection to their imap server -- and there would be no overhead
> of looking for capabilities, or resetting state...etc, as the TCP conn.
> would guarantee it's the same server.
>
> It would all be transparent to the user -- if they reconnect to you, you
> look in your cache, and see if the TCP connection is still valid (or if it,
> possibly, has been closed from the other end).  If it has been, the safest
> thing to do would be to give a message that the remote IMAP connection
> timed-out and they need to re-connect.
>
> Sorry, if this sounds too clueless for words, and I am missing something
> obvious about why this can't be done -- but it sounds like rolling your
> own connection cache would be the way to go...(but probably am missing
> the forest for the trees....)...
>
>
> -l
>
> ______________________________**_________________
> Imap-protocol mailing list
> Imap-protocol@u.washington.edu
> http://mailman2.u.washington.**edu/mailman/listinfo/imap-**protocol<http://mailman2.u.washington.edu/mailman/listinfo/imap-protocol>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman13.u.washington.edu/pipermail/imap-protocol/attachments/20121127/944142b4/attachment.html>
Reply
E-mail headers
From: brong@fastmail.fm
To: imap-protocol@localhost
Date: Fri, 08 Jun 2018 12:34:49 -0000
Message-ID: 1354096326.12905.140661159138037.73D46EAB@webmail.messagingengine.com permalink / raw / eml / mbox
On Wed, Nov 28, 2012, at 12:04 AM, L Walsh wrote:
> Michael M Slusarz wrote:
> >
> > Background: I am a disconnected client author (webmail).... I was 
> > depressed/alarmed at the amount of state re-creation ...need.. 
> > NAMESPACE, CAPABILITY, (since there was no guarantee that we are 
> > connecting to the same backend IMAP server, it was/is not possible to 
> > cache these values on the client side), ENABLE QRESYNC, and LANGUAGE 
> > call on every user action.
> ---
> Just ignore me if I'm too clueless for this to be of any help, but if I was
> doing what you are doing, I'd cache the client TCP connections to the IMAP
> server until I timed their session-out from my webserver.
> 
> I.e. would set cookie upon connection to their imap server that contains
> a 'handle' that would indicate something in my 'IMAP-TCP connection cache',
> and route their requests accordingly.
> 
> Not only are starting TCP connections "expensive", but you have an added 
> expense
> of IMAP state... if HTTP(S), believes in using PIPELINING to enable lower
> server loads and faster response time, why not a webmail->imap gateway?

Indeed this.  Our webmail runs in mod_perl2, so it's persistent, but there's
no guarantee that you'll get the same process on the next web hit.  We have
a thing called 'imappool', which is evil, but works very well.  Every
connection has associated state.  It actually passes the unix descriptor
backwards and forwards between processes.  It's super-fast.

We actually send a command 'XDUMMY' to the backend now, which just gets a
BAD response - but at least it confirms that the connection is still there
so that if a backend goes away we get all the same sort of failure rather
than random failures.

The plan is to replace XDUMMY with something that actually gets a new
SESSIONID from the backend.  We create a new SESSIONID every login and
it gets logged along with every change.  That way we can track changes
back to their source login - but tracking them to the exact web hit would
be even nicer.

Point is - it can be done.  I see that someone who wants to build a PHP
app which can be uploaded to j-random-webhost without having control over
the host configuration makes things hard though... no good answer there
except that IMAP isn't a great match for that configuration.

Bron.
-- 
  Bron Gondwana
  brong@fastmail.fm
Reply
E-mail headers
From: jkt@flaska.net
To: imap-protocol@localhost
Date: Fri, 08 Jun 2018 12:34:49 -0000
Message-ID: e122fa25-9110-4b36-855d-0e7e273c5805@flaska.net permalink / raw / eml / mbox
Hi Michael,
I've read your draft, it's an interesting extension. However, it seems to me that the whole point here is to save a few roundtrips by skipping the process of activating/configuring various optional features. I'll discuss each extension separately.

> COMPRESS=DEFLATE

I was wondering if this one actually provides any benefit for a webmail client. But you're right that it indeed has an overhead and requires a full roundtrip to set up. However, please note that your extension also requires a full roundtrip, so you aren't any better here.

> ENABLE (CONDSTORE/QRESYNC)
> LANGUAGE
> COMPARATOR

It looks to me that you can easily pipeline all of these and that you do not risk anything by doing so. Yes, I'm aware of the wording of the ENABLE RFC which sounds like one really MUST check its return code, but a subsequent thread on this list indicated that this was not the desired outcome and that it is completely legal to pipeline ENABLE QRESYNC with SELECT ... QRESYNC.

As of the LANGUAGE -- how often do you expect to hit an error condition which is not described by an appropriate response code? I don't think that blocking for its result would be a good design choice.

And finally, what IMAP servers support the LANGUAGE extension?

> CONVERSIONS
> saved CONTEXTs
> NOTIFY

Are you actually aware of a single IMAP server supporting any of these (besides CONTEXT=SEARCH, which again can easily be pipelined without any race conditions, and is specific to a mailbox state anyway, which is outside of scope of your extensions)?

In general, all of the items which you included as an example look like easily pipelineable items. Have you tried to use pipelining for these? What was the total time spent waiting for their completion in that case? What would be the best theoretical time which you could get by RESTORE?

With kind regards,
Jan
Reply
E-mail headers
From: slusarz@curecanti.org
To: imap-protocol@localhost
Date: Fri, 08 Jun 2018 12:34:49 -0000
Message-ID: 20121127170037.Horde.bWDt0BZaqY7WIcdZNayF-Q6@bigworm.curecanti.org permalink / raw / eml / mbox
Quoting Brandon Long <blong@google.com>:

> I believe he's dealing with an environment running php scripts with even
> support as running as CGIs which are exec'd on every access.
>
> Which isn't to say that some mechanism wouldn't be possible to do this
> (fork a daemon that holds the connections and pass them to the new server
> or something) but it gets complicated.

Yeah: this is exactly what imapproxy provides  
(http://www.imapproxy.org/).  Way back at the beginning of the thread  
I laid out the reason why this solution is insufficient or, at least,  
undesirable.

> It would be much simpler to just have a long running server serve the
> webpages directly instead of through CGI, but that's what he's got.

Agreed.  If I was in charge of creating a closed system webmail  
architecture, this would be an obvious decision. (Mail store  
interaction would probably not be done via pure IMAP either).  But  
stock HTTP and IMAP servers is what most admins have to work with, so  
we desire to distribute a software solution to these people to ensure  
the largest possible audience.

> On Tue, Nov 27, 2012 at 3:04 PM, L Walsh <imap@tlinx.org> wrote:
>
>> Just ignore me if I'm too clueless for this to be of any help, but if I was
>> doing what you are doing, I'd cache the client TCP connections to the IMAP
>> server until I timed their session-out from my webserver.
>>
>> I.e. would set cookie upon connection to their imap server that contains
>> a 'handle' that would indicate something in my 'IMAP-TCP connection cache',
>> and route their requests accordingly.
>>
>> Not only are starting TCP connections "expensive", but you have an added
>> expense
>> of IMAP state... if HTTP(S), believes in using PIPELINING to enable lower
>> server loads and faster response time, why not a webmail->imap gateway?

At least one author of an IMAP server has stated that the TCP  
connection is no longer expensive and is, in fact, negligible when  
compared to running even a single IMAP command.

>> Set the "session-timeout to ~5-15 minutes" of inactivity like many things
>> do and you can recycle their imap connection, but if they reconnect w/the
>> same cookie, it seems you could simply route their commands to the already
>> open connection to their imap server -- and there would be no overhead
>> of looking for capabilities, or resetting state...etc, as the TCP conn.
>> would guarantee it's the same server.
>>
>> It would all be transparent to the user -- if they reconnect to you, you
>> look in your cache, and see if the TCP connection is still valid (or if it,
>> possibly, has been closed from the other end).  If it has been, the safest
>> thing to do would be to give a message that the remote IMAP connection
>> timed-out and they need to re-connect.

Yup - you've pretty much (re-)invented imapproxy.

michael
Reply
E-mail headers
From: slusarz@curecanti.org
To: imap-protocol@localhost
Date: Fri, 08 Jun 2018 12:34:49 -0000
Message-ID: 20121121155417.Horde.ZeW7JqTPNxTAI-hTtrAT-Q9@bigworm.curecanti.org permalink / raw / eml / mbox
Jan,

Thanks for the input.  My responses are below.

Quoting Jan Kundr?t <jkt@flaska.net>:

> Hi Michael,
> I've read your draft, it's an interesting extension. However, it  
> seems to me that the whole point here is to save a few roundtrips by  
> skipping the process of activating/configuring various optional  
> features. I'll discuss each extension separately.

I would strongly disagree with this statement.  As written, the draft  
is only minimally concerned with saving on network round-trips.

E.g. a webmail implementation: for any even reasonably sized setup,  
interaction between the webmail backend and the IMAP server will  
almost certainly be done through a private network.  Such a setup has  
the added benefit that the IMAP connection does not need any sort of  
security [TLS] overhead.  I have assumed in the draft that the  
client/server round-trip is negligible or, in the very least, not the  
bottleneck in the IMAP interaction.

However, client/server round-trip *is* most likely an issue for a  
whole category of disconnected-like clients: those running on mobile  
hardware.  Pipelining in this environment in no way guarantees that a  
server can or will return the response in the same network packet.  In  
other words, this draft becomes *more* important the more that  
client/server round-trip time becomes the bottleneck/limiting factor,  
whether pipelined or not.

Sidebar: I'm not a huge fan in general of pipelining as a performance  
since it is not always a feasible option for clients.  For example, a  
client may use an OO-library to connect to the IMAP server.  This  
library may not provide a reasonable (or any) way of allowing multiple  
commands to be sent at once via the API.  For example, to start  
compression, enable QRESYNC, and set the language, it is more than  
reasonable to expect this kind of pseudocode:

$result = $imap->useCompression(true);
// Check for success
$imap->useQresync(true);
// Check for success
$imap->setLanguage([LANGUAGE]);
// Check for success

In any OO IMAP interface the order of IMAP commands to allow for  
efficient pipelining, or the fact that pipelining even exists, should  
obviously not be a part of the API.  Thus pipelining is fairly useless  
in the real world as a way to guarantee an increase in performance.

There are other, more important reasons why a mechanism to restore  
configuration is useful:

- It prevents the need to re-parse the CAPABILITY list.  Note that  
parsing the CAPABILITY list involves *MUCH* more than just the actual  
string tokenization of the list, although this alone may not be a  
trivial task (see below).

A client may, depending on the capabilities returned, need to perform  
various internal initialization tasks.  For example - if  
CONDSTORE/QRESYNC is listed, a client may have to then parse a  
separate configuration file to grab the details of the local cache  
where it is storing this information, and then connect to this cache,  
etc.  Or if language is listed, a client might have to parse a local  
list of language availability to determine if it can/should change the  
language.

And CAPABILITY parsing is more than just determining what capabilities  
are listed.  It is also determining which capabilities SHOULD not be  
listed.  Just today, Cyrus was fixed due to a bug that our code was  
triggering: APPENDing binary data via a literal8 caused Cyrus to  
immediately terminate the connection with a BYE response.  Our code is  
smart enough to catch this broken behavior by removing BINARY  
appending from the list of available capabilities.  But without a way  
to ensure that every subsequent connection is a continuation of the  
current session, we have to do this detection EVERY SINGLE TIME.  This  
is potentially a huge performance hit, since we may be appending MBs  
of data to the server before the BYE response can be returned (e.g.  
appending a sent-mail message containing attachments).

- As mentioned above, sending an initialization command to the server  
may take quite a bit of work on the client side to prepare.  It's not  
as easy as hardcoding 'ENABLE QRESYNC' in client code - it may take  
quite a bit of CPU cycles to get to that point in a given client.

Another example: a client keeps all of its imap initialization code in  
a separate dynamically-loadable module.  If the session is  
successfully resumed, this module does not need to be  
loaded/interpreted/run.

- From the server side, it may be much more expensive to initiate an  
IMAP session as compared with resuming one.  This draft allows the  
server to optimize if possible.  I believe Timo's post indicates that  
resuming in Dovecot is more efficient than creating a new session.

- Even when pipelining commands, they still need to be sent, the  
incoming command needs to be tokenized (server), the command is  
performed (server), the response sent back, any untagged responses are  
tokenized (client), the untagged responses are interpreted (client),  
the tagged response is tokenized (client), and the tagged response is  
processed (client).  None of this is "free".  Pipelining eliminates  
none of this.

>> COMPRESS=DEFLATE
>
> I was wondering if this one actually provides any benefit for a  
> webmail client. But you're right that it indeed has an overhead and  
> requires a full roundtrip to set up. However, please note that your  
> extension also requires a full roundtrip, so you aren't any better  
> here.

First a point of clarification: the draft is not specific to webmail  
clients.  It is intended for any disconnected client that may have  
need to initiate multiple IMAP connections during the client's lifetime.

Granted, it is extremely useful for webmail clients due to extreme  
disconnected nature of the connections, but it would also be highly  
useful for clients on any device that does not have a constant (or  
reliable) network connection to the server.  e.g. smartmobile clients;  
ActiveSync polling.

Whether or not COMPRESS is beneficial to a webmail implementation is  
beyond the scope of this discussion.

Second, you are partially right.  A successful restoration of the  
configuration state does require a round-trip to the server.  But a  
RESUME command sent before initialization is an example of a command  
that CAN be easily pipelined with an authentication command.  And the  
full round-trip is offset somewhat by the fact that upon a successful  
RESUME, the CAPABILITY string will not be sent-back to the client if  
the server normally does this automatically on authentication.  And if  
the server doesn't normally return CAPABILITY information, then this  
is a complete win (RESUME/tagged OK vs. CAPABILITY command/CAPABILITY  
untagged response/tagged OK).

>> ENABLE (CONDSTORE/QRESYNC)
>> LANGUAGE
>> COMPARATOR
>
> It looks to me that you can easily pipeline all of these and that  
> you do not risk anything by doing so. Yes, I'm aware of the wording  
> of the ENABLE RFC which sounds like one really MUST check its return  
> code, but a subsequent thread on this list indicated that this was  
> not the desired outcome and that it is completely legal to pipeline  
> ENABLE QRESYNC with SELECT ... QRESYNC.

I would argue that the language of the RFC still controls despite what  
an e-mail on this list says.  A client shouldn't be punished for  
interpreting it that way either.

> As of the LANGUAGE -- how often do you expect to hit an error  
> condition which is not described by an appropriate response code? I  
> don't think that blocking for its result would be a good design  
> choice.

That could be your decision as a client author.  I would vehemently disagree.

> And finally, what IMAP servers support the LANGUAGE extension?

Why does this matter?  RFC 5255 is a Standards Track extension.  A  
year from now, every IMAP server and 200 new ones may support it.

>> CONVERSIONS
>> saved CONTEXTs
>> NOTIFY
>
> Are you actually aware of a single IMAP server supporting any of  
> these (besides CONTEXT=SEARCH, which again can easily be pipelined  
> without any race conditions, and is specific to a mailbox state  
> anyway, which is outside of scope of your extensions)?

Again, why does this matter?  All of these are Standards Track  
extensions (your argument might hold a bit more water if these were  
Experimental documents).

And what about future extensions?  Those obviously aren't supported by  
ANY server yet.

A given client may not support any of these extensions.  This client  
could make the decision that SUSPEND/RESUME is pointless.  That  
doesn't mean the SUSPEND feature is pointless since another client may  
support ALL of these extensions.

> In general, all of the items which you included as an example look  
> like easily pipelineable items. Have you tried to use pipelining for  
> these? What was the total time spent waiting for their completion in  
> that case? What would be the best theoretical time which you could  
> get by RESTORE?

It would be impossible to determine benchmarks since there is no  
defined protocol yet.  And, as mentioned above, any given  
client/server interaction may provide different results based on their  
own internal optimizations and extension support.

About the only thing you could do is look at network traffic savings.   
The following is an example of the possible savings giving a moderate  
use of IMAP configuration state (this is a more real-world example of  
Examples 1 & 2 in the draft):

Initial session:

[User authenticated]
A1 CAPABILITY
* CAPABILITY IMAP4rev1 LITERAL+ SASL-IR LOGIN-REFERRALS ID ENABLE IDLE  
SORT SORT=DISPLAY THREAD=REFERENCES THREAD=REFS THREAD=ORDEREDSUBJECT  
MULTIAPPEND UNSELECT CHILDREN NAMESPACE UIDPLUS LIST-EXTENDED  
I18NLEVEL=1 CONDSTORE QRESYNC ESEARCH ESORT SEARCHRES WITHIN  
CONTEXT=SEARCH LIST-STATUS SPECIAL-USE ACL RIGHTS=texk
[This is the CAPABILITY list from Dovecot 2.1.10]
A1 OK Capability completed.
A2 ENABLE QRESYNC
* ENABLED QRESYNC
A2 OK Enabled.
A3 LANGUAGE DE
* LANGUAGE (DE)
* NAMESPACE (("" "/")) (("Other Users/" "/" "TRANSLATION" ("Andere  
Ben&APw-tzer/"))) (("Public Folders/" "/" "TRANSLATION" ("Gemeinsame  
Postf&AM8-cher/")))
A3 Sprachwechsel durch LANGUAGE-Befehl ausgefuehrt
[...]
A20 SUSPEND
* SUSPEND c3RhdGUgdG9rZW4=
* BYE Server logging out.
A20 OK Logout completed.

Additional network data required by SUSPEND commands: 29 bytes (LOGOUT  
vs. SUSPEND; SUSPEND untagged response)
[However, this command will only normally be run the FIRST time the  
session is accessed, so this is a one-time only hit]
Additional round-trips required: 0

Subsequent sessions:

A1 RESUME c3RhdGUgdG9rZW4=
A1 OK
A2 LOGIN joe passwd
A2 OK [RESUME c3RhdGUgdG9rZW4=] LOGIN completed and configuration restored.
[...]

Additional network data required by RESUME commands: 61 bytes (RESUME  
command, RESUME response code)
Additional round-trips required: 1
Network data saved by RESUME: ~650 bytes
Round-trips saved: 3
Server parsed commands saved: 3
Client issued commands saved: 3
Untagged responses that do not need to be re-parsed: 4


In this example, the one-time addition of 29 bytes of network traffic  
(1 additional untagged response parse) results in the savings of 2  
round-trips, ~600 bytes of network traffic, and 3 additional commands  
that need to be parsed on the client/server side.  And remember this  
doesn't factor in any initialization code that needs to be run within  
the server/client to perform these commands.

To me, that is substantial savings, especially when the connection may  
be re-established every 10 seconds.

Hope this response identifies the reason and necessity of the  
proposal.  Thanks again for the constructive input.

michael
Reply
E-mail headers
From: imap@tlinx.org
To: imap-protocol@localhost
Date: Fri, 08 Jun 2018 12:34:49 -0000
Message-ID: 50B55DE5.5080307@tlinx.org permalink / raw / eml / mbox
Michael M Slusarz wrote:
> Yeah: this is exactly what imapproxy provides 
> (http://www.imapproxy.org/).  Way back at the beginning of the thread 
> I laid out the reason why this solution is insufficient or, at least, 
> undesirable. 
...
>
> Yup - you've pretty much (re-)invented imapproxy.
----
    But the reasons you gave for imapproxy not working were it's 
auto-restart
of a session (which I suggested against), and keeping what connection 
you need
in a cookie -- which would allow specifying exactly which connection you 
want --
thus overcoming your other concern about only allowing for a 1:1 backend.


i.e. it solves your critical problems and leaves a bit of undesirable 
complexity
-- maintenance of a separate service (not if you include your modified 
version
in your disconnected client, no?) and the last reason which was it being 
specific to some proxy server (??which? the one you would ship -- i.e. the
modified imapproxy?)....


So... I guess I'm seeing that because you are running a "scriptish" 
frontend,
you don't want to include a binary ?  erk.  Dunno about php, but you could
do it in perl.  Seems like what you might need is a php-extension (i.e. a
binary-lib that could be loaded and handle some of this)...

That's if you wanted to do it yourself and not bother trying to extend 
the IMAP
protocol...

I dunno, seems like a php-extension-lib (I have never used php, so don't 
know
how difficult that would be, but....) might be the most str8-forward way 
to get
to where you want to go...??

But you're much more knowledgable about all the reasons you want to do 
it the
way you are doing it -- and from my own experience, sometimes I have lots of
un-voiced/unwritten-down reasons for wanting to do something a 
particular way
that I just haven't communicated very well (even to myself!) ;-)...

Anyway, sorry I'm recovering ground you already thought of...

Sounds complicated...ick.  Good luck! ;-)
Reply
E-mail headers
From: blong@google.com
To: imap-protocol@localhost
Date: Fri, 08 Jun 2018 12:34:49 -0000
Message-ID: CABa8R6sJfnA=HYk-fmFfpXki91cne_LeqFyuhDebMGeDdsdj7g@mail.gmail.com permalink / raw / eml / mbox
On Wed, Nov 21, 2012 at 2:54 PM, Michael M Slusarz <slusarz@curecanti.org>wrote:

> Jan,
>
> Thanks for the input.  My responses are below.
>
>
> Quoting Jan Kundr?t <jkt@flaska.net>:
>
>  Hi Michael,
>> I've read your draft, it's an interesting extension. However, it seems to
>> me that the whole point here is to save a few roundtrips by skipping the
>> process of activating/configuring various optional features. I'll discuss
>> each extension separately.
>>
>
> I would strongly disagree with this statement.  As written, the draft is
> only minimally concerned with saving on network round-trips.
>

Yet, that and bytes are what you quote from your example.


> E.g. a webmail implementation: for any even reasonably sized setup,
> interaction between the webmail backend and the IMAP server will almost
> certainly be done through a private network.  Such a setup has the added
> benefit that the IMAP connection does not need any sort of security [TLS]
> overhead.  I have assumed in the draft that the client/server round-trip is
> negligible or, in the very least, not the bottleneck in the IMAP
> interaction.
>
> However, client/server round-trip *is* most likely an issue for a whole
> category of disconnected-like clients: those running on mobile hardware.
>  Pipelining in this environment in no way guarantees that a server can or
> will return the response in the same network packet.  In other words, this
> draft becomes *more* important the more that client/server round-trip time
> becomes the bottleneck/limiting factor, whether pipelined or not.
>
> Sidebar: I'm not a huge fan in general of pipelining as a performance
> since it is not always a feasible option for clients.  For example, a
> client may use an OO-library to connect to the IMAP server.  This library
> may not provide a reasonable (or any) way of allowing multiple commands to
> be sent at once via the API.  For example, to start compression, enable
> QRESYNC, and set the language, it is more than reasonable to expect this
> kind of pseudocode:
>
> $result = $imap->useCompression(true);
> // Check for success
> $imap->useQresync(true);
> // Check for success
> $imap->setLanguage([LANGUAGE])**;
> // Check for success
>
> In any OO IMAP interface the order of IMAP commands to allow for efficient
> pipelining, or the fact that pipelining even exists, should obviously not
> be a part of the API.  Thus pipelining is fairly useless in the real world
> as a way to guarantee an increase in performance.
>

Thats not a true statement.  Its useless if you're forced to use a bad
client API that you aren't willing to work around?  Is that bad client API
going to support adding a new command?


> There are other, more important reasons why a mechanism to restore
> configuration is useful:
>
> - It prevents the need to re-parse the CAPABILITY list.  Note that parsing
> the CAPABILITY list involves *MUCH* more than just the actual string
> tokenization of the list, although this alone may not be a trivial task
> (see below).
>
> A client may, depending on the capabilities returned, need to perform
> various internal initialization tasks.  For example - if CONDSTORE/QRESYNC
> is listed, a client may have to then parse a separate configuration file to
> grab the details of the local cache where it is storing this information,
> and then connect to this cache, etc.  Or if language is listed, a client
> might have to parse a local list of language availability to determine if
> it can/should change the language.
>

And... it would have had to do that anyways.  If its resuming the
connection, it has to either do that initialization or it has to cache that
initialization, both of which it can do just as easily on non-resume if
we're talking about a new connection every 10s.

Parse a local list of language availability?  So, you want to avoid reading
a local config file ... on the hope that the client doesn't need to know
that language information anyways just to display stuff to the user which
doesn't come from the server?


> And CAPABILITY parsing is more than just determining what capabilities are
> listed.  It is also determining which capabilities SHOULD not be listed.
>  Just today, Cyrus was fixed due to a bug that our code was triggering:
> APPENDing binary data via a literal8 caused Cyrus to immediately terminate
> the connection with a BYE response.  Our code is smart enough to catch this
> broken behavior by removing BINARY appending from the list of available
> capabilities.  But without a way to ensure that every subsequent connection
> is a continuation of the current session, we have to do this detection
> EVERY SINGLE TIME.  This is potentially a huge performance hit, since we
> may be appending MBs of data to the server before the BYE response can be
> returned (e.g. appending a sent-mail message containing attachments).
>

So, you worked around the cyrus bug by determining the connection exhibits
the bug and then never using it for that server again?  Or, you could just
not use BINARY ever, or issue an ID command to know if the remote server
has the bug.  You could even pipeline it!  Or you could store the known bad
server information somewhere in your app server.


> - As mentioned above, sending an initialization command to the server may
> take quite a bit of work on the client side to prepare.  It's not as easy
> as hardcoding 'ENABLE QRESYNC' in client code - it may take quite a bit of
> CPU cycles to get to that point in a given client.
>

But this doesn't change anything about the having to do that.  Regardless
of whether you're resuming a session or not, you still have to do that work.


> Another example: a client keeps all of its imap initialization code in a
> separate dynamically-loadable module.  If the session is successfully
> resumed, this module does not need to be loaded/interpreted/run.
>

What kind of clients are we talking about?  I'm just completely failing to
think this is an issue.


> - From the server side, it may be much more expensive to initiate an IMAP
> session as compared with resuming one.  This draft allows the server to
> optimize if possible.  I believe Timo's post indicates that resuming in
> Dovecot is more efficient than creating a new session.
>

Resuming into the middle of a selected folder seems cheap.  Resuming the
status of N commands seems  optimizing in the small.  I would think it
would cost more in terms of either caching the data in memory or stashing
the data to disk/reading it back than the overhead of parsing a couple
commands and maybe allocating a data structure.


> - Even when pipelining commands, they still need to be sent, the incoming
> command needs to be tokenized (server), the command is performed (server),
> the response sent back, any untagged responses are tokenized (client), the
> untagged responses are interpreted (client), the tagged response is
> tokenized (client), and the tagged response is processed (client).  None of
> this is "free".  Pipelining eliminates none of this.


Tokenization is what clients and servers do, and this takes a trivial
amount of time and cpu in reasonable languages.


>  COMPRESS=DEFLATE
>>>
>>
>> I was wondering if this one actually provides any benefit for a webmail
>> client. But you're right that it indeed has an overhead and requires a full
>> roundtrip to set up. However, please note that your extension also requires
>> a full roundtrip, so you aren't any better here.
>>
>
> First a point of clarification: the draft is not specific to webmail
> clients.  It is intended for any disconnected client that may have need to
> initiate multiple IMAP connections during the client's lifetime.
>
> Granted, it is extremely useful for webmail clients due to extreme
> disconnected nature of the connections, but it would also be highly useful
> for clients on any device that does not have a constant (or reliable)
> network connection to the server.  e.g. smartmobile clients; ActiveSync
> polling.
>
> Whether or not COMPRESS is beneficial to a webmail implementation is
> beyond the scope of this discussion.
>
> Second, you are partially right.  A successful restoration of the
> configuration state does require a round-trip to the server.  But a RESUME
> command sent before initialization is an example of a command that CAN be
> easily pipelined with an authentication command.  And the full round-trip
> is offset somewhat by the fact that upon a successful RESUME, the
> CAPABILITY string will not be sent-back to the client if the server
> normally does this automatically on authentication.  And if the server
> doesn't normally return CAPABILITY information, then this is a complete win
> (RESUME/tagged OK vs. CAPABILITY command/CAPABILITY untagged
> response/tagged OK).


And what does "resuming" a COMPRESS=DEFLATE do?  I assume we're not talking
about trying to keep the old dictionary or anything like that, right?
 We're actually talking about just the equivalent of starting it again.


>  ENABLE (CONDSTORE/QRESYNC)
>>> LANGUAGE
>>> COMPARATOR
>>>
>>
>> It looks to me that you can easily pipeline all of these and that you do
>> not risk anything by doing so. Yes, I'm aware of the wording of the ENABLE
>> RFC which sounds like one really MUST check its return code, but a
>> subsequent thread on this list indicated that this was not the desired
>> outcome and that it is completely legal to pipeline ENABLE QRESYNC with
>> SELECT ... QRESYNC.
>>
>
> I would argue that the language of the RFC still controls despite what an
> e-mail on this list says.  A client shouldn't be punished for interpreting
> it that way either.
>
>
>  As of the LANGUAGE -- how often do you expect to hit an error condition
>> which is not described by an appropriate response code? I don't think that
>> blocking for its result would be a good design choice.
>>
>
> That could be your decision as a client author.  I would vehemently
> disagree.


Or just re-issue the command to get the response translated.

 And finally, what IMAP servers support the LANGUAGE extension?
>>
>
> Why does this matter?  RFC 5255 is a Standards Track extension.  A year
> from now, every IMAP server and 200 new ones may support it.


True.  But that also goes to my point that you need to specify which
extensions and what information would need to be resumed, you can't just
say "any which apply", what if the server author doesn't think the way you
do and one of these doesn't get resumed?

 CONVERSIONS
>>> saved CONTEXTs
>>> NOTIFY
>>>
>>
>> Are you actually aware of a single IMAP server supporting any of these
>> (besides CONTEXT=SEARCH, which again can easily be pipelined without any
>> race conditions, and is specific to a mailbox state anyway, which is
>> outside of scope of your extensions)?
>>
>
> Again, why does this matter?  All of these are Standards Track extensions
> (your argument might hold a bit more water if these were Experimental
> documents).
>
> And what about future extensions?  Those obviously aren't supported by ANY
> server yet.
>
> A given client may not support any of these extensions.  This client could
> make the decision that SUSPEND/RESUME is pointless.  That doesn't mean the
> SUSPEND feature is pointless since another client may support ALL of these
> extensions.
>
>
>  In general, all of the items which you included as an example look like
>> easily pipelineable items. Have you tried to use pipelining for these? What
>> was the total time spent waiting for their completion in that case? What
>> would be the best theoretical time which you could get by RESTORE?
>>
>
> It would be impossible to determine benchmarks since there is no defined
> protocol yet.  And, as mentioned above, any given client/server interaction
> may provide different results based on their own internal optimizations and
> extension support.
>

But without an actual proof that this is useful for some combination of
client/server, why would we adopt yet another extension that no one will
implement?


> About the only thing you could do is look at network traffic savings.  The
> following is an example of the possible savings giving a moderate use of
> IMAP configuration state (this is a more real-world example of Examples 1 &
> 2 in the draft):
>
> Initial session:
>
> [User authenticated]
> A1 CAPABILITY
> * CAPABILITY IMAP4rev1 LITERAL+ SASL-IR LOGIN-REFERRALS ID ENABLE IDLE
> SORT SORT=DISPLAY THREAD=REFERENCES THREAD=REFS THREAD=ORDEREDSUBJECT
> MULTIAPPEND UNSELECT CHILDREN NAMESPACE UIDPLUS LIST-EXTENDED I18NLEVEL=1
> CONDSTORE QRESYNC ESEARCH ESORT SEARCHRES WITHIN CONTEXT=SEARCH LIST-STATUS
> SPECIAL-USE ACL RIGHTS=texk
> [This is the CAPABILITY list from Dovecot 2.1.10]
> A1 OK Capability completed.
> A2 ENABLE QRESYNC
> * ENABLED QRESYNC
> A2 OK Enabled.
> A3 LANGUAGE DE
> * LANGUAGE (DE)
> * NAMESPACE (("" "/")) (("Other Users/" "/" "TRANSLATION" ("Andere
> Ben&APw-tzer/"))) (("Public Folders/" "/" "TRANSLATION" ("Gemeinsame
> Postf&AM8-cher/")))
> A3 Sprachwechsel durch LANGUAGE-Befehl ausgefuehrt
> [...]
> A20 SUSPEND
> * SUSPEND c3RhdGUgdG9rZW4=
> * BYE Server logging out.
> A20 OK Logout completed.
>
> Additional network data required by SUSPEND commands: 29 bytes (LOGOUT vs.
> SUSPEND; SUSPEND untagged response)
> [However, this command will only normally be run the FIRST time the
> session is accessed, so this is a one-time only hit]
> Additional round-trips required: 0
>
> Subsequent sessions:
>
> A1 RESUME c3RhdGUgdG9rZW4=
> A1 OK
> A2 LOGIN joe passwd
> A2 OK [RESUME c3RhdGUgdG9rZW4=] LOGIN completed and configuration restored.
> [...]
>
> Additional network data required by RESUME commands: 61 bytes (RESUME
> command, RESUME response code)
> Additional round-trips required: 1
> Network data saved by RESUME: ~650 bytes
> Round-trips saved: 3
> Server parsed commands saved: 3
> Client issued commands saved: 3
> Untagged responses that do not need to be re-parsed: 4
>
>
> In this example, the one-time addition of 29 bytes of network traffic (1
> additional untagged response parse) results in the savings of 2
> round-trips, ~600 bytes of network traffic, and 3 additional commands that
> need to be parsed on the client/server side.  And remember this doesn't
> factor in any initialization code that needs to be run within the
> server/client to perform these commands.
>
> To me, that is substantial savings, especially when the connection may be
> re-established every 10 seconds.
>

Have you considered not re-establishing a connection every 10s?  This is a
connected protocol, not http.

Brandon
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman13.u.washington.edu/pipermail/imap-protocol/attachments/20121121/9121521e/attachment.html>
Reply
E-mail headers
From: jkt@flaska.net
To: imap-protocol@localhost
Date: Fri, 08 Jun 2018 12:34:49 -0000
Message-ID: 68035860-387d-43a9-8bb7-00744a7868b9@flaska.net permalink / raw / eml / mbox
Michael,
I've read your response a few times and I have to admit that most of the points you raise sounds very, very alien to me. I realize that you're comming from a different background than me (different programming language, environment, users and use cases), so I'll be happy to be proven wrong -- I like learning and challenging my own views.

On Wednesday, 21 November 2012 23:54:17 CEST, Michael M Slusarz wrote:
> I would strongly disagree with this statement.  As written, the 
> draft  is only minimally concerned with saving on network 
> round-trips.

That's quite different from what I've understood from your draft -- I'd suggest making the motivation clearer, then. But point understood, and I've now purged the "let's save roundtrips" from my understanding of the draft :). OK.

> $result = $imap->useCompression(true);
> // Check for success
> $imap->useQresync(true);
> // Check for success
> $imap->setLanguage([LANGUAGE]);
> // Check for success

It is pretty obvious that if you use synchronous primitives for enabling individual sub-features in a serialized fashion, your performance will be limited by the round trip times. To put it more bluntly, you cannot have code like the one shown above and expect a good performace.

Coming from that background, I see that it is tempting to replace this endless row of synchronous calls, each enabling a single optional feature, with a quick way to side-step this process by quickly jumping into a pre-negotiated state where everything which was enabled before is enabled now as well. However, my point is that clients already exist proving that the same efficiency can be achieved with the existing facilities. You're right that this requires abolishing the serial, synchronized code, but IMAP is not particularly friendly with synchronous APIs.

> In any OO IMAP interface the order of IMAP commands to allow 
> for  efficient pipelining, or the fact that pipelining even 
> exists, should  obviously not be a part of the API.  Thus 
> pipelining is fairly useless  in the real world as a way to 
> guarantee an increase in performance.

Wrong. I have a client (written in an object-oriented language using object-oriented paradigms) which uses pipelining, and uses it succesfully, I'd say. The trick was to have the library implemented in an asynchronous way. For example, in my client the API between the IMAP-specific parts and the GUI is built on the model-view-controller design pattern; appropriate callbacks are in place to update the attached views when the requested data arrive from the network. It works beautifully in the (native application, not HTML) GUI, and another party uses this IMAP library with the exactly same asynchronous API underneath in a batch tool for stuffing incoming messages into a CRM/ERP database. Yes, the programming is different than if the library provided synchronous calls.

> A client may, depending on the capabilities returned, need to 
> perform  various internal initialization tasks.  For example - 
> if  CONDSTORE/QRESYNC is listed, a client may have to then parse 
> a  separate configuration file to grab the details of the local 
> cache  where it is storing this information, and then connect to 
> this cache,  etc.

So you want to keep the cache information (among other things) inside some serialized client-side state storage. What prevents you from simply checking the capabilities against the previously recorded state and restoring the state when the capabilities match exactly? You can do that now, without waiting for this extension. Yes, it's ugly, but if your initialization is expensive...

> - Even when pipelining commands, they still need to be sent, 
> the  incoming command needs to be tokenized (server), the 
> command is  performed (server), the response sent back, any 
> untagged responses are  tokenized (client), the untagged 
> responses are interpreted (client),  the tagged response is 
> tokenized (client), and the tagged response is  processed 
> (client).  None of this is "free".  Pipelining eliminates  none 
> of this.

Using the numbers you posted later on, we're speaking about parsing roughly 600 bytes of a well-structured text. For me, it's hard to believe that this has any measurable impact.

> I would argue that the language of the RFC still controls 
> despite what  an e-mail on this list says.  A client shouldn't 
> be punished for  interpreting it that way either.

The RFC is a specification crafted by humans. It has errors, and all subsequent revisions will still have errors. (See the errata for a list of those which are known already.) If you choose to block and not pipeline ENABLE QRESYNC and SELECT ... QRESYNC, you hurt your users. (Also note that the clarification given on this list was by the original authors of the RFC.)

>> As of the LANGUAGE -- how often do you expect to hit an error  
>> condition which is not described by an appropriate response 
>> code? I  don't think that blocking for its result would be a 
>> good design  choice.
>
> That could be your decision as a client author.  I would 
> vehemently disagree.
>
>> And finally, what IMAP servers support the LANGUAGE extension?
>
> Why does this matter?  RFC 5255 is a Standards Track extension. 
>  A  year from now, every IMAP server and 200 new ones may 
> support it.

I stand by my reasoning. In order for the block to be actually usefull, you'll have to talk to a server which:

1) actually implements LANGUAGE,
2) executes all commands in parallel OR has the LANGUAGE command implemented in such a slow way that it enables parallel processing for it,
3) returns a failure for one of the first commands which you send *and* does not return an appropriate response code.

But it's your client, do whatever you want to do :). I'm merely saying that adding an extension driven by the desire to eliminate issues like this is not something I support.

> It would be impossible to determine benchmarks since there is 
> no  defined protocol yet.  And, as mentioned above, any given  
> client/server interaction may provide different results based on 
> their  own internal optimizations and extension support.

Right. Well, based on how my client works, I don't expect any significant performance gains obtained through this proposal.

I'm not the standards commitee, but having decent numbers saying "see, this RESUME extensions cuts 40% out of the 1300ms required to establish an IMAP session" is something which moves the discussion from the current, very vague stage of "this is good -- nope, this is worthless" into a stage where we can actually discuss what merits it really brings. As you're proposing the extension, you should IMHO provide these numbers.

> Additional network data required by RESUME commands: 61 bytes 
> (RESUME  command, RESUME response code)
> Additional round-trips required: 1
> Network data saved by RESUME: ~650 bytes
> Round-trips saved: 3
> Server parsed commands saved: 3
> Client issued commands saved: 3
> Untagged responses that do not need to be re-parsed: 4
>
> In this example, the one-time addition of 29 bytes of network 
> traffic  (1 additional untagged response parse) results in the 
> savings of 2  round-trips, ~600 bytes of network traffic, and 3 
> additional commands  that need to be parsed on the client/server 
> side.  And remember this  doesn't factor in any initialization 
> code that needs to be run within  the server/client to perform 
> these commands.
>
> To me, that is substantial savings, especially when the 
> connection may  be re-established every 10 seconds.

I disagree with your analysis for the following reasons:

1) You don't take the initial CAPABILITY into account, but you re-request CAPABILITY after login. (You need the initial capability to see whether the server supports RESUME at all.) This will change the numbers quite a lot.
2) The sample token which Timo showed on the other list was way longer than base64("state token") you use. Just saying.
3) Saving 600 bytes of transmitted data per connection is noise compared to what an actual session typically transfers.
4) You could save even more bytes by converting IMAP to a binary protocol. That possibility in itself is, however, no reason to do so.
5) You're taking an advantage of eliminating NAMESPACE, but so far have ignored LIST and STATUS, even though a typicall client will need them as well. When the LIST responses come into account, savings of 600 bytes starts looking more and more like noise -- not mentioning the mailbox synchronization or data transfers.

As usual, I'd love to be shown where my reasoning has flaws.

With kind regards,
Jan
Reply
E-mail headers
From: slusarz@curecanti.org
To: imap-protocol@localhost
Date: Fri, 08 Jun 2018 12:34:49 -0000
Message-ID: 20121121183606.Horde.8DDdRoJ261AbQsAiMPfS4w1@bigworm.curecanti.org permalink / raw / eml / mbox
Quoting Brandon Long <blong@google.com>:

> On Wed, Nov 21, 2012 at 2:54 PM, Michael M Slusarz  
> <slusarz@curecanti.org>wrote:
>
>> Quoting Jan Kundr?t <jkt@flaska.net>:
>>
>>  Hi Michael,
>>> I've read your draft, it's an interesting extension. However, it seems to
>>> me that the whole point here is to save a few roundtrips by skipping the
>>> process of activating/configuring various optional features. I'll discuss
>>> each extension separately.
>>>
>>
>> I would strongly disagree with this statement.  As written, the draft is
>> only minimally concerned with saving on network round-trips.
>>
>
> Yet, that and bytes are what you quote from your example.

That would be a selective reading of my example.  I spent the entire  
previous paragraph before introducing the example stating that other  
methods of benchmarking would be desirable, but impractical/impossible  
at this time.

I agree that byte counting isn't particularly useful as a benchmark,  
although it does provide some context (for example: a protocol  
addition that ADDED net additional bytes on-the-wire would tend to  
indicate the design/theory is flawed).

And I also explicitly indicated the net amount of  
commands/round-trips/additional IMAP protocol items that were saved,  
which is a more useful benchmark.

>> Sidebar: I'm not a huge fan in general of pipelining as a performance
>> since it is not always a feasible option for clients.  For example, a
>> client may use an OO-library to connect to the IMAP server.  This library
>> may not provide a reasonable (or any) way of allowing multiple commands to
>> be sent at once via the API.  For example, to start compression, enable
>> QRESYNC, and set the language, it is more than reasonable to expect this
>> kind of pseudocode:
>>
>> $result = $imap->useCompression(true);
>> // Check for success
>> $imap->useQresync(true);
>> // Check for success
>> $imap->setLanguage([LANGUAGE])**;
>> // Check for success
>>
>> In any OO IMAP interface the order of IMAP commands to allow for efficient
>> pipelining, or the fact that pipelining even exists, should obviously not
>> be a part of the API.  Thus pipelining is fairly useless in the real world
>> as a way to guarantee an increase in performance.
>>
> Thats not a true statement.  Its useless if you're forced to use a bad
> client API that you aren't willing to work around?  Is that bad client API
> going to support adding a new command?

How do you suggest writing an IMAP client library API in which the  
user of the API doesn't need to know ANYTHING about IMAP?  Any API  
that requires the client author to know about pipelining or other IMAP  
protocol details is worthless.  In fact, I'd go so far to say that a  
useful mail client library API should allow interaction with both an  
IMAP and POP server using the same commands, albeit with the  
expectation that some of the more advanced commands - e.g. ACLs -  
would necessarily need to be null actions when using a POP3 backend.

SUSPEND is a general solution and easy to implement (IMHO) and would  
allow performance gains without understanding the more esoteric  
details of pipelining.  The simple fact that we are discussing which  
commands would be appropriate for pipelining highlights the latter.

>> There are other, more important reasons why a mechanism to restore
>> configuration is useful:
>>
>> - It prevents the need to re-parse the CAPABILITY list.  Note that parsing
>> the CAPABILITY list involves *MUCH* more than just the actual string
>> tokenization of the list, although this alone may not be a trivial task
>> (see below).
>>
>> A client may, depending on the capabilities returned, need to perform
>> various internal initialization tasks.  For example - if CONDSTORE/QRESYNC
>> is listed, a client may have to then parse a separate configuration file to
>> grab the details of the local cache where it is storing this information,
>> and then connect to this cache, etc.  Or if language is listed, a client
>> might have to parse a local list of language availability to determine if
>> it can/should change the language.
>>
>
> And... it would have had to do that anyways.  If its resuming the
> connection, it has to either do that initialization or it has to cache that
> initialization, both of which it can do just as easily on non-resume if
> we're talking about a new connection every 10s.

You've proven my point exactly.  For you, this kind of initialization  
may be trivial.  For another client, this isn't.  You can't make  
assumptions (incorrectly, in your case) about how a client does work  
or should work depending on how you do/would do things.

> Parse a local list of language availability?  So, you want to avoid reading
> a local config file ... on the hope that the client doesn't need to know
> that language information anyways just to display stuff to the user which
> doesn't come from the server?

More assumptions about client behavior.  I'll agree that my initial  
example may not be a tremendously useful/practical example, but it is  
a useful analogy for other initialization tasks that may occur.

>> And CAPABILITY parsing is more than just determining what capabilities are
>> listed.  It is also determining which capabilities SHOULD not be listed.
>>  Just today, Cyrus was fixed due to a bug that our code was triggering:
>> APPENDing binary data via a literal8 caused Cyrus to immediately terminate
>> the connection with a BYE response.  Our code is smart enough to catch this
>> broken behavior by removing BINARY appending from the list of available
>> capabilities.  But without a way to ensure that every subsequent connection
>> is a continuation of the current session, we have to do this detection
>> EVERY SINGLE TIME.  This is potentially a huge performance hit, since we
>> may be appending MBs of data to the server before the BYE response can be
>> returned (e.g. appending a sent-mail message containing attachments).
>>
>
> So, you worked around the cyrus bug by determining the connection exhibits
> the bug and then never using it for that server again?  Or, you could just
> not use BINARY ever, or issue an ID command to know if the remote server
> has the bug. You could even pipeline it!  Or you could store the  
> known bad server
> information somewhere in your app server.

* Without a way of determining if the IMAP server we connect to is the  
same IMAP server we previously connected to when we determined BINARY  
literal8's were broken, there's nothing we can do except try all over  
again.  You obviously can't assume that the IMAP server is the same.   
Several of my clients use IMAP load-balancing, and all backend IMAP  
servers may not be running the same IMAP software/version.

* Ack!  You didn't just say to use the ID command did you?  RFC 2971 [3]:

    Implementations MUST NOT make operational changes based on the data
    sent as part of the ID command or response.  The ID command is for
    human consumption only, and is not to be used in improving the
    performance of clients or servers.

    This includes, but is not limited to, the following:

       [...] Clients MUST NOT attempt to work
       around server bugs based on the ID response.

* The Cyrus break apparently only happened recently.  Hypothetically:  
even if using ID information (BAD!), how does that help all of the  
admins that have installed previous versions of our software where the  
ID sniff does not catch the issue?

* Not use BINARY ever?  How do you send null characters?  And because  
*1* version of *1* server is broken, EVERY other server that has ever,  
or will ever, support BINARY has to be ignored?  That's a bummer.

* (Getting a bit off topic...) If a server supports BINARY - or at  
least if it claims to support BINARY - it is a big time win to just  
send all literals as literal8's.  That way you don't have to scan the  
data stream for nulls, which is potentially an expensive operation  
when the APPENDed data is 10's of MBs in size (users love sending 10  
camera pics in outgoing e-mails for some reason...).  So sending  
literal8's is a significant performance improvement.  If a server  
reports that it supports BINARY, who are we to argue?

* For the record, Cyrus is not the only one that has a broken BINARY  
literal8 implementation (that I know of). For fun, try this on a  
UW-IMAP BINARY capable server:

A1 APPEND INBOX ~{1}
A1 BAD Missing literal in APPEND

>> - As mentioned above, sending an initialization command to the server may
>> take quite a bit of work on the client side to prepare.  It's not as easy
>> as hardcoding 'ENABLE QRESYNC' in client code - it may take quite a bit of
>> CPU cycles to get to that point in a given client.
>>
>
> But this doesn't change anything about the having to do that.  Regardless
> of whether you're resuming a session or not, you still have to do that work.

Back to a client assumption.  It is *much* cheaper for us to resume  
our session than to reinitialize.  It may not be true with your  
implementation but that's irrelevant.

>> Another example: a client keeps all of its imap initialization code in a
>> separate dynamically-loadable module.  If the session is successfully
>> resumed, this module does not need to be loaded/interpreted/run.
>>
>
> What kind of clients are we talking about?  I'm just completely failing to
> think this is an issue.

Ours.  We keep initialization code in a completely separate class  
(PHP).  That class is never loaded if we don't need to re-initialize  
(this currently happens when using the current XIMAPPROXY feature  
discussed in the original thread e-mail).

>> - From the server side, it may be much more expensive to initiate an IMAP
>> session as compared with resuming one.  This draft allows the server to
>> optimize if possible.  I believe Timo's post indicates that resuming in
>> Dovecot is more efficient than creating a new session.
>>
>
> Resuming into the middle of a selected folder seems cheap.  Resuming the
> status of N commands seems  optimizing in the small.  I would think it
> would cost more in terms of either caching the data in memory or stashing
> the data to disk/reading it back than the overhead of parsing a couple
> commands and maybe allocating a data structure.

I am not a server author so I can't speak to this - one of the reasons  
I started this thread was to get feedback on just this issue.

However, I believe Timo has indicated that it is potentially a  
performance win (although he is discussing in a slightly different  
context):

http://markmail.org/message/qp45yod5ukqf3jfn

>> - Even when pipelining commands, they still need to be sent, the incoming
>> command needs to be tokenized (server), the command is performed (server),
>> the response sent back, any untagged responses are tokenized (client), the
>> untagged responses are interpreted (client), the tagged response is
>> tokenized (client), and the tagged response is processed (client).  None of
>> this is "free".  Pipelining eliminates none of this.
>
>
> Tokenization is what clients and servers do, and this takes a trivial
> amount of time and cpu in reasonable languages.

Yes and no.  The higher-level language you get, the less of a chance  
you get to optimize this.  PHP, which I am stuck with, will not be as  
efficient as C at doing these kind of actions so I can only do so much  
to improve tokenization speed.  I'm not claiming that this is vastly  
going to improve performance, but you also can't argue that it can't  
hurt.

Not to mention that I have seen many poorly-written tokenizers doing  
things like using regexps to parse IMAP responses.  For these clients,  
any reduction in the number of commands processed is a much bigger win.

So for well-written IMAP clients containing highly tuned tokenizers, I  
would agree this advantage is of dubious value.  But well-written IMAP  
clients are probably in the minority.

>>  COMPRESS=DEFLATE
>
> And what does "resuming" a COMPRESS=DEFLATE do?  I assume we're not talking
> about trying to keep the old dictionary or anything like that, right?
>  We're actually talking about just the equivalent of starting it again.

I would suggest the proper behavior would be to re-start the  
compression behavior after the tagged command containing the resume  
response code.  Whether the dictionary should be retained from a  
previous section would be a decision entirely up to the implementer.

>  And finally, what IMAP servers support the LANGUAGE extension?
>>>
>>
>> Why does this matter?  RFC 5255 is a Standards Track extension.  A year
>> from now, every IMAP server and 200 new ones may support it.
>
>
> True.  But that also goes to my point that you need to specify which
> extensions and what information would need to be resumed, you can't just
> say "any which apply", what if the server author doesn't think the way you
> do and one of these doesn't get resumed?

You may have missed this in a previous e-mail response of mine to  
another commenter: I now agree that this appears to be an unfortunate  
necessity.

I was trying to keep the draft as lean as possible.  But discussion of  
how this command affects current extensions - see, e.g., Section 4 in  
the draft MOVE extension - is necessary to avoid ambiguities.

> But without an actual proof that this is useful for some combination of
> client/server, why would we adopt yet another extension that no one will
> implement?

Whatever I come up with here will be implemented by me in the  
imapproxy server, whether standardized or not.  Timo has indicated an  
interest to explore the idea further for implementation in Dovecot  
since he may be implementing something like this internally for other  
reasons.

Obviously our project would implement client side. I would assume the  
other large PHP-based open source webmail options would be interested.

As opposed to some other hella complicated extensions (I'm looking at  
you CONVERT), this proposal only adds two commands (one is really a  
simple extension of LOGOUT) and a response code.  Thus, this should be  
something that could be added without having to go in and modify too  
much existing code (It was an explicit design decision to not add the  
RESUME information to the authentication command; this would save a  
round-trip, but add significant complexity and implementation concerns.)

> Have you considered not re-establishing a connection every 10s?  This is a
> connected protocol, not http.

 From a webmail perspective: if you could tell me how to maintain a  
consistent IMAP connection using nothing more than current IMAP  
commands and an out-of-the box HTTP server, I would be ecstatic.   
That's what users demand our software works with, so that's what we  
need to code for.

I don't have the UI information in front of me right now, but a user  
initiating an action every 10 seconds, at least when managing messages  
in a mailbox (loading a message to read, deleting, copying/moving,  
reporting as spam), seems like a reasonable estimate for discussion  
purposes.  That's where I got the 10 second value from.

Thanks again for the additional input and review.

michael
Reply
E-mail headers
From: slusarz@curecanti.org
To: imap-protocol@localhost
Date: Fri, 08 Jun 2018 12:34:49 -0000
Message-ID: 20121126173224.Horde.BbqbGly8D0JG4aqG7fxoMw1@bigworm.curecanti.org permalink / raw / eml / mbox
Quoting Jan Kundr?t <jkt@flaska.net>:

> On Wednesday, 21 November 2012 23:54:17 CEST, Michael M Slusarz wrote:
>> I would strongly disagree with this statement.  As written, the  
>> draft  is only minimally concerned with saving on network  
>> round-trips.
>
> That's quite different from what I've understood from your draft --  
> I'd suggest making the motivation clearer, then. But point  
> understood, and I've now purged the "let's save roundtrips" from my  
> understanding of the draft :). OK.

No need to purge the understanding - saving roundtrips remains a  
useful goal.  It's just not the primary motivating factor behind the  
proposal.

>> $result = $imap->useCompression(true);
>> // Check for success
>> $imap->useQresync(true);
>> // Check for success
>> $imap->setLanguage([LANGUAGE]);
>> // Check for success
>
> It is pretty obvious that if you use synchronous primitives for  
> enabling individual sub-features in a serialized fashion, your  
> performance will be limited by the round trip times. To put it more  
> bluntly, you cannot have code like the one shown above and expect a  
> good performace.
>
> Coming from that background, I see that it is tempting to replace  
> this endless row of synchronous calls, each enabling a single  
> optional feature, with a quick way to side-step this process by  
> quickly jumping into a pre-negotiated state where everything which  
> was enabled before is enabled now as well. However, my point is that  
> clients already exist proving that the same efficiency can be  
> achieved with the existing facilities. You're right that this  
> requires abolishing the serial, synchronized code, but IMAP is not  
> particularly friendly with synchronous APIs.

I realize that the API argument is not my strongest one.  It becomes  
less strong considering that, yes: you could do all this configuration  
in a single API call - i.e., when creating the IMAP interaction  
object, you configure everything in there.

I still maintain that writing an API that requires advanced knowledge  
of IMAP is not that useful.  Things like QRESYNC and LIST-STATUS can  
be entirely abstracted so a client coder does not need to know  
anything about them to take advantage of.

>> A client may, depending on the capabilities returned, need to  
>> perform  various internal initialization tasks.  For example - if   
>> CONDSTORE/QRESYNC is listed, a client may have to then parse a   
>> separate configuration file to grab the details of the local cache   
>> where it is storing this information, and then connect to this  
>> cache,  etc.
>
> So you want to keep the cache information (among other things)  
> inside some serialized client-side state storage. What prevents you  
> from simply checking the capabilities against the previously  
> recorded state and restoring the state when the capabilities match  
> exactly? You can do that now, without waiting for this extension.  
> Yes, it's ugly, but if your initialization is expensive...

Because there's still no guarantee it's the same server/connection:  
that is the key to all of this.  A server can "look" the same but that  
doesn't proves anything.

What happens when the server is upgraded and UTF-8 searching now  
works?  The CAPABILITY string is exactly the same.  But UTF-8 has been  
marked as a bad charset so it will still not be available.  And what  
about those commands that have been determined to be broken previously  
in the session?  It is reasonable to expect the CAPABILITY string to  
be the same between point releases of an IMAP server, but the server  
may have fixed the bug that was causing bad command behavior.

>> - Even when pipelining commands, they still need to be sent, the   
>> incoming command needs to be tokenized (server), the command is   
>> performed (server), the response sent back, any untagged responses  
>> are  tokenized (client), the untagged responses are interpreted  
>> (client),  the tagged response is tokenized (client), and the  
>> tagged response is  processed (client).  None of this is "free".   
>> Pipelining eliminates  none of this.
>
> Using the numbers you posted later on, we're speaking about parsing  
> roughly 600 bytes of a well-structured text. For me, it's hard to  
> believe that this has any measurable impact.

You are incorrect.

I went ahead and setup some rough/quick benchmarking using current  
imapproxy behavior as a proxy for the SUSPEND behavior.  In this  
benchmark, the server and client are on the same machine so network  
latency is assumed to be non-existent.  The load on this machine is  
also non-existent (this test is the only active IMAP process; disk I/O  
is negligible).

Login without resuming session (connecting to a Dovecot 2.1 server)

C: 1 LOGIN [login credentials]
S: 1 OK User logged in
C: 2 CAPABILITY
S: * CAPABILITY IMAP4rev1 LITERAL+ SASL-IR LOGIN-REFERRALS ID ENABLE  
IDLE SORT SORT=DISPLAY THREAD=REFERENCES THREAD=REFS  
THREAD=ORDEREDSUBJECT MULTIAPPEND UNSELECT CHILDREN NAMESPACE UIDPLUS  
LIST-EXTENDED I18NLEVEL=1 CONDSTORE QRESYNC ESEARCH ESORT SEARCHRES  
WITHIN CONTEXT=SEARCH LIST-STATUS SPECIAL-USE ACL RIGHTS=texk
S: 2 OK Capability completed.
C: 3 ENABLE QRESYNC
S: * ENABLED QRESYNC
S: 3 OK Enabled.

Average elapsed time: 0.087 seconds

Login with resuming session:

C: 1 LOGIN [login credentials]
S: * OK [XPROXYREUSE] IMAP connection reused by squirrelmail-imap_proxy
S: 1 OK User logged in

Average elapsed time: 0.039 seconds

Difference: 0.048 seconds (~120% improvement)

120% improvement in a very common example.  And a reminder that this  
is WITHOUT any network latency; latency would only increase the actual  
real-time difference between the benchmarks.

Caveats:
* imapproxy doesn't require you to provide the token before the auth  
command, so that is admittedly not accounted for here.
* However, this RESUME could be easily pipelined with the  
authentication command, so you are not adding a round-trip.
* Additionally. RESUME shouldn't result in much additional server  
load/proccessing since it is doing nothing more than storing the token  
in the server's memory - the server isn't going to process that token  
until the authentication is complete.
* The above example is being routed through an additional proxy server  
so there are small performance penalties there.
* Someone will probably say my code sucks, and that parsing shouldn't  
take that long.  That could very well be true.  But I will note that I  
am running this example on a totally unloaded IMAP server with a  
single user.  The reality is that most IMAP servers are not running on  
a box that has 0.00 load.

So the gains for this very simple example are significant - initial  
login is twice as fast.  A potential savings of 0.10 seconds on a  
given connection could easily be possible: there would easily be this  
much time savings given network latency from a mobile device, for  
example.  Given the old Amazon 100ms = 1% study, the theory behind  
SUSPEND needs to at least be discussed.

For fun, I also took a look at the performance gains between a  
COPY/STORE/EXPUNGE vs. MOVE command.  Here I saw ~30% improvement  
(0.13 seconds vs. 0.10 seconds).  Granted, MOVE is being implemented  
to allow for atomicity of the move action, but it is a good comparison.

>> I would argue that the language of the RFC still controls despite  
>> what  an e-mail on this list says.  A client shouldn't be punished  
>> for  interpreting it that way either.
>
> The RFC is a specification crafted by humans. It has errors, and all  
> subsequent revisions will still have errors. (See the errata for a  
> list of those which are known already.) If you choose to block and  
> not pipeline ENABLE QRESYNC and SELECT ... QRESYNC, you hurt your  
> users. (Also note that the clarification given on this list was by  
> the original authors of the RFC.)

Yes, but you cited to an e-mail message that said this should be the  
case.  I hardly feel an IMAP implementer is going to take someone's  
opinion in an email as canon.

If this shows up as an errata to RFC 5161, I would tend to agree with  
you.  But it doesn't at this point.

>>> As of the LANGUAGE -- how often do you expect to hit an error   
>>> condition which is not described by an appropriate response code?  
>>> I  don't think that blocking for its result would be a good design  
>>>  choice.
>>
>> That could be your decision as a client author.  I would vehemently  
>> disagree.
>>
>>> And finally, what IMAP servers support the LANGUAGE extension?
>>
>> Why does this matter?  RFC 5255 is a Standards Track extension.  A   
>> year from now, every IMAP server and 200 new ones may support it.
>
> I stand by my reasoning. In order for the block to be actually  
> usefull, you'll have to talk to a server which:
>
> 1) actually implements LANGUAGE,
> 2) executes all commands in parallel OR has the LANGUAGE command  
> implemented in such a slow way that it enables parallel processing  
> for it,
> 3) returns a failure for one of the first commands which you send  
> *and* does not return an appropriate response code.
>
> But it's your client, do whatever you want to do :). I'm merely  
> saying that adding an extension driven by the desire to eliminate  
> issues like this is not something I support.

See benchmarks above.  LANGUAGE response is a more complex response  
than for ENABLE, so the floor of performance increase is 120%.

>> It would be impossible to determine benchmarks since there is no   
>> defined protocol yet.  And, as mentioned above, any given   
>> client/server interaction may provide different results based on  
>> their  own internal optimizations and extension support.
>
> Right. Well, based on how my client works, I don't expect any  
> significant performance gains obtained through this proposal.

Sure - just like IDLE is completely useless for disconnected clients.   
That doesn't make SUSPEND not very useful for at least some clients.

> I'm not the standards commitee, but having decent numbers saying  
> "see, this RESUME extensions cuts 40% out of the 1300ms required to  
> establish an IMAP session" is something which moves the discussion  
> from the current, very vague stage of "this is good -- nope, this is  
> worthless" into a stage where we can actually discuss what merits it  
> really brings. As you're proposing the extension, you should IMHO  
> provide these numbers.

A MOVE saves 30% performance off equivalent commands.  SUSPEND, at  
least for a simple example, saves 120%.  (And see below re: NOTIFY  
about something that CAN'T practically be done with current  
disconnected clients).

> 1) You don't take the initial CAPABILITY into account, but you  
> re-request CAPABILITY after login. (You need the initial capability  
> to see whether the server supports RESUME at all.) This will change  
> the numbers quite a lot.

What's the point of including benchmarking of the initial CAPABILITY?   
Both clients need to do this, so there is no difference - it is no  
more expensive for a SUSPEND client than a non-SUSPEND client.

And one of the reasons that I designed the RESUME command as I did is  
precisely to address the second part of your comment: the need to  
potentially send CAPABILITY pre-login.  From an client implementer's  
standpoint, it is quite likely that you DON'T need this CAPABILITY so  
that is an additional advantage.

Let's assume that your client program has previously connected to a  
given IMAP server and executed a successful SUSPEND command.  The next  
time it connects to the same IMAP server, it has no way of knowing  
whether that server is identical pre-authentication.  However:

1. Since the previously connected server supports the SUSPEND command,  
and it is very likely (although not guaranteed) that the server hasn't  
changed in the time since the client last connected, it can be assumed  
to a high degree of probability that the server supports SUSPEND.
2. A client using SUSPEND information will know which authentication  
method was successful the first time it connected to the server.   
Following the logic in #1, it can be assume that the server continues  
supports this authentication method.
3. RESUME command doesn't output any response that needs to be parsed  
before authentication can occur.

If #1 happens to not be true, this is irrelevant - a client will just  
do normal initialization when resuming (the RESUME command would  
generate a BAD tagged response, but a client SHOULD ignore this).

If #2 is not true, a client would have sent 2 unnecessary commands but  
otherwise, no harm done.

#1 or #2 is an incorrect assumption in, say, 1 out of 100 connections  
(which is probably a tremendously conservative example. In a large  
webmail installation, with 10,000+ concurrent users, you are getting  
millions of connections a day on software that isn't being touched for  
several months).  Even at this rate, it still makes far more sense to  
make these assumptions than 1% of the time sending an additional 2  
round-trips.

So a client supporting RESUME will likely save ANOTHER entire  
round-trip, so the 100%+ gain listed above is again shown to be a  
conservative estimate.

> 2) The sample token which Timo showed on the other list was way  
> longer than base64("state token") you use. Just saying.

Sure.  But as long as suspend tokens are not approaching 1000+ bytes,  
they should comfortably fit into an IP packet so this is irrelevant.

> 3) Saving 600 bytes of transmitted data per connection is noise  
> compared to what an actual session typically transfers.

A 50-100ms reduction in connection time is not noise.  Maybe it is for  
a single client connecting to a single server.  But it most certainly  
is not for large, distributed systems.  This kind of savings can be  
the difference between needing to add an additional server to the  
backend farm, which may cost a significant amount of money in  
hardware/installation/maintenance costs.

> 4) You could save even more bytes by converting IMAP to a binary  
> protocol. That possibility in itself is, however, no reason to do so.

I'm not looking to write IMAP 5. I'm looking at a relatively  
uncomplicated way to improve performance in IMAP 4.

> 5) You're taking an advantage of eliminating NAMESPACE, but so far  
> have ignored LIST and STATUS, even though a typicall client will  
> need them as well. When the LIST responses come into account,  
> savings of 600 bytes starts looking more and more like noise -- not  
> mentioning the mailbox synchronization or data transfers.

Mailbox listing is a very touchy spot for disconnected clients.   
Historically, a disconnected client is pretty much stuck with listing  
the mailboxes once with the understanding that if another client  
changes the mailbox structure there's not much we can do about it  
without allowing a user to manually refresh the mailbox list (or  
possibly doing something like polling the mailbox list at a given time  
interval).

However, as Timo noted, SUSPEND potentially allows disconnected  
clients to take advantage of NOTIFY.  Which would be a gigantic gain.   
With the combination of the two, disconnected clients could  
potentially have the equivalent of QRESYNC for mailbox lists, which is  
a feature that doesn't currently exist.  No amount of pipelining is  
going to fix this.

Additionally, this behavior makes SUSPEND useful for connected clients  
if such client locally caches mailbox lists: a desktop client that  
opens a second or two faster due to the fact that LIST's don't need to  
be sent is a substantial UI improvement.

In other words, SUSPEND brings real-world performance improvements and  
provides multiple features that are not possible with current IMAP  
protocol/extensions.

Once again, thanks for the comments.

michael
Reply
E-mail headers
From: tss@iki.fi
To: imap-protocol@localhost
Date: Fri, 08 Jun 2018 12:34:49 -0000
Message-ID: E2597C33-4D66-4446-B7DD-08D4517A357E@iki.fi permalink / raw / eml / mbox
On 22.11.2012, at 3.36, Michael M Slusarz wrote:

> However, I believe Timo has indicated that it is potentially a performance win (although he is discussing in a slightly different context):

It is useful for Dovecot internally, so I'll implement the core functionality in any case. Whether it's useful to expose to IMAP clients, I haven't spent much time on wondering about yet. You said pipelining isn't an issue, so what's really the difference between:

C: a suspend
S: a OK [resume blah]

vs.

C: a cmd1
C: b cmd2
C: c cmd3
S: a ok 1
S: b ok 2
S: c ok 3

In both cases you send data to server, and you receive data from server. In pretty much all cases those replies will be exactly the same. You don't need to even parse it. Just remember the original replies and their byte count, and see if the same reply is received the next time you send the same commands.
Reply
E-mail headers
From: blong@google.com
To: imap-protocol@localhost
Date: Fri, 08 Jun 2018 12:34:49 -0000
Message-ID: CABa8R6u62m0SkAoLa+K8MgQn8CXKxtv5Hnnt6O1muTZQnFqqoA@mail.gmail.com permalink / raw / eml / mbox
On Wed, Nov 21, 2012 at 5:36 PM, Michael M Slusarz <slusarz@curecanti.org>wrote:

> Quoting Brandon Long <blong@google.com>:
>
>  Have you considered not re-establishing a connection every 10s?  This is a
>> connected protocol, not http.
>>
>
> From a webmail perspective: if you could tell me how to maintain a
> consistent IMAP connection using nothing more than current IMAP commands
> and an out-of-the box HTTP server, I would be ecstatic.  That's what users
> demand our software works with, so that's what we need to code for.
>
> I don't have the UI information in front of me right now, but a user
> initiating an action every 10 seconds, at least when managing messages in a
> mailbox (loading a message to read, deleting, copying/moving, reporting as
> spam), seems like a reasonable estimate for discussion purposes.  That's
> where I got the 10 second value from.
>

And this seems to imply, to me, that you'd be better off with the
resume/disconnected session stuff from p-imap/lemonade that I pointed to
before.

After all, you are likely to need to do at least a select of the same
folder for most of those new connections.

Brandon
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman13.u.washington.edu/pipermail/imap-protocol/attachments/20121122/d62ccc36/attachment.html>
Reply
E-mail headers
From: blong@google.com
To: imap-protocol@localhost
Date: Fri, 08 Jun 2018 12:34:49 -0000
Message-ID: CABa8R6vCJJ_oQqiMW2Ofqu1Se-hm+h8dYx7DN6qDVT4so2mDtg@mail.gmail.com permalink / raw / eml / mbox
On Mon, Nov 26, 2012 at 4:32 PM, Michael M Slusarz <slusarz@curecanti.org>wrote:
>
>
> Because there's still no guarantee it's the same server/connection: that
> is the key to all of this.  A server can "look" the same but that doesn't
> proves anything.
>
> What happens when the server is upgraded and UTF-8 searching now works?
>  The CAPABILITY string is exactly the same.  But UTF-8 has been marked as a
> bad charset so it will still not be available.  And what about those
> commands that have been determined to be broken previously in the session?
>  It is reasonable to expect the CAPABILITY string to be the same between
> point releases of an IMAP server, but the server may have fixed the bug
> that was causing bad command behavior.


So, you deduce bugs in the server and then remember them... and this is
better than violating the ID spec?

Your adherence is commendable.

Brandon
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman13.u.washington.edu/pipermail/imap-protocol/attachments/20121126/f8fd7f84/attachment.html>
Reply
E-mail headers
From: jkt@flaska.net
To: imap-protocol@localhost
Date: Fri, 08 Jun 2018 12:34:49 -0000
Message-ID: 441db95f-2452-4d0a-9688-ccb3df28fb3c@flaska.net permalink / raw / eml / mbox
On Tuesday, 27 November 2012 01:32:24 CEST, Michael M Slusarz wrote:
> I still maintain that writing an API that requires advanced 
> knowledge  of IMAP is not that useful.  Things like QRESYNC and 
> LIST-STATUS can  be entirely abstracted so a client coder does 
> not need to know  anything about them to take advantage of.

Depends on who your "client coder" is. Yes, you can easily get your users a view of the mailbox and handle the whole IMAP complexity yourself, in your IMAP library -- I've done that and it works great. But it is important to realize that the user of your API is no longer an "IMAP client implementor" -- you are one, and you will have to deal with all of QRESYNC and LIST-STATUS inside your library which provides that nice IMAP-agnostic API to your users.

Clearly, a layer of abstraction is a good thing here.

> What happens when the server is upgraded and UTF-8 searching 
> now  works?  The CAPABILITY string is exactly the same.  But 
> UTF-8 has been  marked as a bad charset so it will still not be 
> available.  And what  about those commands that have been 
> determined to be broken previously  in the session?  It is 
> reasonable to expect the CAPABILITY string to  be the same 
> between point releases of an IMAP server, but the server  may 
> have fixed the bug that was causing bad command behavior.

If I were in your situation, I would probably create a simple script which will test the functionality of your IMAP server and instruct your administrators to run it whenever they update their IMAP servers, or even make an infrastructure to execute the tests "once in a while". That way, you would not have to wait for servers to adopt RESUME.

With regards to the situation of talking to different server versions (thus exhibiting different bugs) behind the same DNS alias -- how often do you expect that to happen? Is that a configuration you have to support, and support efficiently, i.e. not falling back to the lowest common denominator among the servers' features?

> I went ahead and setup some rough/quick benchmarking using 
> current  imapproxy behavior as a proxy for the SUSPEND behavior. 
>  In this  benchmark, the server and client are on the same 
> machine so network  latency is assumed to be non-existent.  The 
> load on this machine is  also non-existent (this test is the 
> only active IMAP process; disk I/O  is negligible).
>
> Login without resuming session (connecting to a Dovecot 2.1 server)
>
> C: 1 LOGIN [login credentials]
> S: 1 OK User logged in
> C: 2 CAPABILITY
> S: * CAPABILITY IMAP4rev1 LITERAL+ SASL-IR LOGIN-REFERRALS ID 
> ENABLE  IDLE SORT SORT=DISPLAY THREAD=REFERENCES THREAD=REFS  
> THREAD=ORDEREDSUBJECT MULTIAPPEND UNSELECT CHILDREN NAMESPACE 
> UIDPLUS  LIST-EXTENDED I18NLEVEL=1 CONDSTORE QRESYNC ESEARCH 
> ESORT SEARCHRES  WITHIN CONTEXT=SEARCH LIST-STATUS SPECIAL-USE 
> ACL RIGHTS=texk
> S: 2 OK Capability completed.
> C: 3 ENABLE QRESYNC
> S: * ENABLED QRESYNC
> S: 3 OK Enabled.
>
> Average elapsed time: 0.087 seconds
>
> Login with resuming session:
>
> C: 1 LOGIN [login credentials]
> S: * OK [XPROXYREUSE] IMAP connection reused by squirrelmail-imap_proxy
> S: 1 OK User logged in
>
> Average elapsed time: 0.039 seconds
>
> Difference: 0.048 seconds (~120% improvement)

Thanks for posting numbers. Now let's focus on possible issues with that. What are you measuring, exactly? Is that the total time your PHP script takes from the time it opens the TCP connection till parsing the last tagged OK? How much time is spent in your client's code which performs tokenization of the CAPABILITY response, i.e. one thing which you've already identified as a performance bottleneck? Can you make your CAPABILITY parsing faster? What about detailed traces showing *when* is the time really spent?

What happens when you put the imapproxy outside of the measurement setup and talk directly to your IMAP daemon?

I've done my own crude benchamrk on my laptop with Dovecot 2.1.9 (Gentoo) using PAM (backed by /etc/shadow with some "pretty recent" hash) and the difference between running the following two commands:

a) time echo -en "2 LOGIN user password\r\n" | socat - TCP:localhost:imap

b) time echo -en "1 CAPABILITY\r\n2 LOGIN user password\r\n3 CAPABILITY\r\n4 ENABLE QRESYNC\r\n5 NAMESPACE\r\n" | socat - TCP:localhost:imap

...is indeed in the noise area -- for the first command which performs less work, bash's `time` builtin reports the following durations, in milliseconds:

a) [25, 82, 56, 78, 53, 74, 66, 52, 59, 53, 82, 74, 63, 61]

while for the other one, I get the following raw data:

b) [66, 62, 64, 59, 63, 63, 74, 65, 54, 37, 32, 40, 32, 53, 57, 33]

Which means that a's average is 62.7 ms with standard deviation of 14.7, while in b's case, the average runtime was 53.4 ms with standard deviation 13.5. It's a long time since my stats class, but it looks like neither Dovecot nor actual I/O performed over TCP are bottleneck here.

Based on the above, I claim that when the following conditions are met:

1) one uses pipelining,
2) the client-side parser takes negligible time,

then the proposed extension will not save any measurable time.

Now, 1) is possible -- these commands can be pipelined. What about #2? I took the liberty to add this particular benchmark to my client's test suite [2], the parsing takes 0.13ms (not 0.13s, but 130ns) when run on my laptop. The parser is a pretty high-level C++ code using Qt's QByteArray with no optimizaiton whatsoever aimed at reducing excess copying and what not. I'm sure that it can be optimized to take a fraction of that time, it's just that I cannot be bothered to optimize something which is not an issue. The actual data I'm parsing are visible in the test suite and represent a real-world output from Dovecot here.

> Yes, but you cited to an e-mail message that said this should 
> be the  case.  I hardly feel an IMAP implementer is going to 
> take someone's  opinion in an email as canon.
>
> If this shows up as an errata to RFC 5161, I would tend to 
> agree with  you.  But it doesn't at this point.

Errata #1365 [1], "held for document update", submitted in March 2008. I have a new draft version of RFC5162-bis in my INBOX and I'll make sure this gets in if it isn't there already.

>> Right. Well, based on how my client works, I don't expect any  
>> significant performance gains obtained through this proposal.
>
> Sure - just like IDLE is completely useless for disconnected 
> clients.   That doesn't make SUSPEND not very useful for at 
> least some clients.

(To clarify, a client which does not keep its connection active is not usually called a "disconnected client", AFAIK. That is usually meant to identify clients which often work without the network connection, but will happily use it when it's available.)

What I'm saying here is that I suspect that your expectation of performance savings is based on your particular client's implementation details which make it inefficient when talking to current IMAP servers. As an example, let's take the CAPABILITY parsing/handling.

Your first option is to suggest a replacement which elliminates it more or less altogether. My first option is to make your CAPABILITY handling fast enough so that RESUME is not needed.

> A MOVE saves 30% performance off equivalent commands.  SUSPEND, 
> at  least for a simple example, saves 120%.  (And see below re: 
> NOTIFY  about something that CAN'T practically be done with 
> current  disconnected clients).

I have two problems with this:

1) The motivation behind MOVE was not to save performance. In addition, your 30% quote in your mail does not come with any measurement results, doesn't clarify whether you used pipelining or not, and does not mention where was that time actually spent.

2) While I would welcome an extension of NOTIFY to notify me about events which have happened while I was offline, please note that there's nothing in your RESUME draft *and* the NOTIFY RFC actually mandating the server to remember the events since the last time. Remember, "will send updates as configured previously" is very different from "will do the same and also send updates on what has happened since that time".

>> 1) You don't take the initial CAPABILITY into account, but you 
>>  re-request CAPABILITY after login. (You need the initial 
>> capability  to see whether the server supports RESUME at all.) 
>> This will change  the numbers quite a lot.
>
> What's the point of including benchmarking of the initial 
> CAPABILITY?   Both clients need to do this, so there is no 
> difference - it is no  more expensive for a SUSPEND client than 
> a non-SUSPEND client.

If you are citing "improvement in speed by 30%", you have to base this 30% on something. The usual approach is to base it on the total duration.

> And one of the reasons that I designed the RESUME command as I 
> did is  precisely to address the second part of your comment: 
> the need to  potentially send CAPABILITY pre-login.  From an 
> client implementer's  standpoint, it is quite likely that you 
> DON'T need this CAPABILITY so  that is an additional advantage.

If you want to be strict, you need to parse CAPABILITY at least once per connection to now that you can actually send RESUME.

> Let's assume that your client program has previously connected 
> to a  given IMAP server and executed a successful SUSPEND 
> command.  The next  time it connects to the same IMAP server, it 
> has no way of knowing  whether that server is identical 
> pre-authentication.  However:
>
> 1. Since the previously connected server supports the SUSPEND 
> command,  and it is very likely (although not guaranteed) that 
> the server hasn't  changed in the time since the client last 
> connected, it can be assumed  to a high degree of probability 
> that the server supports SUSPEND.

Following that reasoning, you can easily "blindly" send ENABLE QRESYNC and LANGUAGE as well. Please be consistent -- either you allow that for none of (RESUME, ENABLE ..., LANGUAGE), or you allow that for all these.

[...]

> So a client supporting RESUME will likely save ANOTHER entire  
> round-trip, so the 100%+ gain listed above is again shown to be 
> a  conservative estimate.

If you're blindly sending RESUME, you can do the same with any other command shown so far (and including AUTHENTICATE). The risks are always the same (except what, maybe a kilobyte of wasted bandwidth in situations where "something has changed"? Who cares?)

>> 3) Saving 600 bytes of transmitted data per connection is 
>> noise  compared to what an actual session typically transfers.
>
> A 50-100ms reduction in connection time is not noise.

We have not established yet that these 50-100ms cannot be addressed by improving your client's code.

> Mailbox listing is a very touchy spot for disconnected clients. 
>   Historically, a disconnected client is pretty much stuck with 
> listing  the mailboxes once with the understanding that if 
> another client  changes the mailbox structure there's not much 
> we can do about it  without allowing a user to manually refresh 
> the mailbox list (or  possibly doing something like polling the 
> mailbox list at a given time  interval).
>
> However, as Timo noted, SUSPEND potentially allows disconnected 
>  clients to take advantage of NOTIFY.

I've mentioned it above -- there's nothing in the provided draft which makes it possible to use NOTIFY across sessions. Yep, I agree that it would be cool if it was possible *and* if someone actually implemented NOTIFY. But that's orthogonal to SUSPEND/RESUME.

> Additionally, this behavior makes SUSPEND useful for connected 
> clients  if such client locally caches mailbox lists: a desktop 
> client that  opens a second or two faster due to the fact that 
> LIST's don't need to  be sent is a substantial UI improvement.
>
> In other words, SUSPEND brings real-world performance 
> improvements and  provides multiple features that are not 
> possible with current IMAP  protocol/extensions.

You've lost me here -- surely this benefit depends on yet unwritten extension to NOTIFY which makes it work across sessions, right?

With kind regards,
Jan

[1] http://www.rfc-editor.org/errata_search.php?eid=1365
[2] http://commits.kde.org/trojita/92ae247ab69121fc3e8c886fe8c0e2da3e1740f7
Reply
E-mail headers
From: slusarz@curecanti.org
To: imap-protocol@localhost
Date: Fri, 08 Jun 2018 12:34:49 -0000
Message-ID: 20121126152005.Horde.UGsoLB91esWQF9_aGi6UYQ1@bigworm.curecanti.org permalink / raw / eml / mbox
Quoting Brandon Long <blong@google.com>:

> On Wed, Nov 21, 2012 at 5:36 PM, Michael M Slusarz  
> <slusarz@curecanti.org>wrote:
>
>> Quoting Brandon Long <blong@google.com>:
>>
>>  Have you considered not re-establishing a connection every 10s?  This is a
>>> connected protocol, not http.
>>>
>>
>> From a webmail perspective: if you could tell me how to maintain a
>> consistent IMAP connection using nothing more than current IMAP commands
>> and an out-of-the box HTTP server, I would be ecstatic.  That's what users
>> demand our software works with, so that's what we need to code for.
>>
>> I don't have the UI information in front of me right now, but a user
>> initiating an action every 10 seconds, at least when managing messages in a
>> mailbox (loading a message to read, deleting, copying/moving, reporting as
>> spam), seems like a reasonable estimate for discussion purposes.  That's
>> where I got the 10 second value from.
>>
>
> And this seems to imply, to me, that you'd be better off with the
> resume/disconnected session stuff from p-imap/lemonade that I pointed to
> before.
>
> After all, you are likely to need to do at least a select of the same
> folder for most of those new connections.

This is probably getting too client implementation specific... but in  
our dynamic client a large number of browser requests (XMLHttpRequest)  
are polling requests.  No need to select a mailbox.  So any solution  
that REQUIREs a selection of mailbox on resuming is worthless in a  
practical sense.

michael
Reply
E-mail headers
From: slusarz@curecanti.org
To: imap-protocol@localhost
Date: Fri, 08 Jun 2018 12:34:49 -0000
Message-ID: 20121126184814.Horde.pVkfF8iCAZNxKh-rRJmQ2g1@bigworm.curecanti.org permalink / raw / eml / mbox
Quoting Brandon Long <blong@google.com>:

> On Mon, Nov 26, 2012 at 4:32 PM, Michael M Slusarz  
> <slusarz@curecanti.org>wrote:
>>
>>
>> Because there's still no guarantee it's the same server/connection: that
>> is the key to all of this.  A server can "look" the same but that doesn't
>> proves anything.
>>
>> What happens when the server is upgraded and UTF-8 searching now works?
>>  The CAPABILITY string is exactly the same.  But UTF-8 has been marked as a
>> bad charset so it will still not be available.  And what about those
>> commands that have been determined to be broken previously in the session?
>>  It is reasonable to expect the CAPABILITY string to be the same between
>> point releases of an IMAP server, but the server may have fixed the bug
>> that was causing bad command behavior.
>
> So, you deduce bugs in the server and then remember them... and this is
> better than violating the ID spec?

Yes.  In fact we are quite proud of this.  100% foolproof and works on  
every server past, present, and future.

Not sure how you are supposed to do this on IMAP servers that don't  
support ID.  Or don't send version information: Dovecot, for one,  
doesn't by default.  Or for installations that use a version of your  
software released before a particular IMAP server even breaks (full  
disclosure: the recent Cyrus break is weird because it sent a BYE and  
terminated instead of a failed command, so that wasn't previously  
scanned for so we weren't catching this until recently. so we're not  
perfect either.).

IMAP command sniffing = javascript browser sniffing.  The days of  
parsing a browser's User-Agent field are so 1999.

michael
Reply
E-mail headers
From: blong@google.com
To: imap-protocol@localhost
Date: Fri, 08 Jun 2018 12:34:49 -0000
Message-ID: CABa8R6tGF8Uj-MEec6vdHrhkYefhiU0QLGtVXafDwOXb3vnFmQ@mail.gmail.com permalink / raw / eml / mbox
By polling, I assume you mean for calling STATUS?

And my point would be that it wouldn't cost anything on the server.

Imagine, instead of a "special case" resume, that we're instead treating
this as a client that often re-connects.  You'd treat IMAP the same way a
normal "connected" client would, and the server would just keep your
connection in the same connection state, possibly holding any results it
would send you.  You would just "reconnect/resume" and then issue the next
command.  You'd be immediately back in the selected state, but there
wouldn't be any cost associated with it (from the client side) except
perhaps having to update client state if the server gives you updates, but
that just results in faster updates of server state.

You could even mimic Outlook by having a separate "virtual connection" that
handles STATUS calls, and one "virtual connection" for actual folder
actions.

This is essentially trying to turn IMAP into a more HTTP like protocol.

It is more expensive for a server to offer this, though as long as the
client has to "request" a session, it would actually be cheaper for the
server to maintain that state than re-loading it between connections.

Brandon


On Mon, Nov 26, 2012 at 2:20 PM, Michael M Slusarz <slusarz@curecanti.org>wrote:

> Quoting Brandon Long <blong@google.com>:
>
>  On Wed, Nov 21, 2012 at 5:36 PM, Michael M Slusarz <slusarz@curecanti.org
>> >wrote:
>>
>>  Quoting Brandon Long <blong@google.com>:
>>>
>>>  Have you considered not re-establishing a connection every 10s?  This
>>> is a
>>>
>>>> connected protocol, not http.
>>>>
>>>>
>>> From a webmail perspective: if you could tell me how to maintain a
>>> consistent IMAP connection using nothing more than current IMAP commands
>>> and an out-of-the box HTTP server, I would be ecstatic.  That's what
>>> users
>>> demand our software works with, so that's what we need to code for.
>>>
>>> I don't have the UI information in front of me right now, but a user
>>> initiating an action every 10 seconds, at least when managing messages
>>> in a
>>> mailbox (loading a message to read, deleting, copying/moving, reporting
>>> as
>>> spam), seems like a reasonable estimate for discussion purposes.  That's
>>> where I got the 10 second value from.
>>>
>>>
>> And this seems to imply, to me, that you'd be better off with the
>> resume/disconnected session stuff from p-imap/lemonade that I pointed to
>> before.
>>
>> After all, you are likely to need to do at least a select of the same
>> folder for most of those new connections.
>>
>
> This is probably getting too client implementation specific... but in our
> dynamic client a large number of browser requests (XMLHttpRequest) are
> polling requests.  No need to select a mailbox.  So any solution that
> REQUIREs a selection of mailbox on resuming is worthless in a practical
> sense.
>
> michael
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman13.u.washington.edu/pipermail/imap-protocol/attachments/20121126/fa4932be/attachment.html>
Reply
E-mail headers
From: brong@fastmail.fm
To: imap-protocol@localhost
Date: Fri, 08 Jun 2018 12:34:49 -0000
Message-ID: 1354010571.12520.140661158658949.22643EF6@webmail.messagingengine.com permalink / raw / eml / mbox
On Tue, Nov 27, 2012, at 02:48 AM, Michael M Slusarz wrote:
> Quoting Brandon Long <blong@google.com>:
> Not sure how you are supposed to do this on IMAP servers that don't  
> support ID.  Or don't send version information: Dovecot, for one,  
> doesn't by default.  Or for installations that use a version of your  
> software released before a particular IMAP server even breaks (full  
> disclosure: the recent Cyrus break is weird because it sent a BYE and  
> terminated instead of a failed command, so that wasn't previously  
> scanned for so we weren't catching this until recently. so we're not  
> perfect either.).

I'm kinda embarassed about this one!  I forgot to put the GUID calculation
call in the APPEND BINARY path, because nothing ever tested it and it
appeared nobody was using it, because we didn't have a single complaint or
failure with it until just a few weeks ago.

> IMAP command sniffing = javascript browser sniffing.  The days of  
> parsing a browser's User-Agent field are so 1999.

Version 9.x (like something) (like somethingelse) ...

Yeah, sniffing is fun.

The Cyrus XFER command now does version sniffing to work out which version
of cyrus.index the remote end supports, so it can downgrade the files before
transfer.  Kinda messy, but a whole lot better than not working at all!

Bron.
-- 
  Bron Gondwana
  brong@fastmail.fm
Reply
E-mail headers
From: Pidgeot18@verizon.net
To: imap-protocol@localhost
Date: Fri, 08 Jun 2018 12:34:49 -0000
Message-ID: 50B508E3.3050209@verizon.net permalink / raw / eml / mbox
On 11/26/2012 7:48 PM, Michael M Slusarz wrote:
> IMAP command sniffing = javascript browser sniffing. The days of 
> parsing a browser's User-Agent field are so 1999.

As someone who has seen both ends of the User-Agent debate (as a web 
developer and as a browser implementer), I can honestly say that parsing 
User-Agents is actually a very powerful and useful technique, and even 
browser vendors won't have a problem with it... if you do it properly. I 
can also attest that User-Agent parsing happens to a very large degree 
in the modern web (Google, of all people, does it. And incorrectly, the 
last time I checked): try changing your desktop's browser UA to that of 
a mobile browser and notice how different the web looks. Not to mention 
all the feedback you see on things like the WHATWG mailing list 
complaining that the User-Agent doesn't send screen resolution...

-- 
Beware of bugs in the above code; I have only proved it correct, not tried it. -- Donald E. Knuth
Reply
E-mail headers
From: slusarz@curecanti.org
To: imap-protocol@localhost
Date: Fri, 08 Jun 2018 12:34:49 -0000
Message-ID: 20121126155721.Horde.a3j2EWFDFOo-xkaflrURBQ2@bigworm.curecanti.org permalink / raw / eml / mbox
Quoting Brandon Long <blong@google.com>:

> By polling, I assume you mean for calling STATUS?

Well hopefully LIST-STATUS exists on the server, because that makes a  
huge performance difference. (For some reason, a larger number of  
users will whine and complain incessantly unless the ability to poll  
all mailboxes is enabled. Why anybody wants to poll mailboxes other  
than those in which messages are being delivered are beyond me. But  
this is a feature that users demand, so we have to support it  
unfortunately.)

> Imagine, instead of a "special case" resume, that we're instead treating
> this as a client that often re-connects.  You'd treat IMAP the same way a
> normal "connected" client would, and the server would just keep your
> connection in the same connection state, possibly holding any results it
> would send you.  You would just "reconnect/resume" and then issue the next
> command.  You'd be immediately back in the selected state, but there
> wouldn't be any cost associated with it (from the client side) except
> perhaps having to update client state if the server gives you updates, but
> that just results in faster updates of server state.

This would be great.  But implementation of this would be orders of  
magnitude more difficult than the more simple SUSPEND case and, at  
least in part, would be duplicating behavior of QRESYNC.

> You could even mimic Outlook by having a separate "virtual connection" that
> handles STATUS calls, and one "virtual connection" for actual folder
> actions.

This doesn't help for disconnected clients though.

> This is essentially trying to turn IMAP into a more HTTP like protocol.

Not necessarily a bad thing. It would be fantastic if there was an  
option to send "quick" commands that look like:

AUTHENTICATE user password FETCH <imap url>

But such a drastic change is no longer IMAP 4.  So not really worth  
discussing on this list.

> It is more expensive for a server to offer this, though as long as the
> client has to "request" a session, it would actually be cheaper for the
> server to maintain that state than re-loading it between connections.

What I get out of this is that you think QRESYNC was the result of an  
incorrect design decision and, instead,  the proposal you previously  
linked to in this thread should have won out.

michael
Reply
E-mail headers
From: slusarz@curecanti.org
To: imap-protocol@localhost
Date: Fri, 08 Jun 2018 12:34:49 -0000
Message-ID: 20121127124811.Horde.VuMexjqX6lKpL2uX-qPgPw2@bigworm.curecanti.org permalink / raw / eml / mbox
Quoting Bron Gondwana <brong@fastmail.fm>:

> On Tue, Nov 27, 2012, at 02:48 AM, Michael M Slusarz wrote:
>> (full
>> disclosure: the recent Cyrus break is weird because it sent a BYE and
>> terminated instead of a failed command, so that wasn't previously
>> scanned for so we weren't catching this until recently. so we're not
>> perfect either.).
>
> I'm kinda embarassed about this one!  I forgot to put the GUID calculation
> call in the APPEND BINARY path, because nothing ever tested it and it
> appeared nobody was using it, because we didn't have a single complaint or
> failure with it until just a few weeks ago.

There's probably some deeper (sad) IMAP implementation story to be  
learned here.  RFC 3516 was published April 2003, and 9 years later  
we're the first one to ever use literal8's in an APPEND?

michael
Reply
E-mail headers
From: tss@iki.fi
To: imap-protocol@localhost
Date: Fri, 08 Jun 2018 12:34:49 -0000
Message-ID: A44C6C96-FA16-4C69-A5D2-030C253D716E@iki.fi permalink / raw / eml / mbox
On 27.11.2012, at 12.02, Bron Gondwana wrote:

> On Tue, Nov 27, 2012, at 02:48 AM, Michael M Slusarz wrote:
>> Quoting Brandon Long <blong@google.com>:
>> Not sure how you are supposed to do this on IMAP servers that don't  
>> support ID.  Or don't send version information: Dovecot, for one,  
>> doesn't by default.  Or for installations that use a version of your  
>> software released before a particular IMAP server even breaks (full  
>> disclosure: the recent Cyrus break is weird because it sent a BYE and  
>> terminated instead of a failed command, so that wasn't previously  
>> scanned for so we weren't catching this until recently. so we're not  
>> perfect either.).
> 
> I'm kinda embarassed about this one!  I forgot to put the GUID calculation
> call in the APPEND BINARY path, because nothing ever tested it and it
> appeared nobody was using it, because we didn't have a single complaint or
> failure with it until just a few weeks ago.

BTW. I've added tests for this and several other new extensions to imaptest. I haven't tried running them against anything other than Dovecot so far. The biggest missing test is NOTIFY.
Reply
E-mail headers
From: slusarz@curecanti.org
To: imap-protocol@localhost
Date: Fri, 08 Jun 2018 12:34:49 -0000
Message-ID: 20121127160455.Horde.mDRBAWWbJkueCNI0adKoQQ5@bigworm.curecanti.org permalink / raw / eml / mbox
Quoting Joshua Cranmer <Pidgeot18@verizon.net>:

> On 11/26/2012 7:48 PM, Michael M Slusarz wrote:
>> IMAP command sniffing = javascript browser sniffing. The days of  
>> parsing a browser's User-Agent field are so 1999.
>
> As someone who has seen both ends of the User-Agent debate (as a web  
> developer and as a browser implementer), I can honestly say that  
> parsing User-Agents is actually a very powerful and useful  
> technique, and even browser vendors won't have a problem with it...  
> if you do it properly.

I don't have a problem with Browser sniffing, per se.  A web  
framework, for example, only has the User-Agent string to go by so by  
necessity they need to use that.

The problem comes when other agents, or proxies, "steal" a user-agent  
string from another product in an effort to provide compatibility.   
For something more abstract, like mobile detection, this shouldn't be  
an issue.  But if you are trying to workaround a much more granular  
issue - does this browser support transparent PNGs? - problems arise.

So, IMHO, when you have access to the actual environment you are  
trying to test (i.e. DOM for javascript) there's no real excuse to not  
do feature sniffing properly.  It's not quite as clean in IMAP -  
sending MBs of data to an APPEND command only to find out it failed -  
but it is reliable.

michael
Reply
E-mail headers
From: tss@iki.fi
To: imap-protocol@localhost
Date: Fri, 08 Jun 2018 12:34:49 -0000
Message-ID: 56138D73-A10B-4D88-A641-4F97DEA6B340@iki.fi permalink / raw / eml / mbox
On 27.11.2012, at 0.57, Michael M Slusarz wrote:

>> By polling, I assume you mean for calling STATUS?
> 
> Well hopefully LIST-STATUS exists on the server, because that makes a huge performance difference. (For some reason, a larger number of users will whine and complain incessantly unless the ability to poll all mailboxes is enabled. Why anybody wants to poll mailboxes other than those in which messages are being delivered are beyond me. But this is a feature that users demand, so we have to support it unfortunately.)

If server supports both NOTIFY and SUSPEND, you wouldn't need to do such polling. Server just sends the updates when restoring.
Reply
E-mail headers
From: blong@google.com
To: imap-protocol@localhost
Date: Fri, 08 Jun 2018 12:34:49 -0000
Message-ID: CABa8R6tOshnrGPGCdb0QToAx8xjWVWY+RBu0Uw8cZ2NH3=TvZw@mail.gmail.com permalink / raw / eml / mbox
On Mon, Nov 26, 2012 at 2:57 PM, Michael M Slusarz <slusarz@curecanti.org>wrote:

> Quoting Brandon Long <blong@google.com>:
>
>  By polling, I assume you mean for calling STATUS?
>>
>
> Well hopefully LIST-STATUS exists on the server, because that makes a huge
> performance difference. (For some reason, a larger number of users will
> whine and complain incessantly unless the ability to poll all mailboxes is
> enabled. Why anybody wants to poll mailboxes other than those in which
> messages are being delivered are beyond me. But this is a feature that
> users demand, so we have to support it unfortunately.)


sure.


>  Imagine, instead of a "special case" resume, that we're instead treating
>> this as a client that often re-connects.  You'd treat IMAP the same way a
>> normal "connected" client would, and the server would just keep your
>> connection in the same connection state, possibly holding any results it
>> would send you.  You would just "reconnect/resume" and then issue the next
>> command.  You'd be immediately back in the selected state, but there
>> wouldn't be any cost associated with it (from the client side) except
>> perhaps having to update client state if the server gives you updates, but
>> that just results in faster updates of server state.
>>
>
> This would be great.  But implementation of this would be orders of
> magnitude more difficult than the more simple SUSPEND case and, at least in
> part, would be duplicating behavior of QRESYNC.


More difficult then you SUSPEND, sure.  Easier than QRESYNC.  I don't
really see the duplication, however.

 It is more expensive for a server to offer this, though as long as the
>> client has to "request" a session, it would actually be cheaper for the
>> server to maintain that state than re-loading it between connections.
>>
>
> What I get out of this is that you think QRESYNC was the result of an
> incorrect design decision and, instead,  the proposal you previously linked
> to in this thread should have won out.


I wasn't on the list at that time, so I have no idea what the debate looked
like.  I view them as different, however.  QRESYNC requires little extra
storage and can be persisted for any length of time.  I find it odd that it
mentions "mobile, frequently disconnected" as its still a fairly extensive
negotiation between client and server.  Its certainly a win for any client
when re-connecting, regardless of how long ago that was, and fills a hole
in the CONDSTORE case.  So no, I don't think QRESYNC is an incorrect design
decision.  Its possible a "not imap" protocol could conceive of a better
way, in syntax or otherwise, of passing the needed information to re-sync
the client and server, but QRESYNC has to fit the task given it.

Also, for the mobile case, its probably more of a case of "uncontrolled"
connection closing, which isn't as useful a case, because you don't know
which commands "finished" executing on the server and which didn't make it.

I view the advanced suspend/resume as more of a "client always
disconnects".  Mobile clients might be able to use it by frequently
disconnecting, depending on the exact nature of the disruptions that occur,
I'm not a mobile expert to know.

I'd view this as more useful in the "likely to re-connect in ~1 minute"
case, the server probably wouldn't hold onto the data longer.  It would be
implemented by separating the connection data out of the connection so that
it can be kept longer than the connection and re-connected to, and then a
task to remove old ones after some timeout.

I can't speak to which one would be more likely to be adopted, both seem
rather specific to a certain use case.  For instance, your entire use case
would be avoided if you had a persistent server to maintain your
connections for you, instead of being a CGI.

Brandon
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman13.u.washington.edu/pipermail/imap-protocol/attachments/20121126/25840739/attachment.html>
Reply
E-mail headers
From: jkt@flaska.net
To: imap-protocol@localhost
Date: Fri, 08 Jun 2018 12:34:49 -0000
Message-ID: d087ec71-e6bc-47f0-9c49-222444681590@flaska.net permalink / raw / eml / mbox
On Monday, 26 November 2012 23:57:21 CEST, Michael M Slusarz wrote:
>> This is essentially trying to turn IMAP into a more HTTP like protocol.
>
> Not necessarily a bad thing. It would be fantastic if there was 
> an  option to send "quick" commands that look like:
>
> AUTHENTICATE user password FETCH <imap url>

Just saying -- you could pipeline LOGIN/AUTHENTICATE, SELECT and FETCH already. The only downside is that you have wrong data in your socket's read queue iff the UIDVALIDITY has changed. And yes, the overhead of the initial capability checking, TLS negotiation and connection establishment is still there, of course.

Cheers,
Jan
Reply
E-mail headers
From: blong@google.com
To: imap-protocol@localhost
Date: Fri, 08 Jun 2018 12:34:49 -0000
Message-ID: CABa8R6u6Z6994d1dSVXV0wMYNSJzmCSJ8Z_H8PhsBay9Ga=4nQ@mail.gmail.com permalink / raw / eml / mbox
The shear number of extensions for IMAP is kind of a problem in itself, as
is the combinatorial complexity that implies.

One mechanism to help with that, in terms of the chicken-egg problem, would
be if CAPABILITY were a two way street, like ID is.  I could then see what
capabilities cilents have implemented to see if its worth implementing them
on the server.

Not going to change CAPABILITY now, though we could standardize an ID
keyword for that, I suppose.

Ie:

a ID ("client" "mail.app" "version" "1.2.3" "capability" "LITERAL+
CONDSTORE ETC")

Anyways, I imagine the reason no one did it was what was the client benefit
of supporting another path when the standard path is supported everywhere.

Brandon


On Tue, Nov 27, 2012 at 11:48 AM, Michael M Slusarz
<slusarz@curecanti.org>wrote:

> Quoting Bron Gondwana <brong@fastmail.fm>:
>
>  On Tue, Nov 27, 2012, at 02:48 AM, Michael M Slusarz wrote:
>>
>>> (full
>>> disclosure: the recent Cyrus break is weird because it sent a BYE and
>>> terminated instead of a failed command, so that wasn't previously
>>> scanned for so we weren't catching this until recently. so we're not
>>> perfect either.).
>>>
>>
>> I'm kinda embarassed about this one!  I forgot to put the GUID calculation
>> call in the APPEND BINARY path, because nothing ever tested it and it
>> appeared nobody was using it, because we didn't have a single complaint or
>> failure with it until just a few weeks ago.
>>
>
> There's probably some deeper (sad) IMAP implementation story to be learned
> here.  RFC 3516 was published April 2003, and 9 years later we're the first
> one to ever use literal8's in an APPEND?
>
> michael
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman13.u.washington.edu/pipermail/imap-protocol/attachments/20121127/7c8ece9f/attachment.html>
Reply
E-mail headers
From: brong@fastmail.fm
To: imap-protocol@localhost
Date: Fri, 08 Jun 2018 12:34:49 -0000
Message-ID: 1354010952.13681.140661158661065.2086C60B@webmail.messagingengine.com permalink / raw / eml / mbox
On Tue, Nov 27, 2012, at 12:30 AM, Brandon Long wrote:
> On Mon, Nov 26, 2012 at 2:57 PM, Michael M Slusarz <slusarz@curecanti.org>wrote:
> > What I get out of this is that you think QRESYNC was the result of an
> > incorrect design decision and, instead,  the proposal you previously linked
> > to in this thread should have won out.
> 
> I wasn't on the list at that time, so I have no idea what the debate looked
> like.  I view them as different, however.  QRESYNC requires little extra
> storage and can be persisted for any length of time.  I find it odd that it
> mentions "mobile, frequently disconnected" as its still a fairly extensive
> negotiation between client and server.  Its certainly a win for any client
> when re-connecting, regardless of how long ago that was, and fills a hole
> in the CONDSTORE case.  So no, I don't think QRESYNC is an incorrect design
> decision.  Its possible a "not imap" protocol could conceive of a better
> way, in syntax or otherwise, of passing the needed information to re-sync
> the client and server, but QRESYNC has to fit the task given it.

The interesting part for QRESYNC is remembering "tombstones" for EXPUNGED
messages for a while to keep it cheap.

I like QRESYNC a lot more now that I've implemented the Cyrus 2.4+ replication
protocol on top of it.  It basically uses the same logic, but with full
information required to replicate the exact mailbox state at the far end.  It
means fast and cheap resync.

I'm really tempted to try to write it up as a more general protocol for
synchronising IMAP message stores from any vendor.  There's some capabilities
you need which can't be expressed over regular IMAP - for example appending
with a specific new UID rather than the next available.

I also wrote up a long thing to the Cyrus mailing list many years ago about
"UID Promotion" - basically if the two ends had ever had different messages
with the same UID, you need to give new UIDs to BOTH messages which are
higher than any ever seen by any client - generating an EXPUNGE and two
APPEND events at each end - to come entirely back into sync.

If anyone else is interested, I can write up something about how Cyrus
implements QRESYNC now, such that it can clean up old records, yet still
be efficient for clients most of the time.

Bron.

-- 
  Bron Gondwana
  brong@fastmail.fm
Reply
E-mail headers
From: tss@iki.fi
To: imap-protocol@localhost
Date: Fri, 08 Jun 2018 12:34:49 -0000
Message-ID: CD24D84B-783A-4E32-9C9B-9C8D262EC5C1@iki.fi permalink / raw / eml / mbox
On 27.11.2012, at 12.09, Bron Gondwana wrote:

> I like QRESYNC a lot more now that I've implemented the Cyrus 2.4+ replication
> protocol on top of it.  It basically uses the same logic, but with full
> information required to replicate the exact mailbox state at the far end.  It
> means fast and cheap resync.

Dovecot is using the same underlying qresync features as well, although not qresync itself.

> I'm really tempted to try to write it up as a more general protocol for
> synchronising IMAP message stores from any vendor.  There's some capabilities
> you need which can't be expressed over regular IMAP - for example appending
> with a specific new UID rather than the next available.

I think that's the only one that is pretty much required, for both APPEND and COPY. Although exposing this to regular users could become troublesome. I know Dovecot doesn't handle it very nicely when you run out of UIDs. (I used to try to handle it a long time ago by giving new UIDs to messages, but I haven't tested it for ages now. And that wouldn't really work with the newer mailbox formats. So I guess simply failing to add any new messages until mailbox is deleted would work too. But then I'd need to support deleting INBOX..)

I'm also planning on adding support for syncing from regular IMAP server to Dovecot, mainly for migration purposes. It would still need to be at least partially two-way sync. I've been wondering if I should try to sync the UIDs on the IMAP server side as well. It could be done with enough COPY+EXPUNGE commands for a single message. :)

> I also wrote up a long thing to the Cyrus mailing list many years ago about
> "UID Promotion" - basically if the two ends had ever had different messages
> with the same UID, you need to give new UIDs to BOTH messages which are
> higher than any ever seen by any client - generating an EXPUNGE and two
> APPEND events at each end - to come entirely back into sync.

I implemented this with "COPY <old uid> <new uid>" + "EXPUNGE <old uid>". Much simpler than my original plan to implement "CHANGEUID <old> <new>" :)
Reply
E-mail headers
From: slusarz@curecanti.org
To: imap-protocol@localhost
Date: Fri, 08 Jun 2018 12:34:49 -0000
Message-ID: 20121127133229.Horde.X3XMkGM8iZ5mPR4tbjUveA1@bigworm.curecanti.org permalink / raw / eml / mbox
Quoting Bron Gondwana <brong@fastmail.fm>:

> The interesting part for QRESYNC is remembering "tombstones" for EXPUNGED
> messages for a while to keep it cheap.

[snip]

> If anyone else is interested, I can write up something about how Cyrus
> implements QRESYNC now, such that it can clean up old records, yet still
> be efficient for clients most of the time.

I have an interest in this logic, if just for the fact that this issue  
came up recently.

We provide ActiveSync mail syncing through our IMAP library.   
(Warning: I'm not an ActiveSync expert: this is my knowledge as  
provided to me by a guy who is).  To sync a remote ActiveSync client,  
it needs to be determined which messages have been expunged since the  
previous sync.  Fantastic if QRESYNC is available - we can just use  
VANISHED so there is no need to keep the UID list at the ActiveSync  
controller layer.

However, we were seeing transactions like the following (on a dovecot server):

- HIGHESTMODSEQ as known by activesync client: 53000
- HIGHESTMODSEQ on IMAP server: 54000

a uid fetch 1:* UID (VANISHED CHANGEDSINCE 53000)
* VANISHED (EARLIER) 1:37308,37310:40788,40791:41032,41034:41083
a OK Fetch completed.

Yikes!  That's over 40,000 UIDs returned.  Sure enough, there seemed  
to be a tipping point where the "expected" VANISHED return - only  
those UIDs actually removed between 53000 and 54000 - was achieved:

a uid fetch 1:* UID (VANISHED CHANGEDSINCE 53881)
* VANISHED (EARLIER) 1:37308,37310:40788,40791:41032,41034:41083
a OK Fetch completed.
b uid fetch 1:* UID (VANISHED CHANGEDSINCE 53882)
* VANISHED (EARLIER) 37309,41029:41030,41047:41083
b OK Fetch completed.

Turns out that dovecot purges old EXPUNGE records every so often from  
the cache.  Discussion was made of ways of possibly improving this  
behavior, and I believe the idea of tombstones/checkpoints came up.

Note: I understand the above VANISHED call is not the most efficient.   
I told the ActiveSync guy that it would be good practice to also  
include the lowest/highest UID known at a given MODSEQ to make the  
VANISHED call more efficient.  However, this still leaves open the  
possibility of a large UID return range, especially if a user has an  
old message in their mailbox.  e.g. the above example could still  
possibly look like:

a uid fetch 1000:41100 UID (VANISHED CHANGEDSINCE 53800)
* VANISHED (EARLIER) 1001:41032,41034:41083
a OK Fetch completed.

This could be a limitation of ActiveSync, but the problem comes that  
this entire list of UIDs needs to be sent to the remote client.  And  
if that remote client is a mobile device, you are causing a huge  
amount of traffic to be pushed across a wireless connection, and this  
huge amount of data needs to be processed on the remote device.  This  
is potentially an expensive action, both monetary (bandwidth cost),  
UI-wise (longer time to resync), and battery wise if the mobile client  
is using this.

As a client author I would settle knowing that at most only, say,  
1,000 spurious UIDs will ever be returned from a VANISHED command.   
The above behavior can be worked around if the list of UIDs is kept at  
the activesync connector level.  But that may not be desirable in any  
given implementation.

Wondering your theory behind this as a server author.  How much more  
storage does a tombstone regime require on the server?  Is there a way  
to optimize this - i.e. maybe tombstones aren't kept unless/until a  
client actually issues a VANISHED command in a mailbox.  What is a  
reasonable checkpoint/tombstone range?  Is this something that the  
CHECK command could potentially be useful for?

michael
Reply
E-mail headers
From: blong@google.com
To: imap-protocol@localhost
Date: Fri, 08 Jun 2018 12:34:49 -0000
Message-ID: CABa8R6vtOHH24idCZiu+yaudX7i67X4eoE822VwyqgMdaGectg@mail.gmail.com permalink / raw / eml / mbox
On Tue, Nov 27, 2012 at 12:32 PM, Michael M Slusarz
<slusarz@curecanti.org>wrote:

> Quoting Bron Gondwana <brong@fastmail.fm>:
>
>  The interesting part for QRESYNC is remembering "tombstones" for EXPUNGED
>> messages for a while to keep it cheap.
>>
>
> [snip]
>
>
>  If anyone else is interested, I can write up something about how Cyrus
>> implements QRESYNC now, such that it can clean up old records, yet still
>> be efficient for clients most of the time.
>>
>
> I have an interest in this logic, if just for the fact that this issue
> came up recently.
>
> We provide ActiveSync mail syncing through our IMAP library.  (Warning:
> I'm not an ActiveSync expert: this is my knowledge as provided to me by a
> guy who is).  To sync a remote ActiveSync client, it needs to be determined
> which messages have been expunged since the previous sync.  Fantastic if
> QRESYNC is available - we can just use VANISHED so there is no need to keep
> the UID list at the ActiveSync controller layer.
>
> However, we were seeing transactions like the following (on a dovecot
> server):
>
> - HIGHESTMODSEQ as known by activesync client: 53000
> - HIGHESTMODSEQ on IMAP server: 54000
>
> a uid fetch 1:* UID (VANISHED CHANGEDSINCE 53000)
> * VANISHED (EARLIER) 1:37308,37310:40788,40791:**41032,41034:41083
> a OK Fetch completed.
>
> Yikes!  That's over 40,000 UIDs returned.  Sure enough, there seemed to be
> a tipping point where the "expected" VANISHED return - only those UIDs
> actually removed between 53000 and 54000 - was achieved:
>
> a uid fetch 1:* UID (VANISHED CHANGEDSINCE 53881)
> * VANISHED (EARLIER) 1:37308,37310:40788,40791:**41032,41034:41083
> a OK Fetch completed.
> b uid fetch 1:* UID (VANISHED CHANGEDSINCE 53882)
> * VANISHED (EARLIER) 37309,41029:41030,41047:41083
> b OK Fetch completed.
>
> Turns out that dovecot purges old EXPUNGE records every so often from the
> cache.  Discussion was made of ways of possibly improving this behavior,
> and I believe the idea of tombstones/checkpoints came up.
>
> Note: I understand the above VANISHED call is not the most efficient.  I
> told the ActiveSync guy that it would be good practice to also include the
> lowest/highest UID known at a given MODSEQ to make the VANISHED call more
> efficient.  However, this still leaves open the possibility of a large UID
> return range, especially if a user has an old message in their mailbox.
>  e.g. the above example could still possibly look like:
>
> a uid fetch 1000:41100 UID (VANISHED CHANGEDSINCE 53800)
> * VANISHED (EARLIER) 1001:41032,41034:41083
> a OK Fetch completed.
>
> This could be a limitation of ActiveSync, but the problem comes that this
> entire list of UIDs needs to be sent to the remote client.  And if that
> remote client is a mobile device, you are causing a huge amount of traffic
> to be pushed across a wireless connection, and this huge amount of data
> needs to be processed on the remote device.  This is potentially an
> expensive action, both monetary (bandwidth cost), UI-wise (longer time to
> resync), and battery wise if the mobile client is using this.
>
> As a client author I would settle knowing that at most only, say, 1,000
> spurious UIDs will ever be returned from a VANISHED command.  The above
> behavior can be worked around if the list of UIDs is kept at the activesync
> connector level.  But that may not be desirable in any given implementation.
>
> Wondering your theory behind this as a server author.  How much more
> storage does a tombstone regime require on the server?  Is there a way to
> optimize this - i.e. maybe tombstones aren't kept unless/until a client
> actually issues a VANISHED command in a mailbox.  What is a reasonable
> checkpoint/tombstone range?  Is this something that the CHECK command could
> potentially be useful for?


It also depends on the other limits of your backend.  We've been thinking
about a CONDSTORE/QRESYNC system that would be based on our existing
transaction logs, for example, because otherwise trying to shoe-horn the
full IMAP concepts into our existing web optimized backend would be a
pretty expensive.  But since we don't keep all transactions around forever,
we'd have to have a fake floor to our MODSEQ numbers (ie, even if a messge
hasn't changed, it will get a new MODSEQ behind the head).  If your client
didn't sync frequently enough, or was too picky, it could require a
fallback to a full sync (well, full uid/flags sync, not a re-download,
hopefully).

If you design your backend for this stuff from the start, its probably not
too bad, but how many imap servers have been written since QRESYNC was
published, and not just had it shoe-horned into an existing server.

Brandon
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman13.u.washington.edu/pipermail/imap-protocol/attachments/20121127/7132b443/attachment.html>
Reply