Re: XForms: Re: Big time lag using XCopyArea, and get_next_event() event backlog problem.

From: T.C. Zhao (tc_zhao@yahoo.com)
Date: Wed Apr 23 2003 - 13:26:13 EDT

  • Next message: Angus Leeming: "Re: XForms: autoconf/automake patch"

    # To subscribers of the xforms list from "T.C. Zhao" <tc_zhao@yahoo.com> :

    Hi Jason,

    Your analysis is right on the money. As to the solutions, my view is that the
    cleanest way is to remove the GraphicsExposure from the GC, which you have
    control and presumbly know if you want it or not in *your* application.
    Except for those automatic events, it's not easy for humans to fill up the
    queue that processing ten events in the row would still accumulate large
    number of events.

    Giving X events too high a priority by emptying the X event queue before
    calling fl_watch_io() can create problems for people who use IO callbacks (to
    monitor or drive external devices) and maybe some other timing issues if many
    X events are generated by something like NoExposure. No matter how we look at
    this, we're only simulating multitasking/multithreading inperfectly, and will
    run into difficulties under some situations that can only be solved by some
    type of collaborative effort from applications.

    One thing that may work is to adjust the queue priority dynamically. Say, if
    there are no IO callbacks (need to check if timer, signal facility connects
    to this or not, can't quite remember off the top of my head), we can process
    X events N (say 25) times before calling fl_watch_io() or even process X
    events until there are none left before calling fl_watch_io(). I think if we
    were to change how the events are processed, this is probably the only way
    that might work with minimal chance of breaking other applications.

    -TC

    --- jac@casurgica.com wrote:
    > # To subscribers of the xforms list from jac@casurgica.com :
    >
    > > I suspect that you will find detailed answers hard to come by, for the
    > > simple reason that the author of the code (TC) and its maintainer for
    > > the last donkey's years (SPL) are the ones with the real, detailed
    > > knowledge about this end of the library. Your delvings have probably
    > > made you "the" expert.
    >
    > This is very unfortunate because in reality, I have absolutely no idea
    > what I'm talking about.
    >
    >
    > One last thing. Just as an "fyi" for people who are interested, here is
    > the email I sent to my boss last night which contains a half-coherent
    > summary of the stuff I found (btw, pardon the "some genius" comment, I was
    > pretty annoyed at that point):
    >
    > after 9 hours of digging through forms library code and all sorts of crap,
    > i found the cause of the problem, why it's happening, and why only
    > XCopyArea is doing it (XCopyPlane would do it, too). turns out it's a
    > combination of a few bugs in the forms library and some unfortunate
    > default settings in the [removed].
    >
    > if you look under GraphicsExpose and NoExpose events in the xlib reference
    > manual, it states:
    >
    > "If graphics_exposures is True in the GC used for the copy, either one
    > NoExpose event or one or more GraphicsExpose events will be generated for
    > every XCopyArea or XCopyPlane call made."
    >
    > so what is happening is, every time you call fl_check_forms, you follow it
    > immediately by a call to [removed], which calls XCopyArea. this
    > causes a NoExpose (GraphicsExpose events weren't the problem, they weren't
    > getting sent) event to be sent to the canvas. via a few other functions,
    > fl_check_forms eventually calls the function get_next_event(). some bad
    > logic in the function eventually leads to the event queue getting too full
    > and lagging the ui (it peaks around 320 events when it should only be
    > peaking around 2 or 3... and it only stops at 320 because it can't hold
    > any more). here's what get_next_event() does:
    >
    > - if there's at least one event in the queue, grab the first event.
    > - if this event is not destined for a form window (if it's going to say, a
    > canvas window), do some minor tweaking of the event then put it back on
    > the queue so that it can be handled by the next call to get_next_event().
    >
    > this is all good and well. now, get_next_event() also calls a function
    > fl_watch_io(). this function is called every 11th call to get_next_event()
    > just so it doesn't eat up too much cpu time. fl_watch_io() does a bunch of
    > socket stuff that i'm not too clear on. there is a comment in
    > get_next_event() that says fl_watch_io() shouldn't be called with xevents
    > in the queue because it will delay processing of the events. but this is
    > ok because the queue should be empty or only contain 1 or 2 events when
    > fl_watch_io() is called (and 1 or 2 events doesn't lag it that much).
    >
    > HOWEVER, some genius decided that on every 11th call to get_next_event(),
    > when fl_watch_io() is called, event processing should be completely
    > skipped! this means that for every 11 calls to get_next_event(), only 10
    > events are removed from the queue. now recall that since you are calling
    > one XCopyArea() per fl_check_forms(), you are adding 1 event to the queue
    > each time. so for every 11 events you add to the queue, only 10 are
    > removed. this builds up quickly and, not only does it overflow the queue,
    > but it leads to fl_watch_io() being called with over 300 events in the
    > queue, which *really* slows things down. by the way, an interesting thing
    > to note is that fl_check_forms() will ALWAYS return NULL every 11 times
    > you call it.
    >
    > when you don't call XCopyArea, the only time events get added to the queue
    > are in response to mouse and keyboard and expose events and such. in this
    > case it's ok that 10 out of 11 calls to get_next_event() actually remove
    > an event, because there's so few events in the queue that they all get
    > processed very quickly.
    >
    > this is why calling fl_check_forms() twice fixed the problem. because
    > every time you called XCopyArea and a NoExpose event happened, you removed
    > at least 1 event from the queue, so it was always cool. calling
    > fl_check_forms() twice doesn't work if you call XCopyArea() twice.
    >
    > i have many possible solutions to this but i narrowed them down to two
    > simple ones. i have tried them both and they both work perfectly
    > (in the test program, anyway) with no side effects:
    >
    > 1) make get_next_event() *not* skip the event processing every 11 events.
    > there is no reason for it to do so. but still make it call fl_watch_io()
    > every 11 events. this way, all the events get processed and fl_watch_io()
    > still gets called. about 5% of the time, fl_watch_io() is called with 1 or
    > 2 events in the queue, but this is ok and fl_watch_io() doesn't noticeably
    > hang.
    >
    > 2) make the [removed] GC have it's graphics_exposures set to False so
    > that NoExpose events aren't generated. do this by modifying the part where
    > _gc gets initialized like so:
    >
    > XGCValues values;
    > values.graphics_exposures = False;
    > _gc = XCreateGC(_display, _window, GCGraphicsExposures, &values);
    >
    > instead of:
    >
    > _gc = XCreateGC(_display, _window, 0, NULL);
    >
    > both ways are good for different reasons, and i'd actually recommend doing
    > them both. we have the xforms 1.0 source and we can modify it however we
    > want, so we can fix it there and stop using 0.89. if we don't do way 1,
    > the possibility for this problem to occur is still there -- we've only
    > fixed one of the things that lead up to the queue overflow occuring. i
    > don't like that.
    >
    > another thing is, the bug is kind of "unfixable" in a way... but it's
    > weird because the only way to "fix" it would be to hack things in and
    > start doing risky things like ignoring events in weird places and such.
    > the reason is: fl_check_forms(), for the most part, only processes one
    > event each time it is called. so if you are explicitly generating more
    > than one event per fl_check_forms() call, you're kind of screwed. there's
    > no way around this except to use fl_do_forms() instead of
    > fl_check_forms(), which basically processes all the events in the queue
    > before returning.
    >
    > so, here's my suggestions:
    >
    > 1) make sure graphics_exposures is false in the [removed], and
    > 2) fix get_next_event() so it never skips an event, and
    > 3) use fl_do_forms() whenever possible, which i guess is tough for the way
    > [removed] and the [removed] work.
    >
    > jason
    >
    >
    >
    >
    > _________________________________________________
    > To unsubscribe, send the message "unsubscribe" to
    > xforms-request@bob.usuhs.mil or see
    > http://bob.usuhs.mil/mailserv/xforms.html
    > XForms Home Page: http://world.std.com/~xforms
    > List Archive: http://bob.usuhs.mil/mailserv/list-archives/
    > Development: http://savannah.nongnu.org/files/?group=xforms

    __________________________________________________
    Do you Yahoo!?
    The New Yahoo! Search - Faster. Easier. Bingo
    http://search.yahoo.com
    _________________________________________________
    To unsubscribe, send the message "unsubscribe" to
    xforms-request@bob.usuhs.mil or see
    http://bob.usuhs.mil/mailserv/xforms.html
    XForms Home Page: http://world.std.com/~xforms
    List Archive: http://bob.usuhs.mil/mailserv/list-archives/
    Development: http://savannah.nongnu.org/files/?group=xforms



    This archive was generated by hypermail 2b29 : Wed Apr 23 2003 - 13:27:37 EDT