XForms: VERY strange seg faults on fl_initialize()

Todd C. Zino (todd@lacemaker.com)
Wed, 10 Nov 1999 09:34:26 -0500

# To subscribers of the xforms list from "Todd C. Zino" <todd@lacemaker.com> :

The software we develop in-house at my university is uses xforms for the
UNIX GUI. We are currently experiencing core dumps on the fl_initialize()
call within our API code. A bit of background is probably necessary due to
the specific nature of our software, but as I explain below, this problem
still occurs even when bypassing the libraries and trying xforms directly.

Firstly, the various xforms calls which handle file selection, dialogs,
etc. are invoked from within a shared library that forms the heart of our
architectural API. This shared library (let's call it libFOO.so for now)
staticly links libforms.a in order for us to distribute it to various Linux
systems which do not have xforms installed.

Since both Java classes (through JNI) and a command-line utility are
indirectly calling fl_initialize() at any given time by way of a
FOO_Startup() API call we implement in libFOO.so, it is neither practical
nor necessary to pass any real 'command line arguments' into fl_initialize
for it to do its work properly. Currently we are building a dummy
multidimensional **argv with *"FOO" as its lone element and an argc of 1.
And passing 0 for the last two parameters since there are no options and
nothing of worth to parse (this is how the xforms manual shows these
parameters in cases where the programmer wishes to get options/etc elsewhere).

With this setup so far, I have ONLY ever seen seg faults when the library
is invoked from our command-line utility (a file processing binary which
configures the settings for the Java application that sits atop our FOO
API), and only then on CERTAIN redhat 6.0 systems. I've consistently seen
this seg fault on 3 redhat 6.0 boxes and work fine on 2 others (and a
redhat 5.2 one)! This code and the entire libraries have worked beautifully
when they work, and on several flavors of Linux and window managers (Kde,
gnome, afterstep, fvwm). We have a cross-platform port of this to Solaris
using the Solaris xforms which has also never crashed.

I have gdb'd all the way through our API to the actual fl_initialize() call
in our code before I get the crash. All of the parameters being passed in
appear to be initialized properly when tested. I've tried adding in a dummy
"-bw 5" as *argv[1] and upping argc to 2 just for kicks. Same result. Oh,
and I tried both .89 and .88 versions of the forms libraries, as well as
fresh reinstalls of the boxes which saw a crash.

Here's a snippet of the library code, the only place in our library that
calls fl_initialize. All API calls in our library must be preceded by the
FOO_Starup()...

char *tArgv[] = { "FOO", "-bw 9" };
int tArgc = 2;

fl_initialize(&tArgc,tArgv,"FOO",0,0);

To even further attempt isolation of the problem, I finally bypassed our
API entirely and built a copy of our command line utility with direct
fl_initialize and REAL argc/argv parameters like so:

#include "forms.h"
#include "unistd.h"

int main (int argc, char **argv)
{

if (argc > 1) {
fl_initialize(&argc, argv, "Agent FOO", 0, 0);

// a bit more code and normally the API invokation, commented out
// ...
}
}

It is still dying the same way during the fl_initialize(), and the
Segmentation Fault signal during xxgdb is now reported in buffered_vfprintf
(s=0x80761a8, format=0x806cc3c, "%s%s\n", args=0xbffff87c).

(I obviously cannot step into the actual fl_initialize() call since xforms
is not built nor distributed with debugging info to my knowledge).

To further give people some potentially useless insight, here's how the
binary is being built:

gcc -g -I/home/todd/xforms/FORMS -L/usr/X11R6/lib -lX11 -lXpm -lm
AgentFoo.cpp /home/tcz3/xforms/FORMS/libforms.a

(as an aside, if I try to just link the shared forms library and avoid the
X11 fun, the linker swears up and down that it cannot find fl_initialize.
This sort of problem was occuring constantly in the compiling/building of
our API libraries to where we had temporarily resort to linking the xforms
lib about 3 times in the final call...otherwise various xforms functions
would be declared as 'missing' or 'unresolved').

So, to summarize, my questions are as follows:

a) Does anyone know what could possibly be the cause of these crashes,
first and foremost? Nothing is ever dying in the actual window allocation
once fl_initialize finishes successfully. This never happens when the
libraries are invoked from Java, nor does it happen on certain machines. It
is of course possible that the same 'wrong' thing is happening everywhere
but not being fatal elsewhere due to a more forgiving environment stack on
other machines or Java's native calls (the crashes do not correspond to
machines with smaller RAM either).

b) Based on the information I have given, could people suggest possible
workarounds, alternate initialization code or better parameters to pass in,
or a way to better see what is dying why and where? I have to be honest
that this entire debacle has sapped our confidence in xforms and brought up
the issue of replacing it with Gtk in our next release, but I'd like to
give it a final chance by consulting the experts/authors. And in any case,
it would be great for my peace of mind to finally discover the exact
conditions and causes under which this seg fault is occuring.

I look forward to hearing from you all...

+--------------------------------------------------------------+
| Todd C. Zino todd@lacemaker.com |
+--------------------------------------------------------------+
_________________________________________________
To unsubscribe, send the message "unsubscribe" to
xforms-request@bob.usuhs.mil or see
http://bob.usuhs.mil/mailserv/xforms.html
XForms Home Page: http://bragg.phys.uwm.edu/xforms
List Archive: http://bob.usuhs.mil/mailserv/list-archives/