Cleaner crash after running for a while #225

Closed
opened 2025-05-11 12:13:59 +02:00 by esiqveland · 2 comments
esiqveland commented 2025-05-11 12:13:59 +02:00 (Migrated from github.com)

It seems like scheduling lots of lambdas to run with GLib.idleAdd(GLib.PRIORITY_DEFAULT_IDLE crashes after a while.

This crash occurs after about 10 minutes in the given sample code:

counter: 7820
counter: 7830
counter: 7840
counter: 7850
java.lang.IllegalStateException: Session is acquired by 1 clients
	at java.base/jdk.internal.foreign.MemorySessionImpl.alreadyAcquired(MemorySessionImpl.java:306)
	at java.base/jdk.internal.foreign.SharedSession.justClose(SharedSession.java:84)
	at java.base/jdk.internal.foreign.MemorySessionImpl.close(MemorySessionImpl.java:232)
	at java.base/jdk.internal.foreign.ArenaImpl.close(ArenaImpl.java:50)
	at io.github.jwharm.javagi.interop.Arenas.close_cb(Arenas.java:66)
	at org.gnome.gio.Application.run(Application.java:1146)
	at com.github.subsound.debug.DebugGtkMemoryIssue.main(DebugGtkMemoryIssue.java:27)
Unrecoverable uncaught exception encountered. The VM will now exit

Show a crash after about 8k iterations.

The sample code frequently updates a GtkScale to simulate a updating progressbar.


public class DebugGtkMemoryIssue {
    public static void main(String[] args) {
        Application app = new Application("com.scale.example", ApplicationFlags.DEFAULT_FLAGS);
        app.onActivate(() -> onActivate(app));
        app.onShutdown(() -> {
            System.out.println("app.onShutdown: exit");
        });
        app.run(args);

    }

    private static void onActivate(Application app) {
        var cancelled = new AtomicBoolean(false);
        Button btn = Button.builder().setLabel("Cancel worker").build();
        btn.onClicked(() -> {
            System.out.println("btn.onClicked: cancel!");
            cancelled.set(!cancelled.get());
        });

        var scale1 = Scale.builder().setOrientation(Orientation.HORIZONTAL).build();
        scale1.setRange(0, 100);
        scale1.setShowFillLevel(true);
        scale1.setFillLevel(1);

        var counter = new AtomicInteger();
        var fill = new AtomicInteger();
        Thread.startVirtualThread(() -> {
            while (true) {
                if (cancelled.get()) {
                    System.out.println("worker: canceled after %d".formatted(counter.get()));
                    return;
                }
                var count = counter.addAndGet(1);
                if (count % 10 == 0) {
                    System.out.println("counter: %d".formatted(count));
                }
                int i = fill.addAndGet(1);
                var fillLevel = i % 100;

//                GLib.idleAddOnce(() -> {
//                    scale1.setFillLevel(fill.get());
//                    scale1.setShowFillLevel(true);
//                });
                GLib.idleAdd(GLib.PRIORITY_DEFAULT_IDLE, () -> {
                    scale1.setFillLevel(fillLevel);
                    // changing fill level does not always redraw the scale component,
                    // but scale.getAdjustment().emitValueChanged() forces a redraw:
                    scale1.getAdjustment().emitValueChanged();
                    return GLib.SOURCE_REMOVE;
                });

                try {
                    Thread.sleep(50);
                } catch (InterruptedException e) {
                    throw new RuntimeException(e);
                }
            }
        });

        var toolbar = new HeaderBar();
        // Pack everything together, and show the window
        Box box = Box.builder().setSpacing(10).setOrientation(Orientation.VERTICAL).build();
        box.append(toolbar);
        box.append(btn);
        box.append(scale1);

        var window = ApplicationWindow.builder()
                .setApplication(app)
                .setDefaultWidth(1000)
                .setDefaultHeight(700)
                .setContent(box)
                .build();

        window.present();
    }

}

It seems like scheduling lots of lambdas to run with `GLib.idleAdd(GLib.PRIORITY_DEFAULT_IDLE` crashes after a while. This crash occurs after about 10 minutes in the given sample code: ``` counter: 7820 counter: 7830 counter: 7840 counter: 7850 java.lang.IllegalStateException: Session is acquired by 1 clients at java.base/jdk.internal.foreign.MemorySessionImpl.alreadyAcquired(MemorySessionImpl.java:306) at java.base/jdk.internal.foreign.SharedSession.justClose(SharedSession.java:84) at java.base/jdk.internal.foreign.MemorySessionImpl.close(MemorySessionImpl.java:232) at java.base/jdk.internal.foreign.ArenaImpl.close(ArenaImpl.java:50) at io.github.jwharm.javagi.interop.Arenas.close_cb(Arenas.java:66) at org.gnome.gio.Application.run(Application.java:1146) at com.github.subsound.debug.DebugGtkMemoryIssue.main(DebugGtkMemoryIssue.java:27) Unrecoverable uncaught exception encountered. The VM will now exit ``` Show a crash after about 8k iterations. The sample code frequently updates a GtkScale to simulate a updating progressbar. ```java public class DebugGtkMemoryIssue { public static void main(String[] args) { Application app = new Application("com.scale.example", ApplicationFlags.DEFAULT_FLAGS); app.onActivate(() -> onActivate(app)); app.onShutdown(() -> { System.out.println("app.onShutdown: exit"); }); app.run(args); } private static void onActivate(Application app) { var cancelled = new AtomicBoolean(false); Button btn = Button.builder().setLabel("Cancel worker").build(); btn.onClicked(() -> { System.out.println("btn.onClicked: cancel!"); cancelled.set(!cancelled.get()); }); var scale1 = Scale.builder().setOrientation(Orientation.HORIZONTAL).build(); scale1.setRange(0, 100); scale1.setShowFillLevel(true); scale1.setFillLevel(1); var counter = new AtomicInteger(); var fill = new AtomicInteger(); Thread.startVirtualThread(() -> { while (true) { if (cancelled.get()) { System.out.println("worker: canceled after %d".formatted(counter.get())); return; } var count = counter.addAndGet(1); if (count % 10 == 0) { System.out.println("counter: %d".formatted(count)); } int i = fill.addAndGet(1); var fillLevel = i % 100; // GLib.idleAddOnce(() -> { // scale1.setFillLevel(fill.get()); // scale1.setShowFillLevel(true); // }); GLib.idleAdd(GLib.PRIORITY_DEFAULT_IDLE, () -> { scale1.setFillLevel(fillLevel); // changing fill level does not always redraw the scale component, // but scale.getAdjustment().emitValueChanged() forces a redraw: scale1.getAdjustment().emitValueChanged(); return GLib.SOURCE_REMOVE; }); try { Thread.sleep(50); } catch (InterruptedException e) { throw new RuntimeException(e); } } }); var toolbar = new HeaderBar(); // Pack everything together, and show the window Box box = Box.builder().setSpacing(10).setOrientation(Orientation.VERTICAL).build(); box.append(toolbar); box.append(btn); box.append(scale1); var window = ApplicationWindow.builder() .setApplication(app) .setDefaultWidth(1000) .setDefaultHeight(700) .setContent(box) .build(); window.present(); } } ```
jwharm commented 2025-05-14 20:15:18 +02:00 (Migrated from github.com)

I think I found what's causing the issue!

This is how the process works:

  • GLib.idleAdd calls g_idle_add_full for a SourceFunc parameter (a lambda). The SourceFunc has "notified" scope, i.e. GLib calls a "notify" callback parameter to clean up when the SourceFunc has completed.
  • Java-GI allocates an upcall stub for the SourceFunc in its own little Arena, and uses the "notify" parameter to close the Arena afterwards.
  • This process works really well, because the upcall stub's memory allocation is immediately released after the lambda returns.

So far, so good. Now what causes the exception?

  • The SourceFunc runs on the GLib main context, this is another thread.
  • Sometimes, the SourceFunc is executed very quickly, and it completes before g_idle_add_full returns.
  • In this case, the "notify" callback is called while the JVM is still waiting for g_idle_add_full to finish.
  • Now, Arena.close() is called while the allocated upcall stub for the SourceFunc is still being used in a native function call. This is not allowed according to the documentation: "IllegalStateException - if a segment associated with this arena is being accessed concurrently, e.g. by a downcall method handle"

I'm not sure yet how to fix this. I don't think there's a way to know, from the "notify" callback, if the call to g_idle_add_full has completed. I'll probably have to catch the exception, sleep for a while, and keep retrying to close the arena until it succeeds.

I think I found what's causing the issue! This is how the process works: - `GLib.idleAdd` calls `g_idle_add_full` for a SourceFunc parameter (a lambda). The SourceFunc has "notified" scope, i.e. [GLib calls a "notify" callback parameter](https://docs.gtk.org/glib/func.idle_add_full.html#parameters) to clean up when the SourceFunc has completed. - Java-GI allocates an upcall stub for the SourceFunc in its own little Arena, and uses the "notify" parameter [to close the Arena](https://github.com/jwharm/java-gi/blob/main/modules/glib/src/main/java/io/github/jwharm/javagi/interop/Arenas.java#L66) afterwards. - This process works really well, because the upcall stub's memory allocation is immediately released after the lambda returns. So far, so good. Now what causes the exception? - The SourceFunc runs on the GLib main context, this is another thread. - Sometimes, the SourceFunc is executed very quickly, and it completes *before* `g_idle_add_full` returns. - In this case, the "notify" callback is called while the JVM is still waiting for `g_idle_add_full` to finish. - Now, `Arena.close()` is called while the allocated upcall stub for the SourceFunc is still being used in a native function call. This is not allowed [according to the documentation](https://docs.oracle.com/en/java/javase/22/docs/api/java.base/java/lang/foreign/Arena.html#close()): "IllegalStateException - if a segment associated with this arena is being accessed concurrently, e.g. by a downcall method handle" I'm not sure yet how to fix this. I don't think there's a way to know, from the "notify" callback, if the call to `g_idle_add_full` has completed. I'll probably have to catch the exception, sleep for a while, and keep retrying to close the arena until it succeeds.
jwharm commented 2025-05-14 20:22:02 +02:00 (Migrated from github.com)

In the meantime, you can work around the issue if you use GLib.timeoutAddOnce() with a reasonably short timeout parameter. This will allow the call to g_timeout_add_full to return, before the callback function completes and triggers the IllegalStateException.

Even better, replace the entire while-loop with one call to GLib.timeoutAdd() that runs the entire content of your while-loop every 50 ms.

In the meantime, you can work around the issue if you use `GLib.timeoutAddOnce()` with a reasonably short timeout parameter. This will allow the call to `g_timeout_add_full` to return, before the callback function completes and triggers the IllegalStateException. Even better, replace the entire while-loop with one call to `GLib.timeoutAdd()` that runs the entire content of your while-loop every 50 ms.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
java-gi/java-gi#225
No description provided.