SIGXCPU at first osc message

domenico · 2 2020 16:56

Hello,
I’m trying to set a simple parameter via osc. I have a vst2 juce plugin where I set up a Juce::AudioProcessorValueTreeState. The parameter is an AudioParameterBool and I add it to the value tree state.

When sending the correct osc messages to sushi the parameter is set correctly, however when looking for the cause of a mode-switch I found out that upon receiving the first message, a SIGXCPU is caught by the debugger.
Here’s the backtrace:

Thread 11 "sushi_b64" hit Catchpoint 1 (signal SIGXCPU), __GI___mmap64 (
    offset=<optimized out>, fd=-1, flags=16418, prot=0, len=67108864, 
    addr=<optimized out>) at ../sysdeps/unix/sysv/linux/mmap64.c:52
52	../sysdeps/unix/sysv/linux/mmap64.c: No such file or directory.
(gdb) bt
#0  __GI___mmap64 (offset=<optimized out>, fd=-1, flags=16418, prot=0, len=67108864, 
    addr=<optimized out>) at ../sysdeps/unix/sysv/linux/mmap64.c:52
#1  __GI___mmap64 (addr=<optimized out>, len=len@entry=67108864, prot=prot@entry=0, 
    flags=flags@entry=16418, fd=fd@entry=-1, offset=offset@entry=0)
    at ../sysdeps/unix/sysv/linux/mmap64.c:40
#2  0x0000007ff7716a78 in new_heap (size=135168, size@entry=2824, 
    top_pad=<optimized out>) at arena.c:476
#3  0x0000007ff7717548 in _int_new_arena (size=576) at arena.c:911
#4  arena_get2 (avoid_arena=0x0, size=576) at arena.c:911
#5  arena_get2 (size=576, avoid_arena=0x0) at arena.c:872
#6  0x0000007ff7719e2c in tcache_init () at malloc.c:2994
#7  0x0000007ff771abc8 in tcache_init () at malloc.c:3050
#8  __GI___libc_malloc (bytes=24) at malloc.c:3050
#9  0x0000007ff796f46c in operator new(unsigned long) () from /usr/lib/libstdc++.so.6
#10 0x0000007feeb55de0 in juce::ThreadLocalValue<bool>::get (this=0x5555809728)
    at /opt/elk/1.0/sysroots/aarch64-elk-linux/usr/include/c++/8.3.0/bits/atomic_base.h:707
#11 juce::ThreadLocalValue<bool>::operator= (newValue=<optimized out>, 
    this=0x5555809728)
    at /home/domenico/elkJUCE/modules/juce_core/threads/juce_ThreadLocalValue.h:88
#12 JuceVSTWrapper::setParameter (value=1, index=<optimized out>, this=0x55558092a0)
    at /home/domenico/elkJUCE/modules/juce_audio_plugin_client/VST/juce_VST_Wrapper.cpp:795
--Type <RET> for more, q to quit, c to continue without paging--ret
#13 JuceVSTWrapper::setParameterCB (vstInterface=<optimized out>, 
    index=<optimized out>, value=1)
    at /home/domenico/elkJUCE/modules/juce_audio_plugin_client/VST/juce_VST_Wrapper.cpp:802
#14 0x00000055555b74b4 in ?? ()
#15 0x00000055555b78ec in ?? ()
#16 0x00000055555ab90c in ?? ()
#17 0x0000005555646dc4 in ?? ()
#18 0x0000005555635eec in ?? ()
#19 0x0000007ff7e9c0c8 in ?? () from /usr/xenomai/lib/libcobalt.so.2
#20 0x0000007ff7da5824 in start_thread (arg=0x7ffffff406) at pthread_create.c:486
#21 0x0000007ff77717bc in thread_start ()

I removed every use of the parameter so I cannot understand why this happens. Also, it seems that it happens only the first time because the output of /proc/xenomai/sched/stat remains stable when receiving subsequent messages.
It seems simple but I cannot figure out the solution
Additionally, I cannot understand how to use gdb’s ignore with signals as opposed to breakpoints.

Gustav · 2 2020 22:38

Hi domenico. Am I reading your post correctly if you ran sushi with the --debug-mode-sw flag on an Elk board? In that case, if you were looking for the cause of the mode switch, well you’ve found it

Iirc, SIGXCPU is the signal xenomai throws for the debugger to catch when debugging mode switches. It is not something you’d see without mode switch debugging enabled.

What you’re seeing in the logs is a call to the vst2 api function setParameter() which sushi calls in the realtime audio thread just before it calls processReplacing(). I don’t know exactly what Juce does in its vst2 wrapper, but somehow you end up in a call to new(), which calls malloc(), which in turn needs to access the kernels memory map, which is a system call, which results in a mode switch.

Memory allocations is def the most common cause of mode switches in Elk from what we’ve seen. And to fix this you need to dig into why Juce::AudioProcessorValueTreeState needs to allocate here, it could be that it only needs to do this the first time, as you only saw mode switches then. Maybe you already know, happy hunting otherwise.

FYI, your issue has nothing to do with OSC, the OSC message is converted into an internal parameter change message that is sent to the realtime audio thread, which is where this issue occurs. It would have been the same if the originating parameter change came from a midi cc or from a gRPC call.

domenico · 3 2020 09:16

Hi Gustav, thank you.
My post was not very clear indeed: I ran sushi with the debug flag on purpose, as suggested in the documentation. I already used this successfully to track down other memory allocation issues in another plugin and I wanted to do the same here.

I was aware that sushi uses the OSC message to change the parameter that the vst exposes (so it doesn’t depend on the msg. itself) but I’m baffled that juce performs allocation when setting a parameter that is declared in this way (as it seems the official way to do it in Juce).
I will definitely look into the quirks of the juce vst wrapper but I actually don’t mind, for now, if a mode switch happens only the first time that the parameter is used.

Even tho I don’t mind, my issue now is that I would like to debug other mode-switches that may happen after the first one (caused by how my code reacts to the new parameter value), but telling gdb to ignore the first signal caught (as suggested in the same documentation paragraph) and running causes sushi to exit as soon as the first signal is ignored:

Program terminated with signal SIGXCPU, CPU time limit exceeded.
The program no longer exists.

Am I doing something terribly wrong here? Is this occurrence so problematic that, when running with the debug flag, the program exits?

Stefano · 3 2020 10:46

Ciao @domenico,
you can use the ignore N command in gdb to ignore the first N occurrences of a signal catchpoint. In that way, you should be able to see what happens after the first ones, if that helps.

domenico · 3 2020 11:17

Ciao @Stefano,
thank you, unfortunately when doing

catch signal SIGXCPU
ignore 1 1

and running, the exact same allocation causes a catch (same backtrace), while using ignore 1 2 the program terminates with this message:

Stefano · 3 2020 11:37

Hi @domenico,
but if you run the same plugin without gdb does it crash or “only” causes MSW?

domenico · 3 2020 11:57

If I run it without gdb and without the --debug-mode-sw arg it does not crash.
Just now I tried calling it with the debug flag (without gdb, I don’t know if it makes sense) and it crashed indeed, with CPU time limit exceeded

Stefano · 3 2020 12:02

Hi @domenico,
difficult to check more without seeing the binary / code, but can you access the backtrace etc. after the crash?

Because the intended use for the --debug-mode-sw flag is just to halt the execution when a MSW occurs so you can see e.g. which line was causing it. It’s not supposed to be possible to restart execution again after you hit the catchpoint.

domenico · 3 2020 14:33

Hi @Stefano
No, I cannot access it, all the threads exit, the program no longer exists and there is no stack anymore.

I got the code down to just this simple test vst:

// Inside Processor class
juce::AudioParameterFloat* recstate;

Processor() {
    addParameter (recstate = new juce::AudioParameterFloat ("recstate", // ID
                                                            "recstate", // name
                                                            0.0f,   // min value
                                                            1.0f,   // max value
                                                            0.5f)); // def value
}

void processBlock (AudioBuffer<float>& buffer, MidiBuffer& midiMessages) {
    if(recstate->get() == 1.0)
        throw std::logic_error("Parameter 'recstate' was set correctly to 1.0");
}

The situation is (assumed that every time i send that OSC message resulting in MSW):

Regular call to sushi: program reaches throw (good, obviously MSW happens)
gdb sushi and run with --debug-mode-sw:
2.1 with catch SIGXCPU: gdb stops at catchpoint,before throw (good, msw detected)
2.2 with catch SIGXCPU and ignore 1 1: gdb stops again at catchpoint (why? It should ignore it)
2.3 with catch SIGXCPU and ignore 1 2: Program terminates with signal SIGXCPU without catching anything

Output of 2.3:

[New Thread 0x7ff667a090 (LWP 977)]
[New Thread 0x7ff5e26090 (LWP 978)]
[New Thread 0x7ff5625090 (LWP 979)]
[New Thread 0x7ff4e24090 (LWP 980)]
[New Thread 0x7feffff090 (LWP 981)]
[New Thread 0x7fef7fe090 (LWP 982)]
[New Thread 0x7feea8a090 (LWP 983)]
[New Thread 0x7fee289090 (LWP 984)]
[New Thread 0x7fee268090 (LWP 985)]
[New Thread 0x7feda67090 (LWP 986)]
[New Thread 0x7fed266090 (LWP 987)]
[New Thread 0x7feca65090 (LWP 988)]
[New Thread 0x7fd7fff090 (LWP 989)] 
                                   <---- Here sushi sets the param.
[Thread 0x7feca65090 (LWP 988) exited]
[Thread 0x7fee289090 (LWP 984) exited]
[Thread 0x7feda67090 (LWP 986) exited]
[Thread 0x7feea8a090 (LWP 983) exited]
[Thread 0x7ff71029d0 (LWP 973) exited]
[Thread 0x7fed266090 (LWP 987) exited]
[Thread 0x7ff667a090 (LWP 977) exited]
[Thread 0x7ff4e24090 (LWP 980) exited]
[Thread 0x7fd7fff090 (LWP 989) exited]
[Thread 0x7ff5625090 (LWP 979) exited]
[Thread 0x7fee268090 (LWP 985) exited]
[Thread 0x7ff5e26090 (LWP 978) exited]
[Thread 0x7ff707c090 (LWP 976) exited]
[Thread 0x7fef7fe090 (LWP 982) exited]

Program terminated with signal SIGXCPU, CPU time limit exceeded.
The program no longer exists.
(gdb) bt
No stack.
(gdb)

I don’t want to annoy you with this as I could comment out the code that exposes the parameter, mock up in some way the call that sets it internally and go on debugging the rest on the code. However I thought that I could use the ignore command in gdb exactly for this issue: skipping the first signal and go on with the rest of the execution.

If the program cannot proceed after a SIGXCPU, regardless of the fact that I told gdb to ignore it’s first N occurrences, what would be the use of ignore N?

Gustav · 3 2020 17:01

Thanks for explaining in more detail domenico. Interesting problem and I’m not exactly sure whats going on, Stefano gave some good input.

I’m not sure that the last bit of code you’ve posted will do the same thing. You’re explicitly throwing an exception i the audio thread. Tbh, I don’t know what happens here if you throw an exception in the realtime audio thread. Is it even caught somewhere? Does this run normally without the --debug-mode-sw option, or does it exit due to the exception?

I took a quick look in the juce code mentioned in the first backtrace and it does look as is there is a lock (listenerlock) in void AudioProcessorParameter::sendValueChangedMessageToListeners (float newValue) (in juce_AudioProcessor.cpp) that will be called in the setParameterCB. It is least worth digging into more and see what it need the ThreadLocalValue for.

Does it break in exactly the same spot when using ignore 1 1 and not? Maybe it could be 2 allocations close to each other that you only see as 2 separate when debugging? Also, if you’re not running a debug build of the plugin (some things look like they’re optimised out) you could try that and see if that helps.

But as you say, if you can verify that the mode switch only happens the first time the parameter is set, you might be able to live with it.

domenico · 4 2020 14:22

Hi,
it’s a real head-scratcher right?!

Yes the exception are not caught, they are there just for me to have a string printed on the screen as soon as the condition is met, only for this test. However they make no difference at all with regard to the aforementioned issues (I tried with and without the throws).

It exits due to the exception which was what I intended, It’s really just for me to quickly see that it did something.

Yes! So strange. It breaks precisely on the same spot: the backtrace is identical (I computed the diff to be sure).
It’s also a great suggestion to try with a debug build however I was already doing that. I quickly switched back to the release configuration to test things out and the behavior is identical (only the stacktrace is a little less informative).

Ruben · 4 2020 14:24

I don’t have much feedback towards problem solving. However when I made the forks for the community plugins I remember seeing this one mode switch the first time you modify a parameter on a plugin based on JUCE. From what I remember I only saw it the first time and after that the mode switches remained constant as you mentioned.

millind · 1 2022 09:40

true thatmemory alloction is tha main cause of mode switching