JUCE + Tracktion Engine on ELK Audio OS

I just tested with the patched int64 Time::getHighResolutionTicks(). Plugin seems to work fine, but I still see mode switches. Using gdb sushi_b64 and catching SIGXCPU this is the output I get:

elk-pi:~$ gdb sushi_b64
GNU gdb (GDB) 8.2.1
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "aarch64-elk-linux".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from sushi_b64...(no debugging symbols found)...done.
(gdb) catch signal SIGXCPU
Catchpoint 1 (signal SIGXCPU)
(gdb) run -r --debug-mode-sw -c /home/mind/sushi_conf_tracktion_engine_test.json
Starting program: /usr/bin/sushi_b64 -r --debug-mode-sw -c /home/mind/sushi_conf_tracktion_engine_test.json
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/libthread_db.so.1".
[New Thread 0x7ff707c090 (LWP 325)]
SUSHI - Copyright 2017-2020 Elk, Stockholm
SUSHI is licensed under the Affero GPL 3.0. Source code is available at github.com/elk-audio
[New Thread 0x7ff667a090 (LWP 326)]
[New Thread 0x7ff5e26090 (LWP 327)]
[New Thread 0x7ff5625090 (LWP 328)]
[New Thread 0x7ff4e24090 (LWP 329)]
[New Thread 0x7feffff090 (LWP 330)]
[New Thread 0x7fef7fe090 (LWP 331)]
[New Thread 0x7fede1d090 (LWP 332)]
[New Thread 0x7fed61c090 (LWP 333)]
[New Thread 0x7fece1b090 (LWP 334)]
[New Thread 0x7fd7fff090 (LWP 335)]
[New Thread 0x7fd77fe090 (LWP 336)]
[New Thread 0x7fd6ffd090 (LWP 337)]
[New Thread 0x7fd67fc090 (LWP 338)]
[New Thread 0x7fd5ffb090 (LWP 339)]
[New Thread 0x7fd57fa090 (LWP 340)]
[New Thread 0x7fd4ff9090 (LWP 341)]
[New Thread 0x7fb7fff090 (LWP 342)]
[New Thread 0x7fb77fe090 (LWP 343)]
Creating Default Controllers...
Finding MIDI I/O
MIDI output: MIDI Output (enabled)
opening MIDI out device:MIDI Output
MIDI input: MIDI Input
Audio block size: 512  Rate: 44100
Rebuilding Wave Device List...
Wave In: Input 1 (enabled): 0 (L)
Wave In: Input 2 (enabled): 1 (L)
Wave Out: Output 1 + 2 (enabled): 0 (L), 1 (R)
Default Wave Out: Output 1 + 2
Default MIDI Out: MIDI Output
Default Wave In: Input 1
Default MIDI In: 
[New Thread 0x7fb6ffd090 (LWP 344)]
[New Thread 0x7fb67fc090 (LWP 345)]
[New Thread 0x7fb5ffb090 (LWP 346)]
Rebuilding Wave Device List...
Wave In: Input 1 (enabled): 0 (L)
Wave In: Input 2 (enabled): 1 (L)
Wave Out: Output 1 + 2 (enabled): 0 (L), 1 (R)
Default Wave Out: Output 1 + 2
Default MIDI Out: MIDI Output
Default Wave In: Input 1
Default MIDI In: 
Rebuilding Wave Device List...
Wave In: Input 1 (enabled): 0 (L)
Wave In: Input 2 (enabled): 1 (L)
Wave Out: Output 1 + 2 (enabled): 0 (L), 1 (R)
Default Wave Out: Output 1 + 2
Default MIDI Out: MIDI Output
Default Wave In: Input 1
Default MIDI In: 
[New Thread 0x7fb57fa090 (LWP 347)]
[New Thread 0x7fb4ff9090 (LWP 348)]
[Thread 0x7fb57fa090 (LWP 347) exited]
[New Thread 0x7fb57fa090 (LWP 349)]
[New Thread 0x7ff403b090 (LWP 350)]
[New Thread 0x7f97fff090 (LWP 351)]
[New Thread 0x7f977fe090 (LWP 352)]
[New Thread 0x7f96ffd090 (LWP 353)]
[New Thread 0x7f967fc090 (LWP 354)]
[New Thread 0x7f95ffb090 (LWP 355)]
[Switching to Thread 0x7ff403b090 (LWP 350)]

Thread 27 "sushi_b64" hit Catchpoint 1 (signal SIGXCPU), futex_wake (private=<optimized out>, processes_to_wake=2147483647, futex_word=<optimized out>)
    at ../sysdeps/unix/sysv/linux/futex-internal.h:231
231	../sysdeps/unix/sysv/linux/futex-internal.h: No such file or directory.
(gdb) 
(gdb) bt
#0  futex_wake (private=<optimized out>, processes_to_wake=2147483647, futex_word=<optimized out>) at ../sysdeps/unix/sysv/linux/futex-internal.h:231
#1  __pthread_cond_broadcast (cond=0x7fe40515a8) at pthread_cond_broadcast.c:86
#2  0x0000007ff799369c in std::condition_variable::notify_all() () from /usr/lib/libstdc++.so.6
#3  0x0000007fee38e4b4 in juce::WaitableEvent::signal() const () from /home/mind/EngineInPluginDemo.so
#4  0x0000007fee807810 in tracktion_engine::MultiCPU::ParallelMixOperation::perform() () from /home/mind/EngineInPluginDemo.so
#5  0x0000007fee7cdaa8 in tracktion_engine::MixerAudioNode::multiCpuRender(tracktion_engine::AudioRenderContext const&) ()
   from /home/mind/EngineInPluginDemo.so
#6  0x0000007fee7ce124 in tracktion_engine::MixerAudioNode::renderAdding(tracktion_engine::AudioRenderContext const&) () from /home/mind/EngineInPluginDemo.so
#7  0x0000007fee7cb6e0 in tracktion_engine::WaveOutputDeviceInstance::fillNextAudioBlock(tracktion_engine::PlayHead&, tracktion_engine::EditTimeRange, float**, int) () from /home/mind/EngineInPluginDemo.so
#8  0x0000007fee7dfddc in tracktion_engine::EditPlaybackContext::fillNextAudioBlock(tracktion_engine::EditTimeRange, float**, int) ()
   from /home/mind/EngineInPluginDemo.so
#9  0x0000007fee7efa6c in tracktion_engine::DeviceManager::audioDeviceIOCallback(float const**, int, float**, int, int) ()
   from /home/mind/EngineInPluginDemo.so
#10 0x0000007fee8dc020 in juce::AudioDeviceManager::audioDeviceIOCallbackInt(float const**, int, float**, int, int) () from /home/mind/EngineInPluginDemo.so
#11 0x0000007fee7f1000 in tracktion_engine::HostedAudioDeviceInterface::processBlock(juce::AudioBuffer<float>&, juce::MidiBuffer&) ()
   from /home/mind/EngineInPluginDemo.so
#12 0x0000007fee1fd9c4 in EngineInPluginDemo::processBlock(juce::AudioBuffer<float>&, juce::MidiBuffer&) () from /home/mind/EngineInPluginDemo.so
#13 0x0000007fee1f1c78 in void JuceVSTWrapper::internalProcessReplacing<float>(float**, float**, int, JuceVSTWrapper::VstTempBuffers<float>&) ()
   from /home/mind/EngineInPluginDemo.so
#14 0x000000555560c094 in ?? ()
#15 0x00000055555bf468 in ?? ()
#16 0x00000055555bfd60 in ?? ()
--Type <RET> for more, q to quit, c to continue without paging--
#17 0x00000055555b1d58 in ?? ()
#18 0x00000055555a59ec in ?? ()
#19 0x000000555562ffb8 in ?? ()
#20 0x0000007ff7e9c0c8 in ?? () from /usr/xenomai/lib/libcobalt.so.2
#21 0x0000007ff7da5824 in start_thread (arg=0x7ffffff3c6) at pthread_create.c:486
#22 0x0000007ff77717bc in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone.S:78
(gdb) c
Continuing.

Thread 27 "sushi_b64" hit Catchpoint 1 (signal SIGXCPU), futex_wake (private=<optimized out>, processes_to_wake=2147483647, futex_word=<optimized out>)
    at ../sysdeps/unix/sysv/linux/futex-internal.h:231
231	in ../sysdeps/unix/sysv/linux/futex-internal.h
(gdb) c
Continuing.

[2]+  Stopped                 gdb sushi_b64

Does this look familiar? Looks like juce::WaitableEvent::signal() might be triggering some system call?

Definitely, looks some sort of condition variable signaling so it will trigger a mode-switch for sure.

Do you need the signaling from RT to non-RT, in the opposite direction or both?

I am working on something right now that requires some sort of RT->non-RT condition variable signaling but it will take a while to package inside TWINE. I can give you some pointers if you want to hack around a temporary solution, though.

I think this is going now too far away from the frontiers of my knowledge as I don’t understand most of this hehe. I think because the plugin still works fine, I’ll continue development for now without fixing this and maybe have a deeper look when/if it becomes a problem.

Said that, looking at the stack trace it occurred to me that maybe mode switched are happening because tracktion engine is trying to parallelize some tasks and the RT kernel does not support that. Do you think this makes any sense? Because if it does, I guess there’s some easy way to tell tracktion not to parallelize.

I was able to compile libusb as part of my project, pretty easily :slight_smile:

Now doing some tests deploying my app in elk board. things don’t work so far as expected, seems to hang, but I need to do further investigation before posting any specific errors. My app is trying to communicate with push using USB midi ports and using JUCE’s MidiInput::openDevice functions. I seem to recall you mentioning this could be a problem? (my app works fine when compiled in macos and with raspbian)

Also another question I have is what part of the plugin is run in the RT thread and what part in the rest. Is it only JUCE’s PluginProcessor::processBlock that happens in RT kernel?

Anything that uses “normal” Posix threads for parallelization inside an RT thread will definitely cause mode switches. The solution is either disable them as you suggested, try to implement it using TWINE’s worker thread pool or use Xenomai’s Cobalt APIs to convert Posix calls.

The issue might be there if a plugin opens up MIDI ports, which should be the host’s responsibility. I think you should be able to debug and test this using the AppImage of SUSHI for normal Linux, too.

Correct, that’s the only callback in a plugin that runs in the RT context.

Thanks, I’ll continue investigating and try to narrow down the issues, maybe opening new forum threads if needed. I’ll also publish code and instructions to run tracktion engine on ELK as soon as I get in a position to do so.

Good look frederic. I’d love to try out the Tracktion engine on the Elk whenever you make your code public.

Thanks, I’ll let you know. I’m doing some progress, but I’m stumbling across weird behaviours like ELK AudioOS hanging the 2nd time in a row I run my app, but working in the first (and stuff like that). Also there are some kernel mode switches (MSW) that need to be fixed. I set tracktion engine to use a single thread, and now the MSW have reduced a lot, but still there are some I need to debug.

I think I’m getting closer but still more work to be done :slight_smile:

Actually @Stefano, about this issue of AudioOS hanging the 2nd time I run sushi with my plugin, do you have any guess about what might be happening? Clearly my app is quitting and leaving some stuff in a weird state… is there any command to “reset” audio hardware I could run as a tests between sushi runs (might be related to Distorted/noisy audio)? you ever found anything similar to that?

Hi @frederic,
it sounds more like it’s on our side (driver not closing down properly) than on yours.

We had something similar in the past which has been fixed but there might have been other situations on the Pi 4. Could you send us the output of sudo dmesg next time it happens?

I’ll dot that (maybe tonight if I get some time). I’m testing with 2 tracktion engine based projects, a simpler one and another one a bit more complex (not too much). It only happens with one of them. In any case I’ll do more tests and share the output of sudo mesg. Thanks!

Hi again,

I tried the sudo dmesg but I don’t get too far because when ELK hangs I can not run it anymore. If I have it running before (in watch mode) I see nothing special. In any case, I see audio open and audio close messages where it makes sense:

[Apr28 17:09] audio_rtdm: audio_open.     - here I run sushi
[  +4.664678] rtdm_event_wait failed          - here I hit ctrl+c to stop sushi
[  +0.007131] audio_rtdm: audio_close.     
[  +4.159085] audio_rtdm: audio_open.      - here I run sushi for the second time
                         - here OS freezes and I have to power off

This is one issue, but for this one I’m not super worried now. The other issue is what has to do with the MSW. I can see MSW growing with watch -n 0.5 cat /proc/xenomai/sched/stat although the plugin works. But after a while it crashes/hangs OS sometimes. I have the feeling this could be caused by too many MSWs? In any case @Stefano said MSW should be completely avoided so I continued my investigations. Here are some things I learned so far:

  • Running the tracktion EngineInPluginDemo project (with a small modification of adding a background audio file), I get many MSW, at a rate of “hundreds” per second.

  • If I configure tracktion engine to do audio computations in a single thread, the number of MSW is reduced a lot, now grows at a rate of ~5 per second. I found out about this potential problem with

  • If I use gdb to catch catch signal SIGXCPU the debugger seems to stop execution in different points. Here are the code functions I see triggered in the backtrace:

    There are probably more, but these are the different ones I came across. Originally there was one related to parallelisation (the one I reported above in this thread), but was fixed by setting num threads to 1 for the tracktion engine.

  • I can find other time functions in JUCE codebase that call gettimeofday, but I don’t know if this is a problem.

I’m not sure how to continue to fix the MSWs. I guess first one would be to replace nanoseconds system call for something RT safe, but I don’t know how to do that using twine. Then if the juce::ScopedLocks are a problem, I don’t know what to do with them, maybe talk to JUCE/tracktion engine guys.

Hmmm looks like a clean use of tracktion engine in ELK is no going to be easy at all…

Hi @Stefano, any hints regarding my previous message? Thanks!

Hi @frederic,
sorry for the slow response.

Crashes of this kind are usually related to use of timer-related functions in the RT thread, from our experience. Having many MSWs will produce audio dropouts but shouldn’t crash the system.

the timer call here looks the most probable candidate for the crashes…

If called from a RT context it is absolutely an issue.

The function to call to get current time in nanoseconds with TWINE is twine::current_rt_time().

Locks are not possible in a RT thread, period. They shouldn’t be used in a Desktop application, either but probably here the Tracktion team (who know all those issues very well, their latest ADC talks are the best reference ever for the topic) might have been using them between two RT threads having the same priority, which will be fine in a normal OS.

Thanks @Stefano, I thought get_current_rt_time was a sort of replacement for clock_gettime (CLOCK_MONOTONIC, &t); (sorry I’m really noob in this hehe). Also I’ll talk to the tracktion guys for the locks thing. However, I double checked my text and saw I was wrong with the nanoseconds thing. It is not nanoseconds but nanosleep the function that I need to patch from juce:

void JUCE_CALLTYPE Thread::sleep (int millisecs)
{
    struct timespec time;
    time.tv_sec = millisecs / 1000;
    time.tv_nsec = (millisecs % 1000) * 1000000;
    nanosleep (&time, nullptr);
}

Also, is there a way to run sushi in the ELK board and all audio ins/outs without using the RT kernel? Would be useful for testing. I imagine the downside would be non guaranteed performance because the RT kernel would not be there with priority, etc, but that would be completely fine for testing purposes.

Hi @frederic,

Ok, we don’t have any sleep equivalent mapped in TWINE yet - typically RT threads never sleep but there is a use case for e.g. spinlocks.

If you want to implement it yourself quickly, take a look at the Xenomai / Coblat headers that are included in the TWINE’s implementation files and then simply replace nanosleep with __cobalt_nanosleep. That should do the job if you are already linking against TWINE, I think…

Not easily on the Elk Pi shield. There is a ALSA driver for that codec but you’d need to tweak it for our CPLD and other things.

The easiest way to compare on the same Hardware Elk w/ Xenomai VS Linux ALSA w/ PREEMPT_RT could be to get a HiFiBerry shield, that we just supported and for which there are already ALSA / PREEMPT_RT distributions that can be set up relatively easily.

Thanks, I’ll try the nanosleep thing.
About the testing without RT I don’t think HiFiBerry option would work because I’m interested in the multichannel audio i/o (and midi). In fact, I could test using raspbain (and of course without ELK pi hat) and my app works perfectly, including communication with ableton’s push 2, drawing on screen, and tracktion engine as a plugin. but then of course I don’t have multichannel audio i/o. Let’s hope I can fix all RT issues and have it working nicely in ELK board which is the goal :slight_smile:

Hi again @Stefano, I’m debugging a problem loading a file using tracktion engine with the help of Dave from tracktion team (https://forum.juce.com/t/step-sequencer-working-on-macos-build-but-not-when-running-in-elk/39529/18). Apparently there is an issue with locks, but the code is not running in the RT thread. Are also locks not allowed in ELK outside the RT thread?

Hi @frederic,
locks outside the RT thread should not be an issue and, as a matter of fact, we use quite a few in SUSHI itself (using C++ std library).