JUCE + Tracktion Engine on ELK Audio OS

Hi there,

I’m trying to run a JUCE app which uses tracktion engine on ELK Audio OS (on a raspberrypi). I created a very simple example project which simply plays an audio file using the tracktion engine (and the tracktion engine is configured inside a JUCE plugin). Here are the source files: https://www.dropbox.com/s/s63wc6cb0yxcrpm/TracktionEngineTestELK.zip?dl=1

The project compiles and works nicely on macOS. I’m able to cross compile the project for ELK using Docker on macOS. I can run the generated vst2 plugin on the board but the tracktion engine hangs at some point during initialization. Looking at the terminal output when running it on the board and in the desktop I’d say it hangs when it tries to connect to the audio buses:

Output when running in macOS

JUCE v5.4.5
Finding MIDI I/O
MIDI output: Scarlett 6i6 USB
MIDI output: IAC Driver IAC Bus 1
MIDI output: Tracktion MIDI Device
MIDI input: Scarlett 6i6 USB
MIDI input: IAC Driver IAC Bus 1
Audio block size: 512  Rate: 44100
Rebuilding Wave Device List...
Wave In: Input 1 (enabled): 0 (L)
Wave In: Input 2 (enabled): 1 (L)
Wave In: Input 3: 2 (L)
Wave In: Input 4: 3 (L)
Wave In: SPDIF 1: 4 (L)
Wave In: SPDIF 2: 5 (L)
Wave Out: Output 1 + 2 (enabled): 0 (L), 1 (R)
Wave Out: Output 3 + 4: 2 (L), 3 (R)
Wave Out: Output 5 + 6: 4 (L), 5 (R)
Default Wave Out: Output 1 + 2
Default MIDI Out: 
Default Wave In: Input 1
Default MIDI In: 
Playing file from engine...

Output when running from ELK Audio OS (in the board)

SUSHI - Sensus Universal Sound Host Interface
Copyright 2016-2018 MIND Music Labs, Stockholm
Finding MIDI I/O
MIDI output: Midi Through Port-0
MIDI input: Midi Through Port-0
opening MIDI in device: Midi Through Port-0
opening MIDI out device:Midi Through Port-0

And it hangs endlessly. I can’t even stop the app running ctrl+c/ctrl+z or killing the process.

The files I attached have a README file with info for compilation, but it should be no different than other JUCE plugins for ELK. There’s also the sushi config file I’m using included, but it is basically the same as one of the demo projects which worked perfectly for me.

Any suggestion about how could I further debug this issue?

Most likely it’s a call to Linux system timers like clock_gettime() done in the RT thread. This is RT-safe for normal Linux but hangs Xenomai.

Probably in Tracktion the JUCE Timer class is used for that, I’d suggest either look at the source code or, you can run SUSHI with the dummy frontend and attach strace to the real-time thread and look for timer-related syscalls generated by the process.

We provide a safe replacement for these functions in the TWINE library in case the functionality is needed by the plugin.

Interesting, that’s actually what you told me at ADC.

Searching clock_gettime in the tracktion repository returns nothing, but searching clock_gettime in JUCE repository does return some matches (not many). Particularly interesting this Time::getHighResolutionTicks(), which is indeed used in the device manager of tracktion engine (see https://github.com/Tracktion/tracktion_engine/search?q=getHighResolutionTicks&unscoped_q=getHighResolutionTicks).

Do you think an option would be to patch Time::getHighResolutionTicks() in JUCE codebase to use the TWINE equivalent instead of patching tracktion engine?

Where can I find TWINE source and documentation?

Other timer related sys calls that I found in JUCE codebase are gettimeofday and nanosleep (none of them in tracktion engine).

Hi @frederic,

You can patch either of them, my personal choice would be to keep those patches as close as possible and therefore maybe it’s better to do it in the Tracktion engine, which should also be easier to maintain than JUCE.

We’ll publish the repo on Github next week, for the moment you should have a version inside the VM image in the ~/work/ directory.

The issue is more where these are called. If a plugin calls them in a non-RT thread, it’s totally fine to have them, the problem is only for those that are done in the RT callback.

Thanks for the quick answer!
Good that I did not delete the VM yet :slight_smile:

Got the twine folder. I’ll play around with this and see if I can get it working. I suggested patching JUCE because I would only need to patch the wrapper to the system call and it looks easier than replacing several calls in tracktion. Maybe I can try both options and see which one looks cleaner.

As for the RT/non-RT thread, I’m no expert at all in these matters and I don’t know how JUCE/sushi/whoever decides when a thread should run in RT or not, but I guess making all timer syscalls RT safe should make everything safer right? That’s what I guess I’d get by patching JUCE directly. Maybe there’s a performance overhead?

Anyway, I’ll let you know what I come up with…

That’s actually a valid point. The relevant bits in TWINE are this function:

 */
std::chrono::nanoseconds current_rt_time();

Replacing everything shouldn’t hurt but the only replacement we provide in TWINE is for the high resolution timers, which are the ones typically used in a RT thread.

Short story, the RT thread is created by SUSHI and is the one that calls back into the process* function exposed by your plugin. In TWINE there’s another helper function:

bool is_current_thread_realtime();

that you can use as Red-Pill / Blue-Pill test, i.e. asking “am I running in a real-time context?”

aha!

Look at the source code for Time::getHighResolutionTicks() in JUCE:

int64 Time::getHighResolutionTicks() noexcept
{
   #if JUCE_BELA
    return rt_timer_read() / 1000;
   #else
    timespec t;
    clock_gettime (CLOCK_MONOTONIC, &t);
    return (t.tv_sec * (int64) 1000000) + (t.tv_nsec / 1000);
   #endif
}

I assume Bela guys had exactly the same issue and instead of using clock_gettime they use rt_timer_read which is included somehow from the Bela code. Cool! So basically I could most probably get away by adding an extra if clause here, something like if JUCE_ELK which will use the TWINE current_rt_time method (and divide by 1000 or 1000000 also because of nanoseconds->milliseconds). Then I’ll also have to put #if JUCE_ELK somewhere else to actually include the TWINE library.

Oh interesting and not surprising at all :slight_smile:

Then I guess the quickest will be just to define JUCE_BELA and use that, I can’t image OTOH other things that are set by that #define and that could potentially break JUCE for Elk.

Edit: well, most likely that flag depends on having the Bela toolchain and other Bela-related code available, so it’s probably better to use a dedicated #define like you were suggesting and use TWINE.

Hi again,

I managed to compile the project with a patched JUCE that uses twine. This is what I did as a quick hack:

int64 Time::getHighResolutionTicks() noexcept
{
   #if JUCE_BELA
    return rt_timer_read() / 1000;
   #else
    return twine::current_rt_time().count() / 1000;
    //timespec t;
    //clock_gettime (CLOCK_MONOTONIC, &t);
    //return (t.tv_sec * (int64) 1000000) + (t.tv_nsec / 1000);
   #endif
}

After some trials I could compile using this .count() method in the current_rt_time() response. Not sure if is the correct thing to do. In any case, when running the newly compiled version of the project I still see the same hanging issue. Some ideas I have:

  1. getHighResolutionTicks is maybe patched ok but there are still other calls to clock_gettime in JUCE which have not been properly patched. Also I see tracktion engine calling Thread:sleep which will use nanosleep under the hood. Do you know if nanosleep will be RT safe?
  2. For the Bela implementation they use rt_timer_read() from alchemy library which I think is part of xenomai. I could try that instead of twine. Also they only patch getHighResolutionTicks.
  3. Maybe the current issue I’m having is unrelated to RT safety because, as far as I understand, calling a non-RT safe method would decrease performance because of kernel switch but would not completely hang the app right? In trying to debug system calls using strace as suggested above I seem to see and endless loop of log messages apparently trying to do something but I see no time related syscalls that I can recognize. Here is a sample of the strace output when attached to the a sushi running process https://www.dropbox.com/s/wwc7b2xhyklnxk9/strace_process_sample.txt?dl=1

I guess first thing I need to do is to assess that the problem is indeed caused by the time related syscalls. I also need to know which ones would not be RT-safe so I can track them all. I’m not sure if I’m running strace correctly though. What I do is that I run sushi in dummy mode (btw why should it by dummy mode?) and then use htop to get process ID (many sushi processes appear in fact, but only two seem to be using CPU) and then strace -p PID for the different sushi processes that seem to be doing something. The produces an output like the file I linked above. It also happens that by the time I manage to run strace my app is already hanged, so maybe I should do this differently.

Comments, suggestions?

Thanks for all the help! Hopefully at the end of the process I’ll be able to show a nice example app running tracktion engine in ELK so others will benefit as well :slight_smile:

HI @frederic,
we are very interested indeed in having a plugin with the Tracktion engine running :slight_smile:

Could you try to run the plugin with the offline or dummy frontend on the Pi and see if it still hangs / crashes? In that case, we can rule out Xenomai-specific issue.

Another thing that came to my mind were those messages regarding MIDI port connections that you posted. Is the plugin trying to access direct MIDI HW through ALSA? That is something we never tried and might cause some issues… Is it possible to disable this feature when compiling the Tracktion engine? You can still get MIDI from the host through the VST API.

1 Like

The app hangs also when running sushi dummy mode. So when running dummy is not using xenomai kernel at all? That’s what’s in the docs " **Dummy** , without any connection to audio I/O, useful to debug some real-time safety issues on normal Linux machines", but it was not 100% clear to me.

What you mention about MIDI seems to be indeed a problem because I think tracktion will use ALSA for MIDI instead of getting whatever input from the plugin. I’ll check if I can disable MIDI for tracktion engine and try running the app.

Only the RASPA frontend uses Xenomai, the dummy / offline / JACK use normal POSIX threads.
Can you try it running on SUSHI AppImage on a VM or similar? Another test would be to run the same Linux build using another VST host like Carla.

I did some more experiments. Basically I configured the app to not access MIDI directly but simply use the channels provided by the plugin host. In fact there is an example project for tracktion engine that does that [the example named EngineInPlugin of the tracktion repo] and I compiled that one for the ELK. Now the app does no hang but it reports a number of other problems and then ends in a segmentation fault. This is the output (using either dummy or RASPA in sushi, it is the same):

SUSHI - Sensus Universal Sound Host Interface
Copyright 2016-2018 MIND Music Labs, Stockholm
Creating Default Controllers...
*** ERROR: Rogue call to triggerAndWaitForCallback()
*** ERROR: triggerAndWaitForCallback() unable to complete
*** ERROR: Rogue call to triggerAndWaitForCallback()
*** ERROR: triggerAndWaitForCallback() unable to complete
*** ERROR: Rogue call to triggerAndWaitForCallback()
*** ERROR: triggerAndWaitForCallback() unable to complete
*** ERROR: Rogue call to triggerAndWaitForCallback()
*** ERROR: triggerAndWaitForCallback() unable to complete
*** ERROR: Rogue call to triggerAndWaitForCallback()
*** ERROR: triggerAndWaitForCallback() unable to complete
*** ERROR: Rogue call to triggerAndWaitForCallback()
*** ERROR: triggerAndWaitForCallback() unable to complete
*** ERROR: Rogue call to triggerAndWaitForCallback()
*** ERROR: triggerAndWaitForCallback() unable to complete
*** ERROR: Rogue call to triggerAndWaitForCallback()
*** ERROR: triggerAndWaitForCallback() unable to complete
*** ERROR: Rogue call to triggerAndWaitForCallback()
*** ERROR: triggerAndWaitForCallback() unable to complete
*** ERROR: Rogue call to triggerAndWaitForCallback()
*** ERROR: triggerAndWaitForCallback() unable to complete
*** ERROR: Rogue call to triggerAndWaitForCallback()
*** ERROR: triggerAndWaitForCallback() unable to complete
Finding MIDI I/O
MIDI output: MIDI Output
MIDI input: MIDI Input
opening MIDI out device:MIDI Output
*** ERROR: Failed to open MIDI output MIDI Output
Audio block size: 512  Rate: 44100
Rebuilding Wave Device List...
Wave In: Input 1 (enabled): 0 (L)
Wave In: Input 2 (enabled): 1 (L)
Wave Out: Output 1 + 2 (enabled): 0 (L), 1 (R)
Default Wave Out: Output 1 + 2
Default MIDI Out: 
Default Wave In: Input 1
Default MIDI In: MIDI Input
Rebuilding Wave Device List...
*** ERROR: Rogue call to triggerAndWaitForCallback()
*** ERROR: triggerAndWaitForCallback() unable to complete
Wave In: Input 1 (enabled): 0 (L)
Wave In: Input 2 (enabled): 1 (L)
Wave Out: Output 1 + 2 (enabled): 0 (L), 1 (R)
Default Wave Out: Output 1 + 2
Default MIDI Out: 
Default Wave In: Input 1
Default MIDI In: MIDI Input
Rebuilding Wave Device List...
*** ERROR: Rogue call to triggerAndWaitForCallback()
*** ERROR: triggerAndWaitForCallback() unable to complete
Wave In: Input 1 (enabled): 0 (L)
Wave In: Input 2 (enabled): 1 (L)
Wave Out: Output 1 + 2 (enabled): 0 (L), 1 (R)
Default Wave Out: Output 1 + 2
Default MIDI Out: 
Default Wave In: Input 1
Default MIDI In: MIDI Input
Segmentation fault

My next step will be to investigate what this triggerAndWaitForCallback method is and also to try compile and run this project in the linux VM.

EDIT: here is the code for triggerAndWaitForCallback: https://github.com/Tracktion/tracktion_engine/blob/5b281a4410c5cb6469170f1636f792e941254e4c/modules/tracktion_engine/utilities/tracktion_AsyncFunctionUtils.h#L125 Is used to call something in the message thread.

Is this the example you were referring to:

?

It seems like it still tries to access MIDI devices outside the host in setupInputs().

Another fishy thing from the output you pasted is this one:

is the plugin trying to set up audio through ALSA directly?

Yes this is the example.

I think these devices will be the ones returned by the host as tracktion engine runs under the “plugin behaviour”. I indeed tried to modify the example no not call the device manager and the result is the same. I’ll ask in JUCE forums to see if I can get more info.

Again it should not because the whole point of having the tracktion engine inside a plugin is that it only sees audio/midi in/out from the host, and therefore this should not happen. I’ll also ask in JUCE forums to get more info.

I did an experiment modifying the example so that I don’t run the setupInputs nor initialize the synth plugin and I simply load and play an audio file in a tracktion engine track (similar to my first testing code I was using). The segmentation fault I get it as soon as I trigger the play function in tracktion.

this function that returns an error is (I think) used to dispatch something in the message thread. I assume this is not the RT thread (or should not be). It looks like the issue could be related to that.

I got answers in the JUCE forum, tracktion engine won’t try to access midi/audio devices when running as a plugin https://forum.juce.com/t/using-the-tracktion-engine-within-a-plugin/32563/11

I did a couple more experiments with the engine example but I’m always getting these *** ERROR: Rogue call to triggerAndWaitForCallback() errors and a segmentation fault as soon as I call tracktion engines’ transport play function.

Next step will be to have same project compiled and run in a VM with SUSHI app image.

Extra quick question: copying compiled plugins to the ELK board over ssh/ethernet is veeeeery slow. Do you know if there is an easy way to speed that up?

Thanks!

Hi @frederic,

Running inside the VM is a good idea, so at least you can rule out what issues are in SUSHI and what are the Xenomai-specific problems. If this is the example Tracktion project, could you share the code? In that way we can give it a try, too.

Yes, that’s due to the Ethernet going through the USB HUB in the RPi, which is terribly slow especially when set at USB 1.1.

You can either:

  • Remove the option dwc_otg.speed=1 from /boot/config.txt. This will set the USB back to 2.0 but then USB MIDI controllers might drop some packets (e.g. hanging notes)
  • Use WiFi which doesn’t go through the USB hub and is significantly faster

here is the code I’m using: https://www.dropbox.com/s/du100sgwa3kihp5/engine_in_plugin_example_to_share.zip?dl=1
The Makefile will work using the cross compilation toolchain. I also included the JUCE headless and the VST2 sdk that I’m using but I guess you don’t really need that.

You’ll fins the projucer file and linux makefile in tracktion_engine/examples/projects/EngineInPluginDemo
I removed other engine examples to simplify filelist.

I’m currently compiling this in a VM to see if it works. I’ll let you know…

UPDATE: I could compile in the VM. When running with sushi -d I get the same output, with all the *** ERROR: Rogue call to triggerAndWaitForCallback() errors. At the end it says Segmentation fault (core dumped).

If I run it with -j (jack) it fails trying to initialize frontend, maybe some jack thing I have to configure. In any case I’ll focus on the -d option for now as it is curious that the error is the same as in ELK. I’ll try running other VST hosts and see…

1 Like

Interesting, thanks. I’ll take a look during the weekend but it’s good that it fails already in a simple-to-analyze case as the dummy frontend.

Be catious in distributing the VST2 SDK! It is not allowed by Steinberg…
Not a huge issue while this forum is still with restricted access but better to remove it for next week when this area will be public.