Sunday, September 7, 2014

Grasshopper3: Circular Buffer, Post Triggering, and Continuous Modes

Previously I have implemented a bare-bones RAM buffering program for the Grasshopper3 USB3 camera. The idea was to strip out all operations other than transferring raw image data from the USB3 port to RAM, so that the full frame rate of the camera can be buffered even on a Microsoft Surface2 tablet. While the RAM buffer is filling, no image conversion or saving to disk operations are going on. The GUI is running on another thread, and the image preview rate is held to 30Hz.

One-shot linear buffer with pre-trigger: 1) After triggering, frames are transferred into RAM (yellow). 2) RAM buffer is full. 3) Images are converted and saved to disk (not in real-time).
At the time I also tried to implement a circular buffer, where the oldest images are continuously overwritten in RAM. This allows for post-triggering, a common feature of high-speed cameras. The motivation for post-triggering is that the buffer is short (order of seconds) but you don't know when the exciting thing is going to happen. So you capture image data at full frame rate, continuously overwriting the oldest images in the buffer, until something exciting happens. The trigger can come after the action, stopping the image data capture and locking the previous N image frames in the buffer. The entire buffer can then be saved starting from the oldest image and ending at the trigger.

Circular buffer with post-trigger: 1) Buffer begins filling with frames. 2) After full, the oldest frame is overwritten with the newest. 3) A post-trigger stops buffering new frames and starts saving the previous N frames to disk.
It didn't work the first time I tried it; the frame rate would drop after the first time through the buffer. But I did a little code cleanup - now the first time through it constructs the array elements that make up the frame buffer, but on subsequent passes it assumes the structures are already in place and just transfers in data. This makes a wonderful flat-top trapezoidal wave of RAM usage that corresponds exactly with the allocated buffer size:

Post-triggering is not the only thing a circular buffer structure is good for. It can also be used as the basis for robust (buffered) continuous saving to disk. Assuming a fast enough write speed, frames can be continuously taken out of the buffer on a First-In First-Out (FIFO) basis and written to disk. I say "disk" but in order to get fast enough write speeds it really does need to be a solid-state drive. And even then it is a challenge.

For one, the sequential write speed of even the fastest SSDs struggles to keep up with USB3. To achieve maximum frame rate, the saved images must be in a raw format, both to keep the size down (one color per pixel, not de-Bayered) and to avoid having the processor bottleneck the entire process during image conversion. Luckily there is an option to just spit out raw Bayer data in 8- or 12-bit-per-pixel formats. IrfanView (my favorite image viewer) has a plug-in that is capable of parsing and color-processing raw images. The plug-in also works with the batch conversion portion of IrfanView, so you can convert an entire folder of raw frames.

The other challenge is that the operations required to save to disk take up processor time. In the FlyCap2 software that comes with the camera, the image capture loop has no trouble running at full frame rate, but turning on the saving operation causes the frame processing rate to drop on my laptop and especially on the MS Surface 2. To try to combat this problem, I did something I've never done before: actually intentionally write a multi-threaded application the right way. The image capture loop runs on one thread while the save loop runs on a separate thread. (And the GUI runs on an entirely different thread...) This way, a slow-down on the saving thread ideally doesn't cause a frame rate drop on the capture thread. The FIFO might fill up a little, but it can catch up later.

Continuous saving: Images are put into the RAM buffer on one thread (yellow) and removed from it to write to disk on another thread (green). This happens simultaneously and continuously as long as the disk write can keep up.
There's another interesting twist to the continuous-saving circular buffer: the frame rate in doesn't necessarily have to equal the frame rate out. For example, it's possible to buffer into RAM at 150fps but save only every 5th frame, for 30fps output. Then, if something exciting happens, the outgoing rate can be switched to 150fps temporarily to capture the high-speed action. If the write-to-disk thread can't keep up, the FIFO grows in size. As long as the outgoing rate is switched back to 30fps before the buffer is full, the excess FIFO elements can be unloaded later.

The key parameter for this continuous saving mode is the number of frames of delay between the incoming and the outgoing thread. The target delay could be zero, but then you would have to know in advance if you want to switch to high-speed saving. Setting the target delay to some number part-way through the buffer allows for post-triggering of the high-speed saving period, which seems more useful. I added a buffer graphic to the GUI to show both the target and the actual saving delay during continuous saving. My mind still has trouble thinking about when to turn the frame rate divider on and off, but I think it could be useful in some cases.

Here's some video I took to try out these new modes. It's all captured on the Microsoft Surface 2, so no fancy hardware required and it's all still very portable.

This is a simple test of the circular buffer at 1080p150 with post-trigger. The coin in particular is a good example of when it's nice to just leave the buffer running and post-trigger after a lucky spin gets the coin to land in frame.

More coin spinning, but this time using the continuous saving mode. Frames go into RAM at 150fps, but normally they are only written to disk at 30fps. When something interesting happens (such as the coin actually being in frame...), a burst of 150fps writing to disk is triggered. On the Surface 2, the write thread is slower than the read thread, so it can only proceed for a little while until the FIFO gets too full. Switching back to 30fps saving allows the FIFO to catch up.

Finally. a quick test of lower resolution / higher frame rate operation. At 480p, the frame rate can get up to 360+ fps. Buffering is fine at this frame rate (the overall data rate is actually lower). It actually doesn't require an insane amount of light either - the iPhone display is the only source of light here. You can see its IR proximity sensor LED flashing, as well as individual frame transitions on the display, behind the water stream. The maximum frame rate goes all the way up to 1100+ fps at 120p, sometime I have yet to try out.

That's it for now. The program (which started out as the FlyCapture2SimpleGUI source that comes with the camera) has a nice VC# GUI:

I can't distribute the source since it's derived from the proprietary SDK, but now you know it's possible and relatively easy to get it to capture and save efficiently with a bit of good programming. It was a fun project since I've never intentionally written interacting multi-threaded programs, other than maybe separating GUI threads from other things. I guess I'm only ten years or so behind on my application programming skills now...