FIFO Error -1074397140 During PCIe-1429 Grab

Updated Jul 3, 2018

Reported In

Hardware

  • PCIe-1429

Driver

  • NI-IMAQ
  • Vision Acquisition Software

Issue Details

When I do a grab, ring, or sequence with my PCIe-1429, I receive an error that states:

Error -1074397140: FIFO overflow caused acquisition to halt.

How can I avoid it?

Solution

While it may be impossible to eliminate this error on certain motherboards, the following suggestions may alleviate the problem:

System Selection
Select a workstation-class computer. These computers typically allow for maximum packet payload sizes of 256 bytes vs. the 128 byte maximums that are typical for desktop-class systems.

System Setup
  • Make sure that the memory installed in your system is paired correctly. Most modern chipsets can simultaneously access two memory modules if they are installed in the correct memory slots. Check your motherboard documentation to verify that your memory is installed in the correct slots. Otherwise, your memory throughput will be halved.
  • Disable any power management features in your computer’s BIOS.
  • Try different PCI Express slots on your motherboard. Select the interface in MAX and verify that the slots negotiate at x4 lanes.
  • Upgrade to the latest BIOS for your motherboard.
  • If your motherboard has a built in Ethernet NIC, consider disabling it in the Windows Device Manager and using a plug-in Ethernet NIC. In some motherboards, built-in Ethernet NICs are given higher priority access to host memory than devices in the IO slots
  • If your motherboard has an integrated video card, consider disabling it and using a PCI or PCIe video card instead. In some motherboards, integrated video cards use the main memory to store frame buffers which tremendously increases the load on the memory controller. However, plugin cards have their own onboard memory to store frame buffers.
  • If you have a PCI Express graphics card:
    • Use a display resolution that is native, not interpolated, for your graphics card. See your graphics cared for further documentation. Sometimes smaller resolutions are not native resolutions of the card, thus requiring CPU involvement.
    • Reduce the display resolution
    • Replace your PCI Express graphics card with a PCI graphics card. In some computers, this will improve performance since the graphics slot hangs off the North Bridge and gets better memory access than the South Bridge. Verify that the link negotiates to x4. However, in many Intel chipsets, even if the link negotiates to x4, the buffer sizes in the two directions are mismatched (optimized for video output, not input). As a result, in some computers, the graphics slot may under-perform other IO slots.
    • Disable Windows visual effects that create a heavy traffic to the graphics card (such as animating windows when minimizing or maximizing). Go to Start»Control Panel»System»Advanced tab»Visual Effects tab. Select the Adjust for best performance radio button and click OK. Or, selectively disable options per your preference.
  • Remove other PCI Express cards from your computer

Camera
If your camera has the ability to reduce its pixel clock rate while still achieving your desired frames rate, change this setting in the camera. This reduces the “peakiness” of the data load to the framegrabber.

Application design
  • Do as few image copies as possible. For example, a ring application does fewer copies than a grab application. Do not update the image display in your application for every acquired image, rather only on an as-needed basis.
  • Do not display images if it is not necessary. Reduce the update frequency when display is necessary. (This reduces competition between the processor and IO devices to access host memory)

Additional Information

When is the error generated?
When an acquisition is in progress, the NI PCIe-1429 continually acquires data from the camera and stores it in memory on the board. At the same time, it continually negotiates with the PCI Express bus for permission to send this data to the computer’s main memory in a First-In-First-Out (FIFO) fashion. If the PCI Express bus is unable to accept data at a rate at least as fast as the incoming rate from the camera, the onboard memory will eventually fill up and the FIFO overflow error is generated.
 
What causes the error? I thought PCI Express had more than enough dedicated bandwidth.
When comparing the raw PCI Express bandwidth it appears that there will always be more than enough to accommodate any full configuration CameraLink camera. The peak throughput of a x4 Gen 1 PCI Express link is 1 GB/s (1024 MB/s), and a full configuration camera can generate at most 680 MB/s. While the peak bandwidth is an important consideration, the peak bandwidth can never be maintained indefinitely. There are additional factors that must be considered.
 
Bus Overhead
There are a variety of types of overhead required on the PCI Express bus, but we’ll take a look at the largest one – the packet header. For each packet sent across the bus, some time on the bus must be used to transmit header information. For the optimal type of packet for sending image data to host memory, the PCI Express standard requires a per-packet-header of 20 bytes. As you can see below, this can significantly reduce the achievable performance far below 1024 MB/s for small packet sizes.

Packet Payload Size (bytes)

Peak Packet Throughput (Mbytes/sec)

8

286

16

444

32

615

64

762

128

865

256

928

512

962

1024

981

2048

990

4096

995


The maximum packet size is negotiated between two devices sharing a PCI Express link. As of 2008, most desktop chipsets allow at most 128 byte packets and most workstation and server chipsets allow at most 256 byte packets. So for a typical desktop PC, the maximum achievable throughput even after only taking this one aspect of overhead into account is 865 MB/s.
 
There are also other types of overhead including acknowledgement packets, update flow control packets and alignment packets that are beyond the scope of this article.
  
Bus Hierarchy
The PCI Express bus architecture is, at its lowest level, a series of point-to-point connections where each device has its own dedicated link to its partner that is not shared with other devices. While it seems like each device would be fully independent, this is not usually how systems are actually built.
 
In order to get to host memory, a PCI Express packet typically traverses multiple PCI Express links. In a simple computer, the non-graphics PCI Express IO slots might connect to the South Bridge of the chipset while the x16 slot intended for graphics connects to the North Bridge of the chipset. In these systems, the packet in an IO slot would go from the image acquisition device to the South Bridge, from the South Bridge to the North Bridge, and from the North Bridge to the host memory. In other systems, there might be a more substantial bus hierarchy. The IO slots might connect to a switch which might connect to other switches before eventually finding a connection to the South Bridge and from there to the North Bridge and host memory.
 
At each level of the hierarchy, switches or bridges are tasked with combining traffic from the image acquisition device with traffic from the devices on other connected links. These devices might be plugged into other IO slots or they might be built into the Motherboard (an Ethernet interface, for example). If each of the connections moving closer to the host memory were sized as large as the sum of all the links below it (plus some extra capacity to allow for switching overhead) then there could be no throughput loss due to the bus hierarchy. However, since very few systems call for all the devices to transmit at full rate all the time, system designers don’t design systems this way. Doing so would add significant cost to every computer made without benefit to the average user. As a result, the number of steps in the hierarchy and the size of the connections all the way back to host memory can have an impact on high end performance. Typically workstations and servers design for a larger “pipe” size all the way back to the host memory than desktop computers.
 
Bus Latency
We discussed that the throughput between the switches, chipset and the memory controller is less than the sum of all the throughputs of the individual PCI Express links. An additional outcome of all IO devices and the processor eventually needing to get to the same resource (host memory) is that there will be multiple devices wanting access to host memory at the same time. As a result, there can be a time delay between when a device requests access and when it receives it. PCI Express devices all have a small amount of memory to “wait-out” this delay. However, this memory is not infinite. The North Bridge will queue some transactions. When its memory gets full, it tells the South Bridge to stop sending it new data. When the South Bridge gets full, it tells the switch to stop sending new data and so on down the line until the last switch connected to the IO slot tells the device in the IO slot to stop sending data. Once this happens, the image acquisition device starts filling its memory until eventually it, too, is full. When the image acquisition device is full, it doesn’t have a way to tell the camera that it is full. New image data keeps coming, but there is nowhere to store it so the board returns the FIFO overflow error to notify the user. This is different from the user experience when other devices run out of memory. For example, on an Ethernet device the user doesn’t even know that the device filled its memory. Like an image acquisition board the Ethernet TCP/IP protocol checks for missing data. However, when missing data is found on Ethernet, a request is sent to the sending device to retransmit the data. Usually, by the time that the data is re-transmitted, there is space to hold it and the application continues with only a minor delay. In CameraLink, however, there is no mechanism to ask the camera to retransmit the data.
  
Bus Power Management
Current computer power management policies estimate when devices are idle. If they guess incorrectly this can cause performance issues and FIFO overflows.

WAS THIS ARTICLE HELPFUL?

Not Helpful