Prevent Embedded Design Errors (CC 25th Anniversary Preview)

Attention, electrical engineers and programmers! Our upcoming 25th Anniversary Issue (available in early 2013) isn’t solely a look back at the history of this publication. Sure, we cover a bit of history. But the issue also features design tips, projects, interviews, and essays on topics ranging from user interface (UI) tips for designers to the future of small RAM devices, FPGAs, and 8-bit chips.

Circuit Cellar’s 25th Anniversary issue … coming in early 2013

Circuit Cellar columnist Robert Lacoste is one of the engineers whose essay will focus on present-day design tips. He explains that electrical engineering projects such as mixed-signal designs can be tedious, tricky, and exhausting. In his essay, Lacoste details 25 errors that once made will surely complicate (at best) or ruin (at worst) an embedded design project. Below are some examples and tips.

Thinking about bringing an electronics design to market? Lacoste highlights a common error many designers make.

Error 3: Not Anticipating Regulatory Constraints

Another common error is forgetting to plan for regulatory requirements from day one. Unless you’re working on a prototype that won’t ever leave your lab, there is a high probability that you will need to comply with some regulations. FCC and CE are the most common, but you’ll also find local regulations as well as product-class requirements for a broad range of products, from toys to safety devices to motor-based machines. (Refer to my article, “CE Marking in a Nutshell,” in Circuit Cellar 257 for more information.)

Let’s say you design a wireless gizmo with the U.S. market and later find that your customers want to use it in Europe. This means you lose years of work, as well as profits, because you overlooked your customers’ needs and the regulations in place in different locals.

When designing a wireless gizmo that will be used outside the U.S., having adequate information from the start will help you make good decisions. An example would be selecting a worldwide-enabled band like the ubiquitous 2.4 GHz. Similarly, don’t forget that EMC/ESD regulations require that nearly all inputs and outputs should be protected against surge transients. If you forget this, your beautiful, expensive prototype may not survive its first day at the test lab.

Watch out for errors

Here’s another common error that could derail a project. Lacoste writes:

Error 10: You Order Only One Set of Parts Before PCB Design

I love this one because I’ve done it plenty of times even though I knew the risk.

Let’s say you design your schematic, route your PCB, manufacture or order the PCB, and then order the parts to populate it. But soon thereafter you discover one of the following situations: You find that some of the required parts aren’t available. (Perhaps no distributor has them. Or maybe they’re available but you must make a minimum order of 10,000 parts and wait six months.) You learn the parts are tagged as obsolete by its manufacturer, which may not be known in advance especially if you are a small customer.

If you are serious about efficiency, you won’t have this problem because you’ll order the required parts for your prototypes in advance. But even then you might have the same issue when you need to order components for the first production batch. This one is tricky to solve, but only two solutions work. Either use only very common parts that are widely available from several sources or early on buy enough parts for a couple of years of production. Unfortunately, the latter is the only reasonable option for certain components like LCDs.

Ok, how about one more? You’ll have to check out the Anniversary Issue for the list of the other 22 errors and tips. Lacoste writes:

Error 12: You Forget About Crosstalk Between Digital and Analog Signals

Full analog designs are rare, so you have probably some noisy digital signals around your sensor input or other low-noise analog lines. Of course, you know that you must separate them as much as possible, but you can be sure that you will forget it more than once.

Let’s consider a real-world example. Some years ago, my company designed a high-tech Hi-Fi audio device. It included an on-board I2C bus linking a remote user interface. Do you know what happened? Of course, we got some audible glitches on the loudspeaker every time there was an I2C transfer. We redesigned the PCB—moving tracks and adding plenty of grounded copper pour and vias between sensitive lines and the problem was resolved. Of course we lost some weeks in between. We knew the risk, but underestimated it because nothing is as sensitive as a pair of ears. Check twice and always put guard-grounded planes between sensitive tracks and noisy ones.

Circuit Cellar’s Circuit Cellar 25th Anniversary Issue will be available in early 2013. Stay tuned for more updates on the issue’s content.





Debugging USB Firmware

You’ve written firmware for your USB device and are ready to test it. You attach the device to a PC and the hardware wizard announces: “The device didn’t start.” Or, the device installs but doesn’t send or receive data. Or, data is being dropped, the throughput is low, or some other problem presents itself. What do you do?

This article explores tools and techniques to debug the USB devices you design. The focus is on USB 2.0 devices, but much of the information also applies to developing USB 3.0 (SuperSpeed) devices and USB hosts for embedded systems.


If you do anything beyond a small amount of USB developing, a USB protocol analyzer will save you time and trouble. Analyzers cost less than they used to and are well worth the investment.

A hardware-based analyzer connects in a cable segment upstream from the device under test (see Photo 1).

Photo 1: The device under test connects to the analyzer, which
captures the data and passes it unchanged to the device’s host. The
cable on the back of the analyzer carries the captured data to the
analyzer’s host PC for display.

You can view the data down to each packet’s individual bytes and see exactly what the host and device did and didn’t send (see Photo 2).

Photo 2: This bus capture shows the host’s request for a configuration
descriptor and the bytes the device sent in response. Because the endpoint’s
maximum packet size is eight, the device sends the first 8 bytes in one
transaction and the final byte in a second transaction.

An analyzer can also decode data to show standard USB requests and class-specific data (see Photo 3).

Photo 3: This display decodes a received configuration descriptor and its subordinate descriptors.

To avoid corrupted data caused by the electrical effects of the analyzer’s connectors and circuits, use short cables (e.g., 3’ or less) to connect the analyzer to the device under test.

Software-only protocol analyzers, which run entirely on the device’s host PC, can also be useful. But, this kind of analyzer only shows data at the host-driver level, not the complete packets on the bus.


The first rule for developing USB device firmware is to remember that the host computer controls the bus. Devices just need to respond to received data and events. Device firmware shouldn’t make assumptions about what the host will do next.

For example, some flash drives work under Windows but break when attached to a host with an OS that sends different USB requests or mass-storage commands, sends commands in a different order, or detects errors Windows ignores. This problem is so common that Linux has a file, unusual_devs.h, with fixes for dozens of misbehaving drives.

The first line of defense in writing USB firmware is the free USB-IF Test Suite from the USB Implementers Forum (USB-IF), the trade group that publishes the USB specifications. During testing, the suite replaces the host’s USB driver with a special test driver. The suite’s USB Command Verifier tool checks for errors (e.g., malformed descriptors, invalid responses to standard USB requests, responses to Suspend and Resume signaling, etc.). The suite also provides tests for devices in some USB classes, such as human interface devices (HID), mass storage, and video.

Running the tests will usually reveal issues that need attention. Passing the tests is a requirement for the right to display the USB-IF’s Certified USB logo.


Like networks, USB communications have layers that isolate different logical functions (see Table 1).

Table 1: USB communications use layers, which are each responsible for a
specific logical function.

The USB protocol layer manages USB transactions, which carry data packets to and from device endpoints. A device endpoint is a buffer that is a source or sink of data at the device. The host sends data to Out endpoints and receives data from In endpoints. (Even though endpoints are on devices, In and Out are defined from the host’s perspective.)

The device layer manages USB transfers, with each transfer moving a chunk of data consisting of one or more transactions. To meet the needs of different peripherals, the USB 2.0 specification defines four transfer types: control, interrupt, bulk, and isochronous.

The function layer manages protocols specific to a device’s function (e.g., mouse, printer, or drive). The function protocols may be a combination of USB class, industry, and vendor-defined protocols.


The layers supported by device firmware vary with the device hardware. At one end of the spectrum, a Future Technology Devices International (FTDI) FT232R USB UART controller handles all the USB protocols in hardware. The chip has a USB device port that connects to a host computer and a UART port that connects to an asynchronous serial port on the device.

Device firmware reads and writes data on the serial port, and the FT232R converts it between the USB and UART protocols. The device firmware doesn’t have to know anything about USB. This feature has made the FT232R and similar chips popular!

An example of a chip that is more flexible but requires more firmware support is Microchip Technology’s PIC18F4550 microcontroller, which has an on-chip, full-speed USB device controller. In return for greater firmware complexity, the PIC18F4550 isn’t limited to a particular host driver and can support any USB class or function.

Each of the PIC18F4550’s USB endpoints has a series of registers—called a buffer descriptor table (BDT)—that store the endpoint buffer’s address, the number of bytes to send or receive, and the endpoint’s status. One of the BDT’s status bits determines the BDT’s ownership. When the CPU owns the BDT, firmware can write to the registers to prepare to send data or to retrieve received data. When the USB module owns the BDT, the endpoint can send or receive data on the bus.

To send a data packet from an In endpoint, firmware stores the bytes’ starting address to send and the number of bytes and sets a register bit to transfer ownership of the BDT to the USB module. The USB module sends the data in response to a received In token packet on the endpoint and returns BDT ownership to the CPU so firmware can set up the endpoint to send another packet.

To receive a packet on an Out endpoint, firmware stores the buffer’s starting address for received bytes and the maximum number of bytes to receive and transfers ownership of the BDT to the USB module. When data arrives, the USB module returns BDT ownership to the CPU so firmware can retrieve the data and transfer ownership of the BDT back to the USB module to enable the receipt of another packet.

Other USB controllers have different architectures and different ways of managing USB communications. Consult your controller chip’s datasheet and programming guide for details. Example code from the chip vendor or other sources can be helpful.


A USB 2.0 transaction consists of a token packet and, as needed, a data packet and a handshake packet. The token packet identifies the packet’s type (e.g., In or Out), the destination device and endpoint, and the data packet direction.

The data packet, when present, contains data sent by the host or device. The handshake packet, when present, indicates the transaction’s success or failure.

The data and handshake packets must transmit quickly after the previous packet, with only a brief inter-packet delay and bus turnaround time, if needed. Thus, device hardware typically manages the receiving and sending of packets within a transaction.

For example, if an endpoint’s buffer has room to accept a data packet, the endpoint stores the received data and returns ACK in the handshake packet. Device firmware can then retrieve the data from the buffer. If the buffer is full because firmware didn’t retrieve previously received data, the endpoint returns NAK, requiring the host to try again. In a similar way, an In endpoint will NAK transactions until firmware has loaded the endpoint’s buffer with data to send.

Fine tuning the firmware to quickly write and retrieve data can improve data throughput by reducing or eliminating NAKs. Some device controllers support ping-pong buffers that enable an endpoint to store multiple packets, alternating between the buffers, as needed.


In all but isochronous transfers, a data-toggle value in the data packet’s packet identification (PID) field guards against missed or duplicate data packets. If you’re debugging a device where data is transmitting on the bus and the receiver is returning ACK but ignoring or discarding the data, chances are good that the device isn’t sending or expecting the correct data-toggle value. Some device controllers handle the data toggles completely in hardware, while others require some firmware control.

Each endpoint maintains its own data toggle. The values are DATA0 (0011B) and DATA1 (1011B). Upon detecting an incoming data packet, the receiver compares its data toggle’s state with the received data toggle. If the values match, the receiver toggles its value and returns ACK, causing the sender to toggle its value for the next transaction.

The next received packet should contain the opposite data toggle, and again the receiver toggles its bit and returns ACK. Except for control transfers, the data toggle on each end continues to alternate in each transaction. (Control transfers always use DATA0 in the Setup stage, toggle the value for each transaction in the Data stage, and use DATA1 in the Status stage.)

If the receiver returns NAK or no response, the sender doesn’t toggle its bit and tries again with the same data and data toggle. If a receiver returns ACK, but for some reason the sender doesn’t see the ACK, the sender thinks the receiver didn’t receive the data and tries again using the same data and data toggle. In this case, the repeated data receiver ignores the data, doesn’t toggle the data toggle, and returns ACK, resynchronizing the data toggles. If the sender mistakenly sends two packets in a row with the same data-toggle value, upon receiving the second packet, the receiver ignores the data, doesn’t toggle its value, and returns ACK.


All USB devices must support control transfers and may support other transfer types. Control transfers provide a structure for sending requests but have no guaranteed delivery time. Interrupt transfers have a guaranteed maximum latency (i.e., delay) between transactions, but the host permits less bandwidth for interrupt transfers compared to other transfer types. Bulk transfers are the fastest on an otherwise idle bus, but they have no guaranteed delivery time, and thus can be slow on a busy bus. Isochronous transfers have guaranteed delivery time but no built-in error correction.

A transfer’s amount of data depends in part on the higher-level protocol that determines the data packets’ contents. For example, a keyboard sends keystroke data in an interrupt transfer that consists of one transaction with 8 data bytes. To send a large file to a drive, the host typically uses one or more large transfers consisting of multiple transactions. For a high-speed drive, each transaction, except possibly the last one, has 512 data bytes, which is the maximum-allowed packet size for high-speed bulk endpoints.

What determines a transfer’s end varies with the USB class or vendor protocol. In many cases, a transfer ends with a short packet, which is a packet that contains less than the packet’s maximum-allowed data bytes. If the transfer has an even multiple of the packet’s maximum-allowed bytes, the sender may indicate the end of the transfer with a zero-length packet (ZLP), which is a data packet with a PID and error-checking bits but no data.

For example, USB virtual serial-port devices in the USB communications device class use short packets to indicate the transfer’s end. If a device has sent data that is an exact multiple of the endpoint’s maximum packet size and the host sends another In token packet, the endpoint should return a ZLP to indicate the data’s end.


Upon device attachment, in a process called enumeration, the host learns about the device by requesting a series of data structures called descriptors. The host uses the descriptors’ information to assign a driver to the device.

If enumeration doesn’t complete, the device doesn’t have an assigned driver, and it can’t perform its function with the host. When Windows fails to find an appropriate driver, the file in Windowsinf (for Windows 7) can offer clues about what went wrong. A protocol analyzer shows if the device returned all requested descriptors and reveals mistakes in the descriptors.

During device development, you may need to change the descriptors (e.g., add, remove, or edit an endpoint descriptor). Windows has the bad habit of remembering a device’s previous descriptors on the assumption that a device will never change its descriptors. To force Windows to use new descriptors, uninstall then physically remove and reattach the device from Windows Device Manager. Another option is to change the device descriptor’s product ID to make the device appear as a different device.


Unlike the other transfer types, control transfers have multiple stages: setup, (optional) data, and status. Devices must accept all error-free data packets that follow a Setup token packet and return ACK. If the device is in the middle of another control transfer and the host sends a new Setup packet, the device must abandon the first transfer and begin the new one. The data packet in the Setup stage contains important information firmware should completely decode (see Table 2).

Table 2: Device firmware should fully decode the data received in a control transfer’s Setup stage. (Source: USB Implementers Forum, Inc.)

The wLength field specifies how many bytes the host wants to receive. A device shouldn’t assume how much data the host wants but should check wLength and send no more than the requested number of bytes.

For example, a request for a configuration descriptor is actually a request for the configuration descriptor and all of its subordinate descriptors. But, in the first request for a device’s configuration descriptor, the host typically sets the wLength field to 9 to request only the configuration descriptor. The descriptor contains a wTotalLength value that holds the number of bytes in the configuration descriptor and its subordinate descriptors. The host then resends the request setting wLength to wTotalLength or a larger value (e.g., FFh). The device returns the requested descriptor set up to wTotalLength. (Don’t assume the host will do it this way. Always check wLength!)

Each Setup packet also has a bmRequestType field. This field specifies the data transfer direction (if any), whether the recipient is the device or an interface or endpoint, and whether the request is a standard USB request, a USB class request, or a vendor-defined request. Firmware should completely decode this field to correctly identify received requests.

A composite device has multiple interfaces that function independently. For example, a printer might have a printer interface, a mass-storage interface for storing files, and a vendor-specific interface to support vendor-defined capabilities. For requests targeted to an interface, the wIndex field typically specifies which interface applies to the request.


For interrupt endpoints, the endpoint descriptor contains a bInterval value that specifies the endpoint’s maximum latency. This value is the longest delay a host should use between transaction attempts.

A host can use the bInterval delay time or a shorter period. For example, if a full-speed In endpoint has a bInterval value of 10, the host can poll the endpoint every 1 to 10 ms. Host controllers typically use predictable values, but a design shouldn’t rely on transactions occurring more frequently than the bInterval value.

Also, the host controller reserves bandwidth for interrupt endpoints, but the host can’t send data until a class or vendor driver provides something to send. When an application requests data to be sent or received, the transfer’s first transaction may be delayed due to passing the request to the driver and scheduling the transfer.

Once the host controller has scheduled the transfer, any additional transaction attempts within the transfer should occur on time, as defined by the endpoint’s maximum latency. For this reason, sending a large data block in a single transfer with multiple transactions can be more efficient than using multiple transfers with a portion of the data in each transfer.


Most devices’ functions fit a defined USB class (e.g., mass storage, printer, audio, etc.). The USB-IF’s class specifications define protocols for devices in the classes.

For example, devices in the HID class must send and receive all data in data structures called reports. The supported report’s length and the meaning of its data (e.g., keypresses, mouse movements, etc.) are defined in a class-specific report descriptor.

If your HID-class device is sending data but the host application isn’t seeing the data, verify the number of bytes the device is sending matches the number of bytes in a defined report. The device should prepend a report-ID byte to the data only if the HID supports report IDs other than the zero default value.

In many devices, class specifications define class-specific requests or other requirements. For example, a mass storage device that uses the bulk-only protocol must provide a unique serial number in a string descriptor. Carefully read and heed any class specifications that apply to your device!

Many devices also support industry protocols to perform higher-level functions. Printers typically support one or more printer-control languages (e.g., PCL and Postscript). Mass-storage devices support SCSI commands to transfer data blocks and a file system (e.g., FAT32) to define a directory structure.

The higher-level industry protocols don’t depend on a particular hardware interface, so there is little about debugging them that is USB-specific. But, because these protocols can be complicated, example code for your device can be helpful.

In the end, much about debugging USB firmware is like debugging any hardware or software. A good understanding of how the communications should work provides a head start on writing good firmware and finding the source of any problems that may appear.

Jan Axelson is the author of USB Embedded Hosts, USB Complete, and Serial Port Complete. Jan’s PORTS web forum is available at


Jan Axelson’s Lakeview Research, “USB Development Tools: Protocol analyzers,”

This article appears in Circuit Cellar 268 (November 2012).

Q&A: Hai (Helen) Li (Academic, Embedded System Researcher)

Helen Li came to the U.S. from China in 2000 to study for a PhD at Purdue University. Following graduation she worked for Intel, Qualcomm, and Seagate. After about five years of working in industry, she transitioned to academia by taking a position at the Polytechnic Institute of New York University, where she teaches courses such as circuit design (“Introduction to VLSI”), advanced computer architecture (“VLSI System and Architecture Design”), and system-level applications (“Real-Time Embedded System Design”).

Hai (Helen) Li

In a recent interview Li described her background and provided details about her research relating to spin-transfer torque RAM-based memory hierarchy and memristor-based computing architecture.

An abridged version of the interview follows.

NAN: What were some of your most notable experiences working for Intel, Qualcomm, and Seagate?

HELEN: The industrial working experience is very valuable to my whole career life. At Seagate, I led a design team on a test chip for emerging memory technologies. Communication and understanding between device engineers and design communities is extremely important. The joined effects from all the related disciplines (not just one particular area anymore) became necessary. The concept of cross layers (including device/circuit/architecture/system) cooptimization, and design continues in my research career.

NAN: In 2009, you transitioned from an engineering career to a career teaching electrical and computer engineering at the Polytechnic Institute of New York University (NYU). What prompted this change?

HELEN: After five years of working at various industrial companies on wireless communication, computer systems, and storage, I realized I am more interested in independent research and teaching. After careful consideration, I decided to return to an academic career and later joined the NYU faculty.

NAN: How long have you been teaching at the Polytechnic Institute of NYU? What courses do you currently teach and what do you enjoy most about teaching?

HELEN: I have been teaching at NYU-Poly since September 2009. My classes cover a wide range of computer engineering, from basic circuit design (“Introduction to VLSI”), to advanced computer architecture (“VLSI System and Architecture Design”), to system-level applications (“Real-Time Embedded System Design”).

Though I have been teaching at NYU-Poly, I will be taking a one-year leave of absence from fall 2012 to summer 2013. During this time, I will continue my research on very large-scale integration (VLSI) and computer engineering at University of Pittsburgh.

I enjoy the interaction and discussions with students. They are very smart and creative. Those discussions always inspire new ideas. I learn so much from students.

Helen and her students are working on developing a 16-Kb STT-RAM test chip.

NAN: You’ve received several grants from institutions including the National Science Foundation and the Air Force Research Lab to fund your embedded systems endeavors. Tell us a little about some of these research projects.

HELEN: The objective of the research for “CAREER: STT-RAM-based Memory Hierarchy and Management in Embedded Systems” is to develop an innovative spin-transfer torque random access memory (STT-RAM)-based memory hierarchy to meet the demands of modern embedded systems for low-power, fast-speed, and high-density on-chip data storage.

This research provides a comprehensive design package for efficiently integrating STT-RAM into modern embedded system designs and offers unparalleled performance and power advantages. System architects and circuit designers can be well bridged and educated by the research innovations. The developed techniques can be directly transferred to industry applications under close collaborations with several industry partners, and directly impact future embedded systems. The activities in the collaboration also include tutorials at the major conferences on the technical aspects of the projects and new course development.

The main goal of the research for “CSR: Small Collaborative Research: Cross-Layer Design Techniques for Robustness of the Next-Generation Nonvolatile Memories” is to develop design technologies that can alleviate the general robustness issues of next-generation nonvolatile memories (NVMs) while maintaining and even improving generic memory specifications (e.g., density, power, and performance). Comprehensive solutions are integrated from architecture, circuit, and device layers for the improvement on the density, cost, and reliability of emerging nonvolatile memories.

The broader impact of the research lies in revealing the importance of applying cross-layer design techniques to resolve the robustness issues of the next-generation NVMs and the attentions to the robust design context.

The research for “Memristor-Based Computing Architecture: Design Methodologies and Circuit Techniques” was inspired by memristors, which have recently attracted increased attention since the first real device was discovered by Hewlett-Packard Laboratories (HP Labs) in 2008. Their distinctive memristive characteristic creates great potentials in future computing system design. Our objective is to investigate process-variation aware memristor modeling, design methodology for memristor-based computing architecture, and exploitation of circuit techniques to improve reliability and density.

The scope of this effort is to build an integrated design environment for memristor-based computing architecture, which will provide memristor modeling and design flow to circuit and architecture designers. We will also develop and implement circuit techniques to achieve a more reliable and efficient system.

An electric car model controlled by programmable emerging memories is in the developmental stages.

NAN: What types of projects are you and your students currently working on?

HELEN: Our major efforts are on device modeling, circuit design techniques, and novel architectures for computer systems and embedded systems. We primarily focus on the potentials of emerging devices and leveraging their advantages. Two of our latest projects are a 16-Kb STT-RAM test chip and an electric car model controlled by programmable emerging memories.

The complete interview appears in Circuit Cellar 267 (October 2012).