Projects Research & Design Hub

Self-Organizing Wi-Fi Mesh Network

Using PIC32 MCUs

Gone are the days when networking embedded devices was a big deal. And today, such devices can be linked in powerful mesh networks over wireless protocols. In this article, learn how these two Cornell students used Microchip PIC32 MCUs and Espressif’s ESP8266 Wi-Fi module to create a mesh network of wirelessly connected devices. The mesh network is able to configure itself, and requires no manual intervention to connect the nodes.

In this project, we created a mesh network of Microchip PIC32 microcontrollers (MCUs) that were connected to each other wirelessly through ESP8266 Wi- Fi modules (Figure 1). The primary objective for this project was to create a self-contained wireless mesh network of MCUs. The criteria were that the network should be able to add new nodes as they turn on, and should be robust to nodes disconnecting.

FIGURE 1 – Two complete nodes. The left one is turned on and is actively scanning for other nodes.

We considered several different wireless technologies when designing this system. To create a network of nodes, we needed multiple wireless devices that could be connected to one another simultaneously. We considered several different types of wireless technologies, including Bluetooth, packet radio and Wi-Fi. Preliminary investigations revealed that most hobbyist Bluetooth modules had relatively short ranges, and multiple Bluetooth modules couldn’t be connected at once. Most of the packet radio modules that we found could only be configured as transmitters or receivers, and multiplexing those nodes would have resulted in significant packet drop [1]. We settled on the ESP8266 Wi-Fi module from Espressif Systems, because it met the requirements for this project and has a relatively long range.

The hardware for our project was designed around the PIC32 MCU. We designed a schematic for connecting the PIC32 and the ESP8266 through their serial connections. Our software was designed as a layered architecture. This type of architecture is common in network stacks and allows independent implementation and optimization of individual layers. This approach can help simplify the design process and make the implementation easier.

HARDWARE DESIGN
The primary hardware components of a node were a PIC32 MCU and an ESP8266 Wi-Fi module. Given our previous experience with the PIC32, it proved to be an inexpensive, powerful chip that would allow us to meet the demands of the mesh network. The ESP8266 Wi-Fi-module is a device that is known for its versatility, cost-effectiveness and ease of use.

The PIC32 and the ESP8266 both require a 3.3 V power supply. To meet these demands, we devised a section within our board dedicated to regulate any 4.2-12 V power supply to 3.3 V. As confirmed with a voltmeter, the output of the voltage regulator was a fixed 3.3 V. This output was connected to the VCC pin on the ESP8266 Wi-Fi module and then connected to the appropriate pins on the PIC32.

One of the peripherals added to our nodes was an LED. The LED was useful for testing our algorithms and visualizing the behavior of our system. The LED would constantly blink when the node was searching for a connection, and then stay lit up when it connected to another node. Therefore, if we expected a connection to occur or had an unexpected connection, the LED would be an easy visual to identify an issue. Likewise, if the LED acted according to the expected behavior, it would help confirm the functionality of our node.

— ADVERTISMENT—

Advertise Here

As shown in Figure 2, each of our nodes had four sockets: the Microstick socket, the UART socket, the Wi-Fi socket and the PIC32 socket. These sockets were mainly composed of DIP (dual in-line package) sockets and male headers. Having sockets for our most important parts allowed us to easily swap out components. The integration of these sockets allowed us to replace faulty parts with relative ease.

FIGURE 2 – Mesh network node schematic

Another key aspect of the hardware design was the inclusion of the Wi-Fi debug jumpers. As shown in the Figure 2 schematic, pin 1 of the Wi-Fi debugger is connected to RB7, while pin 2 is connected to RA2. This wiring is how we intended the node to be connected for normal use. Since RB7 was connected to TX on the ESP8266, and RA2 was connected to RX, the inclusion of Wi-Fi jumpers led to fairly easy debugging on the Wi-Fi module. However, we could also disconnect the Wi-Fi and PIC UART modules and then use a cable to communicate directly with the Wi-Fi module from a PC. As explained later, one of the times we had to use this direct communication with the Wi-Fi module was when we flashed the firmware.

FIRMWARE AND SOFTWARE
The firmware on an ESP8266 module determines the commands we can give to the module over UART. To get the most up-to-date commands working on the ESP8266 modules, we needed to flash the latest firmware from Espressif [2]. This required the ESP8266 to be put in flash mode, which was done by pulling the GPIO_0 pin low during reset. Then, we used a serial connection to a computer with the firmware-flashing software to load the new firmware into the ESP8266. We used a USB-to-UART cable to connect the ESP8266’s RX and TX pins to a computer, and used Espressif’s esptool to flash the firmware.

The software design was largely influenced by the constraints of the ESP8266 Wi-Fi modules. Wi-Fi devices typically are configured either as stations or access points. A station is a device such as a computer, which can connect to one Wi-Fi network at a time. An access point is something like a router, which allows many stations to connect to it and acts as a hub for connecting Wi-Fi devices. The ESP8266 Wi-Fi modules can be put in a third mode, which is a hybrid of the two. A single chip can act as both a station and an access point. However, it limits the total number of connections to five.

A station can only connect to a single access point. This means that each Wi-Fi module can make only one connection to another module. A Wi-Fi module can have up to five other modules connect to it, but an individual module can only make one connection. If each Wi-Fi module is treated as a node in a graph and each connection is treated as an edge in that graph, this tells us that the number of edges in our network is limited to the number of Wi-Fi modules in the network. This means we can’t make a very fault-tolerant network, since we can have at most one loop in our network. For this reason, we decided to focus on creating software that tries to interconnect as many devices as possible. To accomplish this, we split the software into four logically separate layers: Serial, Wi-Fi, Routing and Application. Now let’s discuss each of these four layers.

SERIAL LAYER
At the bottom of the software stack is the serial layer. This layer was responsible for communicating with the ESP8266 and exposing a simplified API for sending and receiving data from the Wi-Fi module. The UART hardware on the PIC32 has a buffer for up to eight characters, but if the buffer doesn’t get read, subsequent characters will be dropped by the UART module. This becomes an issue, because the ESP8266 can sometimes send data over UART when we aren’t expecting it—such as when it receives a message from another Wi-Fi module. To ensure that all characters that come in over the UART are stored, we used the PIC32’s DMA controller.

We configured one of the DMA channels to move data from the UART RX queue into a large buffer statically allocated in the PIC32’s main memory. The DMA controller automatically wraps back around to the beginning of the buffer once it has been filled. In this sense, the buffer is treated as a ring buffer. To keep track of the write head of the ring buffer, we set up an interrupt that incremented a write pointer, which fired whenever a cell/byte was transferred using DMA. When we wanted to read data from the buffer, we waited until the write pointer advanced past the read pointer, then marched the read pointer through the data of interest.

Subsequently, we abstracted this functionality into two functions that could be used by the layer above the serial layer to communicate with the ESP8266. The first function sent a string of characters to the ESP8266 over UART. The second used the above method of waiting for the write pointer to advance past the read pointer to read data from the ring buffer and return the data to the layer above. This abstraction hid the complexity of DMA and UART, and allowed the next layer to concern itself only with the bidirectional communication stream between it and the ESP8266.

Wi-Fi LAYER
The layer above the serial layer is the Wi-Fi layer. This layer is mainly concerned with setting up the Wi-Fi module, handling connections and disconnections from stations and access points and receiving messages from other Wi-Fi modules. All communication with the ESP8266 is done by issuing AT commands to the device over UART and listening for a response. We were able to obtain a full list of the supported AT commands for the version of the firmware that we flashed onto the devices [3].

— ADVERTISMENT—

Advertise Here

The Wi-Fi layer issues several AT commands when setting up the ESP8266. First, it sets the Wi-Fi module into the hybrid station+access point mode we discussed earlier by issuing the following command:

AT+CWMODE=3

Next, the Wi-Fi module gets its MAC address, which the rest of the software uses as a unique identifier for this node. It does this by executing the following command and listening for a response:

AT+CIPAPMAC_CUR?

During testing, we hard-coded an IP address for each access point. However, we discovered that there were issues connecting two Wi-Fi devices with the same IP addresses. To correct these problems, we gave each device an IP address based on its MAC address, where the %d is the lower 8 bits of the MAC address:

AT+CIPAP_CUR=”192.168.%d.1”,”192.168.%d.1”,”255.255.255.0”

Next, we needed to allow multiple connections with the Wi-Fi module. We also found in the documentation that the multiple connections mode was required to start up a TCP server on the Wi-Fi module:

AT+CIPMUX=1

Then, we initialized the TCP server on port 80:

AT+CIPSERVER=1,80

And finally, we set the SSID of the WiFi module so that other nodes could find it, where %d is again the lower 8 bits of the MAC address:

AT+CWSAP_CUR=”ESP8266-Mesh-%d”

The primary reasons for using a TCP server instead of a UDP server were that we wanted reliable packet delivery between nodes, and we also wanted to have knowledge about the state of connections. TCP perfectly fits the bill for these requirements as a reliable, connection-oriented, message delivery protocol. This completed the setup portion of our code.

Next, we abstracted several AT commands into simple functions. The first was a function to scan for Wi-Fi modules to connect to. The ESP8266 has a command to return a list of all nearby Wi-Fi access points:

AT+CWLAP

— ADVERTISMENT—

Advertise Here

We created a function that would invoke this command and filter the results to return only a list of Wi-Fi access points with SSIDs starting with “ESP8266-Mesh-”. When a node has found a node to which to connect, it needs to do two things. First, it needs to connect to the node’s access point, which is given by the SSID in the list returned by the scanning function:

AT+CWJAP_CUR=”<access point SSID>”

Second, it needs to connect to the TCP server running on port 80 on that node:

AT+CIPSTART=”TCP”,”<other node’s ip address>”,80

Last, we created a function that would send messages between two connected nodes. This function first invokes the command:

AT+CIPSENDBUF=<connection id>,<data length>

This command tells the Wi-Fi module which connection it should send the data to, and how many bytes the data are. Then, the function sends each byte of the message to the Wi-Fi module.

When a Wi-Fi module receives data from another Wi-Fi module, or when another Wi-Fi device connects to it, the ESP8266 sends out messages over UART indicating the event. We set up a loop in our code to constantly listen for these events and invoke event handlers when the events were detected.

ROUTING LAYER
The layer above the Wi-Fi layer is the routing layer. It stores the network topology as a graph, and sends messages to the routing layer of other nodes to construct the graph. We considered an on-demand routing approach, in which each node only knows about its direct neighbors and then sends out special packets to discover paths to other nodes. However, we realized that mesh network applications would want to know about the topology of the network to optimize the connectivity of the network. Therefore, we decided to make a custom routing algorithm that used special messages to alert the network about the addition and removal of edges in the network. This is similar to the way some link-state routing protocols are implemented.

When a station “S” connects to an access point “A,” it may be the case that the two nodes are on separate sides of a partitioned network. Therefore, they must exchange their current network topology graphs, to merge their two network graphs. Instead of having both nodes be responsible for this, only the station node S receives this so-called “bootstrap packet” from the access point (node A). This bootstrap packet contains node A’s current network topology graph. Node S will then figure out the differences between A’s graph and its own graph. Edges that are in A’s graph but not S’s graph will need to be flooded to S’s side of the network. Edges that are in S’s graph but not in A’s graph will need to be flooded to A’s side of the network. Finally, node S will flood the new S-A edge to the whole network.

As a concrete example, consider the following scenario. Node T is connected to node S and node A is connected to node B. The network graph of T and S thus consists of nodes S and T connected by an edge. The network graph of A and B consists of nodes A and B connected by an edge. Next, node S connects to node A. An overview of the messages sent between the four nodes is given in Figure 3.

FIGURE 3 – The messages created during a connection event. Time flows from top to bottom. All the edge-creation messages originate from Node S.

As described above, the access point A sends the bootstrap packet back to node S, and then node S initiates the flood of messages to get every node in the network up to date on the new network topology. After the flood of messages has subsided, all four nodes in the network contain the same network topology graph: node B connected to node A, node A connected to node S, and node S connected
to node T. The resulting topology graph for this scenario is shown in Figure 4.

FIGURE 4 – The network topology stored in every node after nodes A and S are connected. The edges between nodes point from a station to the access point to which the station is connected.

To stop messages from circulating infinitely throughout the network, we use sequence numbers. Each message that a node creates is given a unique sequence number, which allows other nodes to identify and drop duplicate messages. The routing layer also implements an algorithm for sending directed messages between two nodes. Since the routing layer stores a graph of the mesh network, it can use a shortest-path algorithm to route messages through the network from a source node to a destination node. The algorithm we used was a breadth-first search algorithm. Whenever a node receives a directed message, it first checks to see if it is the intended recipient. If it is, then it passes the message up to a higher layer. Otherwise, it finds the shortest path between it and the destination node, and sends it along that path. It also checks the sequence number in the directed message, to prevent a message from being sent in a cycle forever.

APPLICATIONS AND TESTING
The serial, Wi-Fi, and routing layers formed the core of our mesh network software. We decided to build a few simple applications on top of the core software to demonstrate its capabilities. The first application we built was a way to view the network graphically as nodes came online. To do this, we set up a simple loop that would constantly use the scanning function exposed by the Wi-Fi layer to scan for other modules. If a module was found, the application would then tell the routing layer to connect to that device. Because the nodes used Wi-Fi to communicate, we could connect to the mesh network using any Wi- Fi-enabled device. We connected a laptop to a node in the mesh network as if it were a regular access point. We implemented the same protocol that we created on the nodes for the laptop, essentially turning the laptop into another node. This allowed the laptop to receive the edge-creation messages and have its own graph of the network. We then added some code to display the network graph on the laptop’s screen.

We tested this code by first turning on a single node and connecting the laptop to the node. We then turned on two additional nodes, and gave them some time to find each other and establish a connection. We observed this self-connecting behavior as the graph displayed on the laptop Figure 5. The full code used for this project can be found on GitHub [4].

FIGURE 5 – A screenshot from the demo video for connecting the three nodes and the laptop. The graph on the screen shows the laptop’s current view of the network topology. The video is available on Circuit Cellar’s article materials webpage.

To test our network as a communication network, we used the same basic auto-connection functionality from the previous application. We added an LED to one of the nodes, along with some code to turn the LED on and off, depending on what message the node received. We then started up all the nodes as before and sent messages from the laptop to the node with the LED attached to it. We set up the network so that the message would have to pass through at least one node before it got to the node with the LED. This would confirm that our message-routing algorithm worked.

We observed that a few seconds after sending the command from the computer, the LED turned on or off. The primary reason for the lag was that we hard-coded a delay of 5 seconds between iterations of the main application loop. This was chiefly for debugging purposes and could have been removed. Removal would have made the delay less noticeable and ideally seem like the LED instantly reacted to the message being sent.

RESULTS
Overall, the hardware performed well. The boards we created had no issues and reliably connected the ESP8266 modules, the PIC32 and the UART-to-USB debugging cable. The main problem we had with the software was that when more than a few modules were present in the network, some modules would often disconnect. This may have been due to the 30 second timeout for the TCP connection and the large 5 second delay that was introduced to aid debugging. Although we would have liked to test our implementation without the 5 second delay, this would have required rewriting parts of our DMA buffer reading code in a non-trivial manner.

Additionally, sometimes the Wi-Fi modules were unable to see the access points of other Wi-Fi modules on their scans, even when they were very close. Furthermore, sometimes the Wi-Fi modules failed to set up immediately on power up. A hard reset of the Wi-Fi module usually resolved these issues, though we were unable to identify the cause. Nevertheless, we successfully validated the self-organizing property of the mesh network and the ability of the routing layer to route a message from a source node to a destination node.

CONCLUSIONS
We were quite pleased with the outcome of our work. We met most of our initial goals and made some interesting software along the way. One consideration for future work on this project is to improve the functionality of our routing algorithm on a larger scale. We tested our routing algorithm with a relatively small number of nodes. It likely would not scale to a greater number of nodes, because each node needs to know about the existence of every other node. Furthermore, we would have liked to optimize the speed at which the network could propagate messages. However, this would have required rewriting some of the lower-level code and eliminating the 5 second debug delay. Unfortunately, we ran out of time while creating this project. We also had minimal support for handling disconnections and link failures. We had some ideas about how to solve this problem, but didn’t get a chance to adequately implement them.

In future work on this project, we would like to implement and test some algorithms for keeping the network graph consistent for all nodes when edges are removed. Finally, we want to test how well our system performs as a long-range communication system, by having the network bootstrap itself into a multi-hop mesh network and try and get two computers at the endpoints to communicate with each other. 

References:
[1] http://people.ece.cornell.edu/land/courses/ece4760/FinalProjects/f2016/bac239_aw528_rk534/bac239_aw528_rk534/bac239_aw528_rk534/index.html
[2]  https://github.com/espressif/ESP8266_NONOS_SDK
[3]  https://www.espressif.com/sites/default/files/documentation/4b-esp8266_at_command_examples_en.pdf
[4]  https://github.com/Dan12/ece4760-final

Video of project:

RESOURCES
Espressif Systems | www.espressif.com
Microchip Technology | www.microchip.com
SparkFun | www.sparkfun.com

PUBLISHED IN CIRCUIT CELLAR MAGAZINE • NOVEMBER 2019 #352 – Get a PDF of the issue


Don't miss out on upcoming issues of Circuit Cellar. Subscribe today!

 
 
Note: We’ve made the October 2017 issue of Circuit Cellar available as a free sample issue. In it, you’ll find a rich variety of the kinds of articles and information that exemplify a typical issue of the current magazine.


Would you like to write for Circuit Cellar? We are always accepting articles/posts from the technical community. Get in touch with us and let's discuss your ideas.

Become a Sponsor

Daniel Weber is a Masters of Engineering in Computer Science student at Cornell University. He is interested in programming languages and distributed systems, and enjoys participating in CTF competitions with the Cornell Hacking Club.

Michaelangelo Rodriguez is a graduate of Cornell University's electrical & computer engineering program. He is interested in working with embedded systems, and enjoys learning more about the practical applications of these embedded systems through projects with the Arduino and Raspberry Pi. In his spare time, he also enjoys film and playing soccer and chess.