The Future of Real-Time in Databases
The term “real-time” is bandied about casually by database system vendors, but real-time has long had a specific meaning in embedded systems. To quote geeksforgeeks.org : “A realtime system means that the system is subjected to real-time, in other words, the response should be guaranteed within a specified timing constraint or the system should meet the specified deadline. For example, flight control systems, realtime monitors, etc.” To put it another way, real-time does not mean real fast. Speed is not a measure of success in real-time systems; determinism is the primary measure of success.
Real-time systems are evolving at an incredible rate. Real-time systems used to be relatively simple, like the anti-lock braking systems in aircraft and, later, automobiles. Today, real-time systems are more complex. Advanced Driver Assistance Systems (ADAS) are kind of the poster child. They’re a good example of the complexity of modern real-time systems. ADAS must ingest data from multiple and dissimilar sources—for example LIDAR, SONAR, RADAR, optical cameras, GPS and maps to name a few— creating a significant sensor data fusion problem. It’s necessary to get all that data into a central location (a database!) so that it can be correlated, analyzed, and acted upon (start, stop, turn and so forth), all within hard real-time deadlines.
But here’s the rub: Until recently, no commercial off-the-shelf (COTS) embedded database could be used in real-time systems because none of the offerings were cognizant of deadlines. To illustrate this, consider a realtime system that must react within 50 milliseconds. As shown in Figure 1, the real-time task completes the first couple of steps at the 5- and 10-millisecond marks respectively. Then the task makes a call to the database runtime. However, the database runtime has no awareness of a deadline and doesn’t return control to the task until the deadline is well expired. This system has failed.
To be suited for use in real-time systems, an embedded database run-time system must achieve three objectives. The first is that it must be made aware of deadlines and keep track of elapsed time vis-à-vis the deadline given to it. The second is that it must not have any external dependencies that are not also time cognizant. For example, the database run-time should not call malloc() (the C run-time function for allocating memory dynamically). The third objective is that it must be able to schedule database transactions in a manner appropriate for real-time systems.
Deadlines—If an embedded database system must manage deadlines, it follows that the embedded database run-time must have the means to be made aware of deadlines. Insofar as the unit of work in databases is a transaction, the database API to begin a transaction is the logical place to pass the deadline into the database run-time. As the transaction proceeds, the database run-time needs to frequently check progress against the deadline and, if necessary, abort the transaction to meet the deadline. In a real-time database system, transactions can meet (successfully commit) or miss (successfully abort) their deadline, but can never be late (exceed their deadline). Accomplishing this isn’t as straightforward as you might think unless you’re only targeting one real-time operating system (RTOS) because different RTOS have different was of managing clocks and timers. Figure 2 illustrates a transaction’s timeline. Of interest are the deadline verification control points and deadline control point.
External dependencies—Most embedded and real-time systems are written in C/C++. Programmers tend to use the functions in the C run-time (CRT) library liberally. In many cases, this is innocuous but calls to CRT functions like malloc, or to perform input/output should be avoided. They carry the same risk as illustrated in Figure 1: the calling task and (the database run-time in this case) can disappear inside the CRT function where there is no time-cognizance, and not return until the deadline has been violated, risking system failure.
Scheduling—Databases are commonly used by multiple tasks/threads/processes. The database run-time must coordinate the tasks’ access to the database to avoid conflicts. In industry parlance, this is called concurrency control and falls into two broad categories: optimistic and pessimistic concurrency control. With pessimistic concurrency control, a task requests access to a resource that could be the entire database, a database table or set of tables, a database page or a row of a table. Regardless of the granularity of such requests, a component of the database runtime, often called a lock arbiter or lock manager, needs to coordinate these requests.
Ordinarily, this is done in first-in-first-out order. But this is inadequate for a real-time database system. A real-time database system must schedule transactions according to a priority specified by the developer first, and then within the same priority, by the earliest deadline first. Or, the realtime database run-time must leverage priority inheritance so that a low priority task’s transaction that is already running can be elevated to the same priority as a newly scheduled transaction that has a higher priority.
Today’s real-time systems are seeing growth much like embedded systems did in the late ‘90s and early 2000s, when embedded databases became a necessity because embedded systems were being tasked to do so much more. Line-of business/ departmental computing embedded databases needed to be re-imagined to operate within the resource constraints of embedded systems. That’s the driving force behind the development of McObject’s eXtremeDB. Now, with the growing demand for data management in hard real-time systems, we’ve re-imagined and re-engineered eXtremeDB to create eXtremeDB/rt, the first COTS deterministic database management system to operate within the constraints of mission- and safety-critical real-time systems.
McObject | www.mcobject.com
PUBLISHED IN CIRCUIT CELLAR MAGAZINE • APRIL 2022 #381 – Get a PDF of the issue