# Hibernus: Sustaining Computation during Intermittent Supply for Energy-Harvesting Systems

Domenico Balsamo, Alex S. Weddell, Member, IEEE, Geoff V. Merrett, Member, IEEE, Bashir M. Al-Hashimi, Fellow, IEEE, Davide Brunelli, Member, IEEE, and Luca Benini, Fellow, IEEE

Abstract—A key challenge to the future of energy-harvesting 2 systems is the discontinuous power supply that is often generated. 3 We propose a new approach, Hibernus, which enables computa-4 tion to be sustained during intermittent supply. The approach has 5 a low energy and time overhead which is achieved by reactively 6 hibernating: saving system state only once, when power is about 7 to be lost, and then sleeping until the supply recovers. We validate 8 the approach experimentally on a processor with FRAM non-9 volatile memory, allowing it to reactively hibernate using only 10 energy stored in its decoupling capacitance. When compared to 11 a recently proposed technique, the approach reduces processor 12 time and energy overheads by 76-100% and 49-79% respectively.

Index Terms—energy harvesting, checkpointing, embedded 14 software

# I. INTRODUCTION

Energy-harvesting systems power themselves by extract-17 ing energy from the environment [1]. However, the energy 18 provided is often highly temporally dynamic, providing an 19 intermittent supply that is incapable of sustaining computation. 20 This is because processors switch off when the supply drops 21 below their minimum operating voltage and, when power is 22 available again, restart computation from the beginning.

To manage an intermittent supply, one approach is to use a 24 battery or supercapacitor to buffer energy. However, the level 25 of miniaturisation required to realise medical implants [2] 26 or visions of 'smart dust' [3] causes energy storage to be 27 minimised, constraining the computational ability of systems. 54 28 Recently, a different approach (Mementos [4]) was proposed, 29 which uses the well-known concept of checkpoints [5] placed 30 at compile-time. Mementos saves periodic snapshots of system 31 state to non-volatile memory, which enable it to return to 32 a previous checkpoint after a power failure. A number of 33 checkpoint placement heuristics are proposed, including at the 34 beginning of every function-call or before any loop. At run-35 time, when these checkpoints are reached, the supply voltage  $_{36}\left(V_{\mathrm{CC}}\right)$  of the processor is inspected using the an analog-to- $_{
m 37}$  digital converter (ADC). If it is deemed to be failing ( $V_{
m CC}$  <  $_{38}$  a threshold  $V_{
m M}$ ), a snapshot of the system state is saved to non-39 volatile memory. This requires regular polling of the supply 40 voltage, and can result in multiple snapshots being saved when



Fig. 1. Operation of *Hibernus* in response to intermittent supply voltage.

41 the supply voltage is close to the threshold; both introduce time 42 and energy overheads.

This brief proposes *Hibernus*<sup>1</sup>, a new approach which 44 automatically saves a snapshot only once (without the need for 45 checkpoint placement heuristics), immediately before power 46 failure, then sleeps. Hibernus saves the system's complete 47 volatile memory; this is enabled in part by developments 48 in Ferroelectric RAM (FRAM), a non-volatile memory tech-49 nology that is more efficient than flash, and is now being 50 monolithically integrated into low-power microcontrollers [6]. 51 The speed and efficiency of integrated FRAM means we can 52 react to power loss and save a snapshot using only the energy 53 stored in a system's decoupling capacitance.

#### II. HIBERNUS

The Hibernus approach has two states: active and hi-56 bernating. It moves between these states when the supply  $_{57}$  voltage  $(V_{\rm CC})$  passes thresholds (Fig. 1). It uses a hardware 58 interrupt to detect when  $V_{\rm CC}$  drops below  $V_{\rm H}$ , then prompts a 59 reactive hibernation – saving an immediate snapshot of volatile 60 memory, then entering deep sleep. The snapshot is restored by  $_{61}$  another interrupt, when the supply voltage rises above  $V_{
m R}$ . The 62 approach is illustrated in Fig. 2 and differs from Mementos, 63 whose checkpoint locations are set in advance. Due to this, 64 our approach is more energy- and time-efficient than existing 65 approaches (experimentally demonstrated in Sec. III), and does 66 not depend on checkpoint placement heuristics.

Hibernus is application-agnostic and transparent to the 68 programmer, because it can reactively hibernate at any time the Pervasive Systems Centre, Electronics and Computer Science, University 69 during the execution of an application. Therefore, to save a 70 snapshot of system state, it copies all registers and volatile 71 memory to non-volatile memory. The energy consumed by

> <sup>1</sup>In computing, 'hibernation', from the Latin hībernus, is the process of saving state to allow power to be removed.

D. Balsamo, A. S. Weddell, G. V. Merrett and B. M. Al-Hashimi are with of Southampton, UK.

D. Brunelli is with the Department of Industrial Engineering, University of

D. Balsamo and L. Benini are with the Department of Electrical, Electronic and Information Engineering "Guglielmo Marconi" (DEI), University of Bologna, Italy



Fig. 2. Flow-chart illustrating the Hibernus approach.

<sub>72</sub> this process,  $E_{\sigma}$ , depends on the size of the volatile memory 73 and the energy consumption for copying each byte.

$$E_{\sigma} = n_{\alpha} E_{\alpha} + n_{\beta} E_{\beta} \tag{1}$$

Here,  $n_{\alpha}$  and  $n_{\beta}$  are the sizes of the RAM and registers  $_{75}$  (in bytes).  $E_{lpha}$  and  $E_{eta}$  are the energy required to copy each  $_{_{109}}$ 76 RAM and register byte to non-volatile memory (J/byte).

78 contents of all processor registers and RAM. This is the case 79 with modern microcontrollers, e.g. [6]. It also requires enough 80 energy to be stored in the capacitance between the supply rails 81 to save a full snapshot. Energy harvesting systems normally 82 operate across a range of voltages, from  $V_{\min}$  to  $V_{\max}$ . Below 83  $V_{\min}$ , processors may operate unpredictably (brown-out), or  $_{\rm 84}$  shut down completely. Given the total capacitance  $(\sum C),$  the  $^{\rm 116}$ 85 energy  $E_{\delta}$  stored between a given voltage V and  $V_{\min}$  is:

$$E_{\delta} = \frac{V^2 - V_{\min}^2}{2} \cdot \sum C \tag{2}$$

and hysteresis, allowing the system to restore without taking  $\frac{120}{121}$  ory (and consequently a high  $E_{\sigma}$ ), processors are emerging 88 the  $V_{\rm cc}$  below  $V_{\rm H}$ . For small embedded microcontrollers (with  $_{122}$  that incorporate fast, low-power FRAM rather than flash (and 89 relatively small  $n_{\alpha}$ ) using fast-write non-volatile memory  $n_{123}$  hence have a low  $n_{\alpha}$ . The test platform (Fig. 3) uses a (therefore relatively low  $E_{\alpha}$ ), it is possible to save a snapshot 124 development board combining a Texas Instruments MSP430 without additional C (using only the system's decoupling  $_{125}$  processor [6] with FRAM non-volatile memory. This means <sup>92</sup> capacitance); this is explored in Sec. III. However, if  $E_{\delta} < E_{\sigma}$  that its decoupling capacitance alone allows  $E_{\delta} >> E_{\sigma}$  when with  $V=V_{\rm max}$ , it will not be possible to guarantee that  $_{127}V=V_{\rm max}$ , requiring no additional energy storage (battery or  $_{\mbox{\scriptsize 94}}$  snapshots can be taken reliably, and extra C must be added. The total time,  $T_{\text{hibernus}}$ , to execute a test algorithm with  $\frac{1}{129}$  The platform's datasheet parameters were inspected, and Hibernus is given by (3), where  $T_a$  is the CPU time required 130 identified  $E_{\alpha}$  as 4.2 nJ/byte and  $E_{\beta}$  as 2.7 nJ/byte, with a of to execute the algorithm,  $n_{\iota}$  is number of power interruptions 131 total RAM size of 1024 bytes and register size of 524 bytes. where  $V_{\rm CC} < V_{\rm min}$ ) per algorithm execution,  $T_s$  is the time  $^{132}_{132}$  The platform operates with a  $V_{\rm max} = 3.6~V$  and  $V_{\rm min} = 2.0~V$ . 99 required to save a snapshot to non-volatile memory,  $T_r$  is the  $\frac{1}{2}$  Using (1), a complete operation copying all registers and RAM time required to restore from non-volatile memory, and  $\overline{T_{\lambda}}_{134}$  to FRAM consumes 5.7  $\mu$ J. The decoupling capacitance on the is the average time spent sleeping (after a snapshot has been 135 board totals  $\sum C = 16 \mu F$ . Using (2), it was found that this saved but before  $V_{\rm cc}=V_{\rm min}$ , and on power-up when  $V_{\rm min}<\frac{100}{100}$  alone is sufficient for *Hibernus* and  $V_{\rm H}$  was set to 2.17 V. It was 103  $V_{\rm CC} < V_{\rm R}$ ).

$$T_{\text{hibernus}} = T_a + n_{\iota} \left(T_s + T_r + \overline{T_{\lambda}}\right)$$

$$T_{\text{fotal execution}}$$
No. interruptions Restore snapshot



Fig. 3. The test platform used to experimentally validate *Hibernus*.

106 checkpoints per complete execution of the algorithm,  $T_m$  is 107 the time taken for an ADC reading of  $V_{CC}$ , and  $P_s$  is the 108 proportion of checkpoints resulting in a snapshot, taking  $T_s$ .

$$\underbrace{T_{\text{mementos}}}_{\text{Total execution}} = \underbrace{T_a}_{\text{No. interruptions}} + \underbrace{n_\iota}_{\text{No. interruptions}} + \underbrace{\frac{T_a}{2n_m}}_{\text{Backtrack}}) + \underbrace{n_m(T_m + P_sT_s)}_{\text{Backtrack}} \tag{4}$$

Hence,  $T_{\text{hibernus}} < T_{\text{mementos}}$  provided  $n_{\iota}(T_a/2n_m) +$ Hibernus requires sufficient non-volatile memory to save the  $n_m T_m + (n_m P_s - n_\iota) T_s > n_\iota \overline{T_\lambda}$ ; that is, Hibernus spends 111 less time sleeping than Mementos spends on backtracks (re-112 running code that was executed between a snapshot and a 113 power interruption), sampling  $V_{cc}$ , and redundant snapshot 114 saves. This is evaluated experimentally in the next section.

### III. EXPERIMENTAL VALIDATION

Hibernus has been validated with an intermittent power sup-117 ply and representative workload. Its energy and time overheads 118 have been evaluated, and compared against Mementos.

# 119 A. Implementing Hibernus

V is used to define the threshold  $V_{\rm H}$ , and  $V_{\rm R}$  is set higher to While most microcontrollers have flash non-volatile mem-128 large capacitor) to support Hibernus.

> 137 found empirically that  $V_{\rm R}=2.27~{
> m V}$  was sufficient for reliable 138 operation. The test platform's  $V_{\rm CC}$  input (S2) is connected to (3)  $^{139}$  the output of a signal generator (S1) through a diode, which  $^{139}$  prevents back-flow of charge to the harvester (Fig. 3). Square 141 and sinusoidal traces (Fig. 4) with a peak amplitude of 3.6V

The total time,  $T_{\rm mementos}$ , to execute an algorithm with 142 are presented as examples. The slower decay of S2 compared 105 Mementos is given by (4), where  $n_m$  is the number of 143 to SI is due to the input diode; the slow decay on the negative



Fig. 4. Measured behavior of signals S1 and S2 (Fig. 3) with (a) 6 Hz square wave input; (b) 6 Hz sinusoidal input.

```
#include "hibernus.h"
int main (void) {
  if (flag) restore(); //restore system state
    else initialise(); //initialise hibernus
    application code goes here
  interrupt void COMP_D_ISR(void) {
 hibernate(); //save system state &
```

Fig. 5. Example code used for evaluation of Hibernus.

144 edge illustrates the discharge of the decoupling capacitance by 145 the current drawn by the processor.

functionality is contained 148 only include this library and call the initialise(), 205 system restores required to complete the computation of the 149 hibernate() and restore() routines, as illustrated 206 FFT algorithm, (2) the number of times snapshots were stored, 150 in Fig. 5. As shown in Fig. 2, the algorithm requires that 2017 or checkpoints were called, (3) the energy overhead, and (4) interrupts are generated when  $V_{cc}$  passes  $V_{H}$  and  $V_{R}$ ; this 2008 the processor time overhead. The results were averaged over 152 is facilitated by comparators and voltage references. The 209 three complete executions of the test program. The overheads 153 test platform has an on-chip comparator configured with an 210 are evaluated with reference to the time and energy for the 154 on-chip variable reference voltage generator, and an external 211 processor to complete the FFT algorithm with a steady supply: voltage divider  $(R=200 \text{ k}\Omega)$  giving  $V_{\rm cc}/2$ , as inputs. This  $\frac{211}{212}$  without Mementos or *Hibernus*, it completed in 100 ms. 156 is set up in the initialise() routine. Dependent on 157 whether the system is hibernating or active, the interrupt is 213 C. Results 158 set to trigger off either  $V_{\rm cc} \leq V_{\rm H}$  or  $V_{\rm cc} \geq V_{\rm R}$ . The handler 159 then calls hibernate() or restore().

When hibernate() (Fig. 2) is called, it first pushes the 161 core registers onto the FRAM memory. It then copies the entire 162 RAM contents (stack segment, local and global variables) into 163 the FRAM, followed by the general registers, and finally the 164 Stack Pointer (SP) and Program Counter (PC). It then sleeps in 165 a low-power mode. The system remains in sleep mode until  $V_{\rm CC} > V_{\rm R}$ . The restore () routine is then called and the 167 complete previous system state is restored. The system phases 168 the restore of the memory locations to reinstate its operating 169 state reliably. The general registers are restored first, followed 170 by the RAM, and lastly the core registers including the SP and PC. When the PC is restored from the snapshot, the system 227 Mementos (function), and 76-100% shorter than Mementos 172 implicitly transfers to the application and resumes operation.

# 173 B. Experimental Setup

175 task for energy harvesting systems: a Fast Fourier Transform 232 tion) and 49-76% lower than Mementos (loop).

176 (FFT) analysis of three arrays, each holding 128 8-bit samples of tri-axial accelerometer data. The FFT algorithm was chosen 178 as an illustration: Hibernus is application-agnostic and will 179 provide the same functionality to any embedded program, with 180 minimal impact on the application developer (see Fig. 5). supply interruption frequencies  $f_{\iota}$  (of 2, 4, 6, 8, 10 Hz, 182 and DC) were chosen to represent the types of intermittent 183 power output that may be expected from an energy harvester 184 (e.g. micro wind turbine or inductive power transfer to a 185 rotating object). They allow the overheads of the Hibernus 186 approach to be compared against Mementos.

Our implementation of Mementos places static checkpoints 188 after function calls or before loops, referred to as 'function' 189 and 'loop'. ADC  $(V_{cc})$  measurements are taken and compared 190 to a threshold  $(V_{\rm m}=2.5V)$ , chosen for each scheme to ensure 191 that a snapshot can be saved at least once before power failure. 192 At each checkpoint,  $V_{\rm cc} < V_{\rm m}$  indicates imminent power 193 failure, and a snapshot is saved. Mementos consumes energy 194 for multiple checkpoints, both for ADC readings and saving 195 snapshots. In contrast, Hibernus consumes energy for a single 196 hibernation per power-outage, plus the quiescent consumption 197 of the voltage reference and comparator.

The power consumption at mid-range between  $V_{\text{max}}$  and 199  $V_{\rm min}$  of the FFT algorithm (without Hibernus or Mementos 200 running), ADC, voltage reference, and comparator were mea-201 sured as 2.7 mW, 310  $\mu$ W, 17  $\mu$ W and 130  $\mu$ W respectively. 202 These values are used to estimate the energy consumption 203 of the different approaches. For each of the three schemes,

Fig. 6(a) shows how many checkpoints were made by 215 Hibernus and Mementos during a single execution of the FFT. 216 As can be seen, Hibernus reduces the number of times that 217 checkpoints are taken. This can also be seen from Fig. 7, which 218 shows when Hibernus and Mementos checkpoint (for the case when  $f_{\iota}=6$  Hz), whereas *Hibernus* snapshots (hibernates) 220 only once per interruption (twice in total), Mementos executes 221 a static number of checkpoints (12 and 27 times), although 222 some are repeated when  $V_{\rm CC} < V_{\rm min}$  during a snapshot.

Fig. 6(b) shows that, at higher  $f_{\iota}$  values, Hibernus com-224 pletes execution of the FFT over fewer power interruptions 225 (3, instead of 5). This is because the mean processor time overheads (Fig. 6(d)) of Hibernus are 80-100% shorter than 228 (loop); this leaves more time to execute the application (also 229 shown in Fig. 7, where the arrows denote the total execution 230 time). Furthermore, Fig. 6(c) shows that the energy overheads

The evaluation test case represents a common long-running 231 of running Hibernus are 65-79% lower than Mementos (func-



Fig. 6. Comparison of *Hibernus* against Mementos, showing performance 252 when running the FFT text program (averaged over 3 executions): (a) number of checkpoints/snapshot saves, (b) number of times snapshots were restored, 253 (c) energy overhead, (d) time overhead.



Fig. 7. Results comparing when *Hibernus* and Mementos hibernate, checkpoint, and restore. Results shown were measured over a complete execution of the test FFT algorithm, powered by a sinusoidal supply with  $f_{\iota}$  = 6 Hz.

The benefits of *Hibernus* are particularly noticeable at  $f_{\rm L}=$  234 0 Hz (i.e. DC, when  $V_{\rm CC}$  is uninterrupted), where negligible 295 time and energy overheads are imposed (see Fig. 6(c) and 236 (d)), while Mementos still requires the same number of check-237 points. This increases the required processor active time and 238 energy by at least 10% and 11% respectively. Table I shows 239 experimentally obtained values for the parameters of (3) and 240 (4). Evaluating these equations support our measured results,

TABLE I EXPERIMENTALLY MEASURED PARAMETERS (SEE EQUATIONS (3), (4)).

| $f_{\iota}$ | T <sub>a</sub> | $T_s$ | $T_r$ | $T_{m}$ | $T_{\lambda}$ | Hib.        | Loop        |       |                      | Function    |       |                      |
|-------------|----------------|-------|-------|---------|---------------|-------------|-------------|-------|----------------------|-------------|-------|----------------------|
| (Hz)        | (ms)           | (ms)  | (ms)  | (ms)    | (ms)          | $n_{\iota}$ | $n_{\iota}$ | $n_m$ | $\boldsymbol{P}_{s}$ | $n_{\iota}$ | $n_m$ | $\boldsymbol{P}_{s}$ |
| 0           | 100            | 2.85  | 2.2   | 0.65    | -             | -           | -           | 12    | 0.00                 | -           | 27    | 0.00                 |
| 2           | 100            | 2.85  | 2.2   | 0.65    | 17            | 0           | 0           | 12    | 0.08                 | 0           | 27    | 0.11                 |
| 4           | 100            | 2.85  | 2.2   | 0.65    | 9.5           | 1           | 1           | 12    | 0.25                 | 1           | 27    | 0.19                 |
| 6           | 100            | 2.85  | 2.2   | 0.65    | 6.5           | 2           | 2           | 12    | 0.50                 | 2           | 27    | 0.33                 |
| 8           | 100            | 2.85  | 2.2   | 0.65    | 3.8           | 3           | 5           | 12    | 1.00                 | 5           | 27    | 0.70                 |
| 10          | 100            | 2.85  | 2.2   | 0.65    | 2.8           | 3           | 5           | 12    | 0.83                 | 5           | 27    | 0.67                 |

 $_{241}$  and confirm that Hibernus spends less time sleeping than  $_{242}$  Mementos spends on redundant snapshot saves, backtracks,  $_{243}$  and sampling  $V_{\rm cc}$ .

#### IV. CONCLUSION

A new approach for sustaining computation during inter-<sup>246</sup> mittent supply, *Hibernus*, has been proposed. This allows a <sup>247</sup> system to sustain computation through power outages which <sup>248</sup> are common in energy-harvesting systems. It has a lower <sup>249</sup> energy and time overhead than a recently proposed scheme, <sup>250</sup> as demonstrated experimentally. This contributes to the devel-<sup>251</sup> opment of future energy harvesting systems.

# ACKNOWLEDGMENT

This work is part of the PRIME programme: EPSRC grant EP/K034448/1 (www.prime-project.org). It was also supported by a Telecom Italia s.p.a. PhD grant, and PHIDIAS, an EU 7th Framework Programme project (CA 318013).

#### REFERENCES

- P. D. Mitcheson *et al.*, "Energy Harvesting From Human and Machine Motion for Wireless Electronic Devices," *Proc. IEEE*, vol. 96, no. 9, pp. 1457-1486, Sept. 2008.
- M. R. Mhetre *et al.*, "Micro energy harvesting for biomedical applications:
   A review," *Proc. ICECT 2011*, vol. 3, pp. 1-5, 8-10 April 2011.
- 263 [3] B. A. Warneke and K. S. J. Pister, "An ultra-low energy microcontroller for Smart Dust wireless sensor networks," *Proc. IEEE ISSCC 2004*, pp. 316-317, vol. 1, Feb. 2004.
- 266 [4] B. Ransford *et al.*, "Mementos: System Support for Long-Running
   Computation on RFID-Scale Devices," *ASPLOS11*, Newport Beach, CA,
   USA, Mar. 5-11, 2011.
- 269 [5] P. A. Bernstein *et al.*, "Concurrency Control and Recovery in Database
   Systems," *Addison-Wesley Longman Publishing*, Boston, USA 1987.
- 271 [6] M. Zwerg *et al.*, "An 82μA/MHz microcontroller with embedded FeRAM for energy-harvesting applications," *Proc. IEEE ISSCC 2011*, pp.334-336,
   273 20-24 Feb. 2011.