Difference between revisions of "Safe TinyOS"

From TinyOS Wiki
Jump to: navigation, search
(To Do)
 
(98 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 
= What is Safe TinyOS? =
 
= What is Safe TinyOS? =
  
Safe TinyOS is an alternate toolchain, released as part of TinyOS 2.1, that uses the [http://deputy.cs.berkeley.edu/ Deputy] compiler (now part of the [http://ivy.cs.berkeley.edu Ivy] project) to enforce type and memory safety at run time on motes.  Together, type and memory safety give Java-like guarantees to TinyOS programs: pointer and array errors are trapped before they occur.
+
Safe TinyOS is a modified compilation toolchain, released as part of TinyOS 2.1, that uses the [http://deputy.cs.berkeley.edu Deputy] compiler (now part of the [http://ivy.cs.berkeley.edu Ivy] project) to enforce type and memory safety at run time on motes.  Together, type and memory safety give Java-like guarantees to TinyOS programs: pointer and array errors are trapped before they occur.
  
Memory errors are particularly harmful on motes because
+
Untrapped memory errors are particularly harmful in sensor network applications because:
* Lacking virtual memory, safety errors are never trapped as segmentation violations.
+
* Lacking memory protection hardware and a user/kernel boundary, it is easy to accidentally corrupt OS code; these errors are never trapped as segmentation violations.
* Since memory objects are tightly packed, even walking one element past the end of an array is likely to corrupt RAM.
+
* Since memory objects are tightly packed, walking even one element past the end of an array is likely to corrupt RAM that your application relies on.
* Since deployed motes are very difficult to debug, and since other failure modes exist (battery outage, connectivity failure, etc.) it can be very hard to tell when memory corruption is the root cause of a serious deployment problem.
+
* Since deployed motes are difficult to debug, and since other failure modes exist (concurrency errors, battery outage, connectivity failure, etc.) it can be very hard to tell when memory corruption is the root cause of a serious deployment problem.
 +
* They might permit someone to [http://portal.acm.org/citation.cfm?id=1455775 own your motes]
  
Safe TinyOS was described in a paper at SenSys 2007: [http://www.cs.utah.edu/~coop/me/pubs/sensys07.htm Efficient Memory Safety for TinyOS].
+
Safe TinyOS is described in detail in a paper from SenSys 2007: [http://www.cs.utah.edu/~coop/me/pubs/sensys07.htm Efficient Memory Safety for TinyOS].
  
 
= Supported Platforms =
 
= Supported Platforms =
  
Safe compilation of applications for these platforms are supported by default by TinyOS 2.1:
+
Safe compilation is supported by default by TinyOS 2.1 for these platforms:
 
* Mica2
 
* Mica2
 
* MicaZ
 
* MicaZ
 
* TelosB
 
* TelosB
  
We welcome patches supporting additional platforms.
+
= Building a Safe Application =
  
= Getting Started =
+
Getting started is easy:
 
 
Getting started is easy.  To compile a safe application go to for example <tt>$TOSROOT/apps/Blink</tt> and run:
 
  
 
<pre>
 
<pre>
  make micaz safe
+
cd $TOSROOT/apps/Blink
 +
make micaz safe
 
</pre>
 
</pre>
  
 
Since Blink contains no known safety errors, the safe version of Blink should act just like its unsafe counterpart.
 
Since Blink contains no known safety errors, the safe version of Blink should act just like its unsafe counterpart.
  
As a sanity check, make sure that the code size of the safe version of Blink is larger than that
+
As a sanity check, you may want to make sure that the safe version of Blink has a larger code size than the unsafe version:
of the unsafe version:
 
  
 
<pre>
 
<pre>
Line 56: Line 55:
 
</pre>
 
</pre>
  
As illustrated here, Safe TinyOS does not have any RAM overhead.  The ROM overhead is due to bounds-checking pointers prior to dereferences.
+
The ROM overhead is due to bounds-checking code inserted by the Deputy compiler.  Safe TinyOS does not have any RAM overhead.
  
 
= Interpreting Safety Violations =
 
= Interpreting Safety Violations =
  
By default, a mote that encounters a safety violation repeatedly blinks a failure code on its LEDs. This code is a sequence of 8 digits each in the range 0-3. The read this code, wait for the mote's LEDs to "roll," or rapidly blink several times in a 1-2-3 sequence. Next, some number of LEDs will be lit-- write down this number. Usually the first digit or two of the FLID will be zero. Following each digit of the FLID, all three LEDs will be very briefly illuminated to serve as a separator. After all 8 digits have been presented the LEDs will roll and then the FLID is repeated. Once you have the sequence of 8 digits, it can be decoded as follows:
+
TinyOS 2.1 comes with an application that is designed to fail; it is here:
 +
 
 +
<pre>
 +
$TOSROOT/apps/tutorials/BlinkFail
 +
</pre>
 +
 
 +
Looking at BlinkFailC.nc, it is easy to see that the 11th time the Timer1.fired() is signaled, it will access out of bounds storage.
 +
 
 +
First, run the unsafe version of this application:
 +
 
 +
<pre>
 +
make micaz install
 +
</pre>
 +
 
 +
In general, it is hard to predict the consequences of a memory safety violation in a TinyOS program. In many cases, the observable symptom is node crash. That is what should happen after running the unsafe BlinkFail for a few seconds.
  
 +
Next run the safe version:
 +
 +
<pre>
 +
make micaz safe install
 +
</pre>
 +
 +
Prior to accessing <tt>a[10]</tt>, the Safe TinyOS failure handler is invoked, resulting in lots of LED activity.  The purpose of this activity is to give you a "FLID" (fault location identifier) indicating the source-code location of the faulting access.
 +
 +
A FLID is a sequence of 8 base-4 digits.  To read a FLID, wait for the mote's LEDs to "roll," or rapidly blink several times in a 1-2-3 sequence. Next, some number of LEDs will be lit: write down this number. Usually the first digit or two of a FLID will be zero. Following each digit of the FLID, all three LEDs will be very briefly illuminated to serve as a separator. After all 8 digits have been presented, the LEDs will again roll and then the FLID is repeated. Once you have the sequence of 8 digits, it can be decoded as follows:
 +
 +
<pre>
 
tos-decode-flid flid-file flid
 
tos-decode-flid flid-file flid
 +
</pre>
 +
 
The flid-file argument specifies a file that helps the script map from the flid to an error message. Building a safe application causes this file to be placed here:
 
The flid-file argument specifies a file that helps the script map from the flid to an error message. Building a safe application causes this file to be placed here:
  
 +
<pre>
 
build/[platform]/flids.txt
 
build/[platform]/flids.txt
Mixing Safe and Unsafe Code
+
</pre>
  
A module with the @safe attribute is compiled as type-safe and a module with the @unsafe attribute is compiled type-unsafe. By default, a module with neither of these attributes is considered unsafe. This default can be overridden using the nesC command line option -fnesc-default-safe.
+
For example, for the BlinkFail application we get:
  
TODO: explain how to find and eliminate unwanted trusted modules
+
<pre>
 +
[regehr@babel BlinkFail]$ tos-decode-flid ./build/micaz/flids.txt 00001020
 +
Deputy error message for flid 0x0048:
  
= Safety Violations in a Simulator =
+
BlinkFailC__a <= BlinkFailC__a + BlinkFailC__i++ + 1 (with no overflow): BlinkFailC.nc:70: Assertion failed in CPtrArithAccess:
 +
  BlinkFailC__a + BlinkFailC__i++ + 1 <= BlinkFailC__a + 10 (with no overflow)
 +
</pre>
  
To see how trapping safety violations is useful, go to this directory:
+
This confirms what we see by inspecting BlinkFailC.nc: line 70 is a safety problem.
  
$TOSROOT/apps/tutorials/BlinkFail
+
= Safety Violations in a Simulator =
Take a look inside BlinkFailC.nc. It is easy to see that the 11th time the Timer1.fired() is signaled, it will access out of bounds storage. First run the unsafe version:
 
  
make [platform] install
+
A simulator offers a pleasant alternative to debugging safety violations on real hardware.  To try this out, first download and install the CVS version of [http://compilers.cs.ucla.edu/avrora Avrora].
Although the consequences of memory safety violations in C or nesC programs are arbitrary, the expected case is that RAM corruption will eventually cause the node to crash. In this case, a crash should happen after just a few seconds.
 
  
Now run the safe version:
+
Next: build a safe version of BlinkFail, make a copy with the <tt>.elf</tt> extension, and run it in Avrora:
  
make [platform] safe install
+
<pre>
The trapped safety violation is indicated by lots of LED activity. If you want to run this in Avrora:
+
cd $TOSROOT/apps/tutorials/BlinkFail
 +
make micaz safe
 
cp build/micaz/main.exe main.elf
 
cp build/micaz/main.exe main.elf
avrora -simulation=sensor-network -platform=micaz -monitors=leds,break,sleep,interrupts main.elf
+
java avrora.Main -simulation=sensor-network -platform=micaz -monitors=leds,break,sleep,interrupts main.elf
The Avrora output will contain text like this:
+
</pre>
 +
 
 +
The Avrora output should contain text like this:
  
 +
<pre>
 
   0      54435382  break instruction @ 0x0B9E, r30:r31 = 0x0048
 
   0      54435382  break instruction @ 0x0B9E, r30:r31 = 0x0048
 
   0      54435382        @ deputy_fail_noreturn_fast
 
   0      54435382        @ deputy_fail_noreturn_fast
Line 95: Line 128:
 
   0      54435382        @ VirtualizeTimerC$0$fireTimers
 
   0      54435382        @ VirtualizeTimerC$0$fireTimers
 
   0      54435382        @ AlarmToTimerC$0$fired$runTask
 
   0      54435382        @ AlarmToTimerC$0$fired$runTask
The value loaded into the Z register (r30:r31) in code from fail.c is the FLID (see below).
+
</pre>
 +
 
 +
The callstack leading up to the safety violation is shown.  The value loaded into the register pair r30:r31 (the AVR's Z register) is the FLID.
  
 
= Understanding Safe Code =
 
= Understanding Safe Code =
  
Safe code contains safety annotations that are used to communicate invariants to the Deputy compiler so that it can insert appropriate safety checks. Most annotations appear in function prototypes and on global variables, but sometimes annotations are needed inside functions.
+
Safe code contains annotations that express invariants in such a way that the Deputy compiler can perform appropriate static and dynamic checking. Most annotations appear in nesC interface files and in global variable declarations, but sometimes annotations are needed inside functions.
  
 
Annotations that you will find in TinyOS code are:
 
Annotations that you will find in TinyOS code are:
  
ONE -- a pointer that always refers to a single object, similar to a C++ reference
+
; <tt>ONE</tt> : A pointer that always refers to a single object, similar to a C++ reference.
ONE_NOK -- same as ONE but may be null
+
 
COUNT(n) -- a pointer that always refers to a block of at least n objects
+
; <tt>ONE_NOK</tt> : Same as ONE but may be NULL.
COUNT_NOK(n) -- same as COUNT but may be null
+
 
BND(n,m) -- a pointer p such that n≤p<m, and that is aligned with respect to n and m
+
; <tt>COUNT(n)</tt> : A pointer that always refers to a block of at least n objects.
BND_NOK(n,m) -- same as BND but may be null
+
 
TCAST(type,expr) -- a trusted cast, which tells Deputy to just trust the programmer. This is needed to perform casts that are inherently unsafe (e.g., casting an array of bytes into a message_t) or to perform casts that are safe but beyond the reach of Deputy's type system (e.g. some kinds of getHeader() and getFooter() calls)
+
; <tt>COUNT_NOK(n)</tt> : Same as COUNT but may be NULL.
TRUSTEDBLOCK -- code that is completely trusted (i.e., ignored by Deputy). This is used in very few places, and should be avoided when possible.
+
 
The non-null annotations should be used whenever possible, for two reasons. First, the are good documentation. Second, they can help Deputy produce better code since dereferences of these pointers need not be preceded by a null check.
+
; <tt>BND(n,m)</tt> : A pointer p such that n≤p<m, and that is aligned with respect to n and m.
  
= Understanding Annotations in tos/interfaces/*.nc =
+
; <tt>BND_NOK(n,m)</tt> : Same as BND but may be NULL.
  
Write me.
+
; <tt>TCAST(type,expr)</tt> : A trusted cast, which tells Deputy to just trust the programmer. This is needed to perform casts that are safe, but are beyond the reach of Deputy's type system.  Pointer casts in the network stack (e.g. getHeader() and getFooter() code) tend to fall into this category.
  
 +
; <tt>TRUSTEDBLOCK</tt> : Code that is completely trusted (i.e., ignored by Deputy). This is used in very few places, and should be avoided when possible.
 +
 +
The [http://deputy.cs.berkeley.edu/quickref.html quick reference] and [http://deputy.cs.berkeley.edu/manual.html user manual] for Deputy are the best starting points for learning to use its type system, but note that the Safe TinyOS annotations are slightly different from the original Deputy ones.
  
 
= Writing Safe Code =
 
= Writing Safe Code =
  
Writing safe code is generally very easy: in the common case annotations are only needed on pointers that reference multiple objects. The best way to get started is to compile regular nesC or C code in safe mode, see what warnings/errors are emitted by the Deputy compiler, and then start annotating the code until it compiles cleanly. In a small minority of cases, it is necessary to do some refactoring.
+
Writing safe code is generally very easy: in the common case annotations are only needed in declarations for pointers. The best way to get started is to compile regular nesC code in safe mode and see what warnings/errors are emitted by the Deputy compiler.  Then, start annotating the code until it compiles cleanly and runs properly. In a small minority of cases, it is necessary to do some refactoring.
  
Jeremy Condit's quick reference and manual for Deputy are the best starting points for learning to use its type system.
+
<blockquote>
 +
<b>Important:</b> To get safe compilation, modules must be marked with nesC's <tt>@safe()</tt> attribute (see [[#Mixing Safe and Unsafe Code]] below).  Modules lacking this annotation are trusted code: they are compiled without safety checks.
 +
</blockquote>
  
= Customized Failure Handlers =
+
The non-null annotations should be used whenever possible, for two reasons. First, they serve as good documentation. Second, they generally result in more efficient code since dereferences of these pointers need not be preceded by a NULL check.
  
The default safety violation handler is here:
+
The [http://deputy.cs.berkeley.edu/quickref.html quick reference] and [http://deputy.cs.berkeley.edu/manual.html user manual] for Deputy are the best starting points for learning to use its type system, but note that the Safe TinyOS annotations are slightly different from the original Deputy ones.
$SAFE_TINYOS_HOME/tinyos-2.x/tos/lib/safe/platform/fail.c
+
 
This code can be changed to send a packet, log to flash, reboot the mote, or whatever you like. Note that it is highly unlikely that you could send a packet if a fault happened to occur in the radio stack.
+
= Safe Code Examples =
Optimization
+
 
 +
To declare msg as a variable of type "pointer to message_t" that always points to a valid message_t:
 +
 
 +
<pre>
 +
message_t* ONE msg = ...;
 +
</pre>
 +
 
 +
This pointer is never NULL and pointer arithmetic cannot be performed on it.
 +
 
 +
To declare uartQueue as an array containing UART_QUEUE_LEN elements, where the type of each element is "pointer to message_t" and each element of the array must at all times either be NULL or refer to a valid message_t:
 +
 
 +
<pre>
 +
message_t* ONE_NOK uartQueue[UART_QUEUE_LEN];
 +
</pre>
 +
 
 +
This pointer can be used in array expressions as follows:
 +
 
 +
<pre>
 +
p = uartQueue[-1];                // illegal
 +
p = uartQueue[0];                // legal
 +
p = uartQueue[1];                // legal
 +
p = uartQueue[UART_QUEUE_LEN-1];  // legal
 +
p = uartQueue[UART_QUEUE_LEN];    // illegal
 +
</pre>
 +
 
 +
To declare reqBuf as a pointer that is at all times either NULL or pointing to a valid block of at least reqBytes uint8_ts:
 +
 
 +
<pre>
 +
uint8_t * COUNT_NOK(reqBytes) reqBuf;
 +
</pre>
 +
 
 +
To declare pos as a pointer to uint16_t that is always either NULL or refers to a valid uint16_t inside the block of memory delimited by buffer and buffer+count:
 +
 
 +
<pre>
 +
norace uint16_t * BND_NOK(buffer, buffer+count) pos;
 +
</pre>
 +
 
 +
Multiply indirect pointers (rarely seen in TinyOS) require annotations at each level of indirection.  For example, to declare pp as a pointer to pointer to integer such that both pp and anything it points to are valid at all times:
 +
 
 +
<pre>
 +
int * ONE * ONE pp = ...;
 +
</pre>
 +
 
 +
The TCAST macro evaluates its second argument and turns it into a pointer of the type of its first argument without any safety checks:
 +
 
 +
<pre>
 +
cc2420_header_t* ONE x =
 +
  TCAST(cc2420_header_t* ONE,
 +
        (uint8_t *)msg + offsetof(message_t, data) - sizeof( cc2420_header_t ));
 +
</pre>
 +
 
 +
= Understanding Interface Annotations =
 +
 
 +
The TinyOS Core Working Group decided that it was important to avoid hurting the readability of the primary TinyOS interfaces by cluttering them up with safety annotations.  Therefore, the Safe TinyOS annotations in the interfaces use a different syntax than the one generally used for code in modules.
 +
To see how this works, consider this example from <tt>$TOSDIR/interfaces/Packet.nc</tt>:
 +
 
 +
<pre>
 +
/**
 +
  * Return a pointer to a protocol's payload region in a packet.
 +
  * If the caller intends to write to the payload region then
 +
  * the <tt>len</tt> parameter must reflect the maximum required
 +
  * length. If the caller (only) wants to read from the payload
 +
  * region, then <tt>len</tt> may be set to the value of
 +
  * payloadLength(). If the payload region is smaller than
 +
  * <tt>len</tt> this command returns NULL. The offset where
 +
  * the payload region starts within a packet is fixed, i.e. for
 +
  * a given <tt>msg</tt> this command will always return the same
 +
  * pointer or NULL.
 +
  *
 +
  * @param 'message_t* ONE msg'  the packet
 +
  * @param len  the length of payload required
 +
  * @return 'void* COUNT_NOK(len)'    a pointer to the packet's data payload for this layer
 +
  *              or NULL if <tt>len</tt> is too big
 +
  */
 +
command void* getPayload(message_t* msg, uint8_t len);
 +
</pre>
 +
 
 +
Instead of inlining the Safe TinyOS annotations into the <tt>getPayload()</tt> prototype, the annotations are placed in the [[nesdoc]] comment.  The syntax above is equivalent to
 +
 
 +
<pre>
 +
command void* COUNT_NOK(len) getPayload(message_t* ONE msg, uint8_t len);
 +
</pre>
 +
 
 +
It would be prudent to avoid mixing the two kinds of annotation syntax within a single function. 
 +
 
 +
The current convention (which is not automatically enforced) is to use the nesdoc annotation syntax in <tt>tos/interfaces</tt> and the inline syntax elsewhere.
 +
 
 +
= Mixing Safe and Unsafe Code =
 +
 
 +
By default, a TinyOS module is compiled as <i>unsafe</i> code.  In other words, it is trusted code and Deputy
 +
will not insert any checks into it.  The TinyOS Core Working Group decided on unsafe as the default compilation
 +
mode in order to make it easier to transition applications to Safe TinyOS.  This default can be overridden using the
 +
nesC compiler command line option <tt>-fnesc-default-safe</tt>.
 +
 
 +
A module with the <tt>@safe()</tt> attribute is compiled in safe mode: Deputy checks are inserted.
 +
For example in <tt>tos/system/SimpleArbiterP.nc</tt> starts with:
 +
 
 +
<pre>
 +
generic module SimpleArbiterP() @safe() {
 +
</pre>
 +
 
 +
A module with the <tt>@unsafe()</tt> attribute is explicitly exempt from Deputy's checking.
 +
 
 +
In general, the following conventions should be respected:
 +
* Use <tt>@safe()</tt> for any module that has been "Deputized," meaning that it contains sufficient annotations to compile and execute properly in safe mode
 +
* Use <tt>@unsafe()</tt> for any module that fundamentally cannot be compiled in safe mode (at the time this is written, no such modules exist)
 +
* Leave out annotations for modules that have not yet been Deputized
 +
 
 +
When compiling without Safe TinyOS (e.g. <tt>make micaz</tt>), safety annotations are ignored.
 +
 
 +
An application that contains no unsafe modules is globally safe: any safety error that can be caught by Deputy, will be caught.  The easiest way to check if an application that you are using or writing is globally safe is to inspect its <tt>app.c</tt> for <tt>DEPUTY_TRUSTEDBLOCK</tt> annotations, which are inserted by the nesC compiler to tell Deputy to avoid compiling functions from unsafe modules as safe code.  For example:
 +
 
 +
<pre>
 +
[regehr@babel Blink]$ cd $TOSROOT/apps/Blink
 +
[regehr@babel Blink]$ make micaz safe
 +
mkdir -p build/micaz
 +
    compiling BlinkAppC to a micaz binary
 +
ncc -o build/micaz/main.exe -DSAFE_TINYOS ...
 +
    compiled BlinkAppC to build/micaz/main.exe
 +
            2562 bytes in ROM
 +
              51 bytes in RAM
 +
avr-objcopy --output-target=srec build/micaz/main.exe build/micaz/main.srec
 +
avr-objcopy --output-target=ihex build/micaz/main.exe build/micaz/main.ihex
 +
    writing TOS image
 +
[regehr@babel Blink]$ grep -c DEPUTY_TRUSTEDBLOCK build/micaz/app.c
 +
14
 +
</pre>
  
These instructions give a binary that is not heavily optimized. It is possible to reduce the overheads of safety using our cXprop optimizer. We'll add instructions later on how to do this.
+
This shows that the <tt>app.c</tt> for Blink on MicaZ contains 14 DEPUTY_TRUSTEDBLOCK annotations.
 +
However, a detailed inspection shows that almost all of this trusted code is in the failure handlers
 +
themselves. It makes no sense to compile these as safe code.
  
 +
= Customizing the Failure Handler =
  
= To Do =
+
The default safety violation handler is here:
  
We have not yet, but plan to:
+
<pre>
Support platforms other than Mica2, Micaz, and TelosB
+
$TOSDIR/lib/safe/SafeFailureHandlerP.nc
Make TOSSIM work in safe mode
+
</pre>
Integrate Safe TinyOS with a stack depth analysis tool to avoid unsafety through stack overflow
 
Solve the problem of unsafe accesses to pointers to dead stack frames -- these are not covered by Deputy
 
  
Write documentation
+
This code could be changed to notify a base station that a safety violation occurred, log a FLID to flash, reboot the mote, or whatever you like.  Ideally, failure handling code should be atomic, should be independent of the state of the mote's RAM and hardware devices at the time a violation happens, and should not call into any safe code.  Thus, it is highly unlikely that you could successfully send a packet reporting a fault that occurred in the TinyOS network stack without having written a small, customized network driver whose specific purpose is to report safety errors in the main TinyOS network stack.
Eliminate as much trusted code (TCAST and TRUSTEDBLOCK) as possible
 
Let us know if you'd like to help!
 
  
 
= Feedback =
 
= Feedback =
  
Please send problem reports and other feedback to directly to John Regehr or to one of the TinyOS mailing lists. We can create a mailing list specific to Safe TinyOS if that seems desirable.
+
Please send problem reports and other feedback to <tt>tinyos-help</tt> or <tt>tinyos-devel</tt>, as appropriate.
People
+
 
 +
= To Do =
  
We are John Regehr, Eric Eide, Nathan Cooprider, Will Archer, and Yang Chen at Utah, and David Gay at Intel Research Berkeley.
+
For this document:
  
= Acknowledgments =
+
* Factor some of this information out into a TEP
 +
* Explain how to find and understand the Deputy output
 +
* Split out stuff for 2.1 vs. CVS HEAD (where there's a difference)
 +
* Show examples of how to refactor code to be Deputy-friendly
  
The Deputy and CIL people at Berkeley, mainly Jeremy Condit (now at Microsoft Research), Matt Harren, and Zachary Anderson, have been extremely helpful.
+
For Safe TinyOS in general:
  
This work has benefited from input from the TinyOS 2 Core Working Group. In particular, Phil Levis pushed through an important change to the getPayload interface.
+
* Support intelmote2, shimmer, and iris
 +
* Commit the Utah patch for safe TOSSIM
 +
* Although both the Deputy compiler and gcc both attempt to remove unneeded safety checks, there is still plenty of room for improvement.  We'd like advanced compiler techniques (e.g. calling out to a modern decision procedure) for showing that safety checks definitely do or do not fire
 +
* Write an [[Avrora]] monitor that goes ahead and automatically decodes the FLID
 +
* Write an [[mspsim]] hack that does the same
 +
* Add a nesC compiler option that prints a list of unsafe modules linked into an app
 +
* Integrate Safe TinyOS with a stack depth analysis tool to avoid unsafety through stack overflow
 +
* Solve the problem of unsafe accesses through pointers to dead stack frames
 +
* Eliminate as much trusted code (TCAST and TRUSTEDBLOCK) as possible
  
This work is supported by NSF awards CNS-0615367 and CNS-0448047.
+
Let us know if you'd like to help.

Latest revision as of 02:25, 26 November 2009

What is Safe TinyOS?

Safe TinyOS is a modified compilation toolchain, released as part of TinyOS 2.1, that uses the Deputy compiler (now part of the Ivy project) to enforce type and memory safety at run time on motes. Together, type and memory safety give Java-like guarantees to TinyOS programs: pointer and array errors are trapped before they occur.

Untrapped memory errors are particularly harmful in sensor network applications because:

  • Lacking memory protection hardware and a user/kernel boundary, it is easy to accidentally corrupt OS code; these errors are never trapped as segmentation violations.
  • Since memory objects are tightly packed, walking even one element past the end of an array is likely to corrupt RAM that your application relies on.
  • Since deployed motes are difficult to debug, and since other failure modes exist (concurrency errors, battery outage, connectivity failure, etc.) it can be very hard to tell when memory corruption is the root cause of a serious deployment problem.
  • They might permit someone to own your motes

Safe TinyOS is described in detail in a paper from SenSys 2007: Efficient Memory Safety for TinyOS.

Supported Platforms

Safe compilation is supported by default by TinyOS 2.1 for these platforms:

  • Mica2
  • MicaZ
  • TelosB

Building a Safe Application

Getting started is easy:

cd $TOSROOT/apps/Blink
make micaz safe

Since Blink contains no known safety errors, the safe version of Blink should act just like its unsafe counterpart.

As a sanity check, you may want to make sure that the safe version of Blink has a larger code size than the unsafe version:

$ make micaz
mkdir -p build/micaz
    compiling BlinkAppC to a micaz binary
ncc -o build/micaz/main.exe ....
    compiled BlinkAppC to build/micaz/main.exe
            2052 bytes in ROM
              51 bytes in RAM
avr-objcopy --output-target=srec build/micaz/main.exe build/micaz/main.srec
avr-objcopy --output-target=ihex build/micaz/main.exe build/micaz/main.ihex
    writing TOS image

$ make micaz safe
mkdir -p build/micaz
    compiling BlinkAppC to a micaz binary
ncc -o build/micaz/main.exe ...
    compiled BlinkAppC to build/micaz/main.exe
            2562 bytes in ROM
              51 bytes in RAM
avr-objcopy --output-target=srec build/micaz/main.exe build/micaz/main.srec
avr-objcopy --output-target=ihex build/micaz/main.exe build/micaz/main.ihex
    writing TOS image

The ROM overhead is due to bounds-checking code inserted by the Deputy compiler. Safe TinyOS does not have any RAM overhead.

Interpreting Safety Violations

TinyOS 2.1 comes with an application that is designed to fail; it is here:

$TOSROOT/apps/tutorials/BlinkFail

Looking at BlinkFailC.nc, it is easy to see that the 11th time the Timer1.fired() is signaled, it will access out of bounds storage.

First, run the unsafe version of this application:

make micaz install

In general, it is hard to predict the consequences of a memory safety violation in a TinyOS program. In many cases, the observable symptom is node crash. That is what should happen after running the unsafe BlinkFail for a few seconds.

Next run the safe version:

make micaz safe install

Prior to accessing a[10], the Safe TinyOS failure handler is invoked, resulting in lots of LED activity. The purpose of this activity is to give you a "FLID" (fault location identifier) indicating the source-code location of the faulting access.

A FLID is a sequence of 8 base-4 digits. To read a FLID, wait for the mote's LEDs to "roll," or rapidly blink several times in a 1-2-3 sequence. Next, some number of LEDs will be lit: write down this number. Usually the first digit or two of a FLID will be zero. Following each digit of the FLID, all three LEDs will be very briefly illuminated to serve as a separator. After all 8 digits have been presented, the LEDs will again roll and then the FLID is repeated. Once you have the sequence of 8 digits, it can be decoded as follows:

tos-decode-flid flid-file flid

The flid-file argument specifies a file that helps the script map from the flid to an error message. Building a safe application causes this file to be placed here:

build/[platform]/flids.txt

For example, for the BlinkFail application we get:

[regehr@babel BlinkFail]$ tos-decode-flid ./build/micaz/flids.txt 00001020
Deputy error message for flid 0x0048:

BlinkFailC__a <= BlinkFailC__a + BlinkFailC__i++ + 1 (with no overflow): BlinkFailC.nc:70: Assertion failed in CPtrArithAccess:
  BlinkFailC__a + BlinkFailC__i++ + 1 <= BlinkFailC__a + 10 (with no overflow)

This confirms what we see by inspecting BlinkFailC.nc: line 70 is a safety problem.

Safety Violations in a Simulator

A simulator offers a pleasant alternative to debugging safety violations on real hardware. To try this out, first download and install the CVS version of Avrora.

Next: build a safe version of BlinkFail, make a copy with the .elf extension, and run it in Avrora:

cd $TOSROOT/apps/tutorials/BlinkFail
make micaz safe
cp build/micaz/main.exe main.elf
java avrora.Main -simulation=sensor-network -platform=micaz -monitors=leds,break,sleep,interrupts main.elf

The Avrora output should contain text like this:

   0      54435382  break instruction @ 0x0B9E, r30:r31 = 0x0048
   0      54435382        @ deputy_fail_noreturn_fast
   0      54435382        @ deputy_fail_mayreturn
   0      54435382        @ VirtualizeTimerC$0$fireTimers
   0      54435382        @ AlarmToTimerC$0$fired$runTask

The callstack leading up to the safety violation is shown. The value loaded into the register pair r30:r31 (the AVR's Z register) is the FLID.

Understanding Safe Code

Safe code contains annotations that express invariants in such a way that the Deputy compiler can perform appropriate static and dynamic checking. Most annotations appear in nesC interface files and in global variable declarations, but sometimes annotations are needed inside functions.

Annotations that you will find in TinyOS code are:

ONE 
A pointer that always refers to a single object, similar to a C++ reference.
ONE_NOK 
Same as ONE but may be NULL.
COUNT(n) 
A pointer that always refers to a block of at least n objects.
COUNT_NOK(n) 
Same as COUNT but may be NULL.
BND(n,m) 
A pointer p such that n≤p<m, and that is aligned with respect to n and m.
BND_NOK(n,m) 
Same as BND but may be NULL.
TCAST(type,expr) 
A trusted cast, which tells Deputy to just trust the programmer. This is needed to perform casts that are safe, but are beyond the reach of Deputy's type system. Pointer casts in the network stack (e.g. getHeader() and getFooter() code) tend to fall into this category.
TRUSTEDBLOCK 
Code that is completely trusted (i.e., ignored by Deputy). This is used in very few places, and should be avoided when possible.

The quick reference and user manual for Deputy are the best starting points for learning to use its type system, but note that the Safe TinyOS annotations are slightly different from the original Deputy ones.

Writing Safe Code

Writing safe code is generally very easy: in the common case annotations are only needed in declarations for pointers. The best way to get started is to compile regular nesC code in safe mode and see what warnings/errors are emitted by the Deputy compiler. Then, start annotating the code until it compiles cleanly and runs properly. In a small minority of cases, it is necessary to do some refactoring.

Important: To get safe compilation, modules must be marked with nesC's @safe() attribute (see #Mixing Safe and Unsafe Code below). Modules lacking this annotation are trusted code: they are compiled without safety checks.

The non-null annotations should be used whenever possible, for two reasons. First, they serve as good documentation. Second, they generally result in more efficient code since dereferences of these pointers need not be preceded by a NULL check.

The quick reference and user manual for Deputy are the best starting points for learning to use its type system, but note that the Safe TinyOS annotations are slightly different from the original Deputy ones.

Safe Code Examples

To declare msg as a variable of type "pointer to message_t" that always points to a valid message_t:

message_t* ONE msg = ...;

This pointer is never NULL and pointer arithmetic cannot be performed on it.

To declare uartQueue as an array containing UART_QUEUE_LEN elements, where the type of each element is "pointer to message_t" and each element of the array must at all times either be NULL or refer to a valid message_t:

message_t* ONE_NOK uartQueue[UART_QUEUE_LEN];

This pointer can be used in array expressions as follows:

p = uartQueue[-1];                // illegal
p = uartQueue[0];                 // legal
p = uartQueue[1];                 // legal
p = uartQueue[UART_QUEUE_LEN-1];  // legal
p = uartQueue[UART_QUEUE_LEN];    // illegal

To declare reqBuf as a pointer that is at all times either NULL or pointing to a valid block of at least reqBytes uint8_ts:

uint8_t * COUNT_NOK(reqBytes) reqBuf;

To declare pos as a pointer to uint16_t that is always either NULL or refers to a valid uint16_t inside the block of memory delimited by buffer and buffer+count:

norace uint16_t * BND_NOK(buffer, buffer+count) pos;

Multiply indirect pointers (rarely seen in TinyOS) require annotations at each level of indirection. For example, to declare pp as a pointer to pointer to integer such that both pp and anything it points to are valid at all times:

int * ONE * ONE pp = ...; 

The TCAST macro evaluates its second argument and turns it into a pointer of the type of its first argument without any safety checks:

cc2420_header_t* ONE x = 
  TCAST(cc2420_header_t* ONE, 
        (uint8_t *)msg + offsetof(message_t, data) - sizeof( cc2420_header_t ));

Understanding Interface Annotations

The TinyOS Core Working Group decided that it was important to avoid hurting the readability of the primary TinyOS interfaces by cluttering them up with safety annotations. Therefore, the Safe TinyOS annotations in the interfaces use a different syntax than the one generally used for code in modules. To see how this works, consider this example from $TOSDIR/interfaces/Packet.nc:

/**
  * Return a pointer to a protocol's payload region in a packet.
  * If the caller intends to write to the payload region then
  * the <tt>len</tt> parameter must reflect the maximum required
  * length. If the caller (only) wants to read from the payload
  * region, then <tt>len</tt> may be set to the value of
  * payloadLength(). If the payload region is smaller than 
  * <tt>len</tt> this command returns NULL. The offset where
  * the payload region starts within a packet is fixed, i.e. for
  * a given <tt>msg</tt> this command will always return the same
  * pointer or NULL.
  *
  * @param 'message_t* ONE msg'   the packet 
  * @param len   the length of payload required
  * @return 'void* COUNT_NOK(len)'     a pointer to the packet's data payload for this layer
  *              or NULL if <tt>len</tt> is too big
  */
 command void* getPayload(message_t* msg, uint8_t len);

Instead of inlining the Safe TinyOS annotations into the getPayload() prototype, the annotations are placed in the nesdoc comment. The syntax above is equivalent to

command void* COUNT_NOK(len) getPayload(message_t* ONE msg, uint8_t len);

It would be prudent to avoid mixing the two kinds of annotation syntax within a single function.

The current convention (which is not automatically enforced) is to use the nesdoc annotation syntax in tos/interfaces and the inline syntax elsewhere.

Mixing Safe and Unsafe Code

By default, a TinyOS module is compiled as unsafe code. In other words, it is trusted code and Deputy will not insert any checks into it. The TinyOS Core Working Group decided on unsafe as the default compilation mode in order to make it easier to transition applications to Safe TinyOS. This default can be overridden using the nesC compiler command line option -fnesc-default-safe.

A module with the @safe() attribute is compiled in safe mode: Deputy checks are inserted. For example in tos/system/SimpleArbiterP.nc starts with:

generic module SimpleArbiterP() @safe() {

A module with the @unsafe() attribute is explicitly exempt from Deputy's checking.

In general, the following conventions should be respected:

  • Use @safe() for any module that has been "Deputized," meaning that it contains sufficient annotations to compile and execute properly in safe mode
  • Use @unsafe() for any module that fundamentally cannot be compiled in safe mode (at the time this is written, no such modules exist)
  • Leave out annotations for modules that have not yet been Deputized

When compiling without Safe TinyOS (e.g. make micaz), safety annotations are ignored.

An application that contains no unsafe modules is globally safe: any safety error that can be caught by Deputy, will be caught. The easiest way to check if an application that you are using or writing is globally safe is to inspect its app.c for DEPUTY_TRUSTEDBLOCK annotations, which are inserted by the nesC compiler to tell Deputy to avoid compiling functions from unsafe modules as safe code. For example:

[regehr@babel Blink]$ cd $TOSROOT/apps/Blink
[regehr@babel Blink]$ make micaz safe
mkdir -p build/micaz
    compiling BlinkAppC to a micaz binary
ncc -o build/micaz/main.exe -DSAFE_TINYOS ...
    compiled BlinkAppC to build/micaz/main.exe
            2562 bytes in ROM
              51 bytes in RAM
avr-objcopy --output-target=srec build/micaz/main.exe build/micaz/main.srec
avr-objcopy --output-target=ihex build/micaz/main.exe build/micaz/main.ihex
    writing TOS image
[regehr@babel Blink]$ grep -c DEPUTY_TRUSTEDBLOCK build/micaz/app.c 
14

This shows that the app.c for Blink on MicaZ contains 14 DEPUTY_TRUSTEDBLOCK annotations. However, a detailed inspection shows that almost all of this trusted code is in the failure handlers themselves. It makes no sense to compile these as safe code.

Customizing the Failure Handler

The default safety violation handler is here:

$TOSDIR/lib/safe/SafeFailureHandlerP.nc

This code could be changed to notify a base station that a safety violation occurred, log a FLID to flash, reboot the mote, or whatever you like. Ideally, failure handling code should be atomic, should be independent of the state of the mote's RAM and hardware devices at the time a violation happens, and should not call into any safe code. Thus, it is highly unlikely that you could successfully send a packet reporting a fault that occurred in the TinyOS network stack without having written a small, customized network driver whose specific purpose is to report safety errors in the main TinyOS network stack.

Feedback

Please send problem reports and other feedback to tinyos-help or tinyos-devel, as appropriate.

To Do

For this document:

  • Factor some of this information out into a TEP
  • Explain how to find and understand the Deputy output
  • Split out stuff for 2.1 vs. CVS HEAD (where there's a difference)
  • Show examples of how to refactor code to be Deputy-friendly

For Safe TinyOS in general:

  • Support intelmote2, shimmer, and iris
  • Commit the Utah patch for safe TOSSIM
  • Although both the Deputy compiler and gcc both attempt to remove unneeded safety checks, there is still plenty of room for improvement. We'd like advanced compiler techniques (e.g. calling out to a modern decision procedure) for showing that safety checks definitely do or do not fire
  • Write an Avrora monitor that goes ahead and automatically decodes the FLID
  • Write an mspsim hack that does the same
  • Add a nesC compiler option that prints a list of unsafe modules linked into an app
  • Integrate Safe TinyOS with a stack depth analysis tool to avoid unsafety through stack overflow
  • Solve the problem of unsafe accesses through pointers to dead stack frames
  • Eliminate as much trusted code (TCAST and TRUSTEDBLOCK) as possible

Let us know if you'd like to help.