This is a very general question, but I'm sure others will want to know the answer...
I'm using headless RPi's for a remote monitoring application. These RPi's will be buried in utility closets and generally inaccessible places. I already have written Python code that POSTs readings to our server, and I'm trapping (almost) all of the errors that would cause my app to crash.
What tools and techniques would you recommend to maximize the app's uptime "no matter what" (e.g. restart the app if it gets hung, reboot the machine if that doesn't work), and diagnose problems when the system crashes?
Note: I have a vague sense that init.d lets me register the application as a service that can be started and stopped, and update-rc.d will launch the application at startup. And I can use syslog and syslog-ng to log errors remotely. But what's the best way to create an app-level watchdog and a system level watchdog?
Anything else I should be thinking about?