6

I have a Raspberry Pi that is acting as a sensor which I am sending out to various customers. The sensor is recording approximately 1 GB every 2-3 days, so I would like to have a way to remove old data.

The only way I have found involves using crontab to delete the files, which requires internet to know what time it is. The commands I have found are the following: crontab -e then input: 0 15 * * * find PATH -mindepth 1 -mtime +30 -delete which would delete files 30 days old at 3pm every day. The problem is, these sensors won't have access to the internet, so the time will reset when they are restarted. Is there an alternative to this that wouldn't require internet?

To give a little more information: the sensors will typically be in place for about 10 days before being turned off and back on a few days later. The reason I want the files to delete after some time is in case there is a problem. If they deleted 30 days after creation, this would give the customer time to ship it back to me so I could take a look at it

I already have the folders sequentially numbered, so that route may be easiest for me. The folders are labelled 1, 2, 3 etc., with each folder being data from one startup/shutdown procedure and every file inside the folders is a .csv file. Is there a way that I could write a cron command so that when it is close to running out of space, it would delete say folders 1-X to clear 10 GB of space or something like that? If it's easier I could also tell it to delete folders 1-5 when its running out of space or something like that.

I also have looked into the savelog option a little bit, and this might be what I am looking for although I am not sure how to use it. I would want to use it with folders of data since that would be easier. If it could only keep the last 5 folders at a time that would work since I expect each folder to be ~2-3 GB.

Greenonline
  • 2,969
  • 5
  • 27
  • 38

5 Answers5

14

You could add a battery powered real time clock chip to each pi but there is an obvious cost to this.

One other way is to use a sequential number for each file and delete the older files based on the number sequence. Obviously, you would need to keep track of the numbers but it should be possible to work out the values if you know how many are generated per day.

14

Since you don't have Internet NTP, the file timestamps are meaningless, being reset to the Pi's "/etc/fake-hwclock.data" time whenever the system is booted.

If the daily data file(s) have a standard name ( e.g. /PATH/sensor1.data or common tag ( e.g. the ".data" ) you can use savelog ( or logrotate ) to save only the most recent ( e.g. "30" ) copies:

savelog -l -n -c30 /PATH/sensor1.log

or

savelog -l -n -c30 /PATH/*.data
  • '-l' Don't compress
  • '-n' Do not rotate empty files
  • '-c' Save cycle versions

Just add to your user crontab: ( for example )

0 15 * * * savelog -l -n -c30 /PATH/sensor1.log
0 15 * * * savelog -l -n -c30 /PATH/*.data

You'll wind up with the current file and then consecutively numbered files, automatically deleting the 30 day oldest

p.s. The commands 'man savelog' or 'savelog --help' are your friends.

dave58
  • 345
  • 1
  • 3
7

On a system where the time resets on boot, you can't know what time it is when the system starts, and similarly can't know how long it was when the system was down. But you can count reboots, and you probably have a clock that works while the system is up.

Have a global counter ("run id"), stored in a file or database, or whatever, that you increment by one every time your sampling software starts. Then keep a running counter while the software is running, and increment that by one for each sample. With the samples tagged with the [run id, sample id] pairs, you'll be able to determine their order in real time, tell where the gaps from a shutdown have been, and remove the oldest files. Alternatively, use the time elapsed since the software started instead of sample id's.

You can also count the samples to determine how long the system was up on each run, which gives a lower bound of the elapsed wall clock time. But I don't think that's necessary, you're probably fine with just removing the oldest samples (or whole runs) when the storage space starts running out.

ilkkachu
  • 213
  • 1
  • 7
6

Raspbian includes fake-hwclock, which saves the clock to the SD card on shutdown and restores it on boot. However, if you're just cutting the power to restart it, this is fairly useless; it will never shut down, so it'll keep restoring the last-saved time which is the same as always.

Deanna Earley's solution is to add a line to /etc/crontab:

* * * * * root fake-hwclock save

This will save the clock every minute, allowing your other cron job to work properly even if the Pi's power is cut. (Provided that fake-hwclock hasn't been uninstalled, anyway.)

Note that the clock won't tick when the power's off, so the clock will gradually shift sideways; you won't be able to guarantee that the files get wiped at 3pm.

wizzwizz4
  • 186
  • 5
4

You could

  • name your files sequentially and delete the oldest when there’s not much free space left
  • use a larger SD card
  • get the time by other means, for example GPS, DCF77 or RDS
Martin
  • 145
  • 4