4

I am using raspberry pi 2 (B+) in a biology lab to do real time tracking of small insects. Basically, we have an python/opencv program that detect positions of animals and saves it in a MySQL db. We typically run experiments for 2 weeks on sometimes as many as 30 pis simultaneously. Often, some of our pis will freeze: the green ACT LED does not blink any more and they do not respond to ping/ssh. The power LED remains on, and the only way I find to reboot the devices is to power them off by hand. This is very hard to reproduce in so far as a device can run fine for several days and then crash (or not). journalctl does not provide any clues after it happens.

We keep a log of crashing devices, and it appears that some have never crashed while others keep crashing after a few days.

For several reasons, we think it is related to faulty SD cards:

  • We had a much higher propensity to crash with alternative cards (Verbatim microSDHC, 32GB class 10).
  • Swapping cards between devices indicates that the issue is related to the card -- as opposed to the not the power supply or other hardware issue.
  • It does not seem that we run out of RAM either

I have tried to:

  • Reburn card from img file
  • Update firmware
  • Use Pi3

Because of the random nature of the bug, testing every possible solution is a matter of time and statistics, so I am not quite sure where to start.

Technical details: $ uname -a Linux e043 4.4.13-1-ARCH #1 SMP Wed Jun 8 19:31:47 MDT 2016 armv7l GNU/Linux

The SD cards we use are '32G Samsung EVO SD card': $ grep . /sys/class/mmc_host/mmc0/mmc0:*/* 2>/dev/null /sys/class/mmc_host/mmc0/mmc0:0001/cid:1b534d303030303010c337142500f147 /sys/class/mmc_host/mmc0/mmc0:0001/csd:400e00325b590000ee7f7f800a404055 /sys/class/mmc_host/mmc0/mmc0:0001/date:01/2015 /sys/class/mmc_host/mmc0/mmc0:0001/erase_size:512 /sys/class/mmc_host/mmc0/mmc0:0001/fwrev:0x0 /sys/class/mmc_host/mmc0/mmc0:0001/hwrev:0x1 /sys/class/mmc_host/mmc0/mmc0:0001/manfid:0x00001b /sys/class/mmc_host/mmc0/mmc0:0001/name:00000 /sys/class/mmc_host/mmc0/mmc0:0001/oemid:0x534d /sys/class/mmc_host/mmc0/mmc0:0001/preferred_erase_size:4194304 /sys/class/mmc_host/mmc0/mmc0:0001/scr:02b5800200000000 /sys/class/mmc_host/mmc0/mmc0:0001/serial:0xc3371425 /sys/class/mmc_host/mmc0/mmc0:0001/type:SD /sys/class/mmc_host/mmc0/mmc0:0001/uevent:DRIVER=mmcblk /sys/class/mmc_host/mmc0/mmc0:0001/uevent:MMC_TYPE=SD /sys/class/mmc_host/mmc0/mmc0:0001/uevent:MMC_NAME=00000 /sys/class/mmc_host/mmc0/mmc0:0001/uevent:MODALIAS=mmc:block

Boot config file (Pi3) $ /opt/vc/bin/vcgencmd get_config int arm_freq=1200 audio_pwm_mode=1 config_hdmi_boost=5 core_freq=400 desired_osc_freq=0x36ee80 disable_camera_led=1 disable_commandline_tags=2 disable_l2cache=1 force_eeprom_read=1 force_pwm_open=1 framebuffer_ignore_alpha=1 framebuffer_swap=1 gpu_freq=300 hdmi_force_cec_address=65535 init_uart_clock=0x2dc6c00 lcd_framerate=60 over_voltage_avs=0x13d62 overscan_bottom=48 overscan_left=48 overscan_right=48 overscan_top=48 pause_burst_frames=1 program_serial_random=1 sdram_freq=450 temp_limit=85

goldilocks
  • 60,325
  • 17
  • 117
  • 234

5 Answers5

1

Faced this problem in Jun 2020. Just turn swap file off:

sudo dphys-swapfile swapoff
sudo dphys-swapfile uninstall
sudo update-rc.d dphys-swapfile remove

In /etc/dphys-swapfile set CONF_SWAPSIZE=0 (was 100 fro me)

This fixed random freezing in my case. Check result by free before, after and after reboot. Swap should be 0.

Woodoo
  • 11
  • 2
0

I have experienced problems with corrupting SD cards on the Pi 3, so I suspect this is what you are experiencing also. My solution was to use a different SD card. Another solution is to copy the ext4 filesystem to a USB memory stick and change the config on the boot partition of the SD card so that it points to the new location of the root filesystem.

Rebroad
  • 665
  • 5
  • 11
0

SD cards are flimsy little things. I had a similar issue with a fleet of BeagleBone Black. Since you've tracked down the issue to the SD card itself, and reflashing the card didn't solve your issue, replace the SD card. They are cheap.

tlhIngan
  • 3,372
  • 5
  • 21
  • 33
0

I always try to stick with class 4 (or lower) microSD cards, and have never had this problem. Do you supply enough current? Do you have some USB gadget that pulls >100mA? Insufficient power is a nasty microSD cause-of-death. My Pi2B+ gets 2A.

I am not putting much faith in the 'use highest quality samsung/knownbrand SDHC' lore. All my cards are noname and cheap. I leave a bit (15-20%) unallocated, since I expect they will degrade faster. Just rename the resize script to autoresizeforfuturereference.sh and move it to /root before booting a freshly imaged microSD card to prevent the autoresize.

If your application requires heavy disk I/O, use a USB harddisk with a USB Y-cable, and power it from a sufficient source.

user2497
  • 681
  • 5
  • 8
0

I don't know if the following will do any good, but it won't hurt to give them a try.

  1. I usually use this F3 (an alternative to h2testw utility) to perform a robust R/W tests on my microSD cards and/or USB memory sticks before I will use them.
  2. As a double check, I also do some full checksums on a microSD card once I performed a dd to write an image OS on it.
user91822
  • 422
  • 2
  • 5