3

I have been using a Micro SD card that has been getting corrupted very frequently. I tested it with the badblocks command on another Linux box and it said there were 0 errors. I also checked it with the F3 Fight False Flash tool and the results were 0% corruption.

I then checked the checksum of the image I was using (Raspbian Jessie Lite) and it was the same as the checksum on the official website. I doubt that the image was corrupted as I wrote the image to the card because I have done it multiple times with other images and until now have had no problems.

The reason I believe it is corrupted is because whenever I do an update/upgrade or shortly after while trying to install a package, it ends up giving a fatal error like:

Selecting previously unselected package htop.
dpkg: unrecoverable fatal error, aborting:
 files list file for package 'tzdata' is missing final newline
E: Sub-process /usr/bin/dpkg returned an error code (2)

I have ran the apt-get -f install command but that does not help. Could it be that I am not giving the Pi enough power? I am using a 2A power supply and have another card that does not have any problems with the Pi 3 and that power adapter. Also, the red light is solid on most of the time, only blinking every couple minutes. I using SSH and running it headless so there is not a great power draw. What could be the issue? My power supply? The card?

EDIT Forgot to mention, Raspbian Lite automatically expands the filesystem on boot so I know it is not because the file system was full. Also, I ran df -h and it said 12% full for the /root partition.

UPDATE I found that I can fix the system by removing the .list file of the package listed in the dpkg error and then run apt install package --reinstall. However, my question still remains, why do parts of the file system corrupt like this on a regular basis?

UPDATE After running with -f I get:

e2fsck 1.42.13 (17-May-2015) 
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity 
Pass 4: Checking reference counts Unattached inode 139333
Connect to /lost+found<y>? yes 
Inode 139333 ref count is 2, should be 1.  Fix<y>? yes 
Unattached inode 139343 Connect to /lost+found<y>? yes 
Inode 139343 ref count is 2, should be 1.  Fix<y>? yes 
Unattached inode 139380 Connect to /lost+found<y>? yes 
Inode 139380 ref count is 2, should be 1.  Fix<y>? yes 
Inode 139402 ref count is 1, should be 2.  Fix<y>? yes 
Inode 139403 ref count is 1, should be 2.  Fix<y>? yes 
Inode 139404 ref count is 1, should be 2.  Fix<y>? yes 
Inode 139405 ref count is 1, should be 2.  Fix<y>? yes 
Inode 139406 ref count is 1, should be 2.  Fix<y>? yes 
Inode 139408 ref count is 1, should be 2.  Fix<y>? yes 
Inode 139409 ref count is 1, should be 2.  Fix<y>? yes 
Inode 140012 ref count is 1, should be 2.  Fix<y>? yes 
Inode 140355 ref count is 1, should be 2.  Fix<y>? yes 
Pass 5: Checking group summary information
/dev/mmcblk0p2: ***** FILE SYSTEM WAS MODIFIED ***** 
/dev/mmcblk0p2: 63236/482880 files (0.2% non-contiguous), 452979/1948928 blocks 
NULL
  • 2,240
  • 8
  • 28
  • 49

2 Answers2

4

I tested it with the badblocks command on another Linux box and it said there were 0 errors.

This is actually a useless tool with SD cards, because SD cards use virtual addressing and there is no guarantee of consistency with regard to the correspondence to actual physical addresses; in fact, even simply reading data may cause it to be physically moved in a way that is opaque to the operating system and hence any software running on it. Even if you do a vigorous read/write test using badblocks, it's in theory possible that the controller could literally, on a 4 GB card, simply use the exact same 1 MB erase block 4000+ times when badblocks does a scan, meaning it would have verified the functionality of less than 0.1% of the volume.

the F3 Fight False Flash tool and the results were 0% corruption

I'm unfamiliar with this but it does at least sound like it's intended for checking flash based devices, although ultimately I'm uncertain to what extent that is truly possible with SD cards in particular, which, unlike SSDs, are intended to be as cheap as possible and may lack features which allow them to be reliably checked, period (you could always dig into the software docs to see what they have to say about that). Like other flash media they are potentially self-checking and this is I think the primary means of ensuring their reliability, but again, while SD cards are great considering how cheap they are, the emphasis belongs on the making them cheap as a high priority.

That said, filesystem corruption is something that can occur on a perfectly sound piece of hardware, so even with a traditional spinning disk, being able to demonstrate there are no bad blocks doesn't mean a particular filesystem is in good order. A device can pass a low level hardware test with garbage on it.

What you should focus on, since you have another linux system to check the card with, is testing the root filesystem with e2fsck. This does get run intermittently (based on number of boots)1 or if the filesystem is not cleanly unmounted, but it can't be done easily once the system is up and running normally (you can force it to check every boot with a command passed to systemd via the kernel command line, i.e., placed in /boot/cmdline.txt, fsck.mode=force, but this will add substantially to your boot times). The most straightforward way to do it is certainly by putting the card in another machine.

To ensure the a full check is done even if the filesystem appears "clean", use e2fsck -f.

So, if you suspect something fishy of that sort is going on, then you should check that. If it says it is okay, it is probably okay, because unlike the low level hardware tests, this is an actual check of the integrity of the data on the device according to a maintained record -- which the low level hardware tools don't care about because that's not their purpose. A filesystem on the card could have been completely trashed and those tools will not know any better as long as the hardware checks out.

From searching around, it does sound like your problem is probably due to file corruption, but that doesn't mean filesystem corruption, or SD card corruption in the sense of hardware at fault. What constitutes "file corruption" is the context of use. If I write a text file and then something inserts random parts of a jpeg into it, that file is effectively corrupted, but not necessarly from the perspective of the filesystem or hardware.

It other words, it could simply mean some piece of software garbled it because e.g., it was interrupted at the wrong point. Very few things are bullet proof that way and making them so brings along disadvantages of its own.


1. You can get the two significant values here with sudo tune2fs -l /dev/mmcblk0p2 | grep -i "mount count". One is the number of mounts since the last check, the "maximum" is the number of mounts before a check is forced. If this number is -1, it will never happen, and you can set it with, e.g.,

sudo tune2fs -c 10 /dev/mmcblk0p2

Both these operations are okay to do while the filesystem is in use.

goldilocks
  • 60,325
  • 17
  • 117
  • 234
2

You are using an inadequate power supply. All of your filesystem problems will be down to this. The power supply warning square shouldn't be on your screen.

Get something like the official and recommended universal USB micro power supply or the Adafruit 5V 2.4 Amp + MicroUSB cable power adapter.

Eugen
  • 488
  • 3
  • 13
scruss
  • 9,117
  • 1
  • 25
  • 36