|
|
Subscribe / Log in / New account

Error recovery (was: The "too small to fail" memory-allocation rule)

Error recovery (was: The "too small to fail" memory-allocation rule)

Posted Dec 26, 2014 18:24 UTC (Fri) by rwmj (subscriber, #5474)
In reply to: Error recovery (was: The "too small to fail" memory-allocation rule) by cesarb
Parent article: The "too small to fail" memory-allocation rule

I used to think this was an indictment of Unix, but if you look at modern cloud systems with their multiple virtual machines, any one of which is expected to fail without affecting the service. Or Erlang with its philosophy of failing early and recovering failed processes. Well, now it's writing all that error handling code which looks stupid.


to post comments

Error recovery (was: The "too small to fail" memory-allocation rule)

Posted Dec 29, 2014 14:09 UTC (Mon) by epa (subscriber, #39769) [Link] (7 responses)

Yes, because the next time your mobile phone crashes you can seamlessly switch to one of the cloud of redundant phones you carry with you at all times...

Error recovery (was: The "too small to fail" memory-allocation rule)

Posted Dec 29, 2014 21:58 UTC (Mon) by rwmj (subscriber, #5474) [Link] (2 responses)

You obviously don't understand how erlang works. Not many do which I guess explains the state of programming these days.

Error recovery (was: The "too small to fail" memory-allocation rule)

Posted Dec 30, 2014 10:31 UTC (Tue) by epa (subscriber, #39769) [Link] (1 responses)

I'm sure Erlang works reliably, but the kernel cannot be written in Erlang.

Error recovery (was: The "too small to fail" memory-allocation rule)

Posted Jan 23, 2015 9:03 UTC (Fri) by Frej (guest, #4165) [Link]

No but the philosophy could be followed. In many ways it is the just the micro vs monolith kernel. If subsystems could be completely separate, you could just restart the subsystem and retry the request. But it's never quite that simple, and especially so for the kernel.

Error recovery (was: The "too small to fail" memory-allocation rule)

Posted Dec 30, 2014 15:59 UTC (Tue) by rgmoore (✭ supporter ✭, #75) [Link] (3 responses)

OTOH, Android expects its apps to be able to handle crashes cleanly. When it needs to free up memory, it just kills something in the background, and it's up to the app not to have problems from that. Seamless recovery from being killed is mandatory.

Error recovery (was: The "too small to fail" memory-allocation rule)

Posted Dec 30, 2014 16:40 UTC (Tue) by epa (subscriber, #39769) [Link]

It is sound design to make the application handle crashes cleanly, so it can recover without losing more than a tiny bit of work in progress. And that applies to the kernel too: journalling filesystems are designed so that even a hard crash will not lose data.

That doesn't really mean you can do without error handling code in the kernel, though. It's great if your filesystem doesn't get horribly corrupted when the machine crashes, but still the crash is not appreciated by the user. Yes, if you are running a farm of several machines then you can fail over to another and the service stays up; that doesn't really work as a remedy for your laptop locking up, unless you happen to carry around a redundant laptop with you at all times.

And in the case of Android, the apps are killed and restarted, but it would not be acceptable for the kernel itself to just panic on any error condition and require restarting the phone. Which is what we are talking about here: *kernel* error recovery.

Re: OTOH, Android expects its apps to be able to handle crashes cleanly.

Posted Dec 30, 2014 20:47 UTC (Tue) by ldo (guest, #40946) [Link] (1 responses)

Not quite. The framework will always explicitly call onDestroy() before killing your Activity. If this is happening because the system is running low on resources, not because of direct user action, then onSaveInstanceState() will be called before that so you can save whatever is necessary of the state of your UI so, when the user returns to your app, it can be transparently restarted to make it look like it never stopped.

Re: OTOH, Android expects its apps to be able to handle crashes cleanly.

Posted Dec 30, 2014 21:41 UTC (Tue) by cesarb (subscriber, #6266) [Link]

Not true, see drivers/staging/android/lowmemorykiller.c. It directly sends SIGKILL to the process, without calling any Java function.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds