The Story So Far...
The Story So Far...
Posted Dec 28, 2014 23:33 UTC (Sun) by cesarb (subscriber, #6266)In reply to: The Story So Far... by ldo
Parent article: The "too small to fail" memory-allocation rule
Stop right here. The problem is not with cleanups which should always be executed; these are already well-tested. The problem is with cleanups which should *never* be executed, unless an error happens.
In exception-handling terms, what you are talking about is a "finally" block, which executes both on success and on failure. The error-recovery paths in question, however, are "except" blocks, which execute only on failure.
For a filesystem-related example, since we're supposed to be talking about filesystems: let's say a user program just appended something into a file, and you're in the function which will submit the newly appended data to the block subsystem. As part of that, it has to allocate space for the new data, updating several of the filesystem's control structures, and allocate the control structures for the block subsystem.
If any allocation fails in the middle of this complex process, you have to unwind all the bookkeeping updates, otherwise the filesystem gets into an inconsistent state. If no allocation fails, however, the bookkeeping changes must be kept, since they represent the new state of the filesystem.
Is the pseudo-loop technique able to do the equivalent of an "except" block without getting even uglier? Remember that the rewind will have to be called from many places, since any of the allocations can fail, and that the failure can also be in the middle of the the initial update of the control structures, so the rewind might be partial.
The traditional way of doing this in the Linux kernel is somewhat like the following:
int foo(...)
{
int err = 0;
/* do stuff */
err = bar(...);
if (err)
goto error;
/* do even more stuff */
err = baz(...);
if (err)
goto error;
/* do yet more stuff */
out:
/* free temporary memory and unlock mutexes */
return err;
error:
/* unwind state updates */
goto out;
}
The kernel developers are pragmatic. If a kernel developer feels that using a "pseudo-loop" results in cleaner code, they do use it. However, for error handling the usual design pattern is either a "goto error" (single error label, pointers initialized to NULL) or a "goto error_foo" (many error labels, each falling through to the next, releasing resources in reverse order). On success, either the unlock is duplicated (it's usually just a unlock, memory is rarely allocated just for the duration of one function), or the error handling does a "goto out" like in the example above.
Posted Dec 29, 2014 17:35 UTC (Mon)
by ldo (guest, #40946)
[Link] (4 responses)
This is why cleanups should be idempotent--pass the cleanup routine a NULL argument, and it does nothing. So the steps become something like
So when you get to the end, you simply execute all the relevant cleanups unconditionally, and the ones with NULL arguments turn into no-ops.
Posted Dec 29, 2014 18:14 UTC (Mon)
by cesarb (subscriber, #6266)
[Link] (3 responses)
You still are in the "cleanup" mentality. But the problem with filesystems is not "cleanup", it's "rollback". There's nothing to subsume the relevant code, and it subsumes nothing else; it's really something which executes only on failure.
> So when you get to the end, you simply execute all the relevant cleanups unconditionally, and the ones with NULL arguments turn into no-ops.
That doesn't help with the situation in question, where the memory allocation routines never return "failure". The relevant error-recovery code would always get passed NULL, and so the branch which is called when it receives a non-NULL never gets tested.
Sure, you *called* the code unconditionally, but it doesn't change the fact that it *executes* conditionally. Whether the branch point is outside or inside it is immaterial. For instance, it's well-known that the standard "free(void *ptr)" function does nothing when called with a NULL argument, but that's because it begins with a conditional branch: "if (!ptr) return;". If it were always called with a NULL pointer, the most complex part of its code would never be exercised.
Posted Dec 30, 2014 20:34 UTC (Tue)
by ldo (guest, #40946)
[Link] (2 responses)
That’s right. You keep insisting that “cleanup” and “rollback” are entirely different, whereas they are really two aspects of the same thing, and can be treated as such.
> That doesn't help with the situation in question, where the memory Returning NULL from an allocation request is a failure.
> Sure, you *called* the code unconditionally, but it doesn't change Do you know what “abstraction” means?
Posted Dec 30, 2014 22:33 UTC (Tue)
by cesarb (subscriber, #6266)
[Link] (1 responses)
Please, reread the article this comment thread is attached to.
The whole issue is that, under certain conditions, the allocation requests were *never* returning NULL, even when they should!
> > Sure, you *called* the code unconditionally, but it doesn't change the fact that it *executes* conditionally.
Please, reread the article this comment thread is attached to.
Abstraction or not, it doesn't change the fact that, since the allocation requests were *never* returning NULL, even when they should, the error handling code was *never* being executed. It doesn't matter whether it has been abstracted away or not, untested code is untested code.
Posted Dec 31, 2014 21:00 UTC (Wed)
by ldo (guest, #40946)
[Link]
Which is why I am advocating simpler error-handling paths, as in my example.
Re: The problem is with cleanups which should *never* be executed, unless an error happens.
Re: The problem is with cleanups which should *never* be executed, unless an error happens.
Re: You still are in the "cleanup" mentality.
> allocation routines never return "failure".
> the fact that it *executes* conditionally.
Re: You still are in the "cleanup" mentality.
> Returning NULL from an allocation request is a failure.
> Do you know what “abstraction” means?
Re: requests were *never* returning NULL, even when they should!