Add force-lock command #1124

jjshanks · 2024-01-15T22:53:40Z

Summary

This proposal is to add a new force-lock command.

Problem Statement

There is no way to manually lock a state via the CLI, only unlock one.

User-facing description

Usage would be similar to force-unlock except it wouldn't take a lock id it would return one

> tofu force-lock
Lock ID is 0d66876d-8ac8-4724-9cda-91d5592a96da

Technical Description

New command force-lock which will be just like force-unlock except it won't take a lock id and will call Lock() instead of Unlock() on the state manager
For backends that don't allow locking (if any) or need the process to keep running to maintain the lock an error message will be printed.
Documentation

Rationale and alternatives

Teams can have multiple reasons to want to lock state such as deployment freezes and refactoring.

Currently the only way to lock state is either editing the metadata of the state file by hand or using a utility like https://github.com/minamijoyo/tflock

An implementation over a subset of backends meets a need from the community. Expanding to other backends can be delayed until users of those backends request it.

Downsides

Command won't work for all backends and could lead to confusion/questions from users

Unresolved Questions

If a state is already locked should an error be returned or just return the existing lock id?

Related Issues

No response

Proof of Concept

No response

The text was updated successfully, but these errors were encountered:

ghost · 2024-01-24T14:15:53Z

Hey @jjshanks thank you for this RFC. There are some concerns around the ability to hold a lock with OpenTofu exiting on backends, such as Consul as documented here. Could you expand your RFC and address how this would work with regards to specific backends?

jjshanks · 2024-01-30T21:57:06Z

Thanks for the link @janosdebugs. To summarize the other discussions the potential options are

Implement for all backends, this would require some backends (local, consul, etc.) to continue to run and enforce the lock.
- For cases where the process needs to keep running could add a flag to make it explicit
Implement for some backends, this would require outputting some kind of error when unsupported backends are invoked

I don't have any experience using backends that aren't s3/dynamo so I'm not sure I'm the best person to say how other backends should behave. But my gut reaction would be to give an error for anything that needs to keep running for the lock, because users of those backends already don't use force-unlock.

I'll update the RFC to reflect the above for now until there is more feedback.

apparentlymart · 2024-10-09T01:11:31Z

The current API for locking is the Lock method of statemgr.Locker.

Right now that doesn't have any way to distinguish between a lock that can safely be dropped automatically when the process exits vs. a lock that must survive beyond the end of the current process (or return an error if that's not possible).

It seems like the smallest change to the existing API would be to extend statemgr.LockInfo with a new field that serves as a hint to the implementation that it should prefer to fail if it can't promise to preserve the lock beyond the life of the process. The local and consul backends could then check for that hint and return an error if it's set, unless we can find some reasonable way for those backends to create a more persistent lock in future.

It's a awkward that the default behavior for an unmodified backend would be to disregard the new hint field and acquire a useless lock that will immediately be dropped, but thankfully since we have all of the backends in-tree right now we can make sure they all get updated to support the new hint field (if needed) as part of the same PR.

This is perhaps an example of something that would be a little harder if we waited until after #382, because we'd no longer be able to update all existing backends to support this new capability. (Though I think we could still do it, just that we'd need to take a different API design strategy such as having force-lock call an entirely new method that existing backends don't support at all, and then the default would be for the new command to not be supported until all backends have been updated, so there would likely always be backend implementations out there that could support force-lock but nonetheless haven't been updated to support it yet.)

darpham · 2024-10-25T04:28:08Z

Cross-posting to share the knowledge once force-unlock is implemented:)

If it's of any use to folks stumbling across this, here's a bash/zsh compatible interactive script to more easily unlock state. Could be updated for running in CI/CD in an automated fashion with some pre-validation - YMMY and use with due caution.

tf_unlock() {
    local LOCK_ID
    local ERROR_OUTPUT
    local LOCK_DETAILS

    # Capture both stdout and stderr
    ERROR_OUTPUT=$(terraform plan -json 2>&1)

    # Check if there's a lock error
    if [[ $ERROR_OUTPUT == *"Error acquiring the state lock"* ]]; then
        LOCK_DETAILS=$(echo "$ERROR_OUTPUT" | jq -r '.diagnostic.detail // empty' 2>/dev/null)
        if [[ -z "$LOCK_DETAILS" ]]; then
            LOCK_DETAILS=$(echo "$ERROR_OUTPUT" | grep -A10 "Lock Info:")
        fi
        LOCK_ID=$(echo "$LOCK_DETAILS" | awk '/ID:/ {print $2; exit}')

        echo "State is locked. Lock details:"
        echo "$LOCK_DETAILS"
        echo

        echo -n "Do you want to unlock this state? Type 'yes' to confirm: "
        read response
        if [[ "$response" == "yes" ]]; then
            echo "Attempting to unlock..."
            if terraform force-unlock --force "${LOCK_ID}"; then
                echo "Terraform state has been successfully unlocked!"
            else
                echo "Failed to unlock the state. Please check the error message above." >&2
                return 1
            fi
        else
            echo "Unlock cancelled."
            return 0
        fi
    elif [[ $ERROR_OUTPUT == *"Error:"* ]]; then
        echo "Error occurred while checking Terraform state:" >&2
        echo "$ERROR_OUTPUT" >&2
        return 1
    else
        echo "State is not locked. No action needed."
    fi
}

Originally posted by @darpham in hashicorp/terraform#30277 (comment)

RLRabinowitz · 2024-11-25T16:29:12Z

@klongmitre tagging you here, this is the issue we've discussed at KubeCon recently.

If you are interested in this feature, please make sure to upvote the original issue, as we rely on the upvotes to determine community demand

jjshanks added pending-decision This issue has not been accepted for implementation nor rejected. It's still open to discussion. rfc labels Jan 15, 2024

ghost added the question Further information is requested label Jan 24, 2024

github-actions bot mentioned this issue Apr 16, 2024

Top-Ranking Issues 📊 #1496

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add force-lock command #1124

Add force-lock command #1124

jjshanks commented Jan 15, 2024 •

edited

Loading

ghost commented Jan 24, 2024

jjshanks commented Jan 30, 2024

apparentlymart commented Oct 9, 2024 •

edited

Loading

darpham commented Oct 25, 2024 •

edited

Loading

RLRabinowitz commented Nov 25, 2024

Add force-lock command #1124

Add force-lock command #1124

Comments

jjshanks commented Jan 15, 2024 • edited Loading

Summary

Problem Statement

User-facing description

Technical Description

Rationale and alternatives

Downsides

Unresolved Questions

Related Issues

Proof of Concept

ghost commented Jan 24, 2024

jjshanks commented Jan 30, 2024

apparentlymart commented Oct 9, 2024 • edited Loading

darpham commented Oct 25, 2024 • edited Loading

RLRabinowitz commented Nov 25, 2024

jjshanks commented Jan 15, 2024 •

edited

Loading

apparentlymart commented Oct 9, 2024 •

edited

Loading

darpham commented Oct 25, 2024 •

edited

Loading