Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encountered 433 files that should have been pointers, but weren't; --fixup does nothing #5162

Open
solace-rcampbell opened this issue Oct 31, 2022 · 27 comments
Labels

Comments

@solace-rcampbell
Copy link

Describe the bug
I have a repository with some lfs issues that I'm trying to understand how to fix. When I check out a specific branch, I get "stuck" in the branch. I can't switch out of the branch and I can't seem to fix the repository either.

I'm sure I could renormalize the files and commit the change, but I'm trying to "fix" the history to be as if I never made the mistake.

When I checkout this branch, I get this message:

$ git switch git.4
branch 'git.4' set up to track 'origin/git.4'.
Switched to a new branch 'git.4'
Encountered 433 files that should have been pointers, but weren't:

I think someone (almost certainly me) modified .gitattributes and then committed some LFS files but did NOT add the .gitattributes file to the commit itself (or maybe the exact opposite). Now I'm trying to repair the damage.

I read #1939, and tried all the guidance in there, but nothing seemed to work.

I tried git lfs migrate import --fixup --everything, which said it was rewriting 32 commits:

$ git lfs migrate import --fixup --everything
migrate: override changes in your working copy?  All uncommitted changes will be lost! [y/N] y
migrate: changes in your working copy will be overridden ...
migrate: Sorting commits: ..., done.
migrate: Rewriting commits: 100% (32/32), done.
  git.4 7f591f25b9b1444802671946aa7eafd3bec772de -> 7f591f25b9b1444802671946aa7eafd3bec772de
  main  72020868f38a80fa818d28538c146176bdf3807b -> 72020868f38a80fa818d28538c146176bdf3807b
migrate: Updating refs: ..., done.
migrate: checkout: ..., done.

But "git status" still shows the same number of locally modified files, and git push does nothing:

$ git push --all --force origin
Everything up-to-date

Expected behavior
I expected --fixup to change my local repository such that these files were correctly LFS pointers.

System environment
I'm running inside CentOS 7.

Output of git lfs env

$ git lfs env
git-lfs/3.2.0 (GitHub; linux amd64; go 1.18.2)
git version 2.38.0

Endpoint=https://github.com/SolaceDev/broker.git/info/lfs (auth=basic)
LocalWorkingDir=/opt/sbox/rcampbell/git-3
LocalGitDir=/opt/sbox/rcampbell/git-3/.git
LocalGitStorageDir=/opt/sbox/rcampbell/git-3/.git
LocalMediaDir=/opt/sbox/rcampbell/git-3/.git/lfs/objects
LocalReferenceDirs=
TempDir=/opt/sbox/rcampbell/git-3/.git/lfs/tmp
ConcurrentTransfers=8
TusTransfers=false
BasicTransfersOnly=false
SkipDownloadErrors=false
FetchRecentAlways=false
FetchRecentRefsDays=7
FetchRecentCommitsDays=0
FetchRecentRefsIncludeRemotes=true
PruneOffsetDays=3
PruneVerifyRemoteAlways=false
PruneRemoteName=origin
LfsStorageDir=/opt/sbox/rcampbell/git-3/.git/lfs
AccessDownload=basic
AccessUpload=basic
DownloadTransfers=basic,lfs-standalone-file,ssh
UploadTransfers=basic,lfs-standalone-file,ssh
GIT_EXEC_PATH=/usr/libexec/git-core
git config filter.lfs.process = "git-lfs filter-process"
git config filter.lfs.smudge = "git-lfs smudge -- %f"
git config filter.lfs.clean = "git-lfs clean -- %f"

Additional context
Any other relevant context about the problem here.

#1939

@chrisd8088
Copy link
Member

chrisd8088 commented Nov 1, 2022

Hey, I'm sorry you're having trouble.

If by chance your problem is that .gitattributes wasn't committed at some point in the past prior to when files were introduced in your Git history that now match whatever patterns are in .gitattributes, then note that git lfs migrate import --fixup will not help because it only converts files into Git LFS objects if they match against a pattern in .gitattributes where they appear in your history but somehow are not Git LFS objects already.

The documentation says:

In practice, this option imports any filepaths which should be tracked by Git LFS according to the repository’s .gitattributes file(s), but aren’t already pointers.

To make that more clear, here's an example where .gitattrributes is created in the history first, and then a "broken" file which matches a pattern in .gitattributes is introduced, and how git lfs migrate import --fixup can help with that:

$ git init test
$ cd test
$ git lfs track '*.bin'
$ cat .gitattributes
*.bin filter=lfs diff=lfs merge=lfs -text
$ git add .gitattributes
$ git commit -m attribs

$ git lfs uninstall
$ echo foo >foo.bin
$ git add foo.bin
$ git commit -m foo
$ git show HEAD:foo.bin
foo

$ git lfs install
$ git lfs fsck
pointer: unexpectedGitObject: "foo.bin" (treeish a59051c2430b6e97187d2751eff82b37a52726f6) should have been a pointer but was not

$ git lfs migrate info --fixup
migrate: Fetching remote refs: ..., done.
migrate: Sorting commits: ..., done.
migrate: Examining commits: 100% (2/2), done.
*.bin	4 B	1/1 file 	100%

$ git lfs migrate import --fixup
migrate: Fetching remote refs: ..., done.
migrate: Sorting commits: ..., done.
migrate: Rewriting commits: 100% (2/2), done.
  main	a59051c2430b6e97187d2751eff82b37a52726f6 -> d4b72c0f48df5906d42ad28fa319df7db152319f
migrate: Updating refs: ..., done.
migrate: checkout: ..., done.

$ git show HEAD:foo.bin
version https://git-lfs.github.com/spec/v1
oid sha256:b5bb9d8014a0f9b1d61e21e796d78dccdf1352f23cd32812f4850b878ae4944c
size 4

$ git lfs fsck
Git LFS fsck OK

But if your issue is that you don't have a .gitattributes in your history before (or in the same commit when) the files are introduced which then are matched by patterns in a subsequently-introduced .gitattributes, git lfs migrate import --fixup won't help out of the box.

(Also note that in the example above, the commit ID changes in the "Rewriting commits" phase, whereas in your example, the commit IDs don't change, meaning no history was actually altered.)

What you would need to do is determine when the problematic files were introduced into your history and then use something like git rebase to introduce a commit before that point which updates .gitattributes to include patterns to match those files. Then git lfs migrate import --fixup should work as you hope.

@chrisd8088
Copy link
Member

chrisd8088 commented Nov 1, 2022

And just to show the reverse situation, which you may be facing:

$ git init test
$ cd test
$ echo foo >foo.bin
$ git add foo.bin
$ git commit -m foo
$ git show HEAD:foo.bin
foo

$ git lfs track '*.bin'
$ cat .gitattributes
*.bin filter=lfs diff=lfs merge=lfs -text
$ git add .gitattributes
$ git commit -m attribs

$ git lfs fsck
pointer: unexpectedGitObject: "foo.bin" (treeish 2f0019e8ba0ee5fcd07b2c193989dcd793e7bf49) should have been a pointer but was not

$ git lfs migrate info --fixup
migrate: Fetching remote refs: ..., done.
migrate: Sorting commits: ..., done.
migrate: Examining commits: 100% (2/2), done.

$ git lfs migrate import --fixup
migrate: changes in your working copy will be overridden ...
migrate: Fetching remote refs: ..., done.
migrate: Sorting commits: ..., done.
migrate: Rewriting commits: 100% (2/2), done.
  master	2f0019e8ba0ee5fcd07b2c193989dcd793e7bf49 -> 2f0019e8ba0ee5fcd07b2c193989dcd793e7bf49
migrate: Updating refs: ..., done.
migrate: checkout: ..., done.

$ git show HEAD:foo.bin
foo

Note that the git lfs migrate info --fixup "dry run" didn't report a *.bin pattern, and the git lfs migrate import --fixup run didn't change the commit ID, so nothing was altered. You'd need to weave the .gitattributes file into history before or coincident with the introduction of any matching files in order for it to have an effect.

@solace-rcampbell
Copy link
Author

Thank you for the great explanation. I now understand why --fixup is doing exactly what it was designed to do, and how that will not help my situation.

I still wish there was a better way to fix this. Maybe I'm asking too much, but if --fixup can rewrite history, why can't it do the work of injecting a commit in the right spot to fix .gitattributes? I guess at this point, this "bug" becomes a feature request.

@solace-rcampbell
Copy link
Author

solace-rcampbell commented Nov 1, 2022

I started investigating the "rebase" option, but git refuses to let me do a rebase too:

error: cannot rebase: You have unstaged changes.
error: Please commit or stash them.

Git thinks that I've modified these files, but won't allow me to revert them, or stash them. Which really limits what I'm able to do.

This seems like the bug here.

Why are these files marked as modified? I did not modify them. There is something wrong with the repository, but not my checkout.

The error occurred when I committed a modified .getattributes file WITHOUT re-adding those files. That's what LFS should flag. Not a later checkout of the files.

@chrisd8088
Copy link
Member

Maybe I'm asking too much, but if --fixup can rewrite history, why can't it do the work of injecting a commit in the right spot to fix .gitattributes? I guess at this point, this "bug" becomes a feature request.

That would be a feature request, yeah, and I think the reason it isn't a feature already is because git rebase essentially covers the same ground. In order for git lfs migrate import to know what to do in such a case, you'd have to supply it with knowledge of what file match patterns to use. This would probably take the form of a "final" version of your .gitattributes, with all the patterns you wanted specified. But if you have such a thing, then just rebasing to insert it at the start of your history and then running the existing git lfs migrate import --fixup will do the trick.

When you run git diff, what do you see? Something like this?

diff --git a/foo.bin b/foo.bin
index 257cc56..ad483a9 100644
--- a/foo.bin
+++ b/foo.bin
@@ -1 +1,3 @@
-foo
+version https://git-lfs.github.com/spec/v1
+oid sha256:b5bb9d8014a0f9b1d61e21e796d78dccdf1352f23cd32812f4850b878ae4944c

You might try the git lfs uninstall command I showed in one of my examples. Or you could manually edit your ~/.gitconfig and comment out the [filter "lfs"] section. That will ensure Git doesn't run the Git LFS client, and so you can edit your repo without any Git LFS hook functions being invoked.

Then you should be able to rework your history, dropping the recent commit of .gitattributes and inserting it early in your history, before running git lfs install again to restore the [filter "lfs"] Git configuration, at which point git lfs migrate import --fixup should do what you want.

@solace-rcampbell
Copy link
Author

Before I try the uninstall procedure, I'll answer the question you asked. Here is the output from git diff:

diff --git a/.../StoreApi.class b/.../StoreApi.class
index 7878fe16..39973621 100644
Binary files a/.../StoreApi.class and b/.../StoreApi.class differ

That's it. It doesn't actually show the content.

@solace-rcampbell
Copy link
Author

solace-rcampbell commented Nov 2, 2022

git lfs uninstall didn't change any of the behaviour. I still cannot perform a rebase.

$ git lfs uninstall
Hooks for this repository have been removed.
Global Git LFS configuration has been removed.

$ git status
On branch git.4
Your branch is up to date with 'origin/git.4'.
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
modified:   third-party/.../StoreApi.class
...

$ git reset --hard
HEAD is now at 7f591f25 setsolver now supports git
Encountered 319 files that should have been pointers, but weren't:
third-party/.../StoreApi.class
...

$ git rebase -i
error: cannot rebase: You have unstaged changes.
error: Please commit or stash them.

I also tried a new clone after uninstalling LFS, but I had similar results.

@chrisd8088
Copy link
Member

Sorry to hear you're still having trouble. The Encountered 319 files that should have been pointers, but weren't: output means that the Git LFS client is still being invoked, so I presume there are still filter.lfs configurations in place somewhere.

What does git config -l show (with any private information redacted)?

@solace-rcampbell
Copy link
Author

I still see git filter.lfs configuration.

Running git lfs uninstall again results in this error:

$ git lfs uninstall
warning: error running /usr/libexec/git-core/git 'config' '--includes' '--global' '--remove-section' 'filter.lfs': 'fatal: no such section: filter.lfs' 'exit status 128'
open /opt/sbox/rcampbell/git-3/.git/hooks/pre-push: no such file or directory
Hooks for this repository have been removed.
Global Git LFS configuration has been removed.
$ git config -l
filter.lfs.required=true
filter.lfs.clean=git-lfs clean -- %f
filter.lfs.smudge=git-lfs smudge -- %f
filter.lfs.process=git-lfs filter-process
credential.helper=store
core.repositoryformatversion=0
core.filemode=true
core.bare=false
core.logallrefupdates=true
remote.origin.fetch=+refs/heads/*:refs/remotes/origin/*
branch.main.remote=origin
branch.main.merge=refs/heads/main
lfs.repositoryformatversion=0
lfs.https://github.com/.../info/lfs.access=basic
branch.git.4.remote=origin
branch.git.4.merge=refs/heads/git.4

@chrisd8088
Copy link
Member

OK, if you still see the filter.lfs settings then you'll have to manually look through the potential sources of those in your system, global (a.k.a. user), repository, and worktree settings. Using the following should help:

$ git config -l --system
$ git config -l --global
$ git config -l --local
$ git config -l --worktree

I expect you'll find a [filter "lfs"] stanza somewhere in the first three, which are (at least by default) in /etc/gitconfig, ~/.gitconfig, and .git/config in your current repo.

@solace-rcampbell
Copy link
Author

solace-rcampbell commented Nov 2, 2022

Thanks for the direction.
It was in the --system output. I guess our automated install system enabled it during the install of git lfs. git lfs uninstall --system worked. Now I can try the rebase!

I feel disappointed that git lfs uninstall says it worked, but it was still installed afterwards. It would be nice if it told me that it was installed in multiple ways; or that it couldn't uninstall or something, anything to let me know that it didn't ACTUALLY uninstall.

@chrisd8088
Copy link
Member

I can appreciate that. The bare git lfs uninstall command did report only that the Global Git LFS configuration has been removed, so that was correct. But I also find Git's distinction between "global" and "system" confusing, myself.

@solace-rcampbell
Copy link
Author

solace-rcampbell commented Nov 4, 2022

OK, I was able to use "rebase" to inject changes to ".gitattributes" at the right spot in history, but --fixup still doesn't seem to be working for me.

You can see from the log output that .gitattributes was added before the relevant file; but even after a --fixup, git lfs fsck still shows this file should have been a pointer but was not. (this was not the first time I ran --fixup, but I added it here to show that I really did run it.)

$ git lfs migrate import --fixup --everything
migrate: Sorting commits: ..., done.
migrate: Rewriting commits: 100% (41/41), done.
git.4 24e16f412606c73690d236f3d070725ca4479024 -> 24e16f412606c73690d236f3d070725ca4479024
migrate: Updating refs: ..., done.
migrate: checkout: ..., done.

$ git log --graph --pretty=oneline
* 24e16f412606c73690d236f3d070725ca4479024 (HEAD -> git.4, origin/git.4) setsolver now supports git
* dc97b0b265f223ffc9282b1c85c3433af0e88d5d Updates to pre-commit hook.
* 18a9448d5da5d8d36c83e4811f56258e5ce928fe first commit
* 4f4f1542daa4069cfa313c153984f3e265b31c68 (origin/fixroot) Add lfs categorizations.
* 22bfaeb7ed5b2284ddaff52705fcd703f4a2300e root commit

rcampbell@dev3-181 ~/mine/git-5
$ git log "third-party/swagger-codegen/samples/client/petstore/kotlin-string/build/classes/java/main/io/swagger/client/models/ApiResponse.class"
commit 18a9448d5da5d8d36c83e4811f56258e5ce928fe
Author: Rolf Campbell <[email protected]>
Date:   Wed Sep 21 11:07:59 2022 -0400

    first commit

rcampbell@dev3-181 ~/mine/git-5
$ git log .gitattributes
commit 4f4f1542daa4069cfa313c153984f3e265b31c68 (origin/fixroot)
Author: Rolf Campbell <[email protected]>
Date:   Fri Nov 4 10:25:34 2022 -0400

    Add lfs categorizations.

rcampbell@dev3-181 ~/mine/git-5
$ git lfs fsck | grep "third-party/swagger-codegen/samples/client/petstore/kotlin-string/build/classes/java/main/io/swagger/client/models/ApiResponse.class"
pointer: unexpectedGitObject: "third-party/swagger-codegen/samples/client/petstore/kotlin-string/build/classes/java/main/io/swagger/client/models/ApiResponse.class" (treeish 24e16f412606c73690d236f3d070725ca4479024) should have been a pointer but was not

@chrisd8088
Copy link
Member

And, sorry, what does your .gitattributes look like?

@solace-rcampbell
Copy link
Author

It's long, but it has a line that looks like this:

$ grep class .gitattributes
*.class filter=lfs diff=lfs merge=lfs -text

@chrisd8088
Copy link
Member

chrisd8088 commented Nov 4, 2022

Hmm. Well, that does all look right; I'm not certain why the git lfs migrate import --fixup isn't catching the same file that git lfs fsck is.

You could try with git lfs migrate info --fixup --everything to see a bit more detail, I suppose.

It would also be interesting to see the output of GIT_TRACE=1 git lfs fsck and GIT_TRACE=1 git lfs migrate info --fixup --everything (or migrate import --fixup --everything).

Lastly, would you be able to either create a small reproduction case that could be shared, or share your repo as it is?

@chrisd8088
Copy link
Member

chrisd8088 commented Nov 5, 2022

One other thing you could try is to checkout the commit before the one which adds the ApiResponse.class file and try adding it manually in a new commit, to see if it gets properly converted into a Git LFS object or not. Something like:

$ git checkout 4f4f1542daa4069cfa313c153984f3e265b31c68
$ echo foo >third-party/swagger-codegen/samples/client/petstore/kotlin-string/build/classes/java/main/io/swagger/client/models/ApiResponse.class
$ git add third-party/swagger-codegen/samples/client/petstore/kotlin-string/build/classes/java/main/io/swagger/client/models/ApiResponse.class
$ git commit -m foo

$ git show HEAD:third-party/swagger-codegen/samples/client/petstore/kotlin-string/build/classes/java/main/io/swagger/client/models/ApiResponse.class
$ git lfs fsck

@solace-rcampbell
Copy link
Author

I can generate those logs, but the logs are huge (one of them is 170Megs). They also contain too much information about our company code to post on a public issue like this. Is there some way I could communicate the logs without posting them to a public forum?

@chrisd8088
Copy link
Member

Is there some way I could communicate the logs without posting them to a public forum?

Yes, I'm sure we could arrange that, but before we get that far, it might be useful to try the little experiment I mentioned above, and also, if you're running a command like GIT_TRACE=1 git lfs migrate info --fixup --everything, I don't think we need the whole log. It's more likely we can focus on places where the problematic file path is reported and look in those areas to see if anything stands out.

@solace-rcampbell
Copy link
Author

Ok. Here's the first log for GIT_TRACE=1 git lfs fsck filtered to just include the one path I've been mentioning:
git-lfs-fsck.txt

@solace-rcampbell
Copy link
Author

solace-rcampbell commented Nov 8, 2022

And here's the filtered log for GIT_TRACE=1 git lfs migrate info --fixup --everything.

git-lfs-migrate-info.txt.filtered.txt

@solace-rcampbell
Copy link
Author

solace-rcampbell commented Nov 8, 2022

Here's the output from the experiment you wanted me to run (it worked as expected).

$ git checkout 4f4f1542daa4069cfa313c153984f3e265b31c68
Note: switching to '4f4f1542daa4069cfa313c153984f3e265b31c68'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by switching back to a branch.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -c with the switch command. Example:

  git switch -c <new-branch-name>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       Or undo this operation with:
                                                                                                                                                                                                                                                              git switch -                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          Turn off this advice by setting config variable advice.detachedHead to false

HEAD is now at 4f4f1542 Add lfs categorizations.                                                                                                                                                                                                                                                                                                                                                                                                                                                                        rcampbell@dev3-181 ~/mine/git-2
$ echo foo >third-party/swagger-codegen/samples/client/petstore/kotlin-string/build/classes/java/main/io/swagger/client/models/ApiResponse.class
-bash: third-party/swagger-codegen/samples/client/petstore/kotlin-string/build/classes/java/main/io/swagger/client/models/ApiResponse.class: No such file or directory
                                                                                                                                                                                                                                                            rcampbell@dev3-181 ~/mine/git-2
$ echo foo > third-party/swagger-codegen/samples/client/petstore/kotlin-string/build/classes/java/main/io/swagger/client/models/ApiResponse.class
-bash: third-party/swagger-codegen/samples/client/petstore/kotlin-string/build/classes/java/main/io/swagger/client/models/ApiResponse.class: No such file or directory
                                                                                                                                                                                                                                                            rcampbell@dev3-181 ~/mine/git-2
$ mkdir -p third-party/swagger-codegen/samples/client/petstore/kotlin-string/build/classes/java/main/io/swagger/client/models/
                                                                                                                                                                                                                                                            rcampbell@dev3-181 ~/mine/git-2
$ echo foo > third-party/swagger-codegen/samples/client/petstore/kotlin-string/build/classes/java/main/io/swagger/client/models/ApiResponse.class

rcampbell@dev3-181 ~/mine/git-2                                                                                                                                                                                                                             $ git add third-party/swagger-codegen/samples/client/petstore/kotlin-string/build/classes/java/main/io/swagger/client/models/ApiResponse.class

rcampbell@dev3-181 ~/mine/git-2                                                                                                                                                                                                                             $ git commit -m foo
[detached HEAD eb3d36a5] foo
 1 file changed, 3 insertions(+)                                                                                                                                                                                                                             create mode 100644 third-party/swagger-codegen/samples/client/petstore/kotlin-string/build/classes/java/main/io/swagger/client/models/ApiResponse.class
                                                                                                                                                                                                                                                            rcampbell@dev3-181 ~/mine/git-2
$ git show HEAD:third-party/swagger-codegen/samples/client/petstore/kotlin-string/build/classes/java/main/io/swagger/client/models/ApiResponse.class
version https://git-lfs.github.com/spec/v1
oid sha256:b5bb9d8014a0f9b1d61e21e796d78dccdf1352f23cd32812f4850b878ae4944c
size 4

rcampbell@dev3-181 ~/mine/git-2
$ git lfs fsck
Git LFS fsck OK

@chrisd8088
Copy link
Member

Thanks for these, I will try to take a look soon! I confess $DAYJOB is consuming all my cycles at the moment, so it might take me longer than I'd like; please give me a reminder if I forget for too long. And, if you see anything of interest yourself in the logs, by all means call it out.

At a high level, what I'm hoping we can do is create a simple reproduction case, because then it's easier to figure out what's going on. Sometimes, though, even to get to that point we need to add tracing to the client until it's clear. You might experiment yourself (only if you have time, obviously) with building the client from source and throwing in some trace calls; that's pretty much all I'm going to be doing, I suspect, until the cause is identified.

@solace-rcampbell
Copy link
Author

OK, I think I've come up with a recipe. I don't know why this triggers it, but here's what I did:


mkdir test1
echo "# test1" >> README.md
git init
git add README.md
git commit -m "first commit"
git branch -M main
git remote add origin https://github.com/solace-rcampbell/test1.git
git push -u origin main

git branch one
git switch one
dd if=/dev/urandom of=test.bin bs=1k count=20000 iflag=fullblock
git add test.bin
git commit -m "Adding large binary file without LFS."
git push --set-upstream origin one
git lfs track "*.bin"
git add .gitattributes
git commit -m "Track *.bin too late."
git push

create new clone and checkout "main"
git merge origin/one --no-ff  --no-commit
git add --renormalize .
git commit -m "Try renormalize merge."
git push

git checkout 53f0e15d07a0e29efb72557d48ce1864cafbc889 # hash of first commit
git branch fixlfs
git switch fixlfs
git lfs track "*.bin"
git add .gitattributes
git commit -m "Track *.bin early."
git push

git switch main
git rebase -i fixlfs
git lfs migrate import --fixup --everything
git reset --hard

@chrisd8088
Copy link
Member

Excellent, thank you!! I have been able to reproduce the problem using a slightly adapted version of the above steps. First, create the base repo:

$ mkdir test1
$ cd test1
$ echo "# test1" >README.md
$ git init
$ git add README.md
$ git commit -m "first commit"
$ git branch -M main

$ git branch one
$ git switch one

$ dd if=/dev/urandom of=test.bin bs=1k count=20000
$ git add test.bin
$ git commit -m "Adding large binary file without LFS."
$ git lfs track "*.bin"
$ git add .gitattributes
$ git commit -m "Track *.bin too late."

$ cd ..

Then clone it locally, make modifications, and try to migrate:

$ git clone test1 test2
Cloning into 'test2'...
done.
Encountered 1 file that should have been a pointer, but wasn't:
	test.bin

$ cd test2
$ git checkout main
error: Your local changes to the following files would be overwritten by checkout:
	test.bin
Please commit your changes or stash them before you switch branches.
Aborting

$ git lfs uninstall
$ git checkout main

$ git merge origin/one --no-ff --no-commit
Automatic merge went well; stopped before committing as requested
$ git add --renormalize .
$ git commit -m "Try renormalize merge."

$ git log --graph --oneline
*   38105e3 (HEAD -> main) Try renormalize merge.
|\  
| * 115fbdb (origin/one, origin/HEAD, one) Track *.bin too late.
| * fa31c44 Adding large binary file without LFS.
|/  
* db3e824 (origin/main) first commit

$ git checkout HEAD^
Note: switching to 'HEAD^'.
...
HEAD is now at db3e824 first commit

$ git branch fixlfs
$ git switch fixlfs
$ git lfs track "*.bin"
$ git add .gitattributes
$ git commit -m "Track *.bin early."

$ git switch main
Switched to branch 'main'
Your branch is ahead of 'origin/main' by 3 commits.
  (use "git push" to publish your local commits)

$ git rebase -i fixlfs

$ git lfs install
$ git lfs fsck
pointer: unexpectedGitObject: "test.bin" (treeish 85b88989c110fa8aebd1db11b3859e61611ecbc3) should have been a pointer but was not

$ git lfs migrate import --fixup --everything
migrate: Sorting commits: ..., done.                                            
migrate: Rewriting commits: 100% (5/5), done.                                   
  fixlfs        f2f02dd4c8280cd02c35f04bf7654f4842fcf799 -> f2f02dd4c8280cd02c35f04bf7654f4842fcf799
  main          85b88989c110fa8aebd1db11b3859e61611ecbc3 -> 85b88989c110fa8aebd1db11b3859e61611ecbc3
  one           115fbdbfa4e645a2b4a2676909eaa687a03f7b6c -> 115fbdbfa4e645a2b4a2676909eaa687a03f7b6c
migrate: Updating refs: ..., done.                                              
migrate: checkout: ..., done.                                                   

$ git lfs fsck
pointer: unexpectedGitObject: "test.bin" (treeish 85b88989c110fa8aebd1db11b3859e61611ecbc3) should have been a pointer but was not

I suspect, but have no proof yet, that this problem might stem from how the binary file blob turns up in two branches in different relation to the .gitattributes file. I wonder if it gets visited on the one branch first, marked internally as "do not convert", and then is ignored when seen again on the main branch.

I haven't yet had a chance to take a look at your log file excepts, but this is fantastic as we can now play with the reproduction case repeatedly. Thank you very much for your patience and help on this issue!

$ git log --graph --oneline main
* 85b8898 (HEAD -> main) Adding large binary file without LFS.
* f2f02dd (fixlfs) Track *.bin early.
* db3e824 (origin/main) first commit

$ git log --graph --oneline one
* 115fbdb (origin/one, origin/HEAD, one) Track *.bin too late.
* fa31c44 Adding large binary file without LFS.
* db3e824 (origin/main) first commit

$ git ls-tree main
100644 blob 4edd5acb13dba6a9f44a206fa1b6a1789fbcc50a    .gitattributes
100644 blob 9aedc8b8a329bc27c5eb813195e3096fce6f7f31    README.md
100644 blob dff46678282e426573f2752cdfcffd602e846998    test.bin

$ git ls-tree one^
100644 blob 9aedc8b8a329bc27c5eb813195e3096fce6f7f31    README.md
100644 blob dff46678282e426573f2752cdfcffd602e846998    test.bin

@chrisd8088
Copy link
Member

OK, I've been able to narrow this down to a simpler replication case which focusses on having two branches where a potential Git LFS object is introduced in opposite order to the matching file pattern in .gitattributes.

The exact behaviour of git lfs migrate import --fixup --everything is wrong in both flavours of this reproduction case, and depends on the ordering of commits in the output of git rev-list --topo-order <refs>, because that's what the Git history rewriting logic uses. And that in turn depends on the alphabetical ordering of the ref names!

$ git init test
$ cd test
$ git commit --allow-empty -m init

$ echo foo >foo.bin
$ git add foo.bin
$ git commit -m foo
$ git lfs track '*.bin'
$ git add .gitattributes
$ git commit -m attrs

$ git checkout -b aaa HEAD^^

$ git lfs track '*.bin'
$ git add .gitattributes
$ git commit -m attrs
$ echo foo >foo.bin
$ git -c filter.lfs.process= -c filter.lfs.clean=cat -c filter.lfs.required=false add foo.bin
$ git commit -m foo

$ git checkout main
$ git lfs migrate import --fixup --everything
migrate: Sorting commits: ..., done.
migrate: Rewriting commits: 100% (5/5), done.
  main          09028eba46839d9c580dd3c82d0806e8e194eb64 -> 09028eba46839d9c580dd3c82d0806e8e194eb64
  zzz           fb8a190286e5361fdc33ca6c8d7f647563417a4c -> fb8a190286e5361fdc33ca6c8d7f647563417a4c
migrate: Updating refs: ..., done.
migrate: checkout: ..., done.

$ git lfs fsck
pointer: unexpectedGitObject: "foo.bin" (treeish 09028eba46839d9c580dd3c82d0806e8e194eb64) should have been a pointer but was not

$ git log --oneline main
09028eb (HEAD -> main) foo
d02aff0 attrs
71df322 init

$ git log --oneline zzz
fb8a190 (zzz) attrs
a8a9b68 foo
71df322 init

$ git rev-list --topo-order main zzz
09028eba46839d9c580dd3c82d0806e8e194eb64
d02aff093ae34788251aeddf3465213a21b96b75
fb8a190286e5361fdc33ca6c8d7f647563417a4c
a8a9b6876d48305c6c3af2e04c81d58ca2c688ae
71df32225d64eac372b50dd2f6f1e10cbf7c2ecd

And as a reverse of the same problem, if we take the exact same thing but rename our branch aaa so it will be placed ahead of main in the list passed to git rev-list, we get a situation where the foo.bin file is converted to a Git LFS object on the aaa branch when it first appears in the branch history, even though .gitattributes does not yet exist in that branch at that point:

$ git branch -m zzz aaa

$ git lfs migrate import --fixup --everything
migrate: Sorting commits: ..., done.
migrate: Rewriting commits: 100% (5/5), done.
  aaa           fb8a190286e5361fdc33ca6c8d7f647563417a4c -> bc50458018dfa3f372c5d63eddfac7907847a40a
  main          09028eba46839d9c580dd3c82d0806e8e194eb64 -> fb29dfe76a51be919a28364eb83cee3b1d12cbfe
migrate: Updating refs: ..., done.
migrate: checkout: ..., done.

$ git lfs fsck
Git LFS fsck OK

$ git log --oneline main
fb29dfe (HEAD -> main) foo
d02aff0 attrs
71df322 init

$ git log --oneline aaa
bc50458 (aaa) attrs
fea3695 foo
71df322 init

$ git rev-list --topo-order aaa main
bc50458018dfa3f372c5d63eddfac7907847a40a
fea36952937fd36653ac3f80405c02eb92a9184f
fb29dfe76a51be919a28364eb83cee3b1d12cbfe
d02aff093ae34788251aeddf3465213a21b96b75
71df32225d64eac372b50dd2f6f1e10cbf7c2ecd

$ git ls-tree aaa^
100644 blob ad483a923bcf4e7ef83182817802d740ee4d2ec6    foo.bin

$ git show aaa^:foo.bin
version https://git-lfs.github.com/spec/v1
oid sha256:b5bb9d8014a0f9b1d61e21e796d78dccdf1352f23cd32812f4850b878ae4944c
size 4

So, that's a definite bug, and if it wasn't obvious enough already from your reports, it seems like a fairly serious one. I'm going to stick our "bug" label on this issue and we'll try to address this as soon as time permits.

Thanks again for all your patience and help; it's greatly appreciated. Open source runs on this kind of feedback from users! ❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants