Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BTRFS from basic master #3065

Merged
merged 55 commits into from
Nov 27, 2024
Merged

BTRFS from basic master #3065

merged 55 commits into from
Nov 27, 2024

Conversation

simsong
Copy link
Member

@simsong simsong commented Oct 26, 2024

This is #413 but brought up to develop.

  • Add test case for btrfs
  • Make sure any other code changed is also tested.

@simsong simsong mentioned this pull request Oct 26, 2024
@simsong
Copy link
Member Author

simsong commented Oct 27, 2024

Still waiting for AppVeyor, but this now works on windows, linux and MacOS. It would be great for @bcarrier and @jayaramcs to take a look, as this is a new file system for TSK. That's pretty significant! @joachimmetz and @uckelman-sf are also invited to comment.

@simsong
Copy link
Member Author

simsong commented Oct 28, 2024

The AppVeyor fails were because:

  • PyYAML didn't build
  • AppVeyor doesn't have libewf installed.

@simsong simsong marked this pull request as ready for review October 28, 2024 14:30
@joachimmetz
Copy link
Contributor

Nice, would recommend adding a fuzz target as well https://github.com/sleuthkit/sleuthkit/tree/develop/ossfuzz

@simsong
Copy link
Member Author

simsong commented Oct 28, 2024

I don't know how to add the fuzz targets, but if you tell me how, I'll do it!

@joachimmetz
Copy link
Contributor

I don't know how to add the fuzz targets, but if you tell me how, I'll do it!

sure, should not be too involved, let me ask a couple of questions first to get some more context

@simsong
Copy link
Member Author

simsong commented Oct 28, 2024

I don't know how to add the fuzz targets, but if you tell me how, I'll do it!

sure, should not be too involved, let me ask a couple of questions first to get some more context

How about https://corp.digitalcorpora.org/corpora/drives/2015-basicmaster/btrfs_testimage_50MB.E01?

btrfs sits on top of raw storage, like ext.

@joachimmetz
Copy link
Contributor

joachimmetz commented Oct 29, 2024

How about https://corp.digitalcorpora.org/corpora/drives/2015-basicmaster/btrfs_testimage_50MB.E01?

to isolate just fuzzing the btrfs backend a smaller only btrfs raw file system (let me see if I can create a script and add it to https://github.com/dfirlabs/btrfs-specimens). also note that the ossfuzz build does not build with E01 support

btrfs sits on top of raw storage, like ext.

the fls fuzz target likely can be reused then https://github.com/sleuthkit/sleuthkit/blob/develop/ossfuzz/fls_fuzzer.cc
and only an update to the Makefile is needed https://github.com/sleuthkit/sleuthkit/blob/develop/Makefile.am#L825 comparable to

fls_btrfs_fuzzer_SOURCES = \
	ossfuzz/fls_fuzzer.cc \
	ossfuzz/mem_img.h

fls_btrfs_fuzzer_CPPFLAGS = \
	-DFSTYPE=TSK_FS_TYPE_BTRFS_DETECT \
	$(AM_CPPFLAGS)

fls_btrfs_fuzzer_LDADD = \
	@LIB_FUZZING_ENGINE@ \
	$(TSK_LIB)

@joachimmetz
Copy link
Contributor

let me see if I can create a script and add it to https://github.com/dfirlabs/btrfs-specimens

Looks like the minimum required size is 114294784, that might be too large. Note that the seed file is not required but helps with more targeted fuzzing.

@simsong
Copy link
Member Author

simsong commented Oct 29, 2024 via email

@joachimmetz
Copy link
Contributor

I’m unclear what the seed file does if it is not needed.

tldr it helps with improving test coverage, maybe useful context https://google.github.io/oss-fuzz/getting-started/new-project-guide/#seed-corpus

why not make it as part of the fuzzing operation?

not sure what you specifically are asking with this, but what the fuzzer does is invoke the fuzz targets with mutated input based on the seed corpus

@simsong
Copy link
Member Author

simsong commented Oct 29, 2024 via email

@joachimmetz
Copy link
Contributor

The seed file is not required, adding the fuzz target should be sufficient for now. Seed file can always be added later as well.

@simsong
Copy link
Member Author

simsong commented Oct 29, 2024 via email

@joachimmetz
Copy link
Contributor

I'm unaware of academic paper specifically but the project documentation can be found here https://google.github.io/oss-fuzz/

The proposed changes to Makefile.am (in #3065 (comment)) and adding fls_btrfs_fuzzer to the corresponding noinst_PROGRAMS in should be sufficient to get a btrfs fuzz target binary to build and picked up by OSSFuzz.

@simsong
Copy link
Member Author

simsong commented Oct 29, 2024

I'm unaware of academic paper specifically but the project documentation can be found here https://google.github.io/oss-fuzz/

Thanks. I wonder how hard it would be to get it to work with bulk_extractor.

The proposed changes to Makefile.am (in #3065 (comment)) and adding fls_btrfs_fuzzer to the corresponding noinst_PROGRAMS in should be sufficient to get a btrfs fuzz target binary to build and picked up by OSSFuzz.

I'll look into it and see if I can add it to this PR. Meanwhile, I'm during this BT back to a draft until I hear from @jayaramcs.

@simsong simsong marked this pull request as draft October 29, 2024 13:26
simsong added a commit that referenced this pull request Oct 29, 2024
@simsong
Copy link
Member Author

simsong commented Oct 29, 2024

fls_btrfs_fuzzer_SOURCES = \
	ossfuzz/fls_fuzzer.cc \
	ossfuzz/mem_img.h

fls_btrfs_fuzzer_CPPFLAGS = \
	-DFSTYPE=TSK_FS_TYPE_BTRFS_DETECT \
	$(AM_CPPFLAGS)

fls_btrfs_fuzzer_LDADD = \
	@LIB_FUZZING_ENGINE@ \
	$(TSK_LIB)

Oh, I get it. This compiles a fls_btrfs_fuzzer executable with the fuzzing engine linked in and LLVMFuzzerTestOneInput as the test harness. But the fuzzing doesn't happen in the GitHub action, it happens over at Google in GCP. That's why the size is an issue.

I'm still unclear on where the code analysis is happening. Is the fuzz happening with source code inspection or object code inspection? This is presumably data mutation to cause more execution graph exploration, right?

@joachimmetz
Copy link
Contributor

I'm still unclear on where the code analysis is happening. Is the fuzz happening with source code inspection or object code inspection?

It is configured to run on clusterfuzz. Also see https://google.github.io/oss-fuzz/further-reading/clusterfuzz/ but you can also run it locally.

Issues are available through https://issues.oss-fuzz.com/ Brian currently has access, given he is configure as project owner in https://github.com/google/oss-fuzz/tree/master/projects/sleuthkit

@joachimmetz
Copy link
Contributor

Regarding 1f84b56#diff-0462e381b2fb3286568215681c8983490a37ac9ae0f0c5ee304df7fa6426d4afR807

you also need to add fls_btrfs_fuzzer to noinst_PROGRAMS on line 805

@uckelman
Copy link
Contributor

There should be a test image for validating how Sleuthkit reads BTRFS, to go with the other test images in #3044. Has there been any progress on that?

@simsong
Copy link
Member Author

simsong commented Oct 29, 2024

There should be a test image for validating how Sleuthkit reads BTRFS, to go with the other test images in #3044. Has there been any progress on that?

Yes. We have one here:
https://corp.digitalcorpora.org/corpora/drives/2015-basicmaster/btrfs_testimage_50MB.E01

My understanding is that @joachimmetz is concerned about the size.

@simsong
Copy link
Member Author

simsong commented Oct 29, 2024 via email

@@ -0,0 +1,27 @@
AM_CPPFLAGS = -I../.. -I$(srcdir)/../.. -Wall
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New makefiles shouldn't be created in subdirectories. This should go into the main (and only) Makefile.am.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Legacy cruft from the original PR. It's not being used. I'll delete it. Thanks for catching it.

@simsong
Copy link
Member Author

simsong commented Nov 2, 2024

It's unclear to me why the AppVeryor failed, so I'm doing an update branch and we will see if it fails again.

Makefile.am Outdated
@@ -701,9 +704,6 @@ endif
tsk_jar = $(top_builddir)/bindings/java/dist/sleuthkit-$(PACKAGE_VERSION).jar
jardir = $(prefix)/share/java

$(tsk_jar):
cd bindings/java ; ant dist $(ant_args)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this being removed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

user error. I'll add it back.

Makefile.am Outdated Show resolved Hide resolved
Makefile.am Outdated Show resolved Hide resolved
@simsong
Copy link
Member Author

simsong commented Nov 26, 2024 via email

@@ -37,7 +37,7 @@ jobs:
linkage: "shared"
compiler: "gcc"
runner: "ubuntu-22.04"
configure_opts: "--with-libewf --with-libqcow --with-libvhdi --with-libvmdk --disable-java CFLAGS=\"-Wno-unused-command-line-argument -Werror\" CXXFLAGS=-Werror"
configure_opts: "--with-libewf --with-libqcow --with-libvhdi --with-libvmdk --disable-java CXXFLAGS=-Werror"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CFLAGS=-Werror should not be removed.

Makefile.am Outdated
@@ -2,7 +2,7 @@ ACLOCAL_AMFLAGS = -I m4

AM_CPPFLAGS = -I$(srcdir)/tsk
AM_CFLAGS = -Wall -Wextra
AM_CXXFLAGS = -Wall -Wextra -Woverloaded-virtual
AM_CXXFLAGS = -Wall -Wextra -Woverloaded-virtual
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra space.

@simsong
Copy link
Member Author

simsong commented Nov 26, 2024 via email

@@ -49,7 +49,7 @@ jobs:
linkage: "shared"
compiler: "clang"
runner: "ubuntu-22.04"
configure_opts: "--with-libewf --with-libqcow --with-libvhdi --with-libvmdk --disable-java CC=clang CXX=clang++ CFLAGS=-Werror CXXFLAGS=-Werror"
configure_opts: "--with-libewf --with-libqcow --with-libvhdi --with-libvmdk --disable-java CC=clang CXX=clang++ CFLAGS=-Werror CXXFLAGS=-Werror
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is missing the trailing double quote.

Copy link

codecov bot commented Nov 27, 2024

Codecov Report

Attention: Patch coverage is 80.00000% with 3 lines in your changes missing coverage. Please review.

Project coverage is 19.18%. Comparing base (db89b4e) to head (69d8604).
Report is 58 commits behind head on develop.

Files with missing lines Patch % Lines
tsk/fs/fs_open.c 0.00% 2 Missing ⚠️
tsk/fs/dcat_lib.cpp 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #3065      +/-   ##
===========================================
+ Coverage    16.50%   19.18%   +2.68%     
===========================================
  Files          145      148       +3     
  Lines        34107    36233    +2126     
===========================================
+ Hits          5627     6948    +1321     
- Misses       28480    29285     +805     
Files with missing lines Coverage Δ
tsk/fs/btrfs.cpp 58.43% <ø> (ø)
tsk/fs/btrfs_csum.cpp 100.00% <100.00%> (ø)
tsk/fs/fs_types.c 14.81% <ø> (ø)
tsk/fs/tsk_btrfs.h 100.00% <100.00%> (ø)
tsk/fs/tsk_fs.h 0.00% <ø> (ø)
tsk/fs/dcat_lib.cpp 0.00% <0.00%> (ø)
tsk/fs/fs_open.c 30.22% <0.00%> (-0.44%) ⬇️

... and 5 files with indirect coverage changes

@simsong
Copy link
Member Author

simsong commented Nov 27, 2024

Thanks to @uckelman-sf for all of your help!

@simsong simsong merged commit 9608dae into develop Nov 27, 2024
24 checks passed
@simsong simsong deleted the develop-btrfs branch November 27, 2024 07:22
@uckelman-sf
Copy link
Contributor

Wait, this should not have been merged yet.

  • I didn't review the substantive parts of the code---I was waiting for all the build problems to be fixed before doing that.
  • How are we validating that any of this works? How was the test data produced?

@simsong
Copy link
Member Author

simsong commented Nov 27, 2024

Wait, this should not have been merged yet.

  • I didn't review the substantive parts of the code---I was waiting for all the build problems to be fixed before doing that.
  • How are we validating that any of this works? How was the test data produced?

Whoops.
I validated it on the disk image btrfs_testimage_50MB.E01

@simsong
Copy link
Member Author

simsong commented Nov 27, 2024

Wait, this should not have been merged yet.

  • I didn't review the substantive parts of the code---I was waiting for all the build problems to be fixed before doing that.
  • How are we validating that any of this works? How was the test data produced?

Whoops. I validated it on the disk image btrfs_testimage_50MB.E01

How do you want to proceed?

@uckelman-sf
Copy link
Contributor

Whoops. I validated it on the disk image btrfs_testimage_50MB.E01

How? By hand? Against some expected results?

@simsong
Copy link
Member Author

simsong commented Nov 27, 2024

Whoops. I validated it on the disk image btrfs_testimage_50MB.E01

How? By hand? Against some expected results?

By hand. I created the test image and ran fiwalk on it and I got results that I expected.
Separately, I've been working with NIST to see if we can create a new forensic image generator that generates a disk image and creates a DFXML file at the same time, so that we can automate the process of creating novel complex images and then testing them.

@uckelman-sf
Copy link
Contributor

Ok. I'd like to have a look through tsk/fs/btrfs.cpp and tsk/fs/tsk_btrfs.h when I'm back on Tuesday.

@simsong
Copy link
Member Author

simsong commented Nov 27, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants