Skip to content

Conversation

@IanManske
Copy link
Member

@IanManske IanManske commented Nov 20, 2024

Description

This PR cleans up the code surrounding formatting and displaying file sizes.

  • The byte_unit crate we use for file size units displays kilobytes as KB, which is not the SI or ISO/IEC standard. Rather it should be kB, so this fixes SI Kilobyte should be written as "kB", not "KB" #8872. On some systems, KB actually means KiB, so this avoids any potential confusion.
  • The byte_unit crate, when displaying file sizes, casts integers to floats which will lose precision for large file sizes. This PR adds a custom Display implementation for Filesize that can give an exact string representation of a Filesize for metric/SI units.
  • This PR also removes the dependency on the byte_unit crate which brought in several other dependencies.

Additionally, this PR makes some changes to the config for filesize formatting ($env.config.filesize).

  • The previous filesize config had the metric and format options. If a metric (SI) unit was set in format, but metric was set to false, then the metric option would take precedence and convert format to the corresponding binary unit (or vice versa). E.g., { format: kB, metric: false } => KiB. Instead, this PR adds the unit option to replace the format and metric options. unit can be set to a fixed file size unit like kB or KiB, or it can be set to one of the special options: binary or metric. These options tells nushell to format file sizes using an appropriately scaled metric or binary unit (examples below).
    # precision = null
    
    # unit = kB
    1kB  # 1 kB
    1KiB # 1.024 kB
    
    # unit = KiB
    1kB  # 0.9765625 KiB
    1KiB # 1 KiB
    
    # unit = metric
    1000B     # 1 kB
    1024B     # 1.024 kB
    10_000MB  # 10 GB
    10_240MiB # 10.73741824 GB
    
    # unit = binary
    1000B     # 1000 B
    1024B     # 1 KiB
    10_000MB  # 9.313225746154785 GiB
    10_240MiB # 10 GiB
  • In addition, this PR also adds the precision option to the filesize config. It determines how many digits to show after the decimal point. If set to null, then everything after the decimal point is shown.
  • The default filesize config is { unit: metric, precision: 1 }.

User-Facing Changes

  • Commands that use the config to format file sizes will follow the changes described above (e.g., table, into string, to text, etc.).
  • The file size unit/format passed to format filesize is now case sensitive. An error with the valid units is shown if the case does not match.
  • $env.config.filesize.format and $env.config.filesize.metric are deprecated and replaced by $env.config.filesize.unit.
  • A new $env.config.filesize.precision option was added.

Tests + Formatting

Mostly updated test expected outputs.

After Submitting

This PR does not change the way NUON serializes file sizes, because that would require changing the nu parser to be able to losslessly decode the new, exact string representation introduced in this PR.

Similarly, this PR also does not change the file size parsing in any way. Although the file size units provided to format filesize or the filesize config are now case-sensitive, the same is not yet true for file size literals in nushell code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Holy moly is that removing many dependencies!

}

// Safety: all the characters in `buf` are valid UTF-8.
let fract = unsafe { std::str::from_utf8_unchecked(&buf[..i]) };
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The debug_assert above should catch potential errors that could make this invalid UTF-8. I could change this to std::str::from_utf8 paired with an expect instead if that is better.

@IanManske IanManske added A:configuration Issue related to nu's configuration notes:breaking-changes This PR implies a change affecting users and has to be noted in the release notes notes:fixes Include the release notes summary in the "Bug fixes" section deprecated:pr-commands (deprecated: too vague) This PR changes our commands in some way A:nuon-format I/O and spec of the nuon data format and removed A:nuon-format I/O and spec of the nuon data format labels Nov 30, 2024
@IanManske IanManske marked this pull request as ready for review November 30, 2024 01:50
Copy link
Member

@sholderbach sholderbach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In your changes to the PR description you talk about a space option. The code doesn't reflect that yet, so maybe you miss a push here. In teneral I kind of lean in the direction that we should for our Display output not include a space as this corresponds to the literal instead of being typographically correct.

I haven't vetted the nasty formatting algorithm to minimize the float errors with base 10 yet. If your algo is tested well I am probably fine with that debug_assert!

Generally I like the changes in the config layout as is (don't care for the space) and happy about the more Nushelly structure around the config internals. (and good riddance byte_unit's transitive dependencies(

@IanManske
Copy link
Member Author

Sorry for the confusion, I added the space option in another commit, but decided against it and removed it. My reason being that most software (citation needed) seems to display file sizes with a space between the unit and number. For example, Windows Explorer (if I remember correctly), the Firefox inspect network tab, various Linux file managers, Finder on macOS, etc.

If necessary, we can always add the space option in the future.

@IanManske IanManske merged commit 93e1217 into nushell:main Jan 23, 2025
16 checks passed
@github-actions github-actions bot added this to the v0.102.0 milestone Jan 23, 2025
cloud-np added a commit to cloud-np/nushell that referenced this pull request May 3, 2025
The PR that forces this commit nushell/nushell#14397
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A:configuration Issue related to nu's configuration deprecated:pr-commands (deprecated: too vague) This PR changes our commands in some way notes:breaking-changes This PR implies a change affecting users and has to be noted in the release notes notes:fixes Include the release notes summary in the "Bug fixes" section

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SI Kilobyte should be written as "kB", not "KB"

2 participants