Releases: mongodb/js-bson
v6.10.1
6.10.1 (2024-11-27)
The MongoDB Node.js team is pleased to announce version 6.10.1 of the bson
package!
Release Notes
Fix issue with the internal unbounded type cache
As an optimization, a previous performance improvement stored the type information of seen objects to avoid recalculating type information. This caused an issue in the driver under extreme load and high memory usage as the cache grew. The assumption was that garbage collection would clear it enough to sustain normal operation. The cache is now removed and other optimal type checking is used in its place.
Cache the hex string of an ObjectId lazily
When ObjectId.cacheHexString
is set to true
we no longer convert the buffer to a hex string in the constructor, since the cache is already being filled in any call to objectid.toHexString()
.
Additionally, if a string is passed into the constructor we can cache this immediately as there is no performance impact and no extra memory that needs to be allocated.
This improves the performance for situations where you are parsing ObjectIds from a string (ex. JSON
) and want to avoid recalculating the hex. It also improves situations where you have ObjectIds coming from BSON and only convert some of them strings perhaps after applying some filter to eliminate some.
With cacheHexString
enabled deserializing ObjectIds from BSON shows ~80% performance improvement and toString
-ing ObjectIds that were constructed from a string convert ~40% faster!
Thanks to @SeanReece for contributing this improvement!
Bug Fixes
Performance Improvements
Documentation
We invite you to try the bson
library immediately, and report any issues to the NODE project.
v6.10.0
6.10.0 (2024-11-18)
The MongoDB Node.js team is pleased to announce version 6.10.0 of the bson
package!
Release Notes
BSON Binary Vector Support!
The Binary
class has new helpers to assist with using the newly minted Vector sub_type
of Binary sub_type == 9
🎉! For more on how these types can be used with MongoDB take a look at How to Ingest Quantized Vectors!
Here's a summary of the API:
class Binary {
toInt8Array(): Int8Array;
toFloat32Array(): Float32Array;
toPackedBits(): Uint8Array;
static fromInt8Array(array: Int8Array): Binary;
static fromFloat32Array(array: Float32Array): Binary;
static fromPackedBits(array: Uint8Array, padding: number = 0): Binary;
}
Relatively self-explanatory: each one supports converting to and constructing from a native Javascript data type that corresponds to one of the three vector types: Int8
, Float32
, PackedBit
.
Vector Bytes Format
When a Binary is sub_type
9 the first two bytes are set to important metadata about the vector.
binary.buffer[0]
- Thedatatype
that indicates what the following bytes are.binary.buffer[1]
- Thepadding
amount, a value 0-7 that indicates how many bits to ignore in aPackedBit
vector.
Packed Bits 📦
static fromPackedBits(array: Uint8Array, padding: number = 0)
When handling packed bits, the last byte may not be entirely used. For example, a PackedBit vector = [0xFF, 0xF0]
with padding = 4
ignores those last four 0s making the bit vector logically equal to 12 ones.
F F F 0
[1111 1111 1111] // ignored: the four 0s are padding
Important
When using the fromPackedBits
method to set your padding amount to avoid inadvertently extending your bit vector.
Unpacking Bits 🧳
Packed bits get special treatment with two styles of conversion methods to suit your vector-y needs. toBits
will return individually addressable bits shifted apart into an array. fromBits
takes the same format in reverse and packs the bits into bytes.
Notice there is no argument to set the padding
. That is because it can be determined by the array's length. Recall those 12 ones from the previous example, well, the padding has to be 4 to reach a multiple of 8.
class Binary {
toBits(): Int8Array;
static fromBits(bits: ArrayLike<number>): Binary;
}
Caution
We highly encourage using ONLY these methods to interact with vector data and avoid operating directly on the byte format. Other Binary class methods (put()
, write()
read()
, and value()
) and direct access of data in a Binary's buffer
beyond the 1st index should only be used in exceptional circumstances and with extreme caution after closely consulting the BSON Vector specification.
Details to keep in mind
- A javascript engine's endianness is platform dependent whereas BSON is always in little-endian format so if viewing bytes as Float32s take care to re-order bytes as needed.
- Int8 vectors are signed bytes but
read()
always returns unsigned bytes. - The vector data begins at offset
2
.
Binary's read()
returns a view of Binary.buffer
Binary's read()
return type claimed it would return number[]
or Uint8Array
which was true in previous BSON versions that didn't always store a Uint8Array on the buffer property like Binary
does today.
read()
's length parameter did not respect the position
value allowing reading bytes beyond the data that is actually stored in the Binary. This has been corrected.
Additionally, this method returned a view in Node.js environments and a copy in Web environments. it has been fixed to always return a view.
Features
Bug Fixes
Documentation
We invite you to try the bson
library immediately, and report any issues to the NODE project.
v6.9.0
6.9.0 (2024-10-15)
The MongoDB Node.js team is pleased to announce version 6.9.0 of the bson
package!
Release Notes
Timestamp now has t
and i
properties
To make this type a bit easier to use we are surfacing the breakdown of the two internal 32 bit segments of a Timestamp value.
const ts = new Timestamp({ i: 2, t: 1 });
ts.i // 2
ts.t // 1
ObjectId.isValid(string)
performance improvement
Often used to validate whether a hex string is the correct length and proper format before constructing an ObjectId for querying, the isValid function will validate strings much faster than before. Many thanks to @SeanReece for the contribution!
Serialization performance improved.
Optimizations have been implemented with respect to BSON serialization across the board, resulting in up to 20% gains in serialization with a sample of MFlix documents. Thanks again to @SeanReece for the contribution!
Features
Performance Improvements
- NODE-6344: improve ObjectId.isValid(string) performance (#708) (064ba91)
- NODE-6356: Improve serialization performance (#709) (61537f5)
Documentation
We invite you to try the bson
library immediately, and report any issues to the NODE project.
v6.8.0
6.8.0 (2024-06-27)
The MongoDB Node.js team is pleased to announce version 6.8.0 of the bson
package!
Release Notes
Add Signature to Github Releases
The Github release for js-bson
now contains a detached signature file for the NPM package (named
bson-X.Y.Z.tgz.sig
), on every major and patch release to 6.x and 5.x. To verify the signature, follow the instructions in the 'Release Integrity' section of the README.md
file.
Optimize performance of Long.fromBigInt
Internally fromBigInt was originally implemented using toString of the bigint value. Now, Long.fromBigInt
has been refactored to use bitwise operations greatly improving performance.
Features
Performance Improvements
Documentation
We invite you to try the bson
library immediately, and report any issues to the NODE project.
v6.7.0
6.7.0 (2024-05-01)
The MongoDB Node.js team is pleased to announce version 6.7.0 of the bson
package!
Release Notes
Add Long.fromStringStrict
method
The Long.fromStringStrict
method is almost identical to the Long.fromString
method, except it throws a BSONError
if any of the following are true:
- input string has invalid characters, for the given radix
- the string contains whitespace
- the value the input parameters represent is too large or too small to be a 64-bit Long
Unlike Long.fromString
, this method does not coerce the inputs '+/-Infinity'
and 'NaN'
to Long.ZERO
, in any case.
Examples:
Long.fromStringStrict('1234xxx5'); // throws BSONError
Long.fromString('1234xxx5'); // coerces input and returns new Long(123400)
// when writing in radix 10, 'n' and 'a' are both invalid characters
Long.fromStringStrict('NaN'); // throws BSONError
Long.fromString('NaN'); // coerces input and returns Long.ZERO
Note
Long.fromStringStrict
's functionality will be present in Long.fromString
in the V7 BSON release.
Add static Double.fromString
method
This method attempts to create an Double
type from a string, and will throw a BSONError
on any string input that is not representable as a IEEE-754 64-bit double
.
Notably, this method will also throw on the following string formats:
- Strings in non-decimal and non-exponential formats (binary, hex, or octal digits)
- Strings with characters other than sign, numeric, floating point, or slash characters (Note:
'Infinity'
,'-Infinity'
, and'NaN'
input strings are still allowed) - Strings with leading and/or trailing whitespace
Strings with leading zeros, however, are also allowed.
Add static Int32.fromString
method
This method attempts to create an Int32
type from string, and will throw a BSONError
on any string input that is not representable as an Int32
.
Notably, this method will also throw on the following string formats:
- Strings in non-decimal formats (exponent notation, binary, hex, or octal digits)
- Strings with non-numeric and non-leading sign characters (ex: '2.0', '24,000')
- Strings with leading and/or trailing whitespace
Strings with leading zeros, however, are allowed
UTF-8 validation now throws a BSONError
on overlong encodings in Node.js
Specifically, this affects deserialize
when utf8 validation is enabled, which is the default.
An overlong encoding is when the number of bytes in an encoding is inflated by padding the code point with leading 0s (see here for more information).
Long.fromString
takes radix into account before coercing '+/-Infinity' and 'NaN' to Long.ZERO
Long.fromString
no longer coerces the following cases to Long.ZERO
when the provided radix supports all characters in the string:
'+Infinity'
,'-Infinity'
, or'Infinity'
when 35 <= radix <= 36'NaN'
when 24 <= radix <= 36
// when writing in radix 27, 'n' and 'a' are valid characters, so 'NaN' represents the decimal number 17060
Long.fromString('NaN', 27); // new Long(17060)
Long.fromString('NaN', 10); // new Long(0) <-- Since 'NaN' is not a valid input in base 10, it gets coerced to Long.ZERO
Features
- NODE-5648: add Long.fromStringStrict() (#675) (9d5a5df)
- NODE-6086: add Double.fromString() method (#671) (e943cdb)
- NODE-6087: add Int32.fromString method (#670) (5a21889)
Bug Fixes
- NODE-6123: utf8 validation is insufficiently strict (#676) (ae8bac7)
- NODE-6144: Long.fromString incorrectly coerces valid inputs to Long.ZERO in special cases (#677) (208f7e8)
Documentation
We invite you to try the bson
library immediately, and report any issues to the NODE project.
v6.6.0
6.6.0 (2024-04-01)
The MongoDB Node.js team is pleased to announce version 6.6.0 of the bson
package!
Release Notes
Binary.toString
and Binary.toJSON
align with BSON serialization
When BSON serializes a Binary instance it uses the bytes between 0
and binary.position
since Binary supports pre-allocating empty space and writing segments of data using .put()
/.write()
. Erroneously, the toString()
and toJSON()
methods did not use the position
property to limit how much of the underlying buffer to transform into the final value, potentially returning more string than relates to the actual data of the Binary instance.
In general, you may not encounter this bug if Binary
instances are created from a data source (new Binary(someBuffer)
) or are returned by the database because in both of these cases binary.position
is equal to the length of the underlying buffer.
Fixed example creating an empty Binary:
new BSON.Binary().toString();
// old output: '\x00\x00\x00\x00...' (256 zeros)
// new output: ''
Experimental APIs
This release contains experimental APIs that are not suitable for production use. As a reminder, anything marked @experimental
is not a part of the stable semantically versioned API and is subject to change in any subsequent release.
Bug Fixes
Documentation
We invite you to try the bson
library immediately, and report any issues to the NODE project.
v6.5.0
6.5.0 (2024-03-12)
The MongoDB Node.js team is pleased to announce version 6.5.0 of the bson
package!
Release Notes
Fixed float byte-wise handling on big-endian systems
Caution
Among the platforms BSON and the MongoDB driver support this issue impacts s390x big-endian systems. x86, ARM, and other little-endian systems are not affected. Existing versions of the driver can be upgraded to this release.
A recent change to the BSON library started parsing and serializing floats using a Float64Array
. When reading the bytes from this array the ordering is dependent on the platform it is running on and we now properly account for that ordering.
Add SUBTYPE_SENSITIVE
on Binary
class
When a BSON.Binary object is of 'sensitive' subtype, the object's subtype will equal 0x08
.
Features
- NODE-5506: add Binary subtype sensitive (#657) (748ca60)
- NODE-5957: add BSON indexing API (#654) (2ac17ec)
Bug Fixes
Documentation
We invite you to try the bson
library immediately, and report any issues to the NODE project.
v6.4.0
6.4.0 (2024-02-29)
The MongoDB Node.js team is pleased to announce version 6.4.0 of the bson
package!
Release Notes
BSON short basic latin string writing performance improved!
The BSON library's string encoding logic now attempts to optimize for basic latin (ASCII) characters. This will apply to both BSON keys and BSON values that are or contain strings. If strings are less than 6 bytes we observed approximately 100% increase in speed while around 24 bytes the performance was about 33% better. For any non-basic latin bytes or at 25 bytes or greater the BSON library will continue to use Node.js' Buffer.toString API.
The intent is to generally target the serialization of BSON keys which are often short and only use basic latin.
Fixed objectId symbol property not defined on instances from cross cjs and mjs
We do recommend that users of the driver use the BSON APIs exported from the driver. One reason for this is at this time the driver is only shipped in commonjs format and as a result it will only import the commonjs BSON bundle. If in your application you use import syntax then there will be a commonjs and an es module instance in the current process which prevents things like instanceof
from working.
Also, private symbols defined in one package will not be equal to symbols defined in the other. This caused an issue on ObjectId's private symbol property preventing the .equals
method from one package from operating on an ObjectId created from another.
Thanks to @dot-i's contribution we've changed the private symbol to a private string property so that the .equals()
method works across module types.
Deserialization performance increased
If BSON data does not contain Doubles and UTF8 validation is disabled the deserializer is careful to not allocate data structures needed to support that functionality. This has shown to greatly increase (2x-1.3x) the performance of the deserializer.
Thank you @billouboq for this contribution!
Improve the performance of small byte copies
When serializing ObjectIds, Decimal128, and UUID values we can get better performance by writing the byte-copying logic in Javascript for loops rather than using the TypedArray.set API. ObjectId serialization performance is 1.5x-2x faster.
Improved the performance of serializing and deserializing doubles and bigints
We now use bit shifting and multiplication operators in place of DataView getX/setX calls to parse and serialize bigints and a Float64Array to convert a double to bytes. This change has been shown to increase deserializing performance ~1.3x and serializing performance ~1.75x.
Use allocUnsafe for ObjectIds and Decimal128
For small allocations Node.js performance can be improved by using pre-allocated pooled memory. ObjectIds and Decimal128 instance will now use allocUnsafe on Node.js.
Features
Bug Fixes
- NODE-5873: objectId symbol property not defined on instances from cross cjs and mjs (#643) (4d9884d)
Performance Improvements
- NODE-5557: move DataView and Set allocation used for double parsing and utf8 validation to nested path (#611) (9a150e1)
- NODE-5910: optimize small byte copies (#651) (24d035e)
- NODE-5934: replace DataView uses with bit math (#649) (6d343ab)
- NODE-5955: use pooled memory when possible (#653) (78c4264)
Documentation
We invite you to try the bson
library immediately, and report any issues to the NODE project.
v6.3.0
6.3.0 (2024-01-31)
The MongoDB Node.js team is pleased to announce version 6.3.0 of the bson
package!
Release Notes
BSON short basic latin string parsing performance improved! 🐎
The BSON library's string decoding logic now attempts to optimize for basic latin (ASCII) characters. This will apply to both BSON keys and BSON values that are or contain strings. If strings are less than 6 bytes we observed approximately ~100% increase in speed while around 15 bytes the performance was about ~30% better. For any non-basic latin bytes or at 20 bytes or greater the BSON library will continue to use Node.js' Buffer.toString
API.
The intent is to generally target the deserialization of BSON keys which are often short and only use basic latin, Et tu, _id?
Using a number
type as input to the ObjectId
constructor is deprecated
Instead, use static createFromTime()
to set a numeric value for the new ObjectId
.
// previously
new ObjectId(Date.now())
// recommended
ObjectId.createFromTime(Date.now())
Features
- NODE-3034: deprecate number as an input to
ObjectId
constructor (#640) (44bec19) - NODE-5861: optimize parsing basic latin strings (#642) (cdb779b)
Documentation
We invite you to try the bson
library immediately, and report any issues to the NODE project.
v6.2.0
6.2.0 (2023-10-16)
The MongoDB Node.js team is pleased to announce version 6.2.0 of the bson
package!
Release Notes
Node.js BSON now supports inspect options, specifically visualizing in color
New Color Visualization:
Other Notable Changes
- Strings have consistent single quotes around them, other than the case of a code block that has a string within
- Code blocks with newlines will be visually printed with newlines for easier reading
Clarify BSONVersionError message
Previously, our thrown BSONVersionError
stated that the "bson type must be from 6.0 or later". Our intention is to prevent cross-major BSON types from reaching the serialization logic as breaking changes to the types could lead to silent incompatibilities in the serialization process. We've updated the message to make that intention clear: "bson types must be from bson 6.x.x".
Features
Bug Fixes
Documentation
We invite you to try the bson
library immediately, and report any issues to the NODE project.