feat: exposing `deno_cache` + `ai cache interceptor` #446

kallebysantos · 2024-11-19T08:58:10Z

What kind of change does this PR introduce?

feature, enhancement

What is the current behavior?

When using huggingface/transformers.js the model assets can't be cached. It means that each worker life cycle will require a new fetch of these assets.

What is the new behavior?

The deno_cache is now exposed to Js land, so that transformers.js lib can use the global caches object to store model assets. Also an interceptor has been attached to deno_cache, with this we can interpect only .onnx requests and use it during session load.

How it works?

By attaching an interceptor, we can filter by .onnx requests and return back the url bytes instead of fetching the model. This way transformers.js lib will think that its own model's bytes and use it during session init.
The exposed onnx runtime will knows when the incoming bytes is either url string or model bytes this way it can choose to load from internal cache or memory.

When loading from internal cache its achieve almost the same behaviour of PR #368 (Pipeline RFC)

Final considerations:

This is an adapted work from #368 where we spitted out only the core features that improves ort support for edge-runtime.

Finally, thanks for @nyannyacha that help me with this proposal and code as well 🙏

crates/sb_ai/utils.rs

**NOTE:** saving cache in `temp folder` consider move it to a better location

- adding a cache adapter that intercepts `transformers-cache` - fetch and caching models from rust land

- using a separated file to store `ort` predictions snapshots for both `x64` and `arm64`.

- Checking for `RUST_LOG` env, then executing tests as `debug` with tracing enabled

- Applying url check to prevent request errors, matching between `Url` and `Model Bytes`.

- Adding tests scenarios for `env.useBrowserCache = true`

- blocking `Web Cache API` to only allow caching of `.onnx` files comming from `transformers.js` lib.

This reverts commit d6bf8a7.

(cherry picked from commit 8f9a535)

laktek · 2024-11-24T17:34:52Z

@kallebysantos @nyannyacha Thanks for these changes! Merging this PR (I will run some performance tests in our staging environment before rolling this out)

github-actions · 2024-11-24T17:35:33Z

🎉 This PR is included in version 1.64.0 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

kallebysantos force-pushed the feat-cache-api branch from c65261e to 370b075 Compare November 20, 2024 00:09

nyannyacha force-pushed the feat-cache-api branch from 6a0e6cd to 5ae3ba3 Compare November 21, 2024 04:26

nyannyacha marked this pull request as ready for review November 21, 2024 08:33

laktek reviewed Nov 22, 2024

View reviewed changes

crates/sb_ai/utils.rs Outdated Show resolved Hide resolved

nyannyacha force-pushed the feat-cache-api branch 2 times, most recently from aa91f49 to 1f5024b Compare November 23, 2024 01:17

kallebysantos force-pushed the feat-cache-api branch 2 times, most recently from 3e81242 to e9b2f50 Compare November 23, 2024 12:08

kallebysantos and others added 22 commits November 23, 2024 21:11

chore: add deno_cache crate

ae73ab2

feat(sb_core): expose deno_cache to js land

3bc0757

**NOTE:** saving cache in `temp folder` consider move it to a better location

stamp: add example for cache api

64944f5

stamp: add onnxruntime cache adapter

e5820f9

- adding a cache adapter that intercepts `transformers-cache` - fetch and caching models from rust land

fix: tracing wrong import

a9814ef

test(sb_ai): adding ort snapshots in a separated file

50b77b0

- using a separated file to store `ort` predictions snapshots for both `x64` and `arm64`.

stamp(test): add test command with debug support

47df481

- Checking for `RUST_LOG` env, then executing tests as `debug` with tracing enabled

fix(sb_ai): apply url check before load_from_url()

74ad37f

- Applying url check to prevent request errors, matching between `Url` and `Model Bytes`.

test(sb_ai): add integration tests for ort cache

db1e3e1

- Adding tests scenarios for `env.useBrowserCache = true`

stamp: clippy :)

98524cc

chore: rid all things that are related to model downloading

cf6d509

chore: cleanup ort integration tests

547ab81

refactor(sb_ai): polishing and cleanup

483625f

stamp: cleanup

98eed40

stamp: polishing

5575909

chore: update dependencies

1e3ea24

chore: update Cargo.lock

2bfb57a

stamp(sb_ai): polishing

963fed7

chore: allow install_onnx.sh able to be used in another os

fc67215

chore: add x86_64 snapshots of ort integration tests

18be84b

stamp: attach the os prefix to snapshots

06322ed

stamp: should be cleanup unused sessions while running integration tests

7199130

Nyannyacha and others added 3 commits November 23, 2024 12:15

chore: add x86_64 (linux) snapshots of ort integration tests

8a5668c

stamp: snapshots must not be created in CI

8255b74

stamp: disabling Web Cache API for general use

123bd2f

- blocking `Web Cache API` to only allow caching of `.onnx` files comming from `transformers.js` lib.

kallebysantos force-pushed the feat-cache-api branch from e9b2f50 to 8f9a535 Compare November 23, 2024 12:17

nyannyacha and others added 6 commits November 23, 2024 12:17

stamp: format

8a29a32

stamp: allow cache dir to be specified through env var

e8b63dd

stamp: apply cargo fmt

67e6d18

Revert "stamp(test): add test command with debug support"

7a0c3eb

This reverts commit d6bf8a7.

stamp: solve merge conflicts

8d0bc66

chore: add aarch64 (osx) snapshots for integration tests

7bc241a

(cherry picked from commit 8f9a535)

nyannyacha force-pushed the feat-cache-api branch from 8f9a535 to 7bc241a Compare November 23, 2024 13:20

laktek approved these changes Nov 24, 2024

View reviewed changes

laktek merged commit 13ee2a2 into supabase:main Nov 24, 2024
3 checks passed

github-actions bot added the released label Nov 24, 2024

This was referenced Nov 24, 2024

fix: bump edge-runtime to 1.64.0 supabase/cli#2910

Closed

fix: bump edge-runtime to 1.64.1 supabase/cli#2911

Merged

kallebysantos deleted the feat-cache-api branch November 26, 2024 21:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: exposing `deno_cache` + `ai cache interceptor` #446

feat: exposing `deno_cache` + `ai cache interceptor` #446

kallebysantos commented Nov 19, 2024

laktek commented Nov 24, 2024

github-actions bot commented Nov 24, 2024

feat: exposing deno_cache + ai cache interceptor #446

feat: exposing deno_cache + ai cache interceptor #446

Conversation

kallebysantos commented Nov 19, 2024

What kind of change does this PR introduce?

What is the current behavior?

What is the new behavior?

How it works?

Final considerations:

laktek commented Nov 24, 2024

github-actions bot commented Nov 24, 2024

feat: exposing `deno_cache` + `ai cache interceptor` #446

feat: exposing `deno_cache` + `ai cache interceptor` #446