</>OfferRetriever
DashboardDiscuss
NEW

Spring Hire Sale

Limited Time Deal: Unlock all premium questions for over 30% off

$10.42$7.08

08

:

03

:

15

:

07

Get this deal
Back to Dashboard

Token Cache

Hard

A language model inference service caches previously computed token sequences to avoid redundant computation. The cache uses a compressed prefix tree (radix tree) to store integer token sequences efficiently. Sequences that share a common prefix are merged into shared nodes, reducing memory usage.

Design a data structure called TokenCache to store and query token sequences using this structure. ...