Spring Hire Sale
Limited Time Deal: Unlock all premium questions for over 30% off
$10.42$7.08
08
:
03
:
15
:
07
Back to Dashboard
Token Cache
Hard
A language model inference service caches previously computed token sequences to avoid redundant computation. The cache uses a compressed prefix tree (radix tree) to store integer token sequences efficiently. Sequences that share a common prefix are merged into shared nodes, reducing memory usage.
Design a data structure called TokenCache to store and query token sequences using this structure.
...