</>OfferRetriever
DashboardDiscuss
NEW

July 4th Sale

Limited Time Deal: Unlock all premium questions for over 40% off

$10.42$6.25

10

:

09

:

31

:

59

Get this deal
Back to Dashboard

Token Cache

Hard

A language model inference service caches previously computed token sequences to avoid redundant computation. The cache uses a compressed prefix tree (radix tree) to store integer token sequences efficiently. Sequences that share a common prefix are merged into shared nodes, reducing memory usage.

Design a data structure called TokenCache to store and query token sequences using this structure. ...