AISDs: Devices with superpowers

AI Super Devices are memory and storage devices equipped with the superpower to parallel compute, a critical step forward in accelerating future AI, HPC and Database workloads on memory-centric domain-specific architectures. The essential raw ingredients of the AISD recipe are:
  • fifth generation and beyond PCI Express (PCIe 5+) interface supporting the new Compute eXpress Link (CXL) protocols, especially the CXL.mem and optionally CXL.cache capabilities;
  • on-board compute with ability to perform Near-Data Processing, offloading data-intensive (in particular, data movement intensive) operations from CPUs and GPUs where such operations pollute oversubscribed caches, TLBs, and interfaces with poorly reused streaming data; and, finally,
  • device-side software that allows AISDs to work better with CPUs and GPUs, on the one hand, but more importantly, with each other to enable offloading of parallel computations such as sorting, shuffling and transposing from large groups of CPUs or GPUs to a large group of devices holding sharded data.

Consider the case of Large Language Model (LLM) Inference backed by Retrieval Augmented Generation (RAG). As generative applications of LLMs looking to avoid hallucination transition their LLMs toward “learning to learn” capabilities and less memorization of irrelevant facts/information from public data sources, curated unstructured and structured enterprise content exposed as embeddings stored in a vector databases takes on greater relevance. These extracted and stored embeddings are searched using distance functions by Approximate Nearest Neighbor Search (ANNS) algorithms. For vector databases containing 1s to 10s of terabytes of data, CPUs and GPUs will be able to offload Top-K ANNS operations to a single AISD. For larger use cases dealing with petabytes to exabytes of context data, the data will be partitioned into objects and collections spanning perhaps 100s to 1000s of devices. AISDs used in these settings will be able to  collectively offload massive index lookup and index build operations from CPUs and GPUs running the generative AI tasks such as token generation using Transformer style neural networks.

We are actively researching AISDs and their applications now.

 

One thought on “AISDs: Devices with superpowers

Leave a Reply

Your email address will not be published. Required fields are marked *