How-To Guides

Install Litmus Edge on NVIDIA DGX Spark

13 min

overview the nvidia dgx spark is nvidia's compact, high performance ai workstation built to deliver data center level compute in a desktop form factor it is designed for developers who want powerful inference, model tuning, and edge analytics capabilities locally key hardware highlights include grace + blackwell superchip architecture the system integrates a grace cpu and blackwell gpu over nvlink c2c for high bandwidth, low latency coherent memory access high compute density it delivers up to 1 petaflop of ai compute (with sparsity / mixed precision) in a compact chassis unified memory offers 128 gb of coherent memory spanning cpu + gpu, enabling large models and data sets to reside in shared memory without explicit copies nvme storage high speed nvme ssds for fast data throughput and low latency i/o connectx 7 smart nic built in high speed networking and inter spark connectivity for model scaling across multiple devices scalability two dgx spark units can be linked to support larger model sizes (e g up to 405b parameters) via peer to peer interconnect by combining these capabilities with litmusedge, you can leverage edge telemetry, device orchestration, and local data analytics capabilities on top of a powerful ai compute substrate below is a step by step docker based setup that runs directly on the dgx spark system requirements operating system nvidia dgx os (ubuntu) architecture arm64 (aarch64) docker engine already installed network outbound access to pull litmusedge image ports 8443 (https) open for dashboard access verify gpu and docker environment \#nvidia smi should return such output from linux terminal nvidia smi pull and run the litmusedge docker image docker pull litmusedge azurecr io/litmusedge std docker\ latest docker run \\ \ name le \\ d \\ \ cap add=net admin \\ \ restart unless stopped \\ p 8443 443 \\ litmusedge azurecr io/litmusedge std docker\ latest access the litmus edge dashboard https //\<dgx ip> 8443 to extend litmusedge with local llm capabilities, you can deploy ollama on the same dgx spark, a lightweight framework for running llms/slms locally it allows you to serve models such as llama, mistral, or gemma directly from your dgx spark without cloud dependency run the ollama container on dgx spark docker run d \\ \ name litmus ollama \\ \ gpus all \\ v ollama /root/ ollama \\ p 11434 11434 \\ ollama/ollama 0 12 5 what this does starts ollama as a background service allocates all gpus for model inference mounts persistent storage for model files exposes port 11434 for api access once running, ollama can be integrated with litmusedge analytics to provide local inference, summarization, or ai assisted decision logic directly on your dgx spark pull specific models in ollama once the container is up, you can pull ai models of your choice connect to the container shell docker exec it litmus ollama /bin/bash ollama pull qwen3 30b list available models to confirm download ollama ls here, each model is stored locally and uses gpu acceleration through the dgx spark hardware larger models such as qwen3 30b or deepseek r1 32b take longer to load but provide higher accuracy and reasoning depth connect ollama to litmusedge analytics open the litmusedge dashboard and navigate to analytics > models click add connection in the provider field, select ollama api enter the dgx spark’s local ollama endpoint in the url field http //\<dgx ip> 11434 specify the model you want to use, for example qwen3 30b click verify , then save your connection will now appear under ai models as shown in the example screenshot use ollama models in litmusedge analytics once the connection is verified, ollama models can be used in litmusedge analytics for real time inference in the instances section, you can add an ai processor node referencing your connected ollama model use inputs from datahub , preprocess them via jsonata , and feed into the ai processor collect the inference output downstream for visualization or publication back to devices some sample workflows this workflow allows you to execute llm based analytics locally, without sending data to external cloud apis the dgx spark gpu accelerates the model inference, making it ideal for high throughput or edge ai deployments this architecture enables offline or on premise generative ai real time reasoning and fault analysis from edge data full control of ai pipelines without depending on external llm apis conclusion you have now deployed litmusedge on an nvidia dgx spark using docker and integrated it with ollama for local large language model inference the dgx spark’s advanced hardware architecture, featuring unified memory, high compute density, and seamless cpu gpu coordination, provides a strong foundation for running ai workloads at the edge with litmusedge analytics , you can create intelligent data flows that collect, process, and analyze industrial data in real time by connecting the ollama container, litmusedge gains direct access to local slms such as gemma, qwen, mistral, or deepseek, enabling offline and high performance inference without external dependencies this setup allows users to design analytics pipelines that combine device data, preprocessing logic, and ai reasoning within the same environment the result is a unified, gpu accelerated platform that supports on premise model execution, predictive insights, and operational decision making at the edge together, litmusedge and dgx spark form a scalable edge ai ecosystem capable of transforming industrial data into actionable intelligence while keeping processing secure, fast, and completely within your infrastructure