Running a 13-billion-parameter language model without ever touching the cloud is the headline pitch for the Firefly AIBOX-9075, a fanless industrial box built around Qualcomm's Dragonwing IQ-9075. Its Hexagon Tensor Processor reaches up to 200 TOPS at INT8 and, per Firefly's own figures, pushes roughly 22 tokens per second on Llama 2-7B and about 12 tokens per second on 13B-class models. Open weights like DeepSeek-R1, Gemma, Llama, Qwen, and Phi are all listed as supported, alongside Stable Diffusion for local image generation, making this a self-contained inference appliance rather than a thin client.

The IQ-9075 silicon underneath pairs an octa-core Kryo Gen 6 cluster (Cortex-A78C-based) clocked up to 2.36 GHz with four Cortex-R52 real-time cores at 1.85 GHz, an Adreno 663 GPU good for 1.2 TFLOPS of FP32, and an Adreno VPU 765 that decodes a single 8Kp60 stream or up to 32 simultaneous 1080p30 feeds. Firefly fits it with 36GB of LPDDR5 with ECC, 128GB of UFS 2.2 storage, and an M.2 2280 slot wired for PCIe 4.0 x4 NVMe. The GPU exposes Vulkan 1.2, OpenGL ES 3.2, and OpenCL 2.0, so the compute stack is not locked to a single proprietary path.

Software is where the open-source angle firms up. The AIBOX-9075 ships with Ubuntu and Yocto Linux on the Qualcomm Linux stack, and runs Qualcomm's QNN inference runtime in addition to TensorFlow, TensorFlow Lite, PyTorch, ONNX, Keras, and Caffe. The timing is notable: Qualcomm has just committed to an upstream-first Qualcomm Linux 2.0 for its Dragonwing IoT platforms, with a board support package tracking mainline rather than a vendor fork, which should make long-term kernel maintenance on hardware like this considerably less painful. The IQ-9075 is explicitly named as a covered platform in the early Qualcomm Linux 2.0 documentation, with Linux 6.18 LTS as the kernel base and the tree publicly hosted under the qualcomm-linux organization on GitHub. Most platform drivers target upstream inclusion or reside there, though the AI runtime, camera pipeline, graphics stack, and hypervisor remain as closed-source binary packages. Firefly's wiki documents the getting-started flow, debug console, watchdog and RTC usage, and the AidLux, AidLite, AidGen, and AidStream development tools.

The I/O reflects the box's vision and robotics targets. Eight GMSL2 camera inputs arrive over two 4-pin Mini FAKRA connectors, joined by HDMI 2.0 output at 4K60, two USB 3.0 Type-A ports, and a USB-C debug console. A 36-pin Phoenix terminal block breaks out two RS485 lines, two CAN-FD channels, and twelve opto-isolated digital I/O, while networking covers dual 2.5GbE with TSN plus optional Wi-Fi 6 and 4G/5G via internal slots. The aluminum chassis measures 110.2 x 102.25 x 72.0 mm (4.3 x 4.0 x 2.8 inches), rates for -40°C to 85°C (-40°F to 185°F) operation, and draws 12W under normal load against a 60W peak.

The AIBOX-9075 is available direct from Firefly starting at $1,250 (€1,150). Optional add-ons include a Wi-Fi module at $15 (€14), an EC20 4G-LTE module at $39 (€36), and an M.2 2280 SSD at $79 (€73).