LemonyOS – AI Offloading for Business Teams. Run or migrate agents and AI apps on-prem. Fast, auditable, and fully in your control. Sign up today.

Plans that adapt as your needs evolve

Lemony Node, OS and Apps included

2 week remote testing period

Try Lemony for free

1x Lemony Node (Remote Access), 5 Users, 3 pre-loaded AI Models and much more.

Get started

Compare Plans

Plans that adapt as your needs evolve

Base

499

/ month
Get started

Extended

999

/ month
Get started

Scale

1299

/ month
Get started

Enterprise

Custom

Get started

Lemony nodes

Users

AI models

Storage

AI model updates

Support Plan Update

Tokens/sec per user (Llama 4 17B)

GB RAM (273 GB/s bandwidth)

ARMv9 cores

TB NVMe storage

Max power (USB-C)

TOPS (FP8)

TOPS (FP32)

Tokens/year (Llama 4 17B)

Stack-to-Scale

Max model size

1x

5 Users / 1 Team

1

1TB

~25

128

14

4

90W

Up to 1,000

Up to 7.8

Up to 4B

48 GB/s (12-lane PCLe)

90 billion parameters

2x

25 Users / 5 Teams

3

4TB

50$/mth/node

49$/mth direct support

~32

256

28

8

180W

Up to 2,000

Up to 15.6

Up to 26B

48 GB/s (12-lane PCLe)

120 billion parameters

4x

50 Users / 10 Team

6

8TB

50$/mth/node

49$/mth direct support

~40

384

42

12

270W

Up to 3,000

Up to 23.4

Up to 46B

48 GB/s (12-lane PCLe)

140 billion parameters

Up to 500 users

8

48TB

~60

512

56

16

360W

Up to 4,000

Up to 31.2

Up to 95B

48 GB/s (12-lane PCLe)

180 billion parameters

Lemony Specs

Lemony Node

Power

max. 80W

USB Type C

110/230V Power Adapter (included)

Data

Direct Connect: USB-C Adapter (included)

Multi User: RJ45 Ethernet (included)

Cluster: RJ45 10Gbit/s (included)

Supported AI Models

Open-Source LLMs
Multimodal LLMs
Custom Models
Small Language Models

Default Preloaded

Llama 3.2 11B

Llama 3.1 8b

Llama 3.2 1B

AI Unit

NPU

AI Accelerator Cluster

Size

9.45” x 8.66” x 3.74” 240 x

220 x 95mm

Weight

1.68 lbs

760 g

Lemony Node Cluster

1x Lemony Node

TOPS 285
Token/sec Llama 3.2 11B: 19
Token/year >0.6B
Max. Model Size 90B
RAM @220GB/s 64GB
Max Power 80W
Knowledge Storage 1TB

2x Lemony Node

TOPS 520
Token/sec Llama 3.2 11B: 24
Token/year >1B
Max. Model Size 140B
RAM @220GB/s 128GB
Max Power 140W
Knowledge Storage 4TB

3x Lemony Node

TOPS 750
Token/sec Llama 3.2 11B: 32
Token/year >1.5B
Max. Model Size 220B
RAM @220GB/s 192GB
Max Power 200W
Knowledge Storage 6TB

4x Lemony Node

TOPS 980
Token/sec Llama 3.2 11B: 39
Token/year >2B
Max. Model Size 300B
RAM @220GB/s 256GB
Max Power 260W
Knowledge Storage 8TB

Lemony Application

Update AI Models

4x /year

no Internet connection required

Flash drive (via mail)

Update Lemony Node

4x /year

no Internet connection required

Flash drive (via mail)

Update Lemony App

4-8x /year

Web App: No internet, delivered via flash drive

macOS/Windows App: Internet required only for download

Web App

hosted on Lemony Node

no Internet required

macOS App

hosted on your Mac

no Internet required

Windows App

hosted on your PC/Laptop

no Internet required

Still have questions?

How long does it take to set up Lemony?
Do I need IT support to install Lemony?
Where does my data stay?
Is my data used to train external AI models?
What types of teams use Lemony?
Why do I need more then one lemony?
Does Lemony support custom AI Models
How to use otherLLM models?
What LLM engines lemony suport?
What is an assistant?
Who can I use my documents privately without sharing with team?
Can I run 90B models?
Is there a way to connect a camera and microphone to the node and capture the data stream?
Can Lemony take meeting notes and encrypt audio?
Does it has Voice RAG?
Can we develop our own app over lemony?

Know Your AI

Fixed-Cost, No Limits on Messages, Tokens, or APIs
Get started