Runyard is a free hardware-aware AI model browser. You enter your CPU, GPU, and VRAM and it instantly shows every local LLM that will run on your machine, ranked by speed and quality.

How much VRAM do I need to run local LLMs?

8GB of VRAM runs 7B models like Llama 3.1 8B and Mistral 7B at Q4 quantization. 16GB unlocks 13B models. 24GB lets you run Mixtral 8x7B and Llama 3 70B at lower quantization.

What is the best local LLM for my GPU?

Use Runyard at runyard.dev — enter your GPU and VRAM and the Model Radar will rank every compatible LLM for your exact hardware, showing estimated tokens per second for each model.

Can I run Llama 3 locally?

Yes. Llama 3.1 8B at Q4 runs on any 8GB VRAM GPU. Llama 3.1 70B needs around 40GB VRAM at Q4, or an Apple Silicon Mac with 64GB+ unified memory.

← Blog/LiteLLM Supply Chain Attack: What Happened and How to Fix It

March 25, 2026security

Runyard Team

@runyard_dev

9 min read

Contents

▸What Is LiteLLM?▸The Attack: How It Started ▸The Three-Stage Malware ▸Attack Timeline ▸Are You Affected?▸Immediate Response: If You Were Affected ▸Long-Term Fixes: Stop It Happening Again ▸What This Means for Local LLM Users

LiteLLM Supply Chain Attack: What Happened and How to Fix It

On March 24, 2026, two versions of the LiteLLM Python package on PyPI were silently replaced with malware. The attack stole API keys, SSH keys, cloud credentials, and cryptocurrency wallets — then installed a persistent backdoor that phoned home every 50 minutes. Because LiteLLM is downloaded ~3.4 million times per day and sits between your application and every AI provider you use, the blast radius was enormous. This is what happened, who did it, and exactly what you need to do.

What Is LiteLLM?

LiteLLM is a Python proxy library that provides a unified interface to over 100 AI model providers — OpenAI, Anthropic, Google, Azure, AWS Bedrock, and more. It's used in AI agents, chatbots, developer tooling, and MCP servers. Because it handles all your API keys in one place, a compromise of LiteLLM means a compromise of every AI credential on the affected machine.

The Attack: How It Started

The threat actor, tracked as TeamPCP, did not attack LiteLLM directly. They first compromised Trivy — a widely-used open-source security scanner that runs inside many CI/CD pipelines, including LiteLLM's own. Stolen credentials from Trivy gave TeamPCP access to LiteLLM's PyPI publishing pipeline. Two poisoned versions — 1.82.7 and 1.82.8 — were uploaded directly to PyPI. No corresponding GitHub release tags existed for either version, which was the key forensic indicator that something was wrong.

Red flag to watch for: if a PyPI release has no matching GitHub tag, treat it as suspicious. Legitimate maintainers almost always cut a GitHub release before publishing to PyPI.

The Three-Stage Malware

Stage 1 — The Silent Launcher (.pth file)

The malicious package installed a file named litellm_init.pth into Python's site-packages directory. Python automatically executes all .pth files on every interpreter startup — no import required, no explicit trigger needed. The moment any Python process started on the infected system, the payload ran. This included IDE terminals, background services, cron jobs, and MCP servers auto-loaded by tools like Cursor.

Stage 2 — Credential Harvesting

Stage 2 performed a full system enumeration and exfiltrated everything: environment variables, SSH private keys and configs, .env files, AWS/GCP/Azure credentials and metadata endpoint tokens, Kubernetes service account tokens and secrets, Terraform and Helm artifacts, CI/CD secrets, GitHub tokens, shell history, .gitconfig, and cryptocurrency wallet files. Data was AES-256-CBC encrypted with a random session key, that key was wrapped in a 4096-bit RSA public key, then packaged and sent to attacker-controlled domains that spoofed LiteLLM's own branding.

exfiltration-targetstext

# What Stage 2 collected from infected systems:
Environment variables (API keys, tokens, passwords)
~/.ssh/id_rsa, id_ed25519, known_hosts, config
~/.aws/credentials, ~/.aws/config
~/.config/gcloud/credentials.db
~/.kube/config + Kubernetes service account tokens
Terraform state files (.tfstate)
CI/CD secrets (GitHub Actions, GitLab CI, CircleCI)
Shell history (.bash_history, .zsh_history)
.env files (all directories, recursive)
Crypto wallet files (Bitcoin, Ethereum, Solana)
.gitconfig (may contain tokens)

# Exfiltrated to attacker-controlled C2:
models.litellm.cloud  (spoofing LiteLLM's domain)
checkmarx.zone/raw    (IP: 83.142.209.11)

Stage 3 — Persistence and Backdoor

Stage 3 installed a persistent backdoor as ~/.config/sysmon/sysmon.py and registered it as a systemd service. Every 50 minutes it polled a remote C2 server for new payloads and commands. The malware also attempted to create privileged pods in any discovered Kubernetes cluster (kube-system namespace), turning a single compromised developer machine into a foothold into production infrastructure.

Attack Timeline

▸March 23, 2026 — TeamPCP compromises the KICS GitHub Action (Checkmarx), hijacking 35 release tags
▸March 24, ~10:39 UTC — Malicious litellm 1.82.7 appears on PyPI
▸March 24, ~10:52 UTC — Malicious litellm 1.82.8 published to PyPI
▸March 24, ~13:03 UTC — First GitHub issue reporting the attack is closed as "not planned" (maintainer account likely also compromised)
▸March 24, ~16:00 UTC — PyPI quarantines the package after independent researcher reports
▸March 24, ~20:15 UTC — Compromised versions yanked from PyPI
▸Total live exposure window: approximately 6–9 hours on PyPI

Malicious Open-Source Packages Discovered per Year

2020

16K packages

2021

29K packages

2022

70K packages

2023

245K packages

2024

513K packages

According to Sonatype's 2024 State of the Software Supply Chain report, 512,847 malicious open-source packages were discovered in a single year — a 156% year-over-year increase. Malicious packages on PyPI, npm, and other registries grew 1,300% between 2020 and 2023 (ReversingLabs). The LiteLLM attack is not an outlier; it is part of an accelerating pattern targeting AI infrastructure specifically.

Are You Affected?

You are potentially affected if you installed litellm between approximately March 24, 2026 10:39 UTC and March 24, 2026 20:15 UTC without a pinned version, or if your environment auto-updated to the latest release. Check which version you have:

terminalbash

# Check installed litellm version
pip show litellm

# If you see 1.82.7 or 1.82.8 — you are affected
# Safe version: 1.82.6 or earlier

# Check for the malicious .pth launcher
find /usr -name "litellm_init.pth" 2>/dev/null
find ~/.local -name "litellm_init.pth" 2>/dev/null
python -c "import site; print(site.getsitepackages())"
# Then check those directories for litellm_init.pth

# Check for the persistence backdoor
ls -la ~/.config/sysmon/
systemctl status sysmon 2>/dev/null
cat ~/.config/systemd/user/*.service 2>/dev/null | grep sysmon

Immediate Response: If You Were Affected

If you installed either compromised version, treat the entire system as fully compromised. Do not attempt to patch around the issue — the attacker had complete credential access and may have used those credentials before you detected the infection.

1.Isolate the machine from the network if possible while you work through the response steps
2.Remove the malicious .pth file: find all site-packages directories and delete litellm_init.pth
3.Remove the persistence backdoor: delete ~/.config/sysmon/sysmon.py and disable the systemd service
4.Audit Kubernetes for unauthorized pods in kube-system namespace: kubectl get pods -n kube-system
5.Rotate every credential that existed on the machine: SSH keys, AWS/GCP/Azure keys, Kubernetes tokens, GitHub tokens, database passwords, CI/CD secrets, .env file variables, API keys for all AI providers, cryptocurrency wallet keys
6.Check outbound network logs for connections to models.litellm.cloud or 83.142.209.11 (checkmarx.zone)
7.Consider rebuilding the system from a clean state rather than attempting full remediation in place — the backdoor may have delivered additional payloads
8.Downgrade litellm to a safe version: pip install litellm==1.82.6

cleanup.shbash

# 1. Remove malicious .pth file from all site-packages
python -c "import site; [print(p) for p in site.getsitepackages()]" | while read dir; do
  rm -f "$dir/litellm_init.pth"
done

# 2. Remove persistence backdoor
rm -rf ~/.config/sysmon/
systemctl --user disable sysmon 2>/dev/null
systemctl --user stop sysmon 2>/dev/null

# 3. Downgrade to safe version
pip install litellm==1.82.6

# 4. Verify clean install
pip show litellm | grep Version
python -c "import litellm; print('litellm loaded cleanly')"

Long-Term Fixes: Stop It Happening Again

▸Pin all dependencies with exact versions and hashes — use pip-compile, Poetry lockfiles, or pip install with --require-hashes
▸Never use uvx, pipx run, or similar tools to auto-pull the latest version of packages that have access to secrets
▸Verify PyPI releases have a corresponding GitHub tag before installing — a missing tag is a red flag
▸Use the official LiteLLM Proxy Docker image for production — it uses pinned dependencies and was not affected by this attack
▸Implement network egress filtering so unexpected outbound connections to unknown domains are blocked or alerted
▸Audit your CI/CD pipeline tools (security scanners, linters, test runners) — these run in privileged contexts and are high-value targets, exactly as Trivy was here
▸Consider running AI tooling in isolated environments (containers, VMs) that do not have direct access to SSH keys or cloud credentials

requirements.txtbash

# Pin litellm with a hash to prevent silent upgrades
# Generate hashes with: pip download litellm==1.82.6 -d /tmp/pkg && pip hash /tmp/pkg/litellm-*.whl

litellm==1.82.6 \
    --hash=sha256:<insert-hash-here>

# Or use pip-compile to lock all transitive dependencies:
# pip install pip-tools
# pip-compile requirements.in --generate-hashes

The LiteLLM Docker proxy image (ghcr.io/berriai/litellm) was NOT affected by this attack because it pins its own dependencies internally. If you are running LiteLLM in production, the Docker image is now the recommended deployment method.

What This Means for Local LLM Users

LiteLLM is commonly used as a local proxy to route between different AI providers — including local models served by Ollama. If you have been using LiteLLM to manage multiple providers alongside locally-running models, this attack could have exposed every API key in your environment. Tools like runyard.dev help you identify which models can run entirely on your own hardware without ever sending data to an external API — reducing the attack surface by eliminating the need for multiple cloud provider keys in the first place. The fewer secrets on disk, the less there is to steal.

OWASP ranked Supply Chain Vulnerability at #3 in its LLM Top 10 for 2025 — and this attack is a textbook example of why. As AI tooling matures, the libraries that sit between your application and AI providers will become increasingly attractive targets. Treat every AI dependency with the same scrutiny you would apply to a payment or auth library: pin it, audit it, and isolate its access.

March 18, 2026

Pawned by LiteLLM?

Paste pip freeze output to scan for compromised litellm versions.

Newsletter