Supply chain malware in PyTorch Lightning — check your deps before your next training run.
Top Signal
Malware found inside PyTorch Lightning's dependency chain
platform change
HN Front Page, Semgrep
Semgrep discovered a Shai-Hulud themed malicious package injected into PyTorch Lightning's dependency tree — the most widely-used high-level training library in the Python ML ecosystem. The malware was embedded in a legitimate-looking transitive dependency, making it invisible to standard pip audits. This is the second major Python ML supply chain attack this month (after Bitwarden CLI via Checkmarx). Builders should immediately audit their pytorch-lightning installations: run `pip show pytorch-lightning` and check pinned versions against Semgrep's advisory. Lock your ML dependencies with pip 26.1's new lockfile support (shipped last week). If you're running training jobs in CI/CD, add Semgrep's supply chain rules to your pipeline. The broader signal: AI training infrastructure is becoming a high-value supply chain target.
Read more →
Fast Signals
IBM Granite 4.1: Apache-2.0 models where 8B matches 32B MoE
new tool
HN Front Page, r/LocalLLaMA
IBM released Granite 4.1 in 3B/8B/30B sizes plus a speech variant, all Apache 2.0. The 8B dense model claims parity with 32B MoE architectures on key benchmarks — if this holds in practice, it's a compelling option for latency-sensitive deployments where you can't afford MoE routing overhead. Worth benchmarking against Qwen3.6 27B for your specific use case.
Link →
Codex CLI adds /goal — OpenAI ships the Ralph loop pattern
workflow
Simon Willison
OpenAI's Codex CLI 0.128.0 adds a `/goal` command that implements the Ralph loop: set a high-level objective and let the agent work toward it autonomously, re-evaluating after each step. Simon Willison notes this is their version of the pattern Geoffrey Huntley coined. If you're building agent scaffolds, this is converging toward a standard interaction model worth adopting.
Link →
Honker ships queues, streams, pub/sub, and cron inside SQLite
new tool
HN Front Page
Honker, the Rust extension that brings Postgres NOTIFY/LISTEN semantics to SQLite, now has a full site and HN traction (161 points). It adds durable queues, streams, pub/sub, and a cron scheduler — all inside your SQLite file. For builders running agent workflows or local-first apps, this eliminates the need for Redis or a separate message broker in simpler architectures.
Link →
Copy Fail (CVE-2026-31431): PostgreSQL COPY vulnerability hits 457 HN points
platform change
HN Front Page
A new PostgreSQL vulnerability affecting the COPY protocol is drawing significant attention, with distro developers reportedly not given advance disclosure. If you run Postgres in production — and most of us do — check your version against the CVE advisory and patch immediately. The oss-security thread suggests the disclosure process itself was flawed.
Link →
llama.cpp now runs CUDA and ROCm backends simultaneously
workflow
r/LocalLLaMA
With `-DGGML_BACKEND_DL=ON`, llama.cpp can load both CUDA and ROCm backends in the same process. This means mixed NVIDIA+AMD GPU setups — previously a pain to configure — just work. Practical for anyone building inference servers with heterogeneous hardware, and a signal that the local inference stack is maturing beyond single-vendor assumptions.
Link →
AMD Halo Box: 128GB unified memory for local inference, shipping June
platform change
r/LocalLLaMA
AMD's in-house Ryzen 395 box with 128GB unified memory is confirmed for June with photos surfacing. Multiple r/LocalLLaMA posts show strong community interest. At 128GB, this runs unquantized 70B models or quantized 400B+ models locally — a potential Mac Studio competitor for inference workloads at a price point worth watching.
Link →
Radar
Kanwas: shared context board for teams and agents
Open-source tool for maintaining shared context across human team members and AI agents. Early (56 HN points) but addresses the real problem of context fragmentation in multi-agent workflows.
Link →
llama-swap adds matrix grouping for multi-model serving
New feature lets you define which models can co-reside in VRAM and intelligently unloads based on groups. Solves the practical problem of running STT + LLM + RAG on the same GPU without manual swapping.
Link →
Qwen-Scope: official SAEs for model interpretability
Qwen releases official Sparse Autoencoders for their 3.5 model family. If you're doing model steering, feature extraction, or safety research on Qwen models, these are first-party interpretability tools.
Link →
jcode: lightweight coding agent harness on GitHub Trending
A minimal coding agent harness trending on GitHub. Worth watching as the agent scaffold space fragments — simpler harnesses may win over full frameworks for specific workflows.
Link →
Convergence Watch
qwen 3.6
TRENDING
12 mentions across r/LocalLLaMA, HN Front Page, GitHub Trending
Day 7 of dominance. Community has moved past benchmarks into production optimization: KV cache tuning, long-context coding at 128K+, multi-GPU setups on consumer cards. The 27B model is becoming the de facto local coding model, with 35B-A3B MoE variant gaining traction for memory-constrained setups.
mistral medium 3.5
6 mentions across HN Front Page, r/LocalLLaMA
Day 2. Open weights at 128B parameters with MLX 4-bit quants (~70GB) already available. Terminal Bench scores being evaluated. The practical question: does it justify 4-5x the VRAM of Qwen3.6-27B for meaningfully better output? Early local runners are testing.
hipfire
TRENDING
3 mentions across r/LocalLLaMA, GitHub Trending
Day 4. AMD-native inference engine now being Dockerized and tested on consumer GPUs (RX 7900 XTX). The AMD local inference ecosystem is maturing from experiment to usable stack, with hipfire as the dedicated engine alongside llama.cpp's ROCm backend.
ml supply chain attacks
TRENDING
3 mentions across HN Front Page, Semgrep
PyTorch Lightning malware follows Bitwarden CLI (Checkmarx) from April 24. AI/ML Python packages are now active supply chain targets. Builders need dependency lockfiles and supply chain scanning as standard practice, not afterthought.
claude code ecosystem
TRENDING
3 mentions across HN Front Page, Simon Willison
Day 7. HERMES.md billing routing issue hit 925 HN points yesterday, now follow-up coverage. Zig's anti-AI policy and OpenClaw-related billing concerns add to the broader conversation about coding agent trust and billing transparency.
STALE: Latent Space newest item is >48h old