MCP supply chain is under active attack — audit your plugins now before the next team does.
Top Signal
Malware campaign actively targeting MCP developers via poisoned packages
platform change
HN Front Page, GitHub Trending
Socket.dev disclosed three malware families—mini-shai-hulud, miasma, and hades worms—specifically targeting MCP (Model Context Protocol) developers via malicious npm/PyPI packages. The worms embed nuclear/biological weapons text as an obfuscation technique to evade content-filter analysis. MCP's rapid adoption has made its package ecosystem a priority attack surface: attackers know builders are pulling unvetted community skills. Action now: audit every third-party MCP server in your stack, check for unrecognized publishers, and treat any MCP server you didn't write as untrusted code. NVIDIA/SkillSpector (trending on GitHub today) is a new security scanner specifically for agent skill packages—it detects vulnerabilities and malicious patterns before installation. Run it against your current skill set. This is the first documented targeted campaign against MCP builders; the window for naive trust closes here.
Read more →
Fast Signals
EAGLE3 speculative decoding merged into llama.cpp — free 2-4x throughput
platform change
r/LocalLLaMA
EAGLE3, the third-generation speculative decoding algorithm, is now merged into llama.cpp. It uses a small draft model to predict multiple tokens, then verifies in parallel—typically 2–4x throughput gains at zero quality cost. If you run any local llama.cpp inference stack, update and benchmark today; no config changes required.
Link →
Test-time compute lets local 27B models beat Claude Mythos on code optimization
research to practice
r/LocalLLaMA
A community experiment scaled test-time compute (repeated sampling + best-of-N selection) for Qwen-3.6-27B and Gemma-4-31B, surpassing Claude Mythos on code optimization and speedup benchmarks—no fine-tuning, just running the model multiple times. Concrete signal: TTC is worth wiring into local coding pipelines before reaching for a larger cloud model.
Link →
MiniMax M3: 428B MoE drops on HuggingFace with novel sparse attention architecture
new tool
r/LocalLLaMA
MiniMax released M3 publicly—a large MoE model with MiniMax Sparse Attention (MSA) for efficient long-context handling. Unsloth GGUF quantized weights are already uploading. Key caveat: llama.cpp currently falls back to dense attention (MSA not yet implemented), making inference significantly slower. Hold benchmarks until the MSA PR lands.
Link →
agentsview: single-binary analytics for 20+ coding agents, 100x faster than ccusage
new tool
GitHub Trending
agentsview is a local-first, no-account binary that browses, searches, and tracks costs across Claude Code, Codex, and 20+ other coding agents. Directly positioned as a 100x faster ccusage replacement. If you're running multiple coding agents and want cost visibility without shipping data to a service, this is a drop-in install.
Link →
WASI 0.3 ships native async — WASM agent sandboxing is now production-viable
platform change
HN Front Page
WASI 0.3 delivers proper async support and component model stability for WebAssembly System Interface. For builders using WASM as an agent code execution sandbox (untrusted tool calls, multi-tenant pipelines), this is the maturity milestone that makes it worth adopting. 225 HN points signals this is on the ecosystem radar.
Link →
Copy-paste checklist to reduce AI frontend slop in codegen pipelines
workflow
HN Front Page
A practical blog post catalogs specific LLM frontend generation failure modes—layout drift, accessibility shortcuts, spacing inconsistency—and the prompt rules that fix each one. The checklist is system-prompt-ready; drop it directly into any frontend codegen pipeline to tighten output quality without changing the model.
Link →
Radar
SIA: autonomous self-improving AI framework (arXiv-backed)
hexo-ai/SIA (Self-Improving AI) is an open framework for autonomously improving any AI system or agent on a benchmark task with no human intervention, backed by an arXiv paper. Claims are unverified but the pattern—automated iterative pipeline improvement—is the right direction. Watch for benchmark reproducibility before adopting.
Link →
PP-OCRv6 officially released
PaddleOCR's v6 is out. If you're building document ingestion pipelines for RAG or structured extraction, PP-OCR has been among the strongest open-source options for complex layout parsing. Worth benchmarking against your current OCR layer.
Link →
Kimi-K2.7-Code: Moonshot's new coding model on HuggingFace
Moonshot AI dropped Kimi-K2.7-Code on HuggingFace today with minimal announcement. No community benchmarks yet, but Moonshot has shipped competitive prior models. Flag for your next local coding agent eval run.
Link →
Convergence Watch
agent skill security
2 mentions across HN Front Page, GitHub Trending
MCP-targeted malware (Socket.dev) and NVIDIA/SkillSpector (security scanner for agent skills) appearing the same day signals agent supply chain security is crystallizing as its own discipline. As MCP adoption accelerates, expect an ecosystem of auditing and scanning tools to follow this week's incident.
minimax m3
5 mentions across r/LocalLLaMA
Five independent r/LocalLLaMA posts today covering model release, GGUF quantization, sparse attention architecture, HuggingChat availability, and inference caveats. Single-source but high volume; the MSA-not-yet-in-llama.cpp caveat is the key blocker—watch for that PR before treating M3 as runnable locally.
STALE: Latent Space newest item is >48h old