stack.pulse
#stacks/ollama/ai

Ollama release notes, breaking changes, and upgrade notes.

Get up and running with large language models locally StackPulse turns upstream changelogs into scannable summaries with risky changes, deprecations, migration notes, and source links.

releases
11
breaking
0
security
0
deprecated
0
migrations
0

Get source-linked upgrade notes and occasional sponsor recommendations. No GitHub login required.

what stackpulse tracks

Ollama releases from GitHub

StackPulse watches Ollama release notes and keeps the original source link close to every summary.

upgrade risk

Breaking changes and deprecations

Risky changes are separated from normal feature notes so you can scan upgrade impact before changing production dependencies.

migration notes

Source-backed next steps

Migration steps and recommended actions are only shown when the upstream release notes support them.

# latest_releases

source-backed
v0.30.11mediumfeatureJun 25, 2026

v0.30.11

This release adds automatic installation for Claude Code and OpenCode models, improves Vulkan graphics classification on Windows, adjusts speculative decoding in MLXRunner, and updates the documentation.

affected

Users who rely on automatic model installation or use Windows hybrid graphics systems will benefit from these changes.

action

Update to take advantage of new automatic model installations and improved Windows graphics handling.

release_signals
+Auto-install Claude Code when missing
+Auto-install OpenCode when missing
+Improved Vulkan classification on Windows hybrid graphics
+Unified and tuned speculative decoding in MLXRunner
+Added NVIDIA CUDA sm_86 architecture support in Windows presets
view source on github->
v0.30.11-rc1mediumfeatureprereleaseJun 25, 2026

v0.30.11

This release focuses on improving auto-install capabilities, fixing GPU classification issues, and enhancing model performance with speculative decoding and memory management improvements.

affected

Users relying on auto-install features, GPU classification, or memory management will benefit from these changes.

action

Update to this version if you use Claude Code, opencode, or need improved GPU performance.

release_signals
+Auto-install Claude Code and opencode when missing
+Added thinking capability detection to opencode
+Improved speculative decoding in mlxrunner
+Added sm_86 architecture to cuda_v13_windows preset
+Default qwen2.5vl window attention metadata
view source on github->
v0.30.11-rc0mediumfeatureprereleaseJun 25, 2026

v0.30.11

This release focuses on improving launch capabilities, fixing Vulkan classification issues on Windows, and enhancing CUDA support. It also introduces auto-installation features for Claude Code and opencode, along with optimizations for speculative decoding and memory management.

affected

Users relying on Windows hybrid graphics or requiring auto-installation of Claude Code and opencode will benefit from this release.

release_signals
+Added thinking capability detection to opencode
+Auto-install Claude Code when missing
+Auto-install opencode when missing
+Fixed inverted iGPU/dGPU Vulkan classification on Windows hybrid graphics
+Unified and tuned speculative decoding
view source on github->
v0.30.10mediumfeatureJun 17, 2026

v0.30.10

Added support for Command A and North family models on Apple Silicon using the MLX engine. Updated the underlying llama.cpp engine.

affected

Users running Command A or North family models on Apple Silicon hardware are affected.

action

Update to take advantage of improved Apple Silicon support.

release_signals
+Command A and North family models now run on Apple Silicon with the MLX engine
view source on github->
v0.30.9-rc1mediumfeatureprereleaseJun 15, 2026

v0.30.9

This release updates the underlying llama.cpp library to version b9637, which may include performance improvements or bug fixes.

affected

Users relying on the llama.cpp library may benefit from potential improvements or fixes.

view source on github->
v0.30.9mediumfeatureJun 15, 2026

v0.30.9

This release introduces support for the Cohere2Moe architecture and fixes several issues, including token output limitations and LFM2 parser/render improvements.

affected

Users leveraging Cohere2Moe architecture or experiencing token output issues will be affected.

action

Upgrade to v0.30.9 to benefit from the new architecture support and bug fixes.

release_signals
+Support for Cohere2Moe architecture
view source on github->
v0.30.9-rc2mediumfeatureprereleaseJun 15, 2026

v0.30.9

This release introduces support for the Cohere2Moe architecture and fixes several issues, including token output limitations in coding agent use cases and LFM2 parser/render improvements.

affected

Users leveraging coding agents or assistants, and those working with Cohere2Moe architecture, are most affected.

action

Update to the latest release to benefit from new features and fixes.

release_signals
+Support for Cohere2Moe architecture
view source on github->
v0.30.8mediumfeatureJun 12, 2026

v0.30.8

This release focuses on stability improvements, particularly in MLX inference and prompt caching, along with fixes for provider selection in `ollama launch`.

affected

Users relying on MLX inference or recurrent models may benefit from improved stability and performance.

release_signals
+Improved prompt caching by decoupling it from context shift for better KV cache reuse
+More stable MLX inference with hardened linear and embedding layers
+MLX runner now creates snapshots during prompt processing and speculative decoding for improved reliability
+Improved recurrent model support with per-boundary states from the gated-delta kernels
view source on github->
v0.30.7highfeatureJun 7, 2026

v0.30.7

This release introduces Hermes Desktop, a native desktop interface for the Hermes agent, providing a visual interface for managing conversations, integrations, and messaging apps. It also includes updates to the OpenAI-compatible API models list and documentation improvements.

affected

Users of the Hermes agent who want a visual desktop interface will benefit from this release.

action

Run `ollama launch hermes-desktop` to start using Hermes Desktop.

release_signals
+Hermes Desktop is now available via `ollama launch hermes-desktop` with native Windows configuration path support
+OpenAI-compatible API models list now aligns with available model tags
+Added documentation describing the llama.cpp update process
+Updated Zod schema examples to use the native toJSONSchema helper
view source on github->
v0.30.6highfeatureJun 5, 2026

v0.30.6

This release introduces Quantization-Aware Training (QAT) optimized Gemma 4 models for reduced memory usage and improved performance. It also enhances MLX embedding layers for better quantization on Apple Silicon and integrates with Oh My Pi for AI coding assistance.

affected

Users leveraging Gemma 4 models or Apple Silicon devices will benefit from improved performance and memory efficiency.

action

Update to v0.30.6 to take advantage of the new QAT-optimized Gemma 4 models and enhanced quantization on Apple Silicon.

release_signals
+Gemma 4 models optimized with Quantization-Aware Training (QAT)
+Integration with Oh My Pi for AI coding assistance
+MLX embedding layers now use NVFP4 global scale for improved quantization on Apple Silicon
view source on github->
v0.30.5mediumfeatureJun 4, 2026

v0.30.5

This release fixes a critical crash issue with `gemma4:12b` on multiple platforms and improves Hermes Desktop integration, including native Windows support.

affected

Users running `gemma4:12b` on x86, CUDA, Linux, or Windows systems are affected by the crash fix.

action

Update to v0.30.5 to resolve the `gemma4:12b` crash issue.

release_signals
+`ollama launch hermes-desktop` skips rebuilding when a packaged desktop app is already installed.
+`ollama launch hermes` now supports native Windows installs via the Hermes PowerShell installer.
+Added Cline CLI integration documentation.
view source on github->