v0.6.6

Jan v0.6.6: Enhanced llama.cpp integration and smarter model management

Jan v0.6.6: Enhanced llama.cpp integration and smarter model management

Highlights ๐ŸŽ‰

Jan v0.6.6 delivers significant improvements to the llama.cpp backend, introduces Hugging Face as a built-in provider, and brings smarter model management with auto-unload capabilities. This release also includes numerous MCP refinements and platform-specific enhancements.

๐Ÿš€ Major llama.cpp Backend Overhaul

Weโ€™ve completely revamped the llama.cpp integration with:

  • Smart Backend Management: The backend now auto-updates and persists your settings properly
  • Device Detection: Jan automatically detects available GPUs and hardware capabilities
  • Direct llama.cpp Access: Models now interface directly with llama.cpp (previously hidden behind Cortex)
  • Automatic Migration: Your existing models seamlessly move from Cortex to direct llama.cpp management
  • Better Error Handling: Clear error messages when models fail to load, with actionable solutions
  • Per-Model Overrides: Configure specific settings for individual models

๐Ÿค— Hugging Face Cloud Router Integration

Connect to Hugging Faceโ€™s new cloud inference service:

  • Access pre-configured models running on various providers (Fireworks, Together AI, and more)
  • Hugging Face handles the routing to the best available provider
  • Simplified setup with just your HF token
  • Non-deletable provider status to prevent accidental removal
  • Note: Direct model ID search in Hub remains available as before

๐Ÿง  Smarter Model Management

New intelligent features to optimize your system resources:

  • Auto-Unload Old Models: Automatically free up memory by unloading unused models
  • Persistent Settings: Your model capabilities and settings now persist across app restarts
  • Zero GPU Layers Support: Set N-GPU Layers to 0 for CPU-only inference
  • Memory Calculation Improvements: More accurate memory usage reporting

๐ŸŽฏ MCP Refinements

Enhanced MCP experience with:

  • Tool approval dialog improvements with scrollable parameters
  • Better experimental feature edge case handling
  • Fixed tool call button disappearing issue
  • JSON editing tooltips for easier configuration
  • Auto-focus on โ€œAlways Allowโ€ action for smoother workflows

๐Ÿ“š New MCP Integration Tutorials

Comprehensive guides for powerful MCP integrations:

  • Canva MCP: Create and manage designs through natural language - generate logos, presentations, and marketing materials directly from chat
  • Browserbase MCP: Control cloud browsers with AI - automate web tasks, extract data, and monitor sites without complex scripting
  • Octagon Deep Research MCP: Access finance-focused research capabilities - analyze markets, investigate companies, and generate investment insights

๐Ÿ–ฅ๏ธ Platform-Specific Improvements

Windows:

  • Fixed terminal windows popping up during model loading
  • Better process termination handling
  • VCRuntime included in installer for compatibility
  • Improved NSIS installer with app running checks

Linux:

  • AppImage now works properly with newest Tauri version and it went from almost 1GB to less than 200MB
  • Better Wayland compatibility

macOS:

  • Improved build process and artifact naming

๐ŸŽจ UI/UX Enhancements

Quality of life improvements throughout:

  • Fixed rename thread dialog showing incorrect thread names
  • Assistant instructions now have proper defaults
  • Download progress indicators remain visible when scrolling
  • Better error pages with clearer messaging
  • GPU detection now shows accurate backend information
  • Improved clickable areas for better usability

๐Ÿ”ง Developer Experience

Behind the scenes improvements:

  • New automated QA system using CUA (Computer Use Automation)
  • Standardized build process across platforms
  • Enhanced error stream handling and parsing
  • Better proxy support for the new downloader
  • Reasoning format support for advanced models

๐Ÿ› Bug Fixes

Notable fixes include:

  • Factory reset no longer fails with access denied errors
  • OpenRouter provider stays selected properly
  • Model search in Hub shows latest data only
  • Temporary download files are cleaned up on cancel
  • Legacy threads no longer appear above new threads
  • Fixed encoding issues on various platforms

Breaking Changes

  • Models previously managed by Cortex now interface directly with llama.cpp (automatic migration included)
  • Some sampling parameters have been removed from the llama.cpp extension for consistency
  • Cortex extension is deprecated in favor of direct llama.cpp integration

Coming Next

Weโ€™re working on expanding MCP capabilities, improving model download speeds, and adding more provider integrations. Stay tuned!

Update your Jan or download the latest.

For the complete list of changes, see the GitHub release notes.