Models

SammyAI leverages a powerful combination of local and cloud-based models – specifically, Gemma3:4b, Gemini-2.5-Flash, Kimi-K2:1T, and Deepseek-V3.2 – chosen for their proven ability to deliver exceptional results. These models utilize free tiers with generous daily limits, ensuring no cost to you, while maintaining your data privacy. Selected models provide for a truly powerful and accessible writing experience.

Gemma 3:4B

Gemma 3:4B is a fast and remarkably efficient language model developed by Google, built upon the powerful Gemini technology. Designed to be an ideal partner for brainstorming and rapid iteration in creative writing projects, this model is particularly well-suited for those seeking quick explorations of ideas. Its compact size and low resource requirements mean Gemma 3:4B can run effectively on most consumer hardware – we recommend a PC with at least 8GB of RAM for optimal performance. While Gemma 3:4B can run on integrated GPUs, Nvidia GeForce RTX 3060 or better is recommended. Crucially, Gemma 3:4B operates locally on your PC, eliminating reliance on external servers and granting you complete control and privacy. With a 128k context window, you can delve into complex narratives and detailed world-building. Best of all, using Gemma 3:4B is completely free.

Gemini-2.5-Flash

Gemini-2.5-Flash represents Google’s well-rounded and versatile language model, recognized for its exceptional price-performance ratio – as Google itself notes. This cloud-based model excels in both brainstorming and the actual writing process, thanks to its impressive 1 million context window, allowing for incredibly deep and nuanced explorations of ideas. Furthermore, Google offers a generous daily free tier usage rate limit, empowering users to thoroughly test and experiment with Gemini-2.5-Flash without incurring any costs. With its robust capabilities, Gemini-2.5-Flash is a compelling tool for writers seeking a powerful and cost-effective creative partner. SammyAI connects to Gemini-2.5-Flash via Google's API.

Kimi-K2:1T

Kimi-K2:1T, developed by Moonshot AI, is a remarkable mixture-of-experts (MoE) language model boasting an astounding 1 trillion total parameters and a 256k context window. This impressive scale allows Kimi-K2:1T to generate human-like stories and complex narratives with a level of detail and nuance that’s truly exceptional. Accessible through Ollama, this cloud model is offered with generous free daily usage limits, enabling users to explore its capabilities without financial commitment. SammyAI connects to Kimi-K2 via Ollama's API.

Deepseek-V3.2

Deepseek-V3.2 is a powerful, open-source Mixture-of-Experts (MoE) language model, boasting an impressive 671 billion total parameters. Accessible through Ollama with generous daily usage limits, this model excels in both brainstorming and writing tasks, offering users a significant capability for creative exploration. With a 128K context window, Deepseek-V3.2 allows for deep dives into complex narratives and intricate world-building, presenting a compelling option for writers seeking a robust and freely accessible language model for generating imaginative content. SammyAI connects to Deepseek-V3.2 via Ollama's API.

Read full documentation: SammyAI Documentation