Find Your Perfect Local AI Model
Compare system requirements, performance benchmarks, and get personalized recommendations based on your hardware.
Your Hardware
Hardware detection not yet run
Showing 57 of 57 models
Compact and fast embedding model. Good balance of quality and speed for semantic search.
ollama pull all-minilm Mistral-based vision model. Fast multimodal inference with good image understanding.
ollama pull bakllava Multilingual embedding model supporting 100+ languages. Great for cross-lingual retrieval.
ollama pull bge-m3 Meta's specialized coding model based on Llama 2. Excellent for code completion and generation.
ollama pull codellama Google's coding variant of Gemma. Optimized for code completion and generation.
ollama pull codegemma Cohere's model optimized for RAG and tool use. Excellent at following complex instructions.
ollama pull command-r Cohere's largest model. Optimized for RAG, tool use, and complex multi-step tasks.
ollama pull command-r-plus Advanced coding model with MoE architecture. Excellent for complex programming tasks.
ollama pull deepseek-coder-v2 State-of-the-art reasoning model with chain-of-thought capabilities. Excels at math, coding, and complex reasoning.
ollama pull deepseek-r1 Mid-sized distilled R1. Excellent reasoning capabilities for its size.
ollama pull deepseek-r1:14b Compact distilled version of R1. Strong reasoning in an efficient package.
ollama pull deepseek-r1:7b Distilled version of R1 based on Qwen. Great reasoning in a smaller package.
ollama pull deepseek-r1-distill-qwen Uncensored Llama 3 fine-tune. General purpose assistant without content restrictions.
ollama pull dolphin-llama3 Uncensored Mixtral fine-tune. Helpful, harmless, and honest without refusals.
ollama pull dolphin-mixtral Extended context Llama model. Capable of handling very long documents.
ollama pull everythinglm Google's open model built from Gemini research. Available in 2B, 9B, and 27B sizes.
ollama pull gemma2 Google's smallest Gemma. Fast and efficient for basic tasks.
ollama pull gemma2:2b Google's efficient 9B model. Great performance in a compact size.
ollama pull gemma2:9b IBM's enterprise-focused coding model. Trained on 116 programming languages.
ollama pull granite-code Meta's powerful 70B model. Excellent reasoning and instruction following with 128K context.
ollama pull llama3.1:70b Meta's versatile 8B model with 128K context. Great balance of capability and efficiency.
ollama pull llama3.1:8b Efficient smaller models from Meta, perfect for on-device deployment. Available in 1B and 3B sizes.
ollama pull llama3.2 Ultra-lightweight model from Meta. Perfect for edge devices and fast inference.
ollama pull llama3.2:1b Multimodal model that can understand images and text. Available in 11B and 90B variants.
ollama pull llama3.2-vision Meta's largest vision model. State-of-the-art multimodal understanding.
ollama pull llama3.2-vision:90b Meta's latest and most capable open model. Excellent for general tasks, coding, and reasoning with 128K context.
ollama pull llama3.3 Visual instruction-tuned model combining CLIP vision with Llama. Great for image understanding.
ollama pull llava Large vision-language model. Excellent image understanding and reasoning.
ollama pull llava:34b OSS-Instruct trained coding model. Excels at generating clean, documented code.
ollama pull magicoder Highly efficient 7B model that punches above its weight. Great balance of speed and capability.
ollama pull mistral Mistral's most capable model. Excellent for complex reasoning and enterprise use.
ollama pull mistral-large Latest Mistral model optimized for efficiency. Enterprise-grade quality in a compact size.
ollama pull mistral-small Mixture of Experts model that activates only 2 experts per token. Fast inference with high quality.
ollama pull mixtral Tiny but capable vision model. Perfect for edge devices and fast image analysis.
ollama pull moondream State-of-the-art embedding model. Top performance on MTEB benchmarks.
ollama pull mxbai-embed-large Intel's fine-tuned Mistral for conversational AI. Optimized for dialogue.
ollama pull neural-chat High-quality text embedding model. Perfect for RAG, semantic search, and similarity matching.
ollama pull nomic-embed-text Nous Research's flagship model. Excellent for roleplay, creative writing, and reasoning.
ollama pull nous-hermes2 Powerful Mistral fine-tune trained on large synthetic dataset. Great instruction following.
ollama pull openhermes Compact model with strong reasoning. Good balance between size and capability.
ollama pull orca-mini Microsoft's small language model. Surprisingly capable for its size, great for resource-constrained environments.
ollama pull phi3 Microsoft's latest small model with exceptional reasoning. Trained on high-quality synthetic data.
ollama pull phi4 Alibaba's flagship model with excellent multilingual support. Available from 0.5B to 72B.
ollama pull qwen2.5 Efficient 7B version of Alibaba's Qwen 2.5. Great multilingual support and reasoning.
ollama pull qwen2.5:7b Specialized coding model with excellent code completion and generation. Supports 92 programming languages.
ollama pull qwen2.5-coder Efficient coding model supporting 92 languages. Great for code completion on consumer hardware.
ollama pull qwen2.5-coder:7b Hugging Face's tiny but capable model. Perfect for on-device inference and testing.
ollama pull smollm2 High-performance embedding model. Top results on MTEB retrieval benchmarks.
ollama pull snowflake-arctic-embed Upstage's depth-upscaled model. Strong performance through novel training approach.
ollama pull solar Stability AI's coding model. Good for code completion and generation tasks.
ollama pull stable-code Code LLM trained on The Stack v2. Excellent for code completion across 600+ programming languages.
ollama pull starcoder2 Ultra-compact 1.1B model. Perfect for testing, edge devices, and resource-limited environments.
ollama pull tinyllama Uncensored model combining Wizard and Vicuna. Good for creative and unrestricted tasks.
ollama pull wizard-vicuna-uncensored Evol-Instruct trained coding model. Strong on complex programming tasks.
ollama pull wizardcoder 01.AI's bilingual model excelling in English and Chinese. Strong reasoning capabilities.
ollama pull yi 01.AI's specialized coding model. Strong performance on code generation benchmarks.
ollama pull yi-coder