Welcome to your guide for enhancing your development workflow with AI assistance! Whether you're new to AI-powered coding or looking to optimize your setup, this tutorial will walk you through seamless IDE integration.
- Introduction
- Cursor the dedicated IDE for seamless AI integration
- Implementing AI Assistance
- Hands-on implementation guide using Ollama and Continue
Modern development environments now leverage AI capabilities to enhance code synthesis, automate routine operations, and optimize development workflows. IDE integration of language models enables contextual code assistance, automated testing, and intelligent refactoring while maintaining established development practices.
The Pros: Boosting Productivity and Innovation
- Increased productivity : AI tools can automate repetitive coding tasks, generate code snippets, and provide intelligent code suggestions, allowing developers to focus on more complex problems and accelerate development cycles.
- Improved code quality : AI-powered code analysis can detect bugs, security vulnerabilities, and code smells, helping developers write cleaner and more maintainable code.
- Enhanced learning experience : AI assistants can provide real-time guidance, explanations, and best practice recommendations, facilitating the learning process for novice developers.
- Accessibility : AI tools can make coding more accessible to non-experts by bridging the gap between novice and experienced developers, enabling a broader range of individuals to participate in software development.
The Cons: Challenges and Considerations
- Potential bias and errors : AI models can inherit biases from their training data or make mistakes, leading to incorrect code suggestions or flawed analysis.
- Over-reliance and loss of skills : Excessive dependence on AI tools may hinder developers' ability to write code independently and erode their problem-solving skills over time.
- Security and privacy concerns : AI tools may inadvertently expose sensitive data or introduce security vulnerabilities if not properly secured and maintained.
- Integration challenges : Integrating AI tools into existing development workflows and IDEs can be complex, requiring additional configuration and compatibility considerations.
- Cost and accessibility : Advanced AI tools may come with significant licensing costs or require specialized hardware, making them less accessible to individual developers or smaller organizations.
Caution
Code assistance models complement developer expertise by augmenting development workflows through contextual suggestions and documentation integration. These systems enhance implementation speed and technical learning while relying on established programming principles and human oversight.
Always validate AI-generated code against official documentation and established development standards. Implement thorough code review practices when using automated suggestions.
Cursor integrates language models into a VS Code-based editor, providing contextual code completion, codebase analysis, and natural language editing capabilities. The platform supports both cloud-based and local AI models.
Currently Supported Models :
- Claude 3.5 Sonnet by Anthropic
- o1-preview / o1-mini and gpt4-o by OpenAI
Some key features of Cursor include:
- Code understanding and context : Cursor.sh can understand the context of your entire codebase, allowing it to provide tailored suggestions, generate unit tests, and even implement new features by modifying the relevant files.
- Natural language editing : You can edit code using natural language prompts, such as changing an entire method or class with a single instruction.
- Code generation : Cursor.sh can generate code from scratch based on simple instructions provided by the user.
- Copilot++ : An enhanced version of GitHub Copilot that can suggest mid-line completions and entire diffs, trained to autocomplete based on sequences of edits.
- Integration with VSCode : Cursor.sh is a fork of VSCode, allowing users to import all their extensions, themes, and keybindings with a single click.
- Privacy options : Cursor.sh offers a privacy mode where none of your code is stored on their servers or logs, catering to security-critical work.
First Let's pick the right AI provider for your coding needs! Consider What features you'll use most, Where your code will live (local or cloud), Your privacy preferences, and How it fits with your favorite IDE.
Tip
Take a moment to explore different options - finding the right fit makes all the difference!
Reference the Artificial Analysis Leaderboard for comparative analysis of LLM providers across key performance metrics: pricing, token generation speed, response latency, and context window capabilities.
Caution
Security Consideration : Cloud-based code assistance services process codebase data through remote infrastructure, potentially exposing sensitive information to third-party systems. Consider data privacy implications and compliance requirements when implementing proprietary code processing services.
Proprietary code assistance services may utilize submitted code for model training and refinement. Review provider terms of service regarding data collection and model training practices.
Cloud-based Proprietary Providers
Tool | Description | Licence | Pricing |
---|---|---|---|
AskCodi | An AI-powered coding assistant that offers code suggestions, debugging help, and explanations for code snippets. | ||
Blackbox | An AI platform that helps businesses automate processes, make predictions, and optimize decision-making. | ||
Boxy | An AI coding assistant by CodeSandbox providing real-time code suggestions and completions. | ||
Codeium | An AI-powered code completion tool that helps developers write code faster and more accurately. | ||
CodeWhisperer | Developed by Amazon, provide real-time code suggestions and completions. | ||
Codium | An AI-powered tool that analyze your code, docstring, and comments and suggests tests as you code. | ||
Copilot | Developed by GitHub and OpenAI, provide real-time code suggestions and completions. | ||
JetBrains AI | JetBrains is working on integrating AI capabilities into their development tools. | ||
Replit AI | A coding assistant and tutorial platform developed by Replit, offering code suggestions and explanations. | ||
Tabnine | An AI-powered code completion tool that helps developers write code faster and more accurately. |
Open-Source Providers
Tool | Description | Licence | Pricing |
---|---|---|---|
Aider | an AI-powered pair programming tool designed to assist developers in writing and editing code directly from the command line. | ||
Continue | An open-source autopilot for software development that enables developers to create their own AI code assistant within their integrated development environment (IDE) like VS Code or JetBrains IDEs. | ||
Llamacoder | An open source Claude Artifacts – generate small apps with one prompt. | ||
Open Interpreter | Open Interpreter is an innovative open-source project that allows language models to execute code on a user's computer to complete various tasks. |
Cloud-based Providers
Tool | Description | Pricing |
---|---|---|
Known for their language models like Jurassic-1 Jumbo focused on quality, safety, and controllability. | ||
Offers models like Amazon CodeWhisperer for code generation and understanding through their SageMaker platform. | ||
Known for their constitutional AI model Claude, focused on being helpful, harmless, and honest. | ||
Cerebras | An AI company that has developed innovative hardware and software solutions for AI computing. | |
Provides an enterprise AI platform with models like Cohere Generate for custom content creation. | ||
A unified, open analytics platform that provides tools and services for data processing, analytics, and artificial intelligence at scale. | ||
DeepInfra | A platform that provides scalable and cost-effective infrastructure for deploying machine learning models. | |
An AI company that has developed several notable AI models and technologies | ||
A comprehensive solution for companies looking to deploy AI into production, focusing on performance, cost-effectiveness, and developer experience. | ||
Provides models like LaMBDA, PaLM, and Bard for language understanding, generation, and multimodal AI tasks. | ||
Specializes in high-performance AI inference with custom LPU (Language Processing Unit) hardware, offering models like Meta's Llama 3. | ||
The AI dedicated github, Offers a platform with most open-source models like BERT, GPT-Neo, and Llama for various AI tasks. | ||
LeptonAI | A platform that provides cloud-based infrastructure and tools for deploying and running AI applications efficiently. | |
A comprehensive suite of AI services and tools designed to help developers and organizations build, deploy, and manage AI applications at scale. | ||
A French artificial intelligence company that specializes in developing large language models (LLMs) and AI products. | ||
OctoAI | A full-stack inference platform designed specifically for generative AI applications. | |
Offers models like GPT-4, DALL-E, and Whisper for natural language processing, image generation, and speech recognition. | ||
A versatile platform designed to provide access to a wide range of large language models (LLMs) from both proprietary and open-source sources. | ||
An online platform that provides free access to various powerful open-source large language models (LLMs) for experimentation and use in a wide range of applications. | ||
Poe | An AI chatbot aggregator platform developed by Quora that provides users access to multiple advanced language models and chatbots within a single interface. | |
An AI company that develops advanced multimodal AI models and technologies. | ||
Replicate | A cloud platform that allows developers to easily run and deploy open-source machine learning models. | |
SambaNova | An artificial intelligence company that provides a comprehensive AI platform for enterprises. | |
Together | A cloud platform designed for building and running generative AI applications. | |
Vercel | A powerful tool for developers looking to explore and integrate various AI models into their applications efficiently. |
Local Providers
Important
Deploy LLMs locally with our implementation guide for privacy-focused language processing and model experimentation on your hardware.
Tool | Description | OS | Models |
---|---|---|---|
AnythingLLM | an open-source, full-stack application that allows users to chat with their documents in a private and enterprise-friendly environment. | All | Open source Models and Proprietary models via API |
Enchanted | iOS and macOS app for chatting with private self hosted language models. | MacOS/IOS | Open source Models |
GPT4ALL | An open-source software ecosystem developed by Nomic AI that enables users to run powerful large language models (LLMs) locally on their personal computers. | All | Open source Models |
Jan | Clean UI with useful features like system monitoring and LLM library. | All | Open source Models. |
LM Studio | Elegant UI with the ability to run every Hugging Face repository. | All | Open source Models |
Msty | an AI chat application that offers a user-friendly interface for interacting with both local and online AI language models. | All | Open source Models and Proprietary models via API |
Ollama | Fastest when used on the terminal, and any model can be downloaded with a single command. | All | Open sources Models |
Now that you've chosen your provider, let's find the right model for your coding needs.
We designed two tables to assist those who require guidance in selecting the ideal model for coding purposes.
The first table highlights Open-source models, which you can typically run locally on your machine (Probably not hte Massive Models). To ensure a smooth experience running these models locally, I will provide the necessary hardware requirements for optimal performance. Please note that while you may be able to run certain models with lower hardware specifications, you can expect slower output and performance as a result.
The second table features Proprietary models that usually operate on cloud-based providers.
The models are ranked according to LiveCodeBench Pass@1 Code Generation scores (with higher scores indicating better performance). Pass@1 is the probability of passing a given problem in one attempt. LiveCodeBench offers a more comprehensive, up-to-date, and contamination-aware evaluation of code-related capabilities compared to HumanEval.
Caution
Please note that the hardware requirements provided are only indicative, and the specified VRAM requirements in the tables below apply specifically to the default Ollama Q4_0 quantization version of the models. Learn more about LLMs Quantization.
Tip
You can always try different models to find your perfect match. Many developers mix and match based on different tasks!
Badges provide direct links to model downloads, provider services, and official documentation.
Massive models : Challenging for local deployment due to computational requirements. Use them via any of the listed Cloud-based Providers (e.g. Groq / Together / Mistral or OpenRouter).
Organization | Model | Model Size | Hardware requirement | Pass@1 score (/100) | Ollama libraries | Cloud-based providers |
---|---|---|---|---|---|---|
Qwen2.5-72B-Instruct | 72.2B | 47GB+ VRAM GPU (2xRTX 4090 or better) | 48.7 | |||
DeepSeek-V2.5 | 236B | 130GB+ VRAM GPU (2XH100 or better) | 41.8 | |||
DeepSeek-Coder-V2-Instruct | 236B | 130GB+ VRAM GPU (2XH100 or better) | 41.6 | |||
Llama-3.1-405b-Instruct | 405B | 230+ VRAM GPU (3XH100 or better) | 39.9 | |||
Mistral-Large-2-Instruct | 123B | 70GB+ VRAM GPU (1XH100 or better) | 39.1 | |||
Qwen2-72B-Instruct | 72.7B | 47GB+ VRAM GPU (2xRTX 4090 or better) | 30.5 | |||
Llama-3.1-70b-Instruct | 70B | 40GB+ VRAM GPU (2xRTX 4090 or better) | 30 | |||
Llama3-70B-instruct | 70B | 40GB+ VRAM GPU (2xRTX 4090 or better) | 25.8 | |||
Mixtral-8x22b-Instruct-v0.1 | 141B | 80GB+ VRAM GPU (1xH100 or better) | 24.9 | |||
Command R+ | 104B | 60GB+ VRAM GPU (3xRTX 4090 or better) | 19.2 | |||
CodeLlama-70B-Instruct | 70B | 40GB+ VRAM GPU (2xRTX 4090 or better) | 12.6 | None |
Mid-sized models : Suitable for deployment on a high-performance local workstation.
Organization | Model | Model Size | Hardware requirement | Pass@1 score (/100) | Ollama libraries | Cloud-based providers |
---|---|---|---|---|---|---|
Qwen2.5-32B-Instruct | 32.8B | 20GB+ VRAM GPU (RX 7900 XT or RTX 4090 or better) | 46.6 | |||
Codestral-22B | 22B | 14GB+ VRAM GPU (RX 7800 or RTX 4070 Ti or better) | 31.8 | |||
DeepSeek-Coder-V2-Lite-Instruct | 16B | 10GB+ VRAM GPU (RX 7800 or RTX 4070 or better) | 27.9 | |||
DeepSeek-Coder-33B-instruct | 33B | 20GB+ VRAM GPU (RX 7900 XT or RTX 4090 or better) | 26.7 | |||
OpenCodeInterpreter-DS-33B | 33B | 20GB+ VRAM GPU (RX 7900 XT or RTX 4090 or better) | 19 | None |
Small models : Lightweight and deployable on most local machines.
Organization | Model | Model Size | Hardware requirement | Pass@1 score (/100) | Ollama libraries | Cloud-based providers |
---|---|---|---|---|---|---|
Qwen2.5-Coder-7B-Instruct | 7B | 5GB+ VRAM GPU (RX 6500 or RTX 3050 or better) | 34 | |||
Yi-Coder-9B-Chat | 9B | 5GB+ VRAM GPU (RX 6500 or RTX 3050 or better) | 31.2 | |||
Qwen2.5-7B-Instruct | 7.62B | 6GB+ VRAM GPU (RX 7600 or RTX 4060 or better) | 29.9 | |||
Mamba-codestral-7B-v0.1 | 7B | 5GB+ VRAM GPU (RX 6500 or RTX 3050 or better) | 21.1 | No | ||
CodeQwen1.5-7B-Chat | 7B | 5GB+ VRAM GPU (RX 6500 or RTX 3050 or better) | 19.6 | None | ||
Llama3.1-8B-instruct | 8B | 5GB+ VRAM GPU (RX 6500 or RTX 3050 or better) | 19.2 | |||
DeepSeek-Coder-6.7B-instruct | 7B | 5GB+ VRAM GPU (RX 6500 or RTX 3050 or better) | 18.9 | |||
OpenCodeInterpreter-DS-6.7B | 6.7B | 5GB+ VRAM GPU (RX 6500 or RTX 3050 or better) | 18.7 | No | ||
CodeGeeX4-All-9B | 9B | 8GB+ VRAM GPU (RX 7600 or RTX 4060 or better) | 17.8 |
Typically accessible through a paid API or web interface.
Provider | Model | Pass@1 score (/100) | Pricing |
---|---|---|---|
o1-preview | 56.7 | ||
Claude-3.5 Sonnet | 50.9 | ||
GPT-4o | 47.7 | ||
Gemini Pro 1.5 | 43 | ||
GPT-4 Turbo | 41.8 | ||
GPT-4o-mini | 38.7 | ||
Gemini Flash 1.5 | 37.2 | ||
Claude-3 Opus | 35.3 |
Step-by-step guide for privacy-preserving code assistance using open-source solutions.
For those who prioritize keeping everything local and private, I'll be using Ollama as the provider in this tutorial. If you haven't already, be sure to check out our A Practical Tutorial to Run a Local Model to learn how to install Ollama and get started.
Note
In this tutorial, I am using the Codestral:22b model. This particular model is specialized for coding and has been meticulously trained on vast datasets by Mistral.ai.
This model stands out for its exceptional size and capabilities, selected specifically for the hardware I use on my PC. As with any model, it has its unique strengths and weaknesses. To get the most out of this tool, be sure to choose wisely among the available options.
- Click on the Extensions icon in the left sidebar or use the keyboard shortcut: Ctrl + Shift + X (Windows/Linux) or Cmd + Shift + X (macOS).
- In the Extensions marketplace, search for "Continue" to find the extension.
- Click on the "Continue" extension to open its page.
- Click the "Install" button to begin the installation process.
- Wait for the extension to download and install. This might take a few seconds.
- Once the installation is complete, click on the "Reload Required" button to reload VS Code.
-
Open the Continue.dev extension by clicking on its icon in the left sidebar.
-
The extension will prompt you for a Model Provider; select the Local Provider option.
-
If you have set up Ollama correctly, it should automatically detect your models.
-
At the bottom of the window, you will find a list of all your installed models. For this tutorial, I choose Codestral:22b.
-
By default, the extension is set to use starcoder-2-3b for autocomplete. If you want to keep using this model, ensure it is installed with your Ollama setup. Otherwise, if you wish to switch to a more capable model, click on the settings icon at the bottom right of the extension window.
-
This will open the config.json file, where you can customize the extension's behavior.
-
Navigate to the following snippet:
"tabAutocompleteModel": {
"title": "starcoder-2-3b",
"provider": "ollama",
"model": "starcoder-2-3b"
},
- Replace starcoder-2-3b with the model you prefer. For example, if you want to use Codestral:22b, update the code to:
"tabAutocompleteModel": {
"title": "codestral:22b",
"provider": "ollama",
"model": "codestral:22b"
},
or by Qwen2.5-Coder-7B-Instruct for less powerfull computers :
"tabAutocompleteModel": {
"title": "qwen2.5-coder:7b-instruct",
"provider": "ollama",
"model": "qwen2.5-coder:7b-instruct"
},
- Save your modifications and close the config file.
If you don't have the necessary hardware to run models locally or want to tap into more powerful capabilities, this tutorial is for you. I'll be using Groq as the provider, leveraging their infrastructure's fast inference capabilities.
Note
In this tutorial, I'll be leveraging Groq, but feel free to substitute it with your preferred provider - whether that's OpenAI, Anthropic, or another that suits your needs.
When choosing your model, Keep in mind that each model has its strengths and weaknesses.
Before we get started, make sure you have a Groq account set up - if you haven't already, take a moment to create one.
- Click on the Continue icon in the left sidebar to open the Continue panel.
- If it is the first time you open it, The extension will prompt you for a Model Provider; select the Cloud Provider option.
- Select Groq as your preferred AI provider.
- Head to the GroqCloud section of the Groq website (look for the icon at the bottom).
- Create a new API key and copy it to your clipboard.
- Return to the window and paste your API key into the apiKey input field.
- Select the AI model you want to use from the available options. Here on Groq, I'll choose the "Llama-3-70b Model".
- You now have access to a capable model, ready to be used as a programming assistant within Visual Studio Code.