Skip to content

Latest commit

 

History

History
361 lines (261 loc) · 56.2 KB

integrating-ai-models-into-ide.md

File metadata and controls

361 lines (261 loc) · 56.2 KB

SVG Banners

Waving Hand Welcome to your guide for enhancing your development workflow with AI assistance! Whether you're new to AI-powered coding or looking to optimize your setup, this tutorial will walk you through seamless IDE integration. Waving Hand

Orange Square Level : Intermediate      Face with Monocle Reading Time : 16min

Table of Contents


Introduction

Modern development environments now leverage AI capabilities to enhance code synthesis, automate routine operations, and optimize development workflows. IDE integration of language models enables contextual code assistance, automated testing, and intelligent refactoring while maintaining established development practices.

The Pros: Boosting Productivity and Innovation

  • Increased productivity : AI tools can automate repetitive coding tasks, generate code snippets, and provide intelligent code suggestions, allowing developers to focus on more complex problems and accelerate development cycles.
  • Improved code quality : AI-powered code analysis can detect bugs, security vulnerabilities, and code smells, helping developers write cleaner and more maintainable code.
  • Enhanced learning experience : AI assistants can provide real-time guidance, explanations, and best practice recommendations, facilitating the learning process for novice developers.
  • Accessibility : AI tools can make coding more accessible to non-experts by bridging the gap between novice and experienced developers, enabling a broader range of individuals to participate in software development.

The Cons: Challenges and Considerations

  • Potential bias and errors : AI models can inherit biases from their training data or make mistakes, leading to incorrect code suggestions or flawed analysis.
  • Over-reliance and loss of skills : Excessive dependence on AI tools may hinder developers' ability to write code independently and erode their problem-solving skills over time.
  • Security and privacy concerns : AI tools may inadvertently expose sensitive data or introduce security vulnerabilities if not properly secured and maintained.
  • Integration challenges : Integrating AI tools into existing development workflows and IDEs can be complex, requiring additional configuration and compatibility considerations.
  • Cost and accessibility : Advanced AI tools may come with significant licensing costs or require specialized hardware, making them less accessible to individual developers or smaller organizations.

Caution

Code assistance models complement developer expertise by augmenting development workflows through contextual suggestions and documentation integration. These systems enhance implementation speed and technical learning while relying on established programming principles and human oversight.

Triangular Flag Always validate AI-generated code against official documentation and established development standards. Implement thorough code review practices when using automated suggestions. Triangular Flag


Cursor the dedicated IDE for seamless AI integration

Cursor integrates language models into a VS Code-based editor, providing contextual code completion, codebase analysis, and natural language editing capabilities. The platform supports both cloud-based and local AI models.

Currently Supported Models :

  • Claude 3.5 Sonnet by Anthropic
  • o1-preview / o1-mini and gpt4-o by OpenAI

Some key features of Cursor include:

  • Code understanding and context : Cursor.sh can understand the context of your entire codebase, allowing it to provide tailored suggestions, generate unit tests, and even implement new features by modifying the relevant files.
  • Natural language editing : You can edit code using natural language prompts, such as changing an entire method or class with a single instruction.
  • Code generation : Cursor.sh can generate code from scratch based on simple instructions provided by the user.
  • Copilot++ : An enhanced version of GitHub Copilot that can suggest mid-line completions and entire diffs, trained to autocomplete based on sequences of edits.
  • Integration with VSCode : Cursor.sh is a fork of VSCode, allowing users to import all their extensions, themes, and keybindings with a single click.
  • Privacy options : Cursor.sh offers a privacy mode where none of your code is stored on their servers or logs, catering to security-critical work.


Implementing AI Assistance

Model providers

First Let's pick the right AI provider for your coding needs! Consider What features you'll use most, Where your code will live (local or cloud), Your privacy preferences, and How it fits with your favorite IDE.

Tip

Take a moment to explore different options - finding the right fit makes all the difference!

Reference the Artificial Analysis Leaderboard for comparative analysis of LLM providers across key performance metrics: pricing, token generation speed, response latency, and context window capabilities.

Caution

Triangular Flag Security Consideration : Cloud-based code assistance services process codebase data through remote infrastructure, potentially exposing sensitive information to third-party systems. Consider data privacy implications and compliance requirements when implementing proprietary code processing services.

Proprietary code assistance services may utilize submitted code for model training and refinement. Review provider terms of service regarding data collection and model training practices. Triangular Flag

Code dedicated providers

Cloud-based Proprietary Providers

Tool Description Licence Pricing
AskCodi An AI-powered coding assistant that offers code suggestions, debugging help, and explanations for code snippets. proprietary Freemium
Blackbox An AI platform that helps businesses automate processes, make predictions, and optimize decision-making. proprietary Freemium
Boxy An AI coding assistant by CodeSandbox providing real-time code suggestions and completions. proprietary Freemium
Codeium An AI-powered code completion tool that helps developers write code faster and more accurately. proprietary Freemium
CodeWhisperer Developed by Amazon, provide real-time code suggestions and completions. proprietary Freemium
Codium An AI-powered tool that analyze your code, docstring, and comments and suggests tests as you code. proprietary Freemium
Copilot Developed by GitHub and OpenAI, provide real-time code suggestions and completions. proprietary Paid
JetBrains AI JetBrains is working on integrating AI capabilities into their development tools. proprietary Paid
Replit AI A coding assistant and tutorial platform developed by Replit, offering code suggestions and explanations. proprietary Freemium
Tabnine An AI-powered code completion tool that helps developers write code faster and more accurately. proprietary Freemium

Open-Source Providers

Tool Description Licence Pricing
Aider an AI-powered pair programming tool designed to assist developers in writing and editing code directly from the command line. opensource Free
Continue An open-source autopilot for software development that enables developers to create their own AI code assistant within their integrated development environment (IDE) like VS Code or JetBrains IDEs. opensource Free
Llamacoder An open source Claude Artifacts – generate small apps with one prompt. opensource Free
Open Interpreter Open Interpreter is an innovative open-source project that allows language models to execute code on a user's computer to complete various tasks. opensource Free

Generalist Models Providers

Cloud-based Providers

Tool Description Pricing
ai21 Known for their language models like Jurassic-1 Jumbo focused on quality, safety, and controllability. Freemium
Amazon Offers models like Amazon CodeWhisperer for code generation and understanding through their SageMaker platform. Paid
Anthropic Known for their constitutional AI model Claude, focused on being helpful, harmless, and honest. Freemium
Cerebras An AI company that has developed innovative hardware and software solutions for AI computing. Freemium
cohere Provides an enterprise AI platform with models like Cohere Generate for custom content creation. Freemium
databricks A unified, open analytics platform that provides tools and services for data processing, analytics, and artificial intelligence at scale. Paid
DeepInfra A platform that provides scalable and cost-effective infrastructure for deploying machine learning models. Freemium
Deepseek An AI company that has developed several notable AI models and technologies Freemium
Fireworks A comprehensive solution for companies looking to deploy AI into production, focusing on performance, cost-effectiveness, and developer experience. Freemium
Google Provides models like LaMBDA, PaLM, and Bard for language understanding, generation, and multimodal AI tasks. Freemium
Groq Specializes in high-performance AI inference with custom LPU (Language Processing Unit) hardware, offering models like Meta's Llama 3. Freemium
Hugging Face The AI dedicated github, Offers a platform with most open-source models like BERT, GPT-Neo, and Llama for various AI tasks. Free
LeptonAI A platform that provides cloud-based infrastructure and tools for deploying and running AI applications efficiently. Freemium
microsoft A comprehensive suite of AI services and tools designed to help developers and organizations build, deploy, and manage AI applications at scale. Paid
Mistral A French artificial intelligence company that specializes in developing large language models (LLMs) and AI products. Freemium
OctoAI A full-stack inference platform designed specifically for generative AI applications. Freemium
OpenAI Offers models like GPT-4, DALL-E, and Whisper for natural language processing, image generation, and speech recognition. Freemium
OpenRouter A versatile platform designed to provide access to a wide range of large language models (LLMs) from both proprietary and open-source sources. Paid
Perplexity An online platform that provides free access to various powerful open-source large language models (LLMs) for experimentation and use in a wide range of applications. Free
Poe An AI chatbot aggregator platform developed by Quora that provides users access to multiple advanced language models and chatbots within a single interface. Freemium
Reka An AI company that develops advanced multimodal AI models and technologies. Freemium
Replicate A cloud platform that allows developers to easily run and deploy open-source machine learning models. Paid
SambaNova An artificial intelligence company that provides a comprehensive AI platform for enterprises. Paid
Together A cloud platform designed for building and running generative AI applications. Paid
Vercel A powerful tool for developers looking to explore and integrate various AI models into their applications efficiently. Freemium

Local Providers

Important

Deploy LLMs locally with our implementation guide for privacy-focused language processing and model experimentation on your hardware.

Tool Description OS Models
AnythingLLM an open-source, full-stack application that allows users to chat with their documents in a private and enterprise-friendly environment. All Open source Models and Proprietary models via API
Enchanted iOS and macOS app for chatting with private self hosted language models. MacOS/IOS Open source Models
GPT4ALL An open-source software ecosystem developed by Nomic AI that enables users to run powerful large language models (LLMs) locally on their personal computers. All Open source Models
Jan Clean UI with useful features like system monitoring and LLM library. All Open source Models.
LM Studio Elegant UI with the ability to run every Hugging Face repository. All Open source Models
Msty an AI chat application that offers a user-friendly interface for interacting with both local and online AI language models. All Open source Models and Proprietary models via API
Ollama Fastest when used on the terminal, and any model can be downloaded with a single command. All Open sources Models


LLMs Models

Now that you've chosen your provider, let's find the right model for your coding needs.

We designed two tables to assist those who require guidance in selecting the ideal model for coding purposes.

The first table highlights Open-source models, which you can typically run locally on your machine (Probably not hte Massive Models). To ensure a smooth experience running these models locally, I will provide the necessary hardware requirements for optimal performance. Please note that while you may be able to run certain models with lower hardware specifications, you can expect slower output and performance as a result.

The second table features Proprietary models that usually operate on cloud-based providers.

The models are ranked according to LiveCodeBench Pass@1 Code Generation scores (with higher scores indicating better performance). Pass@1 is the probability of passing a given problem in one attempt. LiveCodeBench offers a more comprehensive, up-to-date, and contamination-aware evaluation of code-related capabilities compared to HumanEval.

Caution

Triangular Flag Please note that the hardware requirements provided are only indicative, and the specified VRAM requirements in the tables below apply specifically to the default Ollama Q4_0 quantization version of the models. Learn more about LLMs Quantization. Triangular Flag

Tip

You can always try different models to find your perfect match. Many developers mix and match based on different tasks!

Triangular Flag Badges provide direct links to model downloads, provider services, and official documentation. Triangular Flag

Open Source Models

Massive models : Challenging for local deployment due to computational requirements. Use them via any of the listed Cloud-based Providers (e.g. Groq / Together / Mistral or OpenRouter).

Organization Model Model Size Hardware requirement Pass@1 score (/100) Ollama libraries Cloud-based providers
Alibaba Qwen2.5-72B-Instruct 72.2B 47GB+ VRAM GPU (2xRTX 4090 or better) 48.7 Ollama Hugging Face
Deepseek DeepSeek-V2.5 236B 130GB+ VRAM GPU (2XH100 or better) 41.8 Ollama Deepseek
Deepseek DeepSeek-Coder-V2-Instruct 236B 130GB+ VRAM GPU (2XH100 or better) 41.6 Ollama Deepseek
Meta Llama-3.1-405b-Instruct 405B 230+ VRAM GPU (3XH100 or better) 39.9 Ollama OpenRouter
Mistral Mistral-Large-2-Instruct 123B 70GB+ VRAM GPU (1XH100 or better) 39.1 Ollama Mistral
Alibaba Qwen2-72B-Instruct 72.7B 47GB+ VRAM GPU (2xRTX 4090 or better) 30.5 Ollama Hugging Face
Meta Llama-3.1-70b-Instruct 70B 40GB+ VRAM GPU (2xRTX 4090 or better) 30 Ollama Groq
Meta Llama3-70B-instruct 70B 40GB+ VRAM GPU (2xRTX 4090 or better) 25.8 Ollama Groq
Mistral Mixtral-8x22b-Instruct-v0.1 141B 80GB+ VRAM GPU (1xH100 or better) 24.9 Ollama Perplexity
cohere Command R+ 104B 60GB+ VRAM GPU (3xRTX 4090 or better) 19.2 Ollama cohere
Meta CodeLlama-70B-Instruct 70B 40GB+ VRAM GPU (2xRTX 4090 or better) 12.6 Ollama None

Mid-sized models : Suitable for deployment on a high-performance local workstation.

Organization Model Model Size Hardware requirement Pass@1 score (/100) Ollama libraries Cloud-based providers
Alibaba Qwen2.5-32B-Instruct 32.8B 20GB+ VRAM GPU (RX 7900 XT or RTX 4090 or better) 46.6 Ollama Hugging Face
Mistral Codestral-22B 22B 14GB+ VRAM GPU (RX 7800 or RTX 4070 Ti or better) 31.8 Ollama Mistral
Deepseek DeepSeek-Coder-V2-Lite-Instruct 16B 10GB+ VRAM GPU (RX 7800 or RTX 4070 or better) 27.9 Ollama Deepseek
Deepseek DeepSeek-Coder-33B-instruct 33B 20GB+ VRAM GPU (RX 7900 XT or RTX 4090 or better) 26.7 Ollama Deepseek
Map OpenCodeInterpreter-DS-33B 33B 20GB+ VRAM GPU (RX 7900 XT or RTX 4090 or better) 19 Ollama None

Small models : Lightweight and deployable on most local machines.

Organization Model Model Size Hardware requirement Pass@1 score (/100) Ollama libraries Cloud-based providers
Alibaba Qwen2.5-Coder-7B-Instruct 7B 5GB+ VRAM GPU (RX 6500 or RTX 3050 or better) 34 Ollama Hugging Face
01AI Yi-Coder-9B-Chat 9B 5GB+ VRAM GPU (RX 6500 or RTX 3050 or better) 31.2 Ollama Hugging Face
Alibaba Qwen2.5-7B-Instruct 7.62B 6GB+ VRAM GPU (RX 7600 or RTX 4060 or better) 29.9 Ollama Hugging Face
Mistral Mamba-codestral-7B-v0.1 7B 5GB+ VRAM GPU (RX 6500 or RTX 3050 or better) 21.1 No Mistral
Alibaba CodeQwen1.5-7B-Chat 7B 5GB+ VRAM GPU (RX 6500 or RTX 3050 or better) 19.6 Ollama None
Meta Llama3.1-8B-instruct 8B 5GB+ VRAM GPU (RX 6500 or RTX 3050 or better) 19.2 Ollama Groq
Deepseek DeepSeek-Coder-6.7B-instruct 7B 5GB+ VRAM GPU (RX 6500 or RTX 3050 or better) 18.9 Ollama Deepseek
Map OpenCodeInterpreter-DS-6.7B 6.7B 5GB+ VRAM GPU (RX 6500 or RTX 3050 or better) 18.7 No Hugging Face
THUDM CodeGeeX4-All-9B 9B 8GB+ VRAM GPU (RX 7600 or RTX 4060 or better) 17.8 Ollama Hugging Face

Proprietary Models

Typically accessible through a paid API or web interface.

Provider Model Pass@1 score (/100) Pricing
OpenAI o1-preview 56.7 Paid
Anthropic Claude-3.5 Sonnet 50.9 Paid
OpenAI GPT-4o 47.7 Paid
Google Gemini Pro 1.5 43 Paid
OpenAI GPT-4 Turbo 41.8 Paid
OpenAI GPT-4o-mini 38.7 Freemium
Google Gemini Flash 1.5 37.2 Paid
Anthropic Claude-3 Opus 35.3 Paid


Hands-on implementation guide using Ollama and Continue

Step-by-step guide for privacy-preserving code assistance using open-source solutions.

Setup Ollama as the Local Model Provider

For those who prioritize keeping everything local and private, I'll be using Ollama as the provider in this tutorial. If you haven't already, be sure to check out our A Practical Tutorial to Run a Local Model to learn how to install Ollama and get started.

Note

In this tutorial, I am using the Codestral:22b model. This particular model is specialized for coding and has been meticulously trained on vast datasets by Mistral.ai.

This model stands out for its exceptional size and capabilities, selected specifically for the hardware I use on my PC. As with any model, it has its unique strengths and weaknesses. To get the most out of this tool, be sure to choose wisely among the available options.

Install Continue extension

Official Continue.dev website

  • Click on the Extensions icon in the left sidebar or use the keyboard shortcut: Ctrl + Shift + X (Windows/Linux) or Cmd + Shift + X (macOS).
  • In the Extensions marketplace, search for "Continue" to find the extension.
  • Click on the "Continue" extension to open its page.
  • Click the "Install" button to begin the installation process.
  • Wait for the extension to download and install. This might take a few seconds.
  • Once the installation is complete, click on the "Reload Required" button to reload VS Code.

Sidebar chat Model

  • Open the Continue.dev extension by clicking on its icon in the left sidebar.

  • The extension will prompt you for a Model Provider; select the Local Provider option.

  • If you have set up Ollama correctly, it should automatically detect your models.

  • At the bottom of the window, you will find a list of all your installed models. For this tutorial, I choose Codestral:22b.

Inline autocomplete Model

  • By default, the extension is set to use starcoder-2-3b for autocomplete. If you want to keep using this model, ensure it is installed with your Ollama setup. Otherwise, if you wish to switch to a more capable model, click on the settings icon at the bottom right of the extension window.

  • This will open the config.json file, where you can customize the extension's behavior.

  • Navigate to the following snippet:

 "tabAutocompleteModel": {
    "title": "starcoder-2-3b",
    "provider": "ollama",
    "model": "starcoder-2-3b"
  },
  • Replace starcoder-2-3b with the model you prefer. For example, if you want to use Codestral:22b, update the code to:
"tabAutocompleteModel": {
   "title": "codestral:22b",
   "provider": "ollama",
   "model": "codestral:22b"
 },

or by Qwen2.5-Coder-7B-Instruct for less powerfull computers :

"tabAutocompleteModel": {
   "title": "qwen2.5-coder:7b-instruct",
   "provider": "ollama",
   "model": "qwen2.5-coder:7b-instruct"
 },
  • Save your modifications and close the config file.

Clapping Hands That's it for this tutorial ! You now have an AI model integrated into your VSCode setup. Clapping Hands


(Bonus) Setup Groq as the Cloud-based Model Provider

If you don't have the necessary hardware to run models locally or want to tap into more powerful capabilities, this tutorial is for you. I'll be using Groq as the provider, leveraging their infrastructure's fast inference capabilities.

Note

In this tutorial, I'll be leveraging Groq, but feel free to substitute it with your preferred provider - whether that's OpenAI, Anthropic, or another that suits your needs.

When choosing your model, Keep in mind that each model has its strengths and weaknesses.

Before we get started, make sure you have a Groq account set up - if you haven't already, take a moment to create one.

  • Click on the Continue icon in the left sidebar to open the Continue panel.
  • If it is the first time you open it, The extension will prompt you for a Model Provider; select the Cloud Provider option.
  • Select Groq as your preferred AI provider.
  • Head to the GroqCloud section of the Groq website (look for the icon at the bottom).
  • Create a new API key and copy it to your clipboard.
  • Return to the window and paste your API key into the apiKey input field.
  • Select the AI model you want to use from the available options. Here on Groq, I'll choose the "Llama-3-70b Model".
  • You now have access to a capable model, ready to be used as a programming assistant within Visual Studio Code.