Hands-on Code assistants have gained considerable attention as an early use case for generative AI - especially following the launch of Microsoft's GitHub Copilot. But, if you don't relish the idea of letting Microsoft loose on your code or paying $10/month for the privilege, you can always build your own.
While Microsoft was among the first to commercialize an AI code assistant and integrate it into an IDE, it's far from the only option out there. In fact, there are numerous large language models (LLMs) trained specifically with code generation in mind.
What's more, there's a good chance the computer you're sitting in front of right now is capable of running these models. The trick is integrating them into an IDE in a way that's actually useful.
This is where apps like Continue come into play. The open source code assistant is designed to plug into popular IDEs like JetBrains or Visual Studio Code and connect to popular LLM runners you might already be familiar with - like Ollama, Llama.cpp, and LM Studio.
Like other popular code assistants, Continue supports code completion and generation, as well as the ability to optimize, comment, or refactor your code for different use cases. Additionally, Continue also sports an integrated chatbot with RAG functionality, which effectively allows you to talk to your codebase.
We'll be looking at using Continue with Ollama in this guide, but the app also works with several proprietary models - including OpenAI and Anthropic - via their respective APIs, if you'd rather pay per token than a fixed monthly price.
For this guide, we'll be deploying Continue in VSCodium. To get started, launch the IDE and open the extensions panel. From there, search for and install "Continue."
After a few seconds, Continue's initial setup wizard should launch, directing you to choose whether you'd like to host your models locally or tap into another provider's API.
In this case, we're going to host our models locally via Ollama, so we'll select "Local models." This will configure Continue to use the following models out of the box. We'll discuss how to change these out for alternative ones in a bit, but for now these offer a good starting place:
If for whatever reason Continue skips past the launch wizard, don't worry, you can pull these models manually using Ollama by running the following in your terminal:
For more information on setting up and deploying models with Ollama, check out our quick start guide here.
Before we continue, it's worth noting that by default, Continue collects anonymized telemetry data including:
You can opt out of this by modifying the file located in your home directory or by unticking the "Continue: Telemetry Enabled" box in VS Code settings.
More information on Continue's data gathering policies can be found here.
With the installation out of the way, we can start digging into the various ways to integrate Continue into your workflow. The first of these is arguably the most obvious: generating code snippets from scratch.
If, for example, you wanted to generate a basic web page for a project, you'd press or on your keyboard and enter your prompt in the action bar.
In this case, our prompt was "Generate a simple landing page in HTML with inline CSS." Upon submitting our prompt, Continue loads the relevant model - this can take a few seconds depending on your hardware - and presents us with a code snippet to accept or reject.
Continue can also be used to refactor, comment, optimize, or otherwise edit your existing code.
For example, let's say you've got a Python script for running an LLM in PyTorch that you want to refactor to run on an Apple Silicon Mac. You'd start by selecting your document, hitting on your keyboard and prompting the assistant to do just that.
After a few seconds, Continue passes along the model's recommendations for what changes it thinks you should make - with new code highlighted in green and code marked for removal marked with red.
In addition to refactoring existing code, this functionality can also be useful for generating comments and/or docstrings after the fact. These functions can be found under "Continue" in the right-click context menu.
While code generation can be useful for quickly mocking up proof of concepts or refactoring existing code, it can still be a little hit and miss depending on what model you're using.
Anyone who's ever asked ChatGPT to generate a block of code will know that sometimes it just starts hallucinating packages or functions. These hallucinations do become pretty obvious, since bad code tends to fail rather spectacularly. But, as we've previously discussed, these hallucinated packages can become a security threat if suggested frequently enough.
If letting an AI model write your code for you is a bridge too far, Continue also supports code completion functionality. That at least gives you more control over what edits or changes the model does or doesn't make.
This functionality works a bit like tab completion in the terminal. As you type, Continue will automatically feed your code into a model - like Starcoder2 or Codestral - and offer suggestions for how to complete a string or function.
The suggestions appear in gray and are updated with each keystroke. If Continue guesses correctly, you can accept the suggestion by pressing the on your keyboard.
Along with code generation and prediction, Continue features an integrated chatbot with RAG-style functionality. You can learn more about RAG in our hands-on guide here, but in the case of Continue, it uses a combination of Llama 3 8B and the nomic-embed-text embedding model to make your codebase searchable.
This functionality is admittedly a bit of a rabbit hole, but here are a couple of examples of how it can be used to speed up your workflow:
How reliably Continue actually is in practice really depends on what models you're using, as the plug-in itself is really more of a framework for integrating LLMs and code models into your IDE. While it dictates how you interact with these models, it has no control over the actual quality of the generated code.
The good news is Continue isn't married to any one model or technology. As we mentioned earlier it plugs into all manner of LLM runners and APIs. If a new model is released that's optimized for your go-to programming language, there's nothing stopping you - other than your hardware of course - from taking advantage of it.
And since we're using Ollama as our model server, swapping out models is, for the most part, a relatively straightforward task. For example, if you'd like to swap out Llama 3 for Google's Gemma 2 9B and Starcoder2 for Codestral you'd run the following commands:
Note: At 22 billion parameters and with a context window of 32,000 tokens, Codestral is a pretty hefty model to run at home even when quantized to 4-bit precision. If you're having trouble with it crashing, you may want to look at something smaller like DeepSeek Coder's 1B or 7B variants.
To swap out the model used for the chatbot and code generator you can select it from Continue's selection menu. Alternatively, you can cycle through downloaded models using
Changing out the model used for the tab autocomplete functionality is a little trickier and requires tweaking the plug-in's config file.
After pulling down your model of choice [1], click on the gear icon in the lower right corner of the Continue sidebar [2] and modify "title" and "model" entries under "tabAutocompleteModel" section [3]. If you're using Codestral, that section should look something like this:
By default, Continue automatically collects data on how you build your software. The data can be used to fine-tune custom models based on your particular style and workflows.
To be clear, this data is stored locally under in your home directory, and, from what we understand, isn't included in the telemetry data Continue gathers by default. But, if you're concerned, we recommend turning that off.
The specifics of fine-tuning large language models are beyond the scope of this article, but you can find out more about the kind of data collected by the app and how it can be utilized in this blog post.
We hope to explore fine-tuning in more detail in a future hands-on, so be sure to share your thoughts on local AI tools like Continue as well as what you'd like to see us try next in the comments section. ®