Alpha Prompts Benchmarks Tools Workflows Newsletter

AI workers for your business

Automate your ad creatives

Configuring Context Window settings in LM Studio for better AI memory

A guide to adjusting context window sizes to prevent the AI from 'forgetting' the conversation and understanding token limits in local models.

LM Studio OpenAI Tokenizer AI Tools Optimization

Segment Details

Source Video Time - 3:38

Duration 2.4 mins

Learning Timeline

Key Insights

Understanding How AI Processes Text

AI doesn't count words or characters in the traditional way. It uses 'Tokens'. For instance, a sentence might have 26 words but be counted as 38 tokens by an LLM.

Why AI Becomes 'Forgetful'

When the token usage indicator exceeds 100%, the AI begins to discard the earliest information in the conversation to make room for new tokens. This causes the AI to 'forget' the original topic.

The Impact of Increasing the Context Window

Increasing the Context Window (e.g., from 2048 to 4096) allows the AI to remember longer conversations, but it will consume more of your computer's memory (VRAM/RAM).

Prompts

AI Memory Test

Target: LM Studio

Right now I'm reading a book called How to Take Smart Notes.

Prompt to Increase Token Usage

Target: LM Studio

Tell me a story about cows.

Memory Re-check

Target: LM Studio

What book am I reading right now?

Step by Step

How to Adjust the Context Window in LM Studio

Open the LM Studio application on your computer.
Select the local model you want to use from the list (e.g., GEMMA 3 4B).
Locate the configuration settings panel on the right side of the screen.
Find the 'Context Length' or 'Context Window' section.
Enter your desired token value (e.g., change from 2048 to 4096 for larger memory capacity).
Click the blue 'Load Model' button at the top.
Click the 'AI Chat' tab (message icon) on the left sidebar to start a new session.
Type your message in the chat box and press Enter.
Monitor the 'Usage' indicator at the bottom of the screen to see the percentage of token capacity used.
If the AI starts forgetting earlier information, click the 'Eject Model' button at the top.
Repeat the steps to adjust the 'Context Length' to a higher value, then click 'Load Model' again.

More from Local AI & Open Source Deployment

None

Automating web browser tasks with Local LLMs (Ollama) & DeepSeek

Browser Use Ollama

Setting up GPT-OSS models using LM Studio CLI

LM Studio OpenAI

Enabling Flash Attention and Quantization in LM Studio for a performance boost

Guide to running Llama 3.1 locally using LM Studio

LM Studio Llama 3.1

Build Your Own Socratic AI Tutor Using Open WebUI and Custom Prompts

Open WebUI Claude