Alex Credits and Limits

Alex uses a credit system that reflects model complexity, conversation length, and the amount of context the AI has to process. The goal is transparency: the panel shows your current limit state, consumption, and available options.

This page explains the principles in a simple, practical way. Exact limits can vary based on your package, service configuration, and active add-ons.

What Affects Consumption

Each request can include several parts:

Part	What it means for consumption
Your prompt	The message you send to Alex, including any details you provide
Context	Conversation history, relevant memory, server state, tool outputs, or file content
Response	Alex’s answer, analysis, proposed solution, and steps it performs
Tools	Work with console, files, logs, web search, sub-agents, or other integrations
Cache	Repeated context that selected models can process more efficiently

Consumption is therefore not only about the number of messages. A short question is usually lighter than a deep diagnostic task that reads logs, inspects files, and performs multiple steps.

How Cache Works

Some models support prompt caching. Repeated parts of the conversation or system context may be processed more efficiently in later requests.

Cache works automatically. You do not need to enable or configure it. If the selected model supports cache and the request is suitable for it, Alex uses it in the background.

When cache helps most

You continue in the same conversation instead of starting a new one.
You work on one area step by step, such as the same app, server, or issue.
You follow up on a previous plan, check result, or file review.
You avoid pasting the same long logs repeatedly after Alex has already read them.
You let Alex reuse existing context instead of restating the same background every time.

When cache helps less

Every request starts a new conversation.
You frequently switch topics, servers, or projects.
Each message contains a large amount of new text.
The selected model does not support cache or cannot use it for the current request.

Cache does not mean zero consumption. It is an optimization that can reduce the cost of repeated context.

How To Save Credits and Limits

1. Send a clear prompt upfront

Instead of adding one sentence at a time, include the goal, constraints, and expected outcome in the first message.

Less effective:

Check the website.

More effective:

example.com returns 502 after deployment. Check nginx, PHP-FPM, and the latest logs, find the cause, and write a repair plan before making changes.

2. Continue in the same conversation

If you are solving one issue, stay in the same thread. Alex can continue from the history, plan, and previous results. On supported models, this also gives cache a better chance to work efficiently.

3. Do not resend data Alex already has

Once Alex has read a log, configuration, or part of a project, refer back to it:

Continue from the previous nginx log review and also check PHP-FPM.

You do not need to paste the same log again unless it changed.

4. Use Planning Mode for complex tasks

For incidents, migrations, or larger debugging sessions, turn on Planning Mode. Alex analyzes first and proposes a plan. This reduces blind trial-and-error and often avoids unnecessary consumption.

5. Choose the model based on task complexity

Routine checks, status requests, simple configuration, and quick explanations usually work well with standard models. Stronger models are best reserved for complex debugging, architecture, multi-file analysis, or higher-risk decisions.

6. Split very large work into logical stages

One well-prepared task is more efficient than many tiny follow-ups. For very large work, use stages: plan, first part, review, next part. This keeps quality and consumption easier to control.

Types of Limits

Alex may track multiple limits at the same time:

Limit	Purpose
Short-term window	Protects against sudden overuse or unexpected high consumption in a short time
Long-term window	Tracks overall consumption within your package
Model availability	Some models or features may be available only on selected tariffs
Special features	Image generation or selected tools may have their own rules

The exact state is always visible in the panel. If you reach a limit, Alex will notify you and show the next available options.

Tariffs

Your available limit and features depend on your active package and account configuration. Higher tariffs usually provide more room for demanding work, team usage, or premium models.

If you are unsure which tariff fits you, use your workflow as a guide:

Usage	Typical approach
Occasional questions	Standard tariff and basic models
Regular server management	Higher limit and cache-friendly workflow
Development, debugging, incidents	Higher tariff, Planning Mode, and stronger models when needed
Team or intensive work	Higher availability with a clear budget setup

Pay-as-you-go

If pay-as-you-go (PAYG) is enabled, you can continue after selected limits are exhausted according to your account settings. Overage consumption is charged from your balance and can be controlled by a cap you configure.

PAYG is useful when you do not want work to stop in the middle of an incident or longer analysis. We still recommend monitoring consumption and using cache-friendly workflows.

Tracking Consumption

In chat

The chat shows your current limit state, remaining room, and warnings when you are close to exhaustion.

In the panel

The service or account overview shows consumption history, used models, PAYG entries where applicable, and other information available for your tariff.

For some models, the overview may also show the cache portion of consumption. Treat it as an indication of how much repeated context could be processed more efficiently.

What Happens When You Reach a Limit

If you reach a limit, Alex notifies you directly in the interface. Depending on your setup, you can:

wait for the limit to refresh,
choose a more efficient model or smaller task scope,
use an available upgrade or PAYG,
contact support if the situation is urgent.

FAQ

Why does consumption differ for similar requests?

In one case Alex may answer directly, while in another it needs to read history, logs, files, or run diagnostics. Consumption is also affected by the selected model and whether cache can be used.

Does cache always reduce consumption?

Not always. Cache helps most with repeated or stable context and models that support it. If every message adds a large amount of new content, cache has less room to help.

Should I start fewer conversations because of cache?

Yes, when you are solving the same issue or project. Continuing in the same conversation helps Alex reuse context and may improve efficiency. For unrelated topics, start a new conversation so context does not get mixed.

What if Alex hits an error?

If a request fails technically, the system evaluates consumption based on what actually happened. The overview shows what was counted. Contact support if anything looks unclear.

Next Steps

Alex Models and Limits - How to choose the right model
Best Practices - How to write prompts and save request units
Alex Memory - How Alex remembers preferences and context
Interactive Questions - How Alex asks questions in chat

Need to increase your limit or have a question? Open a support ticket.