Skip to content

Chatbot, copilot, or agent?

Chatbots respond.
Copilots help.
Agents do work within boundaries.

An AI agent is a system that does part of a business process on its own: it reads a ticket, an invoice or a record, picks the next step and works in your tools (CRM, inbox, spreadsheets). It operates inside boundaries you set, and hands anything it is not sure about to a human. Deploying an agent for one named process typically starts at €6 000 net. This page is your test: how to tell a real agent before someone sells you a repackaged chatbot.

Written by Artem Lisovtsov and the Syntalith engineering team · Updated 2 July 2026

In short

  • An AI agent runs the process itself: it chooses the next step and uses tools in a loop, until it reaches the goal or hands the case to a person.
  • A chatbot answers, a copilot assists, an automation runs on a fixed path. What sets an agent apart is that the model picks the next step, within the boundaries you set.
  • An agent pays for itself only where the next step depends on the content of the case; the rest is handled more cheaply by a script or an app.
  • An "agent" is not a marketing label but a system that passes seven criteria: work, context, tools, boundaries, escalation, measurement, trace. Miss one and it is not an agent yet.

What an agent is made of

Agent =work+context+tools+boundaries+escalation+measurement+trace

If one element is missing, it is not an agent. We build a simpler script, integration, or LLM application instead, and we say so plainly.

Agent-washing

Most "agents" on the market are a repackaged chatbot.

The same word now sticks to a chatbot, a RAG app, a copilot, and a script with a model. So we do not start from the label on a slide; we start from the work and the boundaries, and what you are actually getting you settle with the seven criteria below. And the reverse: even a real agent costs more and carries more risk than a script or an app, so we propose one only where the next step genuinely depends on the content of the case. Everywhere else we point you to a simpler, cheaper form, even when that means a smaller job for us.

The market in numbers: adoption and agent-washing.

  • 8.4%

    of Polish companies used AI in 2025 (EU average: 20%, highest: Denmark 42%). A low bar, a real edge for whoever deploys it properly.

    Eurostat 2025
  • >40%

    of agentic AI projects will be canceled by end of 2027: cost, unclear ROI, and weak risk controls. So pin down scope, cost, and risk control before you start, not after.

    Gartner 2025
  • 6%

    how little token prices fell in 2026 through May (against 39% in the second half of 2025): deflation stalled and buyers keep shifting toward pricier premium models. That is why we meter run cost and cap it, instead of assuming it keeps falling.

    YipitData 2026
  • ~130

    real agent vendors Gartner counted among thousands. The rest is agent-washing: rebranded chatbots and RPA.

    Gartner 2025

One word, five levels: from a chatbot to a full runtime.

Before you trust the word "agent," pin down which of these levels someone means.

  1. 1

    Chatbot or RAG renamed as an agent

    Answers questions, does not do work.

    Chatbot with RAG, not an agent.

  2. 2

    Copilot in a process

    Helps a person; the person does the work.

    Copilot, not an agent.

  3. 3

    Operational tool-using agent

    Sees context, calls APIs, escalates.

    Can be an agent if all seven criteria are present.

  4. 4

    Multi-agent orchestration

    Many roles, shared memory, queues.

    A team of agents, often overhyped.

  5. 5

    Full agentic runtime

    Persistent, long-running, tool permissions, lifecycle.

    Operational stack. We do not claim to own a proprietary runtime.

Seven questions that turn a demo into a production system.

This is your detector. Ask anyone selling you an “agent” these seven questions: if they cannot answer them, it is not an agent yet, just an interface to a model. With the answers you know what it does, what it must not do, and when it hands a case to a human.

These seven questions come from our production deployments, including a listing-scoring system for real estate across four portals. Where a number is confidential, we describe the shape of the work, not the value. See deployments

Scope

What the agent should do and what it may reach.

Work

What specific work does the agent do?

A good answer

One process, one trigger, one measurable outcome.

Common mistake

“It handles customers.” Too broad to measure anything.

Context

What must the agent understand: data, rules, exceptions?

A good answer

Named data sources, the rules, and the list of exceptions.

Common mistake

Passing the full conversation history on every call. The model drifts and the token bill climbs.

Tools

Which systems does it use, and with what permissions?

A good answer

Least privilege, read-only first, dedicated service accounts, an isolated environment.

Common mistake

The agent inherits someone's credentials, sessions, and files.

Control

Where the agent stops and who takes the exception.

Boundaries

What must it never do on its own?

A good answer

A hard “must not” list enforced in code, not in the prompt. External content is data, never a command.

Common mistake

The rule lives only in the prompt. Under load the agent skips it.

Escalation

When does it hand off to a human?

A good answer

Explicit triggers, production writes behind approval, handoff with full context.

Common mistake

The agent decides its next move with no gate.

Proof

Whether the system is worth keeping and how to replay what it did.

Measurement

How do we know it's worth keeping?

A good answer

Weekly metrics, including Cost Per Query and Escalation Rate.

Common mistake

“It works,” with no numbers.

Trace

What does it record, and who can audit it?

A good answer

Every action logged: what, when, on what basis. A central, replayable log.

Common mistake

State kept in model memory. Yesterday's run can't be reconstructed.

This is not a marketing checklist: the same seven points recur in the scan, the contract, and the deployment report, so you can hold the promise against the proof.

Rather not interrogate vendors yourself? Bring one process and we'll ask these questions for you.

Book a free process scan (30 min)

EU AI Act mapping

We map the seven agent criteria onto the EU AI Act requirements that feed operator documentation: trace, transparency, human oversight and monitoring. It organizes the technical documentation.

Art. 12 · Record-keeping
Trace
Art. 13 · Transparency
Work + Context
Art. 13 + 26 · Tools and deployer obligations
Tools
Art. 14 · Human oversight
Boundaries + Escalation
Art. 17 + 72 · Quality and monitoring
Measurement

Technical input to operator documentation, not legal advice: we do not classify the system as high-risk and do not determine your obligations. Most of these requirements apply to high-risk systems; separately, Art. 50 (telling a user they are talking to AI) binds providers and deployers from 2 August 2026, regardless of risk class. EUR-Lex ↗

Since February 2025, Art. 4 of the EU AI Act requires providers and deployers to ensure sufficient AI literacy among the people who use it. That's why our training is run by engineers. AI training for companies

Five forms, one word

Agent, chatbot, copilot, RAG, or plain automation?

One word describes five different things. The difference is not the model; it is who runs the process and who performs the step. RAG in this table means an app that first searches your documents, then answers.

  • Chatbot

    What it doesAnswers questions in conversation, turn by turn.

    The test that separates it from an agentIt never executes multi-step work on its own between your messages.

  • Copilot

    What it doesSuggests as you work, while a human approves every step.

    The test that separates it from an agentThe human drives and executes; the system only assists.

  • RAG app

    What it doesSearches your documents and answers from them, with the source cited.

    The test that separates it from an agentThe developer fixes the retrieval path, not the model. The moment the model starts deciding when and why to fetch, RAG crosses into an agent and must pass the same seven-question test.

  • Automation

    What it doesRuns known steps along rules written in code.

    The test that separates it from an agentA developer sets the path, not the model.

  • AI agent

    What it doesRuns the process from event to outcome, within set boundaries.

    The test that separates it from an agentThe model chooses the next step and tool, in a loop, until the goal or an escalation.

The simpler the form, the cheaper and more reliable it is. We propose an agent only when a cheaper form won't do.

How it works

Purchase-invoice reconciliation, run as an agent

The same work, broken into the seven criteria that let us call a system an agent:

Agent · ready
  1. WorkRuns each supplier invoice from arrival to a posting-ready entry: a three-way match against the order and goods receipt, reconciled line by line, with every discrepancy flagged.
  2. ContextSees the invoice, the matching order and goods receipt, the supplier's terms and prior invoices, and your posting and approval rules.
  3. ToolsReads the PDF or structured e-invoice, queries the ERP for the order and receipt, normalizes the supplier across aliases, and drafts the journal entry with its account coding.
  4. BoundariesPosts nothing to the ledger and releases no payment on its own. Above a set amount or below a confidence threshold, it stops at a human gate instead of guessing.
  5. EscalationMismatches, duplicates and first-time suppliers go to a controller with the proposed entry, the cited evidence and the reasoning already attached.
  6. MeasurementWe track the share of invoices reconciled without a correction and the time from arrival to posting-ready.
  7. TraceEvery match, decision and source lands in an append-only log the agent cannot rewrite, so the audit trail assembles itself as the work runs.

The effect of this pattern: most invoices wait in the morning ready to post, and the accountant touches only the exceptions.

The example illustrates the pattern; it is not a specific client's result.

Reference material

Where automation ends and an agent begins.

A lead, a ticket, or an invoice is rarely one class of work: some steps have fixed rules, some need language, and some need a human decision. We call a system an agent only when it owns the whole case, not a single step.

Who owns the case

  1. Fixed steps, known rules

    A rule runs fixed steps

    A test can check the result. A model improves nothing here, it only adds cost and risk. A script, integration, or workflow is enough.

  2. The model reads text, but does not own the case

    The model reads, the run is closed

    The model reads text: it classifies, extracts data, summarizes, drafts a reply. The run stays closed, so it is still automation, just with better reading.

  3. The outcome belongs to a person

    A person approves and executes

    The system prepares, a person approves: a draft reply, a document summary, a record ready to accept. The write, the send, and the decision stay with the person.

  4. the agent threshold

    Decision, tools, escalation, trace

    The agent owns the case, within boundaries

    The next step depends on the case, not on a script. The agent reaches for tools, stops exceptions, measures the result, and records the reason for every step.

Recommendation

If a script, integration, or LLM application is enough, we will say so before a proposal. An agent is right only where the next step genuinely depends on what is in the case.

Running cost

What does running an agent actually cost?

As much as the work you give it. An agent doesn't run one query, it runs a loop: it plans, reaches for tools, checks the result and corrects, so it burns 5 to 30 times more tokens per task than a chatbot. The unit that matters is cost per completed task, not cost per prompt.

5–30×

more tokens per task than a chatbot

Gartner 2026, via CIO Dive

Cheaper tokens don't mean a cheaper agent: unit prices are falling ever more slowly, while consumption rises faster, so the bill rises. The biggest drain is a gateless loop that retries on and on.

How we keep cost under control

  1. Cache

    Repeated context computed once.

  2. Model routing

    A cheaper, smaller model for simple steps, the frontier model for the hard ones.

  3. Hard budget

    A token-and-turn budget at the escalation gate.

We meter cost from day one (Cost Per Query). The same threshold that makes the agent safe guards the bill.

Safety

What stops an agent when someone tries to hijack it?

Not a filter at the door, but boundaries at the exit. An agent acts with real privilege, and the most common attack is prompt injection: a hidden instruction in an email, a document, or a page it reads starts steering it. OWASP puts agent goal hijack, the agentic form of this attack, at the top of its 2026 risk list. No filter catches everything, so we design it so even a hijacked agent can't do anything dangerous.

OWASP 2026

Even a hijacked agent can't do anything dangerous

Least privilege, read-only first.

Tools

A "must not" list enforced in code; external content is data, not a command.

Boundaries

Irreversible steps wait for a human's approval.

Escalation

Every action logged and replayable.

Trace

These are the same four criteria that separate an agent from a fake. Boundaries aren't an add-on to safety: they are the safety.

Where it runs

In the EU by default (GDPR), on a major cloud (AWS, Azure, Google Cloud), or in your own cloud account if governance prefers. Code, prompts and data stay with you, no vendor lock-in.

Choosing the form

Agent or plain automation?

The shape of the work points to the form and the entry price. The four most common situations map directly onto our service lines:

Five questions before you decide

  1. Can you write every step and exception on one sheet of paper?

    If yes, that's automation. A script or an integration will do it cheaper and more predictably than a model.

  2. Is the input unstructured text: emails, PDFs, notes?

    A model can read, classify and extract while the flow stays closed. That's automation with AI, not yet an agent.

  3. Does the next step depend on what's in the case?

    That's agent territory: the system leads the case, picks the next step and reaches for tools, inside boundaries you set.

  4. Must a human make the call: law, risk, relationship?

    Then we build a copilot or an app: the system prepares, a human approves the effect.

  5. Can you say what the system must never do and when it hands the case over?

    If not yet, we fix the process first. Without boundaries and escalation we don't deploy an agent.

  1. A repeatable process with known rules: inbox, documents, reports, syncing systems.

    AI Automations

    from €3 500

  2. A tool with an LLM inside: a copilot, RAG, document extraction, an internal panel. A person approves the consequence.

    AI Apps

    from €6 000

  3. The system should run the process itself inside set boundaries: pick the next step, use tools, escalate exceptions.

    AI Agents

    from €6 000

  4. There is a process, but no obvious place to start: the form and the order still need to be set.

    Free process scan

    €0

Net prices. After the scan you get a written takeaway within 2 business days.

From zero to production

How an agent that actually does the work gets built.

We build in three stages. Each one ends at a gate: a concrete result in hand and your decision whether to go on. No "commit now, see it later."

  1. Stage 01

    Audit: the process first, not the model

    A free scan: 30 minutes with an engineer on the process that costs you the most. For a complex case, an implementation specification is produced: a process map, architecture and a fixed quote.

    Gate

    The takeaway and the specification are yours. Sometimes the best call is not to build an agent, and we will say so plainly.

    written takeaway within 2 business days

  2. Stage 02

    Pilot: proof on your own data

    The agent runs on a slice of real work, inside boundaries, with a human at the gate. We write down one measurable target before any code.

    Gate

    You decide after the pilot, not before. The goal is written down before the build; if the pilot misses it, we fix it within the fixed price.

    ~6–8 weeks · target in writing

  3. Stage 03

    Production: a system that runs the work

    We deploy the agent in an isolated environment, with least-privilege access, a full trace, and escalation of exceptions.

    Yours

    The code, prompts, and data are yours. Maintenance and development scale with the process.

    production, measurement and maintenance

Evolution

From automation, through apps, to agents

Each year moved the boundary of what "AI" can do inside a company. The agent is the newest form, not the only one: below are four dates that shaped it.

  1. 2022. Automation: Fixed rules: it does exactly what it was programmed to, with no exceptions and no context.
  2. 2023. LLM apps: It understands language and generates, but waits to be asked: one step per prompt.
  3. 2025. Agents: It plans and acts across steps in a loop, and stops at the human boundary.
  4. 2026. Agents on standards: Shared tool protocols and production deployments: the question stopped being "whether" and became "within what boundaries".

Dates are market milestones, not our own timeline: before ChatGPT (November 2022), automations were rule-based; GPT-4 (2023) opened LLM applications; 2025 was called the year of agents; and 2025–2026 brought standardization of tool protocols.

We don't sell technology. We give a recommendation.

After the scan or the implementation specification you get one of three answers:

Build

When the work, boundaries, and trace are clear enough, the recommendation can lead to a full build or a narrower pilot.

Narrow or clean up

When the process is promising but too broad or underspecified, we first reduce scope or clean up the inputs.

Do not build an agent yet

When plain automation, a script, or a human decision is the better answer, we write that plainly.

Each recommendation comes back with a price and scope, so you can compare it before you commission anything.

Free process scan

Start with a free process scan.

  • 30 minutes with the engineer who would build it, not a salesperson.
  • A review of the processes that cost you the most time and money.
  • A written summary: what to automate, in what order, with cost ranges.

No sales deck and no obligations. If automation doesn't make sense, we'll write that too.

€0

30 minutes · written takeaway within 2 business days

Common questions

Before you call something an agent, ask these.

This is a quick filter before any build conversation. It does not settle the full architecture, but it protects you from buying an agent where a simpler pattern is enough.

  • Does every process need an AI agent?

  • When is automation enough instead of an AI agent?

  • When does an LLM app become an agent?

  • How is an AI agent different from a chatbot?

  • Does an AI agent run without human oversight?

  • How do you know an agent is working correctly?

Bring one process. We will check whether it really needs an agent.

  • 30 minutes with the engineer who would build it, not a salesperson.
  • A review of the processes that cost you the most time and money.
  • A written summary: what to automate, in what order, with cost ranges.
€030 minutes · written takeaway within 2 business days
Book a free process scan (30 min)

No sales deck and no obligations. If automation doesn't make sense, we'll write that too.