Version: 2.4

Process Transactions

This page covers how the tSM Process Engine manages transactions — wait states, transaction boundaries, external tasks, concurrency, and the SAGA pattern. For general transaction concepts, see the Transactions overview. For SpEL-specific transaction behaviour (the {async: true} flag, #businessException pitfalls), see SpEL and Transactions.

Wait States

A wait state is any point where the engine saves the current process state to the database and commits the transaction. The process then pauses until something triggers it to continue.

Common wait states:

User Tasks — waiting for a person to complete a form
Receive Tasks — waiting for a message or signal
Timer Events — waiting until a timer expires
Message/Signal Events — waiting for an external trigger
External Tasks — handed off to a worker process

Once a wait state is reached, the process engine:

Saves the execution context (all process variables, states, tokens) and business object changes (e.g., Order status, characteristics) to the database.
Commits the transaction so that all work done so far is permanent.
Pauses until the next event or message triggers continuation.

Transaction Boundaries

A transaction boundary is the span of execution between two wait states. The engine executes all BPMN steps (service tasks, gateways, script tasks, etc.) within this span in a single database transaction. If an error occurs and is not handled by an error boundary event, the engine rolls back to the last committed wait state.

Asynchronous Continuations

You can create additional transaction boundaries within a flow by using asyncBefore or asyncAfter on a task. This splits the execution into smaller transactions:

<bpmn:serviceTask id="UpdateBillingSystem"
                  name="Update Billing"
                  camunda:asyncAfter="true"
                  camunda:expression="${@billingSystem.updateOrder(#order)}"/>

asyncBefore — commits right before the task begins (a new transaction starts for this task).
asyncAfter — commits right after the task ends (the next task runs in a new transaction).

Use asynchronous continuations for:

Long-running flows (to avoid locking the database for extended periods).
Isolating non-transactional calls so that a failure in a later step does not roll back an already-completed external call.

Rollback on Exception

If an exception arises within a transaction scope:

All database changes made in that scope are discarded.
The process reverts to the most recent committed wait state.
Control returns to the caller or job executor.

Duplicate Reprocessing

When a rollback happens, external calls within the rolled-back scope may need re-invocation. For non-idempotent operations (e.g., "charge a credit card"), re-calling can cause duplicates unless you build safeguards:

Idempotent keys or unique tokens
A separate async boundary so the completed step is committed before the external call

External Task Mechanism

External tasks delegate specialized work to another microservice. They are the safest way to call external systems from a process, because the external call runs in a completely separate transaction.

The engine persists the task (this is a wait state — the transaction commits).
A worker fetches and locks the task, performs the work, then reports completion.
The process continues upon notification.

Example: JIRA + SAP via External Tasks

When you need the response from external systems — or when you need guaranteed isolation between calls — use External Tasks (or Kafka Tasks) instead of the {async: true} SpEL flag.

Advantages:

Each external call is in its own transaction — a failure in SAP does not affect the already-committed JIRA task.
You get the response back (e.g., the JIRA issue key PROJ-123).
Failed tasks are retried automatically by the job executor.
The process engine is decoupled from the external system's availability.

A Kafka Task works similarly but uses Kafka messaging instead of REST polling — see Kafka Task for details.

Advantages of External Tasks over {async: true}:

Workers can be implemented in any language.
The process engine is decoupled from the external logic.
If a worker fails mid-task, the lock eventually expires and another instance can retry.
You receive the result back into the process.

Concurrency Issues

When multiple transactions interact with the same process, concurrency problems can arise:

A user completes a task at the same time a message arrives.
A timer or async continuation triggers while another step is executing.
Parallel gateways or multi-instance tasks require synchronization.

Optimistic Locking

The engine uses a revision column in the database to detect concurrent modifications. If two updates collide, one fails with an OptimisticLockingException. The engine retries internal jobs that fail this way.

If you call an external service within the same scope, it might be re-called after a retry, leading to duplicate side-effects. Therefore:

Add asyncBefore or asyncAfter around non-transactional calls so the call is retried in isolation.
For more advanced scenarios, consider locking at the business key level (e.g., Redis) to ensure only one transaction modifies a given entity at a time.

Job Executor and Prioritization

The job executor handles timers, asynchronous continuations, and retries. It processes tasks from a job queue, subject to locking and concurrency controls.

Job Priority

Jobs can have a numeric priority. Higher-priority tasks are picked up first when the executor is under load.

Handling Failed Jobs

If a job fails:

The retry counter decreases.
The job is unlocked so another attempt can be made.
Once retries reach zero, an incident is created, requiring manual resolution.
Default retries may be zero for non-idempotent operations to prevent unintended re-calls.

Business Transactions (SAGA Pattern)

In large workflows spanning multiple microservices, a single database transaction is impractical. Instead, the engine orchestrates business transactions using a pattern known as SAGA:

Each operation has a local commit and a compensating action.
If a later step fails, the engine executes the compensating actions of all previously completed steps.

Model this with BPMN error events, compensation events, or dedicated rollback service tasks.

Choosing the Right Approach

Aspect	Immediate Call	`{async: true}`	External / Kafka Task
When is the call made?	During the transaction	After commit (via Kafka)	In a separate transaction
Response available?	✅ Yes	❌ No (returns `null`)	✅ Yes
Safe on rollback?	❌ External effect remains	✅ Never sent	✅ Isolated transaction
Retry on failure?	❌ Manual	✅ Kafka retries	✅ Job executor retries
Complexity	Low	Low	Medium
Use when	Read-only queries, idempotent calls	Fire-and-forget, notifications, comments	Need response, must handle errors, critical operations

For details on the {async: true} flag and SpEL-specific examples, see SpEL and Transactions.

Practical Recommendations

Situation	Recommendation
Fire-and-forget (notifications, comments, audit)	Use `{async: true}` on SpEL clients
Need the response from an external system	Use an External Task or Kafka Task
Validation before external calls	Always validate before making non-transactional calls
Long-running external operations	Use External Tasks (separate worker, separate transaction)
Multiple external systems in sequence	Use External Tasks or Kafka Tasks for each, creating isolated transactions
Idempotency concerns	Use idempotent keys or async boundaries

Wait States​

Transaction Boundaries​

Asynchronous Continuations​

Rollback on Exception​

Duplicate Reprocessing​

External Task Mechanism​

Example: JIRA + SAP via External Tasks​

Concurrency Issues​

Optimistic Locking​

Job Executor and Prioritization​

Job Priority​

Handling Failed Jobs​

Business Transactions (SAGA Pattern)​

Choosing the Right Approach​

Practical Recommendations​

See Also​