Overview
What are BeX AI Assistants? BeX AI Assistants are AI agents built to power automation and natural conversations. They interact with users, understand intent, and execute the most suitable workflow — all orchestrated by the BeX AI Engine. A single Assistant can contain and manage many workflows.
How do we use the Assistant APIs? The most flexible way is to call the Assistant API and pass user inputs as context keys. Context keys are a free‑shape JSON dictionary containing the fields your workflow expects. You fully control the schema. For example, if you design a calculator workflow, your context might look like:
{
"first_num" : 9,
"second_num" : 8,
"operation" : "sum"
}
Execution modes. Assistants support three ways to decide which workflow runs. In Default mode, if an Assistant has a configured default workflow, it will run that one when you use mode: "Specific" without an opcode_id. In Specific mode, you provide an opcode_id to run a particular workflow. In Full mode, you set mode: "Full" and the Assistant chooses the best workflow based on your context keys — useful when multiple workflows could apply.
Streaming vs non‑streaming. You decide whether to receive one final snapshot or a stream of events as work completes. With stream: true, the API returns newline‑delimited JSON (NDJSON) where each line is an event. This is ideal when you want to render progress or partial results. With stream: false, you receive a single JSON only after all steps complete.
Consider a hiring assistant that processes a candidate’s CV in two steps: (1) analyze the CV and propose department fit with a qualification score; (2) draft a final HR‑ready message. If stream is true, you’ll see the analysis output as soon as it’s ready and later receive the final message. If stream is false, the API returns once both steps finish.
When streaming is enabled, you can further tune how events arrive: enable cumulative events to include everything seen so far; request token events from LLM steps to display text as it’s generated; and enable buffered sentences so token streams arrive as readable sentences instead of raw sub‑word pieces.
“Remember me”. Assistants can remember previous executions to improve future runs. Include a Memory step in your workflow design and pass a stable memory_log_id (any unique string) in your requests. Memory is currently scoped to the workflow level.
Parameters (request body)
| Field | Type | Required | Default | Short description |
|---|---|---|---|---|
org_id | string | Yes | — | Organization ID. |
sub_org_id | string | Yes | — | Sub‑organization ID. |
assistant_name | string | Yes | — | Assistant identifier (assistant_id). |
mode | enum | Yes | Specific | Specific (default) or Full. See Overview. |
opcode_id | string | Conditional | — | Required when mode=Specific unless a default workflow is configured. |
context | object | Yes | — | Free‑shape JSON holding your input fields. |
memory_log_id | string | No | — | Stable ID to persist/retrieve workflow memory. |
stream | boolean | No | false | Enable NDJSON streaming of events. |
stream_cumulative | boolean | No | false | Include cumulative outputs in each event. |
emit_token_events | boolean | No | true | Emit per‑token frames from LLM steps (when enabled in the workflow). |
stream_buffer_sentences | boolean | No | false | Buffer tokens into readable sentences. |
Requests & Responses
Base URL: https://{host}/projecto/execute/api
Execute: POST /v2/assistant/execute
Example: POST https://dev-bex.coolriots.ai/projecto/execute/api/v2/assistant/execute
Final responses use Content-Type: application/json. Streams use application/x-ndjson (one JSON object per line).
{
"request_id": "uuid",
"assistant": { "assistant_name": "...", "selected_opcode_id": "..." },
"opcode": { "id": "...", "type": "AutoOp|ChatOp", "version": "..." },
"status": "ok|error",
"timing": { "started_at": "ISO8601", "duration_ms": 0 },
"outputs": [ { "step_id": "...", "label": "...", "data": { /* any */ }, "truncated": false } ],
"delta_outputs": [],
"warnings": [],
"errors": [ /* if any */ ]
}
POST https://dev-bex.coolriots.ai/projecto/execute/api/v2/assistant/execute
Content-Type: application/json
{
"org_id": "665467d84b7c867c744381a0",
"sub_org_id": "66f0ff588c5ada289b447571",
"assistant_name": "Bexinsight_Payload_Generator",
"mode": "Specific",
"context": { "message": "Hi, I'm interested in your product.", "email": "[email protected]" },
"stream": false,
"stream_cumulative": false,
"emit_token_events": true,
"stream_buffer_sentences": false
}
{
"request_id": "b8b4a8d3-9f31-44a2-9cd6-68ac8d5e60ab",
"assistant": { "assistant_name": "Bexinsight_Payload_Generator", "selected_opcode_id": "default_workflow_id" },
"opcode": { "id": "default_workflow_id", "type": "AutoOp", "version": "1.0.0" },
"status": "ok",
"timing": { "started_at": "2025-09-17T08:10:00Z", "duration_ms": 1520 },
"outputs": [ { "step_id": "step1", "step_type": "LLM", "label": "greeting", "data": "Hello! How can I help you today?", "truncated": false } ],
"delta_outputs": [],
"warnings": [],
"errors": []
}
POST https://dev-bex.coolriots.ai/projecto/execute/api/v2/assistant/execute
Content-Type: application/json
{
"org_id": "665467d84b7c867c744381a0",
"sub_org_id": "66f0ff588c5ada289b447571",
"assistant_name": "Bexinsight_Payload_Generator",
"mode": "Specific",
"opcode_id": "lead_qualifier_opcode",
"context": { "message": "Hi" },
"stream": false,
"stream_cumulative": false,
"emit_token_events": true,
"stream_buffer_sentences": false
}
{
"request_id": "7b1bd71f-7f91-4ff2-8a2d-8a9c0f0e8e12",
"assistant": { "assistant_name": "Bexinsight_Payload_Generator", "selected_opcode_id": "lead_qualifier_opcode" },
"opcode": { "id": "lead_qualifier_opcode", "type": "AutoOp", "version": "1.0.0" },
"status": "ok",
"timing": { "started_at": "2025-09-17T08:15:00Z", "duration_ms": 980 },
"outputs": [ /* workflow outputs */ ],
"delta_outputs": [],
"warnings": [],
"errors": []
}
POST https://dev-bex.coolriots.ai/projecto/execute/api/v2/assistant/execute
Content-Type: application/json
{
"org_id": "665467d84b7c867c744381a0",
"sub_org_id": "66f0ff588c5ada289b447571",
"assistant_name": "ServiceDesk_Assistant",
"mode": "Full",
"context": { "user_query": "Reset my password" },
"stream": true,
"stream_cumulative": false,
"emit_token_events": false,
"stream_buffer_sentences": false
}
{"request_id":"42abf1c7-7fae-4c71-90f7-57a11cc154d8","assistant":{"assistant_name":"ServiceDesk_Assistant","selected_opcode_id":"pwd_reset_flow"},"opcode":{"id":"pwd_reset_flow","type":"AutoOp","version":"2.0"},"status":"running","timing":{"started_at":"2025-09-17T08:11:00Z","duration_ms":200},"outputs":[],"delta_outputs":[{"step_id":"step2","step_type":"Non-LLM","label":"lookup_user","data":{"exists":true},"truncated":false}],"warnings":[],"errors":[]}
{"request_id":"42abf1c7-7fae-4c71-90f7-57a11cc154d8","assistant":{"assistant_name":"ServiceDesk_Assistant","selected_opcode_id":"pwd_reset_flow"},"opcode":{"id":"pwd_reset_flow","type":"AutoOp","version":"2.0"},"status":"ok","timing":{"started_at":"2025-09-17T08:11:00Z","duration_ms":880},"outputs":[{"step_id":"step2","step_type":"Non-LLM","label":"lookup_user","data":{"exists":true},"truncated":false},{"step_id":"step3","step_type":"LLM","label":"final_message","data":"We've sent a reset link to your email.","truncated":false}],"delta_outputs":[],"warnings":[],"errors":[]}
POST https://dev-bex.coolriots.ai/projecto/execute/api/v2/assistant/execute
Content-Type: application/json
{
"org_id": "665467d84b7c867c744381a0",
"sub_org_id": "66f0ff588c5ada289b447571",
"assistant_name": "ChatBot",
"mode": "Full",
"context": { "message": "Hello" },
"stream": true,
"stream_cumulative": true,
"emit_token_events": true,
"stream_buffer_sentences": true
}
{"request_id":"d4a5a777-7122-4931-91f6-c5ebba2db57d","assistant":{"assistant_name":"ChatBot","selected_opcode_id":"chat_llm"},"opcode":{"id":"chat_llm","type":"ChatOp","version":"1.2"},"status":"running","timing":{"started_at":"2025-09-17T08:13:00Z","duration_ms":150},"outputs":[],"delta_outputs":[],"warnings":[],"errors":[],"event":{"type":"token","sequence":1},"step":{"id":"step3","type":"LLM","index":2},"token":{"text":"Hello there!"}}
{"request_id":"d4a5a777-7122-4931-91f6-c5ebba2db57d","assistant":{"assistant_name":"ChatBot","selected_opcode_id":"chat_llm"},"opcode":{"id":"chat_llm","type":"ChatOp","version":"1.2"},"status":"running","timing":{"started_at":"2025-09-17T08:13:00Z","duration_ms":300},"outputs":[],"delta_outputs":[],"warnings":[],"errors":[],"event":{"type":"token","sequence":2},"step":{"id":"step3","type":"LLM","index":2},"token":{"text":" How can I assist you today?"}}
{"request_id":"d4a5a777-7122-4931-91f6-c5ebba2db57d","assistant":{"assistant_name":"ChatBot","selected_opcode_id":"chat_llm"},"opcode":{"id":"chat_llm","type":"ChatOp","version":"1.2"},"status":"ok","timing":{"started_at":"2025-09-17T08:13:00Z","duration_ms":1600},"outputs":[{"step_id":"step3","step_type":"LLM","label":"reply","data":"Hello there! How can I assist you today?","truncated":false}],"delta_outputs":[],"warnings":[],"errors":[]}
Code Samples
:: Final (non-streaming)
curl -X POST "https://dev-bex.coolriots.ai/projecto/execute/api/v2/assistant/execute" ^
-H "Content-Type: application/json" ^
-d "{
\"org_id\": \"665467d84b7c867c744381a0\",
\"sub_org_id\": \"66f0ff588c5ada289b447571\",
\"assistant_name\": \"Bexinsight_Payload_Generator\",
\"mode\": \"Specific\",
\"opcode_id\": \"lead_qualifier_opcode\",
\"context\": { \"message\": \"Hi\" },
\"stream\": false,
\"stream_cumulative\": false,
\"emit_token_events\": true,
\"stream_buffer_sentences\": false
}"
:: Streaming (NDJSON)
curl -N --no-buffer -X POST "https://dev-bex.coolriots.ai/projecto/execute/api/v2/assistant/execute" ^
-H "Content-Type: application/json" ^
-d "{
\"org_id\": \"665467d84b7c867c744381a0\",
\"sub_org_id\": \"66f0ff588c5ada289b447571\",
\"assistant_name\": \"ServiceDesk_Assistant\",
\"mode\": \"Full\",
\"context\": { \"user_query\": \"Reset my password\" },
\"stream\": true,
\"stream_cumulative\": true,
\"emit_token_events\": true,
\"stream_buffer_sentences\": true
}"
Errors & Rate Limits
Error envelope: Responses share a common envelope. On failures, status is "error" and details appear in errors[].
{
"request_id": "7b1bd71f-7f91-4ff2-8a2d-8a9c0f0e8e12",
"assistant": { "assistant_name": "ServiceDesk_Assistant" },
"opcode": { "id": "selected_opcode_id_if_any" },
"status": "error",
"timing": { "started_at": "2025-09-17T08:15:00Z", "duration_ms": 0 },
"outputs": [],
"delta_outputs": [],
"warnings": [],
"errors": [
{ "code": "ASSISTANT_EXEC_ERROR", "type": "RuntimeError", "message": "Unhandled exception during execution", "retryable": false, "time": "2025-09-17T08:15:00Z", "step": null }
]
}
| Rule | Typical value | Notes |
|---|---|---|
| Requests per second | 25 req/s | Across all endpoints unless overridden by deployment. |
| Requests per minute | 500 req/min | Bursts above this may receive HTTP 429. |
| Cooldown | 30 s | Retry after the indicated window in the response headers/body. |