Overview

What are BeX AI Assistants? BeX AI Assistants are AI agents built to power automation and natural conversations. They interact with users, understand intent, and execute the most suitable workflow — all orchestrated by the BeX AI Engine. A single Assistant can contain and manage many workflows.

How do we use the Assistant APIs? The most flexible way is to call the Assistant API and pass user inputs as context keys. Context keys are a free‑shape JSON dictionary containing the fields your workflow expects. You fully control the schema. For example, if you design a calculator workflow, your context might look like:

{
  "first_num" : 9,
  "second_num" : 8,
  "operation" : "sum"
}

Execution modes. Assistants support three ways to decide which workflow runs. In Default mode, if an Assistant has a configured default workflow, it will run that one when you use mode: "Specific" without an opcode_id. In Specific mode, you provide an opcode_id to run a particular workflow. In Full mode, you set mode: "Full" and the Assistant chooses the best workflow based on your context keys — useful when multiple workflows could apply.

Streaming vs non‑streaming. You decide whether to receive one final snapshot or a stream of events as work completes. With stream: true, the API returns newline‑delimited JSON (NDJSON) where each line is an event. This is ideal when you want to render progress or partial results. With stream: false, you receive a single JSON only after all steps complete.

Consider a hiring assistant that processes a candidate’s CV in two steps: (1) analyze the CV and propose department fit with a qualification score; (2) draft a final HR‑ready message. If stream is true, you’ll see the analysis output as soon as it’s ready and later receive the final message. If stream is false, the API returns once both steps finish.

When streaming is enabled, you can further tune how events arrive: enable cumulative events to include everything seen so far; request token events from LLM steps to display text as it’s generated; and enable buffered sentences so token streams arrive as readable sentences instead of raw sub‑word pieces.

“Remember me”. Assistants can remember previous executions to improve future runs. Include a Memory step in your workflow design and pass a stable memory_log_id (any unique string) in your requests. Memory is currently scoped to the workflow level.

Parameters (request body)

Field	Type	Required	Default	Short description
`org_id`	string	Yes	—	Organization ID.
`sub_org_id`	string	Yes	—	Sub‑organization ID.
`assistant_name`	string	Yes	—	Assistant identifier (assistant_id).
`mode`	enum	Yes	`Specific`	`Specific` (default) or `Full`. See Overview.
`opcode_id`	string	Conditional	—	Required when `mode=Specific` unless a default workflow is configured.
`context`	object	Yes	—	Free‑shape JSON holding your input fields.
`memory_log_id`	string	No	—	Stable ID to persist/retrieve workflow memory.
`stream`	boolean	No	false	Enable NDJSON streaming of events.
`stream_cumulative`	boolean	No	false	Include cumulative outputs in each event.
`emit_token_events`	boolean	No	true	Emit per‑token frames from LLM steps (when enabled in the workflow).
`stream_buffer_sentences`	boolean	No	false	Buffer tokens into readable sentences.

Requests & Responses

Base URL & Endpoint

Base URL: https://{host}/projecto/execute/api
Execute: POST /v2/assistant/execute
Example: POST https://dev-bex.coolriots.ai/projecto/execute/api/v2/assistant/execute

Final responses use Content-Type: application/json. Streams use application/x-ndjson (one JSON object per line).

Response Envelope (Final Snapshot)

{
  "request_id": "uuid",
  "assistant": { "assistant_name": "...", "selected_opcode_id": "..." },
  "opcode": { "id": "...", "type": "AutoOp|ChatOp", "version": "..." },
  "status": "ok|error",
  "timing": { "started_at": "ISO8601", "duration_ms": 0 },
  "outputs": [ { "step_id": "...", "label": "...", "data": { /* any */ }, "truncated": false } ],
  "delta_outputs": [],
  "warnings": [],
  "errors": [ /* if any */ ]
}

A) Default Mode (non‑streaming)

POST https://dev-bex.coolriots.ai/projecto/execute/api/v2/assistant/execute
Content-Type: application/json

{
  "org_id": "665467d84b7c867c744381a0",
  "sub_org_id": "66f0ff588c5ada289b447571",
  "assistant_name": "Bexinsight_Payload_Generator",
  "mode": "Specific",
  "context": { "message": "Hi, I'm interested in your product.", "email": "[email protected]" },
  "stream": false,
  "stream_cumulative": false,
  "emit_token_events": true,
  "stream_buffer_sentences": false
}

{
  "request_id": "b8b4a8d3-9f31-44a2-9cd6-68ac8d5e60ab",
  "assistant": { "assistant_name": "Bexinsight_Payload_Generator", "selected_opcode_id": "default_workflow_id" },
  "opcode": { "id": "default_workflow_id", "type": "AutoOp", "version": "1.0.0" },
  "status": "ok",
  "timing": { "started_at": "2025-09-17T08:10:00Z", "duration_ms": 1520 },
  "outputs": [ { "step_id": "step1", "step_type": "LLM", "label": "greeting", "data": "Hello! How can I help you today?", "truncated": false } ],
  "delta_outputs": [],
  "warnings": [],
  "errors": []
}

B) Specific Mode (non‑streaming)

POST https://dev-bex.coolriots.ai/projecto/execute/api/v2/assistant/execute
Content-Type: application/json

{
  "org_id": "665467d84b7c867c744381a0",
  "sub_org_id": "66f0ff588c5ada289b447571",
  "assistant_name": "Bexinsight_Payload_Generator",
  "mode": "Specific",
  "opcode_id": "lead_qualifier_opcode",
  "context": { "message": "Hi" },
  "stream": false,
  "stream_cumulative": false,
  "emit_token_events": true,
  "stream_buffer_sentences": false
}

{
  "request_id": "7b1bd71f-7f91-4ff2-8a2d-8a9c0f0e8e12",
  "assistant": { "assistant_name": "Bexinsight_Payload_Generator", "selected_opcode_id": "lead_qualifier_opcode" },
  "opcode": { "id": "lead_qualifier_opcode", "type": "AutoOp", "version": "1.0.0" },
  "status": "ok",
  "timing": { "started_at": "2025-09-17T08:15:00Z", "duration_ms": 980 },
  "outputs": [ /* workflow outputs */ ],
  "delta_outputs": [],
  "warnings": [],
  "errors": []
}

C) Full Mode — Streaming (no tokens, non‑cumulative)

POST https://dev-bex.coolriots.ai/projecto/execute/api/v2/assistant/execute
Content-Type: application/json

{
  "org_id": "665467d84b7c867c744381a0",
  "sub_org_id": "66f0ff588c5ada289b447571",
  "assistant_name": "ServiceDesk_Assistant",
  "mode": "Full",
  "context": { "user_query": "Reset my password" },
  "stream": true,
  "stream_cumulative": false,
  "emit_token_events": false,
  "stream_buffer_sentences": false
}

{"request_id":"42abf1c7-7fae-4c71-90f7-57a11cc154d8","assistant":{"assistant_name":"ServiceDesk_Assistant","selected_opcode_id":"pwd_reset_flow"},"opcode":{"id":"pwd_reset_flow","type":"AutoOp","version":"2.0"},"status":"running","timing":{"started_at":"2025-09-17T08:11:00Z","duration_ms":200},"outputs":[],"delta_outputs":[{"step_id":"step2","step_type":"Non-LLM","label":"lookup_user","data":{"exists":true},"truncated":false}],"warnings":[],"errors":[]}
{"request_id":"42abf1c7-7fae-4c71-90f7-57a11cc154d8","assistant":{"assistant_name":"ServiceDesk_Assistant","selected_opcode_id":"pwd_reset_flow"},"opcode":{"id":"pwd_reset_flow","type":"AutoOp","version":"2.0"},"status":"ok","timing":{"started_at":"2025-09-17T08:11:00Z","duration_ms":880},"outputs":[{"step_id":"step2","step_type":"Non-LLM","label":"lookup_user","data":{"exists":true},"truncated":false},{"step_id":"step3","step_type":"LLM","label":"final_message","data":"We've sent a reset link to your email.","truncated":false}],"delta_outputs":[],"warnings":[],"errors":[]}

D) Full Mode — Streaming with tokens (buffered sentences)

POST https://dev-bex.coolriots.ai/projecto/execute/api/v2/assistant/execute
Content-Type: application/json

{
  "org_id": "665467d84b7c867c744381a0",
  "sub_org_id": "66f0ff588c5ada289b447571",
  "assistant_name": "ChatBot",
  "mode": "Full",
  "context": { "message": "Hello" },
  "stream": true,
  "stream_cumulative": true,
  "emit_token_events": true,
  "stream_buffer_sentences": true
}

{"request_id":"d4a5a777-7122-4931-91f6-c5ebba2db57d","assistant":{"assistant_name":"ChatBot","selected_opcode_id":"chat_llm"},"opcode":{"id":"chat_llm","type":"ChatOp","version":"1.2"},"status":"running","timing":{"started_at":"2025-09-17T08:13:00Z","duration_ms":150},"outputs":[],"delta_outputs":[],"warnings":[],"errors":[],"event":{"type":"token","sequence":1},"step":{"id":"step3","type":"LLM","index":2},"token":{"text":"Hello there!"}}
{"request_id":"d4a5a777-7122-4931-91f6-c5ebba2db57d","assistant":{"assistant_name":"ChatBot","selected_opcode_id":"chat_llm"},"opcode":{"id":"chat_llm","type":"ChatOp","version":"1.2"},"status":"running","timing":{"started_at":"2025-09-17T08:13:00Z","duration_ms":300},"outputs":[],"delta_outputs":[],"warnings":[],"errors":[],"event":{"type":"token","sequence":2},"step":{"id":"step3","type":"LLM","index":2},"token":{"text":" How can I assist you today?"}}
{"request_id":"d4a5a777-7122-4931-91f6-c5ebba2db57d","assistant":{"assistant_name":"ChatBot","selected_opcode_id":"chat_llm"},"opcode":{"id":"chat_llm","type":"ChatOp","version":"1.2"},"status":"ok","timing":{"started_at":"2025-09-17T08:13:00Z","duration_ms":1600},"outputs":[{"step_id":"step3","step_type":"LLM","label":"reply","data":"Hello there! How can I assist you today?","truncated":false}],"delta_outputs":[],"warnings":[],"errors":[]}

Code Samples

Pick a language

:: Final (non-streaming)
curl -X POST "https://dev-bex.coolriots.ai/projecto/execute/api/v2/assistant/execute" ^
  -H "Content-Type: application/json" ^
  -d "{
    \"org_id\": \"665467d84b7c867c744381a0\",
    \"sub_org_id\": \"66f0ff588c5ada289b447571\",
    \"assistant_name\": \"Bexinsight_Payload_Generator\",
    \"mode\": \"Specific\",
    \"opcode_id\": \"lead_qualifier_opcode\",
    \"context\": { \"message\": \"Hi\" },
    \"stream\": false,
    \"stream_cumulative\": false,
    \"emit_token_events\": true,
    \"stream_buffer_sentences\": false
  }"

:: Streaming (NDJSON)
curl -N --no-buffer -X POST "https://dev-bex.coolriots.ai/projecto/execute/api/v2/assistant/execute" ^
  -H "Content-Type: application/json" ^
  -d "{
    \"org_id\": \"665467d84b7c867c744381a0\",
    \"sub_org_id\": \"66f0ff588c5ada289b447571\",
    \"assistant_name\": \"ServiceDesk_Assistant\",
    \"mode\": \"Full\",
    \"context\": { \"user_query\": \"Reset my password\" },
    \"stream\": true,
    \"stream_cumulative\": true,
    \"emit_token_events\": true,
    \"stream_buffer_sentences\": true
  }"

import json, requests
base = "https://dev-bex.coolriots.ai/projecto/execute/api"
url = f"{base}/v2/assistant/execute"

# Final (non-streaming)
final_payload = {
    "org_id": "665467d84b7c867c744381a0",
    "sub_org_id": "66f0ff588c5ada289b447571",
    "assistant_name": "Bexinsight_Payload_Generator",
    "mode": "Specific",
    "opcode_id": "lead_qualifier_opcode",
    "context": {"message": "Hi"},
    "stream": False,
    "stream_cumulative": False,
    "emit_token_events": True,
    "stream_buffer_sentences": False,
}
resp = requests.post(url, json=final_payload)
resp.raise_for_status()
print(resp.json())

# Streaming (NDJSON)
stream_payload = {
    "org_id": "665467d84b7c867c744381a0",
    "sub_org_id": "66f0ff588c5ada289b447571",
    "assistant_name": "ServiceDesk_Assistant",
    "mode": "Full",
    "context": {"user_query": "Reset my password"},
    "stream": True,
    "stream_cumulative": True,
    "emit_token_events": True,
    "stream_buffer_sentences": True,
}
with requests.post(url, json=stream_payload, stream=True) as r:
    r.raise_for_status()
    for line in r.iter_lines(decode_unicode=True):
        if line:
            evt = json.loads(line)
            print(evt.get("status"), evt.get("event"), evt.get("token", {}).get("text"))

const base = "https://dev-bex.coolriots.ai/projecto/execute/api";
const url = `${base}/v2/assistant/execute`;

// Final (non-streaming)
const finalPayload = {
  org_id: "665467d84b7c867c744381a0",
  sub_org_id: "66f0ff588c5ada289b447571",
  assistant_name: "Bexinsight_Payload_Generator",
  mode: "Specific",
  opcode_id: "lead_qualifier_opcode",
  context: { message: "Hi" },
  stream: false,
  stream_cumulative: false,
  emit_token_events: true,
  stream_buffer_sentences: false,
};
const final = await fetch(url, {method:"POST", headers:{"Content-Type":"application/json"}, body: JSON.stringify(finalPayload)});
console.log(await final.json());

// Streaming (NDJSON)
const streamPayload = {
  org_id: "665467d84b7c867c744381a0",
  sub_org_id: "66f0ff588c5ada289b447571",
  assistant_name: "ServiceDesk_Assistant",
  mode: "Full",
  context: { user_query: "Reset my password" },
  stream: true,
  stream_cumulative: true,
  emit_token_events: true,
  stream_buffer_sentences: true,
};
const resp = await fetch(url, {method:"POST", headers:{"Content-Type":"application/json"}, body: JSON.stringify(streamPayload)});
const reader = resp.body.getReader();
const decoder = new TextDecoder();
let buf = "";
while (true) {
  const { value, done } = await reader.read();
  if (done) break;
  buf += decoder.decode(value, { stream: true });
  let lines = buf.split(/
?
/); buf = lines.pop() || "";
  for (const line of lines) if (line.trim()) console.log(JSON.parse(line));
}
if (buf.trim()) console.log(JSON.parse(buf));

Errors & Rate Limits

Error envelope: Responses share a common envelope. On failures, status is "error" and details appear in errors[].

{
  "request_id": "7b1bd71f-7f91-4ff2-8a2d-8a9c0f0e8e12",
  "assistant": { "assistant_name": "ServiceDesk_Assistant" },
  "opcode": { "id": "selected_opcode_id_if_any" },
  "status": "error",
  "timing": { "started_at": "2025-09-17T08:15:00Z", "duration_ms": 0 },
  "outputs": [],
  "delta_outputs": [],
  "warnings": [],
  "errors": [
    { "code": "ASSISTANT_EXEC_ERROR", "type": "RuntimeError", "message": "Unhandled exception during execution", "retryable": false, "time": "2025-09-17T08:15:00Z", "step": null }
  ]
}

Rate Limits

Rule	Typical value	Notes
Requests per second	25 req/s	Across all endpoints unless overridden by deployment.
Requests per minute	500 req/min	Bursts above this may receive HTTP 429.
Cooldown	30 s	Retry after the indicated window in the response headers/body.