Assistant API v2
Final & Streaming (NDJSON) • Developer Docs

Overview

What are BeX AI Assistants? BeX AI Assistants are AI agents built to power automation and natural conversations. They interact with users, understand intent, and execute the most suitable workflow — all orchestrated by the BeX AI Engine. A single Assistant can contain and manage many workflows.

How do we use the Assistant APIs? The most flexible way is to call the Assistant API and pass user inputs as context keys. Context keys are a free‑shape JSON dictionary containing the fields your workflow expects. You fully control the schema. For example, if you design a calculator workflow, your context might look like:

Context Example — Calculator
{
  "first_num" : 9,
  "second_num" : 8,
  "operation" : "sum"
}

Execution modes. Assistants support three ways to decide which workflow runs. In Default mode, if an Assistant has a configured default workflow, it will run that one when you use mode: "Specific" without an opcode_id. In Specific mode, you provide an opcode_id to run a particular workflow. In Full mode, you set mode: "Full" and the Assistant chooses the best workflow based on your context keys — useful when multiple workflows could apply.

Streaming vs non‑streaming. You decide whether to receive one final snapshot or a stream of events as work completes. With stream: true, the API returns newline‑delimited JSON (NDJSON) where each line is an event. This is ideal when you want to render progress or partial results. With stream: false, you receive a single JSON only after all steps complete.

Consider a hiring assistant that processes a candidate’s CV in two steps: (1) analyze the CV and propose department fit with a qualification score; (2) draft a final HR‑ready message. If stream is true, you’ll see the analysis output as soon as it’s ready and later receive the final message. If stream is false, the API returns once both steps finish.

When streaming is enabled, you can further tune how events arrive: enable cumulative events to include everything seen so far; request token events from LLM steps to display text as it’s generated; and enable buffered sentences so token streams arrive as readable sentences instead of raw sub‑word pieces.

“Remember me”. Assistants can remember previous executions to improve future runs. Include a Memory step in your workflow design and pass a stable memory_log_id (any unique string) in your requests. Memory is currently scoped to the workflow level.

Parameters (request body)

Field Type Required Default Short description
org_idstringYesOrganization ID.
sub_org_idstringYesSub‑organization ID.
assistant_namestringYesAssistant identifier (assistant_id).
modeenumYesSpecificSpecific (default) or Full. See Overview.
opcode_idstringConditionalRequired when mode=Specific unless a default workflow is configured.
contextobjectYesFree‑shape JSON holding your input fields.
memory_log_idstringNoStable ID to persist/retrieve workflow memory.
streambooleanNofalseEnable NDJSON streaming of events.
stream_cumulativebooleanNofalseInclude cumulative outputs in each event.
emit_token_eventsbooleanNotrueEmit per‑token frames from LLM steps (when enabled in the workflow).
stream_buffer_sentencesbooleanNofalseBuffer tokens into readable sentences.

Requests & Responses

Base URL & Endpoint

Base URL: https://{host}/projecto/execute/api
Execute: POST /v2/assistant/execute
Example: POST https://dev-bex.coolriots.ai/projecto/execute/api/v2/assistant/execute

Final responses use Content-Type: application/json. Streams use application/x-ndjson (one JSON object per line).

Response Envelope (Final Snapshot)
JSON
{
  "request_id": "uuid",
  "assistant": { "assistant_name": "...", "selected_opcode_id": "..." },
  "opcode": { "id": "...", "type": "AutoOp|ChatOp", "version": "..." },
  "status": "ok|error",
  "timing": { "started_at": "ISO8601", "duration_ms": 0 },
  "outputs": [ { "step_id": "...", "label": "...", "data": { /* any */ }, "truncated": false } ],
  "delta_outputs": [],
  "warnings": [],
  "errors": [ /* if any */ ]
}
A) Default Mode (non‑streaming)
Request
POST https://dev-bex.coolriots.ai/projecto/execute/api/v2/assistant/execute
Content-Type: application/json

{
  "org_id": "665467d84b7c867c744381a0",
  "sub_org_id": "66f0ff588c5ada289b447571",
  "assistant_name": "Bexinsight_Payload_Generator",
  "mode": "Specific",
  "context": { "message": "Hi, I'm interested in your product.", "email": "[email protected]" },
  "stream": false,
  "stream_cumulative": false,
  "emit_token_events": true,
  "stream_buffer_sentences": false
}
Response (200)
{
  "request_id": "b8b4a8d3-9f31-44a2-9cd6-68ac8d5e60ab",
  "assistant": { "assistant_name": "Bexinsight_Payload_Generator", "selected_opcode_id": "default_workflow_id" },
  "opcode": { "id": "default_workflow_id", "type": "AutoOp", "version": "1.0.0" },
  "status": "ok",
  "timing": { "started_at": "2025-09-17T08:10:00Z", "duration_ms": 1520 },
  "outputs": [ { "step_id": "step1", "step_type": "LLM", "label": "greeting", "data": "Hello! How can I help you today?", "truncated": false } ],
  "delta_outputs": [],
  "warnings": [],
  "errors": []
}
B) Specific Mode (non‑streaming)
Request
POST https://dev-bex.coolriots.ai/projecto/execute/api/v2/assistant/execute
Content-Type: application/json

{
  "org_id": "665467d84b7c867c744381a0",
  "sub_org_id": "66f0ff588c5ada289b447571",
  "assistant_name": "Bexinsight_Payload_Generator",
  "mode": "Specific",
  "opcode_id": "lead_qualifier_opcode",
  "context": { "message": "Hi" },
  "stream": false,
  "stream_cumulative": false,
  "emit_token_events": true,
  "stream_buffer_sentences": false
}
Response (200)
{
  "request_id": "7b1bd71f-7f91-4ff2-8a2d-8a9c0f0e8e12",
  "assistant": { "assistant_name": "Bexinsight_Payload_Generator", "selected_opcode_id": "lead_qualifier_opcode" },
  "opcode": { "id": "lead_qualifier_opcode", "type": "AutoOp", "version": "1.0.0" },
  "status": "ok",
  "timing": { "started_at": "2025-09-17T08:15:00Z", "duration_ms": 980 },
  "outputs": [ /* workflow outputs */ ],
  "delta_outputs": [],
  "warnings": [],
  "errors": []
}
C) Full Mode — Streaming (no tokens, non‑cumulative)
Request
POST https://dev-bex.coolriots.ai/projecto/execute/api/v2/assistant/execute
Content-Type: application/json

{
  "org_id": "665467d84b7c867c744381a0",
  "sub_org_id": "66f0ff588c5ada289b447571",
  "assistant_name": "ServiceDesk_Assistant",
  "mode": "Full",
  "context": { "user_query": "Reset my password" },
  "stream": true,
  "stream_cumulative": false,
  "emit_token_events": false,
  "stream_buffer_sentences": false
}
Response (NDJSON — 2 lines)
{"request_id":"42abf1c7-7fae-4c71-90f7-57a11cc154d8","assistant":{"assistant_name":"ServiceDesk_Assistant","selected_opcode_id":"pwd_reset_flow"},"opcode":{"id":"pwd_reset_flow","type":"AutoOp","version":"2.0"},"status":"running","timing":{"started_at":"2025-09-17T08:11:00Z","duration_ms":200},"outputs":[],"delta_outputs":[{"step_id":"step2","step_type":"Non-LLM","label":"lookup_user","data":{"exists":true},"truncated":false}],"warnings":[],"errors":[]}
{"request_id":"42abf1c7-7fae-4c71-90f7-57a11cc154d8","assistant":{"assistant_name":"ServiceDesk_Assistant","selected_opcode_id":"pwd_reset_flow"},"opcode":{"id":"pwd_reset_flow","type":"AutoOp","version":"2.0"},"status":"ok","timing":{"started_at":"2025-09-17T08:11:00Z","duration_ms":880},"outputs":[{"step_id":"step2","step_type":"Non-LLM","label":"lookup_user","data":{"exists":true},"truncated":false},{"step_id":"step3","step_type":"LLM","label":"final_message","data":"We've sent a reset link to your email.","truncated":false}],"delta_outputs":[],"warnings":[],"errors":[]}
D) Full Mode — Streaming with tokens (buffered sentences)
Request
POST https://dev-bex.coolriots.ai/projecto/execute/api/v2/assistant/execute
Content-Type: application/json

{
  "org_id": "665467d84b7c867c744381a0",
  "sub_org_id": "66f0ff588c5ada289b447571",
  "assistant_name": "ChatBot",
  "mode": "Full",
  "context": { "message": "Hello" },
  "stream": true,
  "stream_cumulative": true,
  "emit_token_events": true,
  "stream_buffer_sentences": true
}
Response (NDJSON — 3 lines)
{"request_id":"d4a5a777-7122-4931-91f6-c5ebba2db57d","assistant":{"assistant_name":"ChatBot","selected_opcode_id":"chat_llm"},"opcode":{"id":"chat_llm","type":"ChatOp","version":"1.2"},"status":"running","timing":{"started_at":"2025-09-17T08:13:00Z","duration_ms":150},"outputs":[],"delta_outputs":[],"warnings":[],"errors":[],"event":{"type":"token","sequence":1},"step":{"id":"step3","type":"LLM","index":2},"token":{"text":"Hello there!"}}
{"request_id":"d4a5a777-7122-4931-91f6-c5ebba2db57d","assistant":{"assistant_name":"ChatBot","selected_opcode_id":"chat_llm"},"opcode":{"id":"chat_llm","type":"ChatOp","version":"1.2"},"status":"running","timing":{"started_at":"2025-09-17T08:13:00Z","duration_ms":300},"outputs":[],"delta_outputs":[],"warnings":[],"errors":[],"event":{"type":"token","sequence":2},"step":{"id":"step3","type":"LLM","index":2},"token":{"text":" How can I assist you today?"}}
{"request_id":"d4a5a777-7122-4931-91f6-c5ebba2db57d","assistant":{"assistant_name":"ChatBot","selected_opcode_id":"chat_llm"},"opcode":{"id":"chat_llm","type":"ChatOp","version":"1.2"},"status":"ok","timing":{"started_at":"2025-09-17T08:13:00Z","duration_ms":1600},"outputs":[{"step_id":"step3","step_type":"LLM","label":"reply","data":"Hello there! How can I assist you today?","truncated":false}],"delta_outputs":[],"warnings":[],"errors":[]}

Code Samples

Pick a language
cURL — Final & Streaming
:: Final (non-streaming)
curl -X POST "https://dev-bex.coolriots.ai/projecto/execute/api/v2/assistant/execute" ^
  -H "Content-Type: application/json" ^
  -d "{
    \"org_id\": \"665467d84b7c867c744381a0\",
    \"sub_org_id\": \"66f0ff588c5ada289b447571\",
    \"assistant_name\": \"Bexinsight_Payload_Generator\",
    \"mode\": \"Specific\",
    \"opcode_id\": \"lead_qualifier_opcode\",
    \"context\": { \"message\": \"Hi\" },
    \"stream\": false,
    \"stream_cumulative\": false,
    \"emit_token_events\": true,
    \"stream_buffer_sentences\": false
  }"

:: Streaming (NDJSON)
curl -N --no-buffer -X POST "https://dev-bex.coolriots.ai/projecto/execute/api/v2/assistant/execute" ^
  -H "Content-Type: application/json" ^
  -d "{
    \"org_id\": \"665467d84b7c867c744381a0\",
    \"sub_org_id\": \"66f0ff588c5ada289b447571\",
    \"assistant_name\": \"ServiceDesk_Assistant\",
    \"mode\": \"Full\",
    \"context\": { \"user_query\": \"Reset my password\" },
    \"stream\": true,
    \"stream_cumulative\": true,
    \"emit_token_events\": true,
    \"stream_buffer_sentences\": true
  }"

Errors & Rate Limits

Error envelope: Responses share a common envelope. On failures, status is "error" and details appear in errors[].

Error Shape
{
  "request_id": "7b1bd71f-7f91-4ff2-8a2d-8a9c0f0e8e12",
  "assistant": { "assistant_name": "ServiceDesk_Assistant" },
  "opcode": { "id": "selected_opcode_id_if_any" },
  "status": "error",
  "timing": { "started_at": "2025-09-17T08:15:00Z", "duration_ms": 0 },
  "outputs": [],
  "delta_outputs": [],
  "warnings": [],
  "errors": [
    { "code": "ASSISTANT_EXEC_ERROR", "type": "RuntimeError", "message": "Unhandled exception during execution", "retryable": false, "time": "2025-09-17T08:15:00Z", "step": null }
  ]
}
Rate Limits
RuleTypical valueNotes
Requests per second25 req/sAcross all endpoints unless overridden by deployment.
Requests per minute500 req/minBursts above this may receive HTTP 429.
Cooldown30 sRetry after the indicated window in the response headers/body.