Text Module

The text module provides pure-JS text processing utilities, with no external dependencies.

parseUrl

Generic URL parser. Extracts standard URL parts (domain, path, query) plus Prisme.ai-specific fields (workspaceId, file id/filename).

- run:
    module: text
    function: parseUrl
    parameters:
      url: "{{uploaded_file.url}}"
    output: parsed
# parsed.domain      → "api.prisme.ai"
# parsed.path        → "/v2/files/ws123/abc.report.pdf"
# parsed.id          → "abc"
# parsed.filename    → "report.pdf"
# parsed.ext          → "pdf"
# parsed.workspaceId → "ws123"
# parsed.mimetype    → "application/pdf"
# parsed.query       → { "token": "xyz" }

Parameter	Type	Required	Default	Description
`url`	string	yes		Any URL (or raw string)

Returns an object with the following fields:

Field	Type	Description
`domain`	string	Hostname (e.g. `api.prisme.ai`). Empty for non-URLs.
`path`	string	Full pathname (e.g. `/v2/files/ws123/abc.report.pdf`).
`id`	string	From the last path segment: everything before the first dot, only when 2+ dots are present. Empty otherwise.
`filename`	string	From the last path segment: everything after the first dot. If no dot, equals the full segment.
`ext`	string	Lowercase file extension from the filename (e.g. `pdf`, `xlsx`). Empty if none.
`workspaceId`	string	Extracted from `/files/{wsId}/…` or `/workspaces/{wsId}/…`. Empty if not found.
`mimetype`	string	MIME type inferred from the file extension.
`query`	object	Query string parameters as key-value pairs.

How `id` / `filename` splitting works

The last path segment is parsed based on the number of dots:

2+ dots ({id}.{name}.{ext}): id is everything before the first dot, filename is the rest.
1 dot ({name}.{ext}): the whole segment is the filename, no id.
0 dots: the whole segment is the filename, no id.

Last segment	`id`	`filename`
`abc123.report.pdf`	`abc123`	`report.pdf`
`doc.pdf`	(empty)	`doc.pdf`
`README`	(empty)	`README`

Examples

Extract a file ID from a native upload URL:

- run:
    module: text
    function: parseUrl
    parameters:
      url: "{{file_part.url}}"
    output: _parsed
- set:
    name: file_id
    value: "{{_parsed.id}}"

Extract workspace ID from any Prisme.ai URL:

- run:
    module: text
    function: parseUrl
    parameters:
      url: "{{webhook_url}}"
    output: _parsed
- set:
    name: ws_id
    value: "{{_parsed.workspaceId}}"

splitText

Split text into chunks using a recursive character splitting strategy. The splitter tries separators in order, splits on the first one found, merges small pieces back up to chunkSize, maintains chunkOverlap between consecutive chunks, and recurses with finer separators for pieces still too large.

- run:
    module: text
    function: splitText
    parameters:
      content: "{{document.text}}"
      chunkSize: 1000
      chunkOverlap: 200
    output: chunks

Parameter	Type	Required	Default	Description
`content`	string \| string[]	yes		Text or array of texts to split
`chunkSize`	number	yes		Maximum size of each chunk (in characters)
`chunkOverlap`	number	yes		Number of overlapping characters between consecutive chunks
`separators`	string[]	no	`["\n\n", "\n", " ", ""]`	Ordered list of separators to try, from coarsest to finest
`keepSeparator`	boolean \| `"start"` \| `"end"`	no	`false`	Attach the separator to the chunk. `true` or `"end"` appends it to the preceding chunk, `"start"` prepends it to the following chunk. Only visible with non-whitespace separators (whitespace is trimmed).

Returns an array of { content, size } objects:

[
  { "content": "First chunk text...", "size": 253 },
  { "content": "Second chunk text...", "size": 241 }
]

Split with custom separators (e.g. Markdown headings)

- run:
    module: text
    function: splitText
    parameters:
      content: "{{document.text}}"
      chunkSize: 1500
      chunkOverlap: 100
      separators:
        - "\n## "
        - "\n### "
        - "\n\n"
        - "\n"
        - " "
        - ""
      keepSeparator: start
    output: chunks

Iterate over chunks

- run:
    module: text
    function: splitText
    parameters:
      content: "{{document.text}}"
      chunkSize: 500
      chunkOverlap: 50
    output: chunks
- repeat:
    on: "{{chunks}}"
    do:
      - emit:
          event: chunk-ready
          payload:
            text: "{{item.content}}"
            size: "{{item.size}}"

Overview

Chat

Agent Creator

Knowledges

Builder

Governe

Insights (beta)

parseUrl

How `id` / `filename` splitting works

Examples

splitText

Split with custom separators (e.g. Markdown headings)

Iterate over chunks

Overview

Chat

Agent Creator

Knowledges

Builder

Governe

Insights (beta)

Documentation Index

​parseUrl

​How id / filename splitting works

​Examples

​splitText

​Split with custom separators (e.g. Markdown headings)

​Iterate over chunks

parseUrl

How `id` / `filename` splitting works

Examples

splitText

Split with custom separators (e.g. Markdown headings)

Iterate over chunks