Skip to content

fenic.api.functions.semantic

Semantic functions for Fenic DataFrames - LLM-based operations.

Functions:

  • analyze_sentiment

    Analyzes the sentiment of a string column. Returns one of 'positive', 'negative', or 'neutral'.

  • classify

    Classifies a string column into one of the provided labels.

  • embed

    Generate embeddings for the specified string column.

  • extract

    Extracts structured information from unstructured text using a provided schema.

  • map

    Applies a natural language instruction to one or more text columns, enabling rich summarization and generation tasks.

  • predicate

    Applies a natural language predicate to one or more string columns, returning a boolean result.

  • reduce

    Aggregate function: reduces a set of strings across columns into a single string using a natural language instruction.

analyze_sentiment

analyze_sentiment(column: ColumnOrName, model_alias: Optional[str] = None, temperature: float = 0) -> Column

Analyzes the sentiment of a string column. Returns one of 'positive', 'negative', or 'neutral'.

Parameters:

  • column (ColumnOrName) –

    Column or column name containing text for sentiment analysis.

  • model_alias (Optional[str], default: None ) –

    Optional alias for the language model to use for the mapping. If None, will use the language model configured as the default.

  • temperature (float, default: 0 ) –

    Optional temperature parameter for the language model. If None, will use the default temperature (0.0).

Returns:

  • Column ( Column ) –

    Expression containing sentiment results ('positive', 'negative', or 'neutral').

Raises:

  • ValueError

    If column is invalid or cannot be resolved.

Analyzing the sentiment of a user comment
semantic.analyze_sentiment(col('user_comment'))
Source code in src/fenic/api/functions/semantic.py
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
@validate_call(config=ConfigDict(strict=True, arbitrary_types_allowed=True))
def analyze_sentiment(
        column: ColumnOrName,
        model_alias: Optional[str] = None,
        temperature: float = 0,
) -> Column:
    """Analyzes the sentiment of a string column. Returns one of 'positive', 'negative', or 'neutral'.

    Args:
        column: Column or column name containing text for sentiment analysis.
        model_alias: Optional alias for the language model to use for the mapping. If None, will use the language model configured as the default.
        temperature: Optional temperature parameter for the language model. If None, will use the default temperature (0.0).

    Returns:
        Column: Expression containing sentiment results ('positive', 'negative', or 'neutral').

    Raises:
        ValueError: If column is invalid or cannot be resolved.

    Example: Analyzing the sentiment of a user comment
        ```python
        semantic.analyze_sentiment(col('user_comment'))
        ```
    """
    return Column._from_logical_expr(
        AnalyzeSentimentExpr(
            Column._from_col_or_name(column)._logical_expr,
            model_alias=model_alias,
            temperature=temperature,
        )
    )

classify

classify(column: ColumnOrName, labels: List[str] | type[Enum], examples: Optional[ClassifyExampleCollection] = None, model_alias: Optional[str] = None, temperature: float = 0) -> Column

Classifies a string column into one of the provided labels.

This is useful for tagging incoming documents with predefined categories.

Parameters:

  • column (ColumnOrName) –

    Column or column name containing text to classify.

  • labels (List[str] | type[Enum]) –

    List of category strings or an Enum defining the categories to classify the text into.

  • examples (Optional[ClassifyExampleCollection], default: None ) –

    Optional collection of example classifications to guide the model. Examples should be created using ClassifyExampleCollection.create_example(), with instruction variables mapped to their expected classifications.

  • model_alias (Optional[str], default: None ) –

    Optional alias for the language model to use for the mapping. If None, will use the language model configured as the default.

  • temperature (float, default: 0 ) –

    Optional temperature parameter for the language model. If None, will use the default temperature (0.0).

Returns:

  • Column ( Column ) –

    Expression containing the classification results.

Raises:

  • ValueError

    If column is invalid or categories is not a list of strings.

Categorizing incoming support requests
# Categorize incoming support requests
semantic.classify("message", ["Account Access", "Billing Issue", "Technical Problem"])
Categorizing incoming support requests with examples
examples = ClassifyExampleCollection()
examples.create_example(ClassifyExample(
    input="I can't reset my password or access my account.",
    output="Account Access"))
examples.create_example(ClassifyExample(
    input="You charged me twice for the same month.",
    output="Billing Issue"))
semantic.classify("message", ["Account Access", "Billing Issue", "Technical Problem"], examples)
Source code in src/fenic/api/functions/semantic.py
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
@validate_call(config=ConfigDict(strict=True, arbitrary_types_allowed=True))
def classify(
        column: ColumnOrName,
        labels: List[str] | type[Enum],
        examples: Optional[ClassifyExampleCollection] = None,
        model_alias: Optional[str] = None,
        temperature: float = 0,
) -> Column:
    """Classifies a string column into one of the provided labels.

    This is useful for tagging incoming documents with predefined categories.

    Args:
        column: Column or column name containing text to classify.

        labels: List of category strings or an Enum defining the categories to classify the text into.

        examples: Optional collection of example classifications to guide the model.
            Examples should be created using ClassifyExampleCollection.create_example(),
            with instruction variables mapped to their expected classifications.

        model_alias: Optional alias for the language model to use for the mapping. If None, will use the language model configured as the default.

        temperature: Optional temperature parameter for the language model. If None, will use the default temperature (0.0).

    Returns:
        Column: Expression containing the classification results.

    Raises:
        ValueError: If column is invalid or categories is not a list of strings.

    Example: Categorizing incoming support requests
        ```python
        # Categorize incoming support requests
        semantic.classify("message", ["Account Access", "Billing Issue", "Technical Problem"])
        ```

    Example: Categorizing incoming support requests with examples
        ```python
        examples = ClassifyExampleCollection()
        examples.create_example(ClassifyExample(
            input="I can't reset my password or access my account.",
            output="Account Access"))
        examples.create_example(ClassifyExample(
            input="You charged me twice for the same month.",
            output="Billing Issue"))
        semantic.classify("message", ["Account Access", "Billing Issue", "Technical Problem"], examples)
        ```
    """
    if isinstance(labels, List) and len(labels) == 0:
        raise ValueError(
            f"Must specify the categories for classification, found: {len(labels)} categories"
        )
    return Column._from_logical_expr(
        SemanticClassifyExpr(
            Column._from_col_or_name(column)._logical_expr,
            labels,
            examples=examples,
            model_alias=model_alias,
            temperature=temperature,
        )
    )

embed

embed(column: ColumnOrName, model_alias: Optional[str] = None) -> Column

Generate embeddings for the specified string column.

Parameters:

  • column (ColumnOrName) –

    Column or column name containing the values to generate embeddings for.

  • model_alias (Optional[str], default: None ) –

    Optional alias for the embedding model to use for the mapping. If None, will use the embedding model configured as the default.

Returns:

  • Column

    A Column expression that represents the embeddings for each value in the input column

Raises:

  • TypeError

    If the input column is not a string column.

Generate embeddings for a text column
df.select(semantic.embed(col("text_column")).alias("text_embeddings"))
Source code in src/fenic/api/functions/semantic.py
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
@validate_call(config=ConfigDict(strict=True, arbitrary_types_allowed=True))
def embed(
    column: ColumnOrName,
    model_alias: Optional[str] = None,
) -> Column:
    """Generate embeddings for the specified string column.

    Args:
        column: Column or column name containing the values to generate embeddings for.
        model_alias: Optional alias for the embedding model to use for the mapping.
            If None, will use the embedding model configured as the default.


    Returns:
        A Column expression that represents the embeddings for each value in the input column

    Raises:
        TypeError: If the input column is not a string column.

    Example: Generate embeddings for a text column
        ```python
        df.select(semantic.embed(col("text_column")).alias("text_embeddings"))
        ```
    """
    return Column._from_logical_expr(
        EmbeddingsExpr(Column._from_col_or_name(column)._logical_expr, model_alias=model_alias)
    )

extract

extract(column: ColumnOrName, schema: Union[ExtractSchema, Type[BaseModel]], max_output_tokens: int = 1024, temperature: float = 0, model_alias: Optional[str] = None) -> Column

Extracts structured information from unstructured text using a provided schema.

This function applies an instruction-driven extraction process to text columns, returning structured data based on the fields and descriptions provided. Useful for pulling out key entities, facts, or labels from documents.

Parameters:

  • column (ColumnOrName) –

    Column containing text to extract from.

  • schema (Union[ExtractSchema, Type[BaseModel]]) –

    An ExtractSchema containing fields of type ExtractSchemaField that define the output structure and field descriptions or a Pydantic model that defines the output structure with descriptions for each field.

  • model_alias (Optional[str], default: None ) –

    Optional alias for the language model to use for the mapping. If None, will use the language model configured as the default.

  • temperature (float, default: 0 ) –

    Optional temperature parameter for the language model. If None, will use the default temperature (0.0).

  • max_output_tokens (int, default: 1024 ) –

    Optional parameter to constrain the model to generate at most this many tokens. If None, fenic will calculate the expected max tokens, based on the model's context length and other operator-specific parameters.

Returns:

  • Column ( Column ) –

    A new column with structured values (a struct) based on the provided schema.

Extracting product metadata from a description using an explict ExtractSchema
schema = ExtractSchema([
     ExtractSchemaField(
         name="brand",
         data_type=DataType.STRING,
         description="The brand or manufacturer mentioned in the product description"
     ),
     ExtractSchemaField(
         name="capacity_gb",
         data_type=DataType.INTEGER,
         description="The storage capacity of the product in gigabytes, if mentioned"
     ),
     ExtractSchemaField(
         name="connectivity",
         data_type=DataType.STRING,
         description="The type of connectivity or ports described (e.g., USB-C, Thunderbolt)"
     )
 ])
df.select(semantic.extract("product_description", schema))
Extracting user intent from a support message using a Pydantic model
class UserRequest(BaseModel):
    request_type: str = Field(..., description="The type of request (e.g., refund, technical issue, setup help)")
    target_product: str = Field(..., description="The name or type of product the user is referring to")
    preferred_resolution: str = Field(..., description="The action the user is expecting (e.g., replacement, callback)")

df.select(semantic.extract("support_message", UserRequest))

Raises: ValueError: If any input expression is invalid, or if the schema is empty or invalid, or if the schema contains fields with no descriptions.

Source code in src/fenic/api/functions/semantic.py
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
@validate_call(config=ConfigDict(strict=True, arbitrary_types_allowed=True))
def extract(
        column: ColumnOrName,
        schema: Union[ExtractSchema, Type[BaseModel]],
        max_output_tokens: int = 1024,
        temperature: float = 0,
        model_alias: Optional[str] = None,
) -> Column:
    """Extracts structured information from unstructured text using a provided schema.

    This function applies an instruction-driven extraction process to text columns, returning
    structured data based on the fields and descriptions provided. Useful for pulling out key entities,
    facts, or labels from documents.

    Args:
        column: Column containing text to extract from.
        schema: An ExtractSchema containing fields of type ExtractSchemaField that define
            the output structure and field descriptions or a Pydantic model that defines the output structure with
            descriptions for each field.
        model_alias: Optional alias for the language model to use for the mapping. If None, will use the language model configured as the default.
        temperature: Optional temperature parameter for the language model. If None, will use the default temperature (0.0).
        max_output_tokens: Optional parameter to constrain the model to generate at most this many tokens. If None, fenic will calculate the expected max
            tokens, based on the model's context length and other operator-specific parameters.

    Returns:
        Column: A new column with structured values (a struct) based on the provided schema.

    Example: Extracting product metadata from a description using an explict ExtractSchema
        ```python
        schema = ExtractSchema([
             ExtractSchemaField(
                 name="brand",
                 data_type=DataType.STRING,
                 description="The brand or manufacturer mentioned in the product description"
             ),
             ExtractSchemaField(
                 name="capacity_gb",
                 data_type=DataType.INTEGER,
                 description="The storage capacity of the product in gigabytes, if mentioned"
             ),
             ExtractSchemaField(
                 name="connectivity",
                 data_type=DataType.STRING,
                 description="The type of connectivity or ports described (e.g., USB-C, Thunderbolt)"
             )
         ])
        df.select(semantic.extract("product_description", schema))
        ```

    Example: Extracting user intent from a support message using a Pydantic model
        ```python
        class UserRequest(BaseModel):
            request_type: str = Field(..., description="The type of request (e.g., refund, technical issue, setup help)")
            target_product: str = Field(..., description="The name or type of product the user is referring to")
            preferred_resolution: str = Field(..., description="The action the user is expecting (e.g., replacement, callback)")

        df.select(semantic.extract("support_message", UserRequest))
        ```
    Raises:
        ValueError: If any input expression is invalid, or if the schema
            is empty or invalid, or if the schema contains fields with no descriptions.
    """
    validate_extract_schema_structure(schema)

    return Column._from_logical_expr(
        SemanticExtractExpr(
            Column._from_col_or_name(column)._logical_expr,
            max_tokens=max_output_tokens,
            temperature=temperature,
            model_alias=model_alias,
            schema=schema,
        )
    )

map

map(instruction: str, examples: Optional[MapExampleCollection] = None, model_alias: Optional[str] = None, temperature: float = 0, max_output_tokens: int = 512) -> Column

Applies a natural language instruction to one or more text columns, enabling rich summarization and generation tasks.

Parameters:

  • instruction (str) –

    A string containing the semantic.map prompt. The instruction must include placeholders in curly braces that reference one or more column names. These placeholders will be replaced with actual column values during prompt construction during query execution.

  • examples (Optional[MapExampleCollection], default: None ) –

    Optional collection of examples to guide the semantic mapping operation. Each example should demonstrate the expected input and output for the mapping. The examples should be created using MapExampleCollection.create_example(), providing instruction variables and their expected answers.

  • model_alias (Optional[str], default: None ) –

    Optional alias for the language model to use for the mapping. If None, will use the language model configured as the default.

  • temperature (float, default: 0 ) –

    Optional temperature parameter for the language model. If None, will use the default temperature (0.0).

  • max_output_tokens (int, default: 512 ) –

    Optional parameter to constrain the model to generate at most this many tokens. If None, fenic will calculate the expected max tokens, based on the model's context length and other operator-specific parameters.

Returns:

  • Column ( Column ) –

    A column expression representing the semantic mapping operation.

Raises:

  • ValueError

    If the instruction is not a string.

Mapping without examples
semantic.map("Given the product name: {name} and its description: {details}, generate a compelling one-line description suitable for a product catalog.", examples)
Mapping with few-shot examples
examples = MapExampleCollection()
examples.create_example(MapExample(
    input={"name": "GlowMate", "details": "A rechargeable bedside lamp with adjustable color temperatures, touch controls, and a sleek minimalist design."},
    output="The modern touch-controlled lamp for better sleep and style."
))
examples.create_example(MapExample(
    input={"name": "AquaPure", "details": "A compact water filter that attaches to your faucet, removes over 99% of contaminants, and improves taste instantly."},
    output="Clean, great-tasting water straight from your tap."
))
semantic.map("Given the product name: {name} and its description: {details}, generate a compelling one-line description suitable for a product catalog.", examples)
Source code in src/fenic/api/functions/semantic.py
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
@validate_call(config=ConfigDict(arbitrary_types_allowed=True, strict=True))
def map(
        instruction: str,
        examples: Optional[MapExampleCollection] = None,
        model_alias: Optional[str] = None,
        temperature: float = 0,
        max_output_tokens: int = 512,
) -> Column:
    """Applies a natural language instruction to one or more text columns, enabling rich summarization and generation tasks.

    Args:
        instruction: A string containing the semantic.map prompt.
            The instruction must include placeholders in curly braces that reference one or more column names.
            These placeholders will be replaced with actual column values during prompt construction during
            query execution.
        examples: Optional collection of examples to guide the semantic mapping operation.
            Each example should demonstrate the expected input and output for the mapping.
            The examples should be created using MapExampleCollection.create_example(),
            providing instruction variables and their expected answers.
        model_alias: Optional alias for the language model to use for the mapping. If None, will use the language model configured as the default.
        temperature: Optional temperature parameter for the language model. If None, will use the default temperature (0.0).
        max_output_tokens: Optional parameter to constrain the model to generate at most this many tokens. If None, fenic will calculate the expected max
            tokens, based on the model's context length and other operator-specific parameters.

    Returns:
        Column: A column expression representing the semantic mapping operation.

    Raises:
        ValueError: If the instruction is not a string.

    Example: Mapping without examples
        ```python
        semantic.map("Given the product name: {name} and its description: {details}, generate a compelling one-line description suitable for a product catalog.", examples)
        ```

    Example: Mapping with few-shot examples
        ```python
        examples = MapExampleCollection()
        examples.create_example(MapExample(
            input={"name": "GlowMate", "details": "A rechargeable bedside lamp with adjustable color temperatures, touch controls, and a sleek minimalist design."},
            output="The modern touch-controlled lamp for better sleep and style."
        ))
        examples.create_example(MapExample(
            input={"name": "AquaPure", "details": "A compact water filter that attaches to your faucet, removes over 99% of contaminants, and improves taste instantly."},
            output="Clean, great-tasting water straight from your tap."
        ))
        semantic.map("Given the product name: {name} and its description: {details}, generate a compelling one-line description suitable for a product catalog.", examples)
        ```
    """
    return Column._from_logical_expr(
        SemanticMapExpr(
            instruction,
            examples=examples,
            max_tokens=max_output_tokens,
            model_alias=model_alias,
            temperature=temperature,
        )
    )

predicate

predicate(instruction: str, examples: Optional[PredicateExampleCollection] = None, model_alias: Optional[str] = None, temperature: float = 0) -> Column

Applies a natural language predicate to one or more string columns, returning a boolean result.

This is useful for filtering rows based on user-defined criteria expressed in natural language.

Parameters:

  • instruction (str) –

    A string containing the semantic.predicate prompt. The instruction must include placeholders in curly braces that reference one or more column names. These placeholders will be replaced with actual column values during prompt construction during query execution.

  • examples (Optional[PredicateExampleCollection], default: None ) –

    Optional collection of examples to guide the semantic predicate operation. Each example should demonstrate the expected boolean output for different inputs. The examples should be created using PredicateExampleCollection.create_example(), providing instruction variables and their expected boolean answers.

  • model_alias (Optional[str], default: None ) –

    Optional alias for the language model to use for the mapping. If None, will use the language model configured as the default.

  • temperature (float, default: 0 ) –

    Optional temperature parameter for the language model. If None, will use the default temperature (0.0).

Returns:

  • Column ( Column ) –

    A column expression that returns a boolean value after applying the natural language predicate.

Raises:

  • ValueError

    If the instruction is not a string.

Identifying product descriptions that mention wireless capability
semantic.predicate("Does the product description: {product_description} mention that the item is wireless?")
Filtering support tickets that describe a billing issue
semantic.predicate("Does this support message: {ticket_text} describe a billing issue?")
Filtering support tickets that describe a billing issue with examples
examples = PredicateExampleCollection()
examples.create_example(PredicateExample(
    input={"ticket_text": "I was charged twice for my subscription and need help."},
    output=True))
examples.create_example(PredicateExample(
    input={"ticket_text": "How do I reset my password?"},
    output=False))
semantic.predicate("Does this support ticket describe a billing issue? {ticket_text}", examples)
Source code in src/fenic/api/functions/semantic.py
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
@validate_call(config=ConfigDict(arbitrary_types_allowed=True, strict=True))
def predicate(
        instruction: str,
        examples: Optional[PredicateExampleCollection] = None,
        model_alias: Optional[str] = None,
        temperature: float = 0,
) -> Column:
    """Applies a natural language predicate to one or more string columns, returning a boolean result.

    This is useful for filtering rows based on user-defined criteria expressed in natural language.

    Args:
        instruction: A string containing the semantic.predicate prompt.
            The instruction must include placeholders in curly braces that reference one or more column names.
            These placeholders will be replaced with actual column values during prompt construction during
            query execution.
        examples: Optional collection of examples to guide the semantic predicate operation.
            Each example should demonstrate the expected boolean output for different inputs.
            The examples should be created using PredicateExampleCollection.create_example(),
            providing instruction variables and their expected boolean answers.
        model_alias: Optional alias for the language model to use for the mapping. If None, will use the language model configured as the default.
        temperature: Optional temperature parameter for the language model. If None, will use the default temperature (0.0).

    Returns:
        Column: A column expression that returns a boolean value after applying the natural language predicate.

    Raises:
        ValueError: If the instruction is not a string.

    Example: Identifying product descriptions that mention wireless capability
        ```python
        semantic.predicate("Does the product description: {product_description} mention that the item is wireless?")
        ```

    Example: Filtering support tickets that describe a billing issue
        ```python
        semantic.predicate("Does this support message: {ticket_text} describe a billing issue?")
        ```

    Example: Filtering support tickets that describe a billing issue with examples
        ```python
        examples = PredicateExampleCollection()
        examples.create_example(PredicateExample(
            input={"ticket_text": "I was charged twice for my subscription and need help."},
            output=True))
        examples.create_example(PredicateExample(
            input={"ticket_text": "How do I reset my password?"},
            output=False))
        semantic.predicate("Does this support ticket describe a billing issue? {ticket_text}", examples)
        ```
    """
    return Column._from_logical_expr(
        SemanticPredExpr(
            instruction,
            examples=examples,
            model_alias=model_alias,
            temperature=temperature,
        )
    )

reduce

reduce(instruction: str, model_alias: Optional[str] = None, temperature: float = 0, max_output_tokens: int = 512) -> Column

Aggregate function: reduces a set of strings across columns into a single string using a natural language instruction.

Parameters:

  • instruction (str) –

    A string containing the semantic.reduce prompt. The instruction can include placeholders in curly braces that reference column names. These placeholders will be replaced with actual column values during prompt construction during query execution.

  • model_alias (Optional[str], default: None ) –

    Optional alias for the language model to use for the mapping. If None, will use the language model configured as the default.

  • temperature (float, default: 0 ) –

    Optional temperature parameter for the language model. If None, will use the default temperature (0.0).

  • max_output_tokens (int, default: 512 ) –

    Optional parameter to constrain the model to generate at most this many tokens. If None, fenic will calculate the expected max tokens, based on the model's context length and other operator-specific parameters.

Returns:

  • Column ( Column ) –

    A column expression representing the semantic reduction operation.

Raises:

  • ValueError

    If the instruction is not a string.

Summarizing documents using their titles and bodies
semantic.reduce("Summarize these documents using each document's title: {title} and body: {body}.")
Source code in src/fenic/api/functions/semantic.py
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
@validate_call(config=ConfigDict(strict=True))
def reduce(
        instruction: str,
        model_alias: Optional[str] = None,
        temperature: float = 0,
        max_output_tokens: int = 512,
) -> Column:
    """Aggregate function: reduces a set of strings across columns into a single string using a natural language instruction.

    Args:
        instruction: A string containing the semantic.reduce prompt.
            The instruction can include placeholders in curly braces that reference column names.
            These placeholders will be replaced with actual column values during prompt construction during
            query execution.
        model_alias: Optional alias for the language model to use for the mapping. If None, will use the language model configured as the default.
        temperature: Optional temperature parameter for the language model. If None, will use the default temperature (0.0).
        max_output_tokens: Optional parameter to constrain the model to generate at most this many tokens. If None, fenic will calculate the expected max
            tokens, based on the model's context length and other operator-specific parameters.

    Returns:
        Column: A column expression representing the semantic reduction operation.

    Raises:
        ValueError: If the instruction is not a string.

    Example: Summarizing documents using their titles and bodies
        ```python
        semantic.reduce("Summarize these documents using each document's title: {title} and body: {body}.")
        ```
    """
    return Column._from_logical_expr(
        SemanticReduceExpr(
            instruction,
            max_tokens=max_output_tokens,
            model_alias=model_alias,
            temperature=temperature,
        )
    )