Skip to content

fenic.core

Core module for Fenic.

Classes:

  • ArrayType

    A type representing a homogeneous variable-length array (list) of elements.

  • ClassifyExample

    A single semantic example for classification operations.

  • ClassifyExampleCollection

    Collection of examples for semantic classification operations.

  • ColumnField

    Represents a typed column in a DataFrame schema.

  • DataType

    Base class for all data types.

  • DocumentPathType

    Represents a string containing a a document's local (file system) or remote (URL) path.

  • EmbeddingType

    A type representing a fixed-length embedding vector.

  • ExtractSchema

    Represents a structured extraction schema.

  • ExtractSchemaField

    Represents a field within an structured extraction schema.

  • ExtractSchemaList

    Represents a list data type for structured extraction schema definitions.

  • JoinExample

    A single semantic example for semantic join operations.

  • JoinExampleCollection

    Collection of examples for semantic join operations.

  • LMMetrics

    Tracks language model usage metrics including token counts and costs.

  • MapExample

    A single semantic example for semantic mapping operations.

  • MapExampleCollection

    Collection of examples for semantic mapping operations.

  • OperatorMetrics

    Metrics for a single operator in the query execution plan.

  • PredicateExample

    A single semantic example for semantic predicate operations.

  • PredicateExampleCollection

    Collection of examples for semantic predicate operations.

  • QueryMetrics

    Comprehensive metrics for an executed query.

  • QueryResult

    Container for query execution results and associated metadata.

  • RMMetrics

    Tracks embedding model usage metrics including token counts and costs.

  • Schema

    Represents the schema of a DataFrame.

  • StructField

    A field in a StructType. Fields are nullable.

  • StructType

    A type representing a struct (record) with named fields.

  • TranscriptType

    Represents a string containing a transcript in a specific format.

Attributes:

  • BooleanType

    Represents a boolean value. (True/False)

  • BranchSide

    Type alias representing the side of a branch in a lineage graph.

  • DataLike

    Union type representing any supported data format for both input and output operations.

  • DataLikeType

    String literal type for specifying data output formats.

  • DoubleType

    Represents a 64-bit floating-point number.

  • FloatType

    Represents a 32-bit floating-point number.

  • HtmlType

    Represents a string containing raw HTML markup.

  • IntegerType

    Represents a signed integer value.

  • JsonType

    Represents a string containing JSON data.

  • MarkdownType

    Represents a string containing Markdown-formatted text.

  • SemanticSimilarityMetric

    Type alias representing supported semantic similarity metrics.

  • StringType

    Represents a UTF-8 encoded string value.

BooleanType module-attribute

BooleanType = _BooleanType()

Represents a boolean value. (True/False)

BranchSide module-attribute

BranchSide = Literal['left', 'right']

Type alias representing the side of a branch in a lineage graph.

Valid values:

  • "left": The left branch of a join.
  • "right": The right branch of a join.

DataLike module-attribute

DataLike = Union[DataFrame, DataFrame, Dict[str, List[Any]], List[Dict[str, Any]], Table]

Union type representing any supported data format for both input and output operations.

This type encompasses all possible data structures that can be: 1. Used as input when creating DataFrames 2. Returned as output from query results

Supported formats
  • pl.DataFrame: Native Polars DataFrame with efficient columnar storage
  • pd.DataFrame: Pandas DataFrame, optionally with PyArrow extension arrays
  • Dict[str, List[Any]]: Column-oriented dictionary where:
    • Keys are column names (str)
    • Values are lists containing all values for that column
  • List[Dict[str, Any]]: Row-oriented list where:
    • Each element is a dictionary representing one row
    • Dictionary keys are column names, values are cell values
  • pa.Table: Apache Arrow Table with columnar memory layout
Usage
  • Input: Used in create_dataframe() to accept data in various formats
  • Output: Used in QueryResult.data to return results in requested format

The specific type returned depends on the DataLikeType format specified when collecting query results.

DataLikeType module-attribute

DataLikeType = Literal['polars', 'pandas', 'pydict', 'pylist', 'arrow']

String literal type for specifying data output formats.

Valid values
  • "polars": Native Polars DataFrame format
  • "pandas": Pandas DataFrame with PyArrow extension arrays
  • "pydict": Python dictionary with column names as keys, lists as values
  • "pylist": Python list of dictionaries, each representing one row
  • "arrow": Apache Arrow Table format

Used as input parameter for methods that can return data in multiple formats.

DoubleType module-attribute

DoubleType = _DoubleType()

Represents a 64-bit floating-point number.

FloatType module-attribute

FloatType = _FloatType()

Represents a 32-bit floating-point number.

HtmlType module-attribute

HtmlType = _HtmlType()

Represents a string containing raw HTML markup.

IntegerType module-attribute

IntegerType = _IntegerType()

Represents a signed integer value.

JsonType module-attribute

JsonType = _JsonType()

Represents a string containing JSON data.

MarkdownType module-attribute

MarkdownType = _MarkdownType()

Represents a string containing Markdown-formatted text.

SemanticSimilarityMetric module-attribute

SemanticSimilarityMetric = Literal['cosine', 'l2', 'dot']

Type alias representing supported semantic similarity metrics.

Valid values:

  • "cosine": Cosine similarity, measures the cosine of the angle between two vectors.
  • "l2": Euclidean (L2) distance, measures the straight-line distance between two vectors.
  • "dot": Dot product similarity, the raw inner product of two vectors.

These metrics are commonly used for comparing embedding vectors in semantic search and other similarity-based applications.

StringType module-attribute

StringType = _StringType()

Represents a UTF-8 encoded string value.

ArrayType

Bases: DataType

A type representing a homogeneous variable-length array (list) of elements.

Attributes:

  • element_type (DataType) –

    The data type of each element in the array.

Create an array of strings
ArrayType(StringType)
ArrayType(element_type=StringType)

ClassifyExample

Bases: BaseModel

A single semantic example for classification operations.

Classify examples demonstrate the classification of an input string into a specific category string, used in a semantic.classify operation.

ClassifyExampleCollection

ClassifyExampleCollection(examples: List[ExampleType] = None)

Bases: BaseExampleCollection[ClassifyExample]

Collection of examples for semantic classification operations.

Classification operations categorize input text into predefined classes. This collection manages examples that demonstrate the expected classification results for different inputs.

Examples in this collection have a single input string and an output string representing the classification result.

Methods:

  • from_polars

    Create collection from a Polars DataFrame. Must have an 'output' column and an 'input' column.

Source code in src/fenic/core/types/semantic_examples.py
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
def __init__(self, examples: List[ExampleType] = None):
    """Initialize a collection of semantic examples.

    Args:
        examples: Optional list of examples to add to the collection. Each example
            will be processed through create_example() to ensure proper formatting
            and validation.

    Note:
        The examples list is initialized as empty if no examples are provided.
        Each example in the provided list will be processed through create_example()
        to ensure proper formatting and validation.
    """
    self.examples: List[ExampleType] = []
    if examples:
        for example in examples:
            self.create_example(example)

from_polars classmethod

from_polars(df: DataFrame) -> ClassifyExampleCollection

Create collection from a Polars DataFrame. Must have an 'output' column and an 'input' column.

Source code in src/fenic/core/types/semantic_examples.py
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
@classmethod
def from_polars(cls, df: pl.DataFrame) -> ClassifyExampleCollection:
    """Create collection from a Polars DataFrame. Must have an 'output' column and an 'input' column."""
    collection = cls()

    if EXAMPLE_INPUT_KEY not in df.columns:
        raise InvalidExampleCollectionError(
            f"Classify Examples DataFrame missing required '{EXAMPLE_INPUT_KEY}' column"
        )
    if EXAMPLE_OUTPUT_KEY not in df.columns:
        raise InvalidExampleCollectionError(
            f"Classify Examples DataFrame missing required '{EXAMPLE_OUTPUT_KEY}' column"
        )

    for row in df.iter_rows(named=True):
        if row[EXAMPLE_INPUT_KEY] is None:
            raise InvalidExampleCollectionError(
                f"Classify Examples DataFrame contains null values in '{EXAMPLE_INPUT_KEY}' column"
            )
        if row[EXAMPLE_OUTPUT_KEY] is None:
            raise InvalidExampleCollectionError(
                f"Classify Examples DataFrame contains null values in '{EXAMPLE_OUTPUT_KEY}' column"
            )

        example = ClassifyExample(
            input=row[EXAMPLE_INPUT_KEY],
            output=row[EXAMPLE_OUTPUT_KEY],
        )
        collection.create_example(example)

    return collection

ColumnField

Represents a typed column in a DataFrame schema.

A ColumnField defines the structure of a single column by specifying its name and data type. This is used as a building block for DataFrame schemas.

Attributes:

  • name (str) –

    The name of the column.

  • data_type (DataType) –

    The data type of the column, as a DataType instance.

DataType

Bases: ABC

Base class for all data types.

You won't instantiate this class directly. Instead, use one of the concrete types like StringType, ArrayType, or StructType.

Used for casting, type validation, and schema inference in the DataFrame API.

DocumentPathType

Bases: _StringBackedType

Represents a string containing a a document's local (file system) or remote (URL) path.

EmbeddingType

Bases: DataType

A type representing a fixed-length embedding vector.

Attributes:

  • dimensions (int) –

    The number of dimensions in the embedding vector.

  • embedding_model (str) –

    Name of the model used to generate the embedding.

Create an embedding type for text-embedding-3-small
EmbeddingType(384, embedding_model="text-embedding-3-small")

ExtractSchema

Represents a structured extraction schema.

An extract schema contains a collection of named fields with descriptions that define what information should be extracted into each field.

Methods:

  • field_names

    Get a list of all field names in the schema.

field_names

field_names() -> List[str]

Get a list of all field names in the schema.

Returns:

  • List[str]

    A list of strings containing the names of all fields in the schema.

Source code in src/fenic/core/types/extract_schema.py
123
124
125
126
127
128
129
def field_names(self) -> List[str]:
    """Get a list of all field names in the schema.

    Returns:
        A list of strings containing the names of all fields in the schema.
    """
    return [field.name for field in self.struct_fields]

ExtractSchemaField

ExtractSchemaField(name: str, data_type: Union[DataType, ExtractSchemaList, ExtractSchema], description: str)

Represents a field within an structured extraction schema.

An extract schema field has a name, a data type, and a required description that explains what information should be extracted into this field.

Initialize an ExtractSchemaField.

Parameters:

  • name (str) –

    The name of the field.

  • data_type (Union[DataType, ExtractSchemaList, ExtractSchema]) –

    The data type of the field. Must be either a primitive DataType, ExtractSchemaList, or ExtractSchema.

  • description (str) –

    A description of what information should be extracted into this field.

Raises:

  • ValueError

    If data_type is a non-primitive DataType.

Source code in src/fenic/core/types/extract_schema.py
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
def __init__(
    self,
    name: str,
    data_type: Union[DataType, ExtractSchemaList, ExtractSchema],
    description: str,
):
    """Initialize an ExtractSchemaField.

    Args:
        name: The name of the field.
        data_type: The data type of the field. Must be either a primitive DataType,
            ExtractSchemaList, or ExtractSchema.
        description: A description of what information should be extracted into this field.

    Raises:
        ValueError: If data_type is a non-primitive DataType.
    """
    self.name = name
    if isinstance(data_type, DataType) and not isinstance(
        data_type, _PrimitiveType
    ):
        raise ValueError(
            f"Invalid data type: {data_type}. Only primitive types are supported directly. "
            f"For complex types, please use ExtractSchemaList or ExtractSchema instead."
        )
    self.data_type = data_type
    self.description = description

ExtractSchemaList

ExtractSchemaList(element_type: Union[DataType, ExtractSchema])

Represents a list data type for structured extraction schema definitions.

A schema list contains elements of a specific data type and is used for defining array-like structures in structured extraction schemas.

Initialize an ExtractSchemaList.

Parameters:

  • element_type (Union[DataType, ExtractSchema]) –

    The data type of elements in the list. Must be either a primitive DataType or another ExtractSchema.

Raises:

  • ValueError

    If element_type is a non-primitive DataType.

Source code in src/fenic/core/types/extract_schema.py
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
def __init__(
    self,
    element_type: Union[DataType, ExtractSchema],
):
    """Initialize an ExtractSchemaList.

    Args:
        element_type: The data type of elements in the list. Must be either a primitive
            DataType or another ExtractSchema.

    Raises:
        ValueError: If element_type is a non-primitive DataType.
    """
    if isinstance(element_type, DataType) and not isinstance(
        element_type, _PrimitiveType
    ):
        raise ValueError(
            f"Invalid element type: {element_type}. Only primitive types are supported directly. "
            f"For complex types, please use ExtractSchemaList or ExtractSchema instead."
        )
    self.element_type = element_type

JoinExample

Bases: BaseModel

A single semantic example for semantic join operations.

Join examples demonstrate the evaluation of two input strings across different datasets against a specific condition, used in a semantic.join operation.

JoinExampleCollection

JoinExampleCollection(examples: List[ExampleType] = None)

Bases: BaseExampleCollection[JoinExample]

Collection of examples for semantic join operations.

Methods:

  • from_polars

    Create collection from a Polars DataFrame. Must have 'left', 'right', and 'output' columns.

Source code in src/fenic/core/types/semantic_examples.py
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
def __init__(self, examples: List[ExampleType] = None):
    """Initialize a collection of semantic examples.

    Args:
        examples: Optional list of examples to add to the collection. Each example
            will be processed through create_example() to ensure proper formatting
            and validation.

    Note:
        The examples list is initialized as empty if no examples are provided.
        Each example in the provided list will be processed through create_example()
        to ensure proper formatting and validation.
    """
    self.examples: List[ExampleType] = []
    if examples:
        for example in examples:
            self.create_example(example)

from_polars classmethod

from_polars(df: DataFrame) -> JoinExampleCollection

Create collection from a Polars DataFrame. Must have 'left', 'right', and 'output' columns.

Source code in src/fenic/core/types/semantic_examples.py
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
@classmethod
def from_polars(cls, df: pl.DataFrame) -> JoinExampleCollection:
    """Create collection from a Polars DataFrame. Must have 'left', 'right', and 'output' columns."""
    collection = cls()

    required_columns = [
        EXAMPLE_LEFT_KEY,
        EXAMPLE_RIGHT_KEY,
        EXAMPLE_OUTPUT_KEY,
    ]
    for col in required_columns:
        if col not in df.columns:
            raise InvalidExampleCollectionError(
                f"Join Examples DataFrame missing required '{col}' column"
            )

    for row in df.iter_rows(named=True):
        for col in required_columns:
            if row[col] is None:
                raise InvalidExampleCollectionError(
                    f"Join Examples DataFrame contains null values in '{col}' column"
                )

        example = JoinExample(
            left=row[EXAMPLE_LEFT_KEY],
            right=row[EXAMPLE_RIGHT_KEY],
            output=row[EXAMPLE_OUTPUT_KEY],
        )
        collection.create_example(example)

    return collection

LMMetrics dataclass

LMMetrics(num_uncached_input_tokens: int = 0, num_cached_input_tokens: int = 0, num_output_tokens: int = 0, cost: float = 0.0, num_requests: int = 0)

Tracks language model usage metrics including token counts and costs.

Attributes:

  • num_uncached_input_tokens (int) –

    Number of uncached tokens in the prompt/input

  • num_cached_input_tokens (int) –

    Number of cached tokens in the prompt/input,

  • num_output_tokens (int) –

    Number of tokens in the completion/output

  • cost (float) –

    Total cost in USD for the LM API call

MapExample

Bases: BaseModel

A single semantic example for semantic mapping operations.

Map examples demonstrate the transformation of input variables to a specific output string used in a semantic.map operation.

MapExampleCollection

MapExampleCollection(examples: List[ExampleType] = None)

Bases: BaseExampleCollection[MapExample]

Collection of examples for semantic mapping operations.

Map operations transform input variables into a text output according to specified instructions. This collection manages examples that demonstrate the expected transformations for different inputs.

Examples in this collection can have multiple input variables, each mapped to their respective values, with a single output string representing the expected transformation result.

Methods:

  • from_polars

    Create collection from a Polars DataFrame. Must have an 'output' column and at least one input column.

Source code in src/fenic/core/types/semantic_examples.py
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
def __init__(self, examples: List[ExampleType] = None):
    """Initialize a collection of semantic examples.

    Args:
        examples: Optional list of examples to add to the collection. Each example
            will be processed through create_example() to ensure proper formatting
            and validation.

    Note:
        The examples list is initialized as empty if no examples are provided.
        Each example in the provided list will be processed through create_example()
        to ensure proper formatting and validation.
    """
    self.examples: List[ExampleType] = []
    if examples:
        for example in examples:
            self.create_example(example)

from_polars classmethod

from_polars(df: DataFrame) -> MapExampleCollection

Create collection from a Polars DataFrame. Must have an 'output' column and at least one input column.

Source code in src/fenic/core/types/semantic_examples.py
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
@classmethod
def from_polars(cls, df: pl.DataFrame) -> MapExampleCollection:
    """Create collection from a Polars DataFrame. Must have an 'output' column and at least one input column."""
    collection = cls()

    if EXAMPLE_OUTPUT_KEY not in df.columns:
        raise ValueError(
            f"Map Examples DataFrame missing required '{EXAMPLE_OUTPUT_KEY}' column"
        )

    input_cols = [col for col in df.columns if col != EXAMPLE_OUTPUT_KEY]

    if not input_cols:
        raise ValueError(
            "Map Examples DataFrame must have at least one input column"
        )

    for row in df.iter_rows(named=True):
        if row[EXAMPLE_OUTPUT_KEY] is None:
            raise InvalidExampleCollectionError(
                f"Map Examples DataFrame contains null values in '{EXAMPLE_OUTPUT_KEY}' column"
            )

        input_dict = {
            col: str(row[col]) for col in input_cols if row[col] is not None
        }

        example = MapExample(input=input_dict, output=row[EXAMPLE_OUTPUT_KEY])
        collection.create_example(example)

    return collection

OperatorMetrics dataclass

OperatorMetrics(operator_id: str, num_output_rows: int = 0, execution_time_ms: float = 0.0, lm_metrics: LMMetrics = LMMetrics(), rm_metrics: RMMetrics = RMMetrics(), is_cache_hit: bool = False)

Metrics for a single operator in the query execution plan.

Attributes:

  • operator_id (str) –

    Unique identifier for the operator

  • num_output_rows (int) –

    Number of rows output by this operator

  • execution_time_ms (float) –

    Execution time in milliseconds

  • lm_metrics (LMMetrics) –

    Language model usage metrics for this operator

  • is_cache_hit (bool) –

    Whether results were retrieved from cache

PredicateExample

Bases: BaseModel

A single semantic example for semantic predicate operations.

Predicate examples demonstrate the evaluation of input variables against a specific condition, used in a semantic.predicate operation.

PredicateExampleCollection

PredicateExampleCollection(examples: List[ExampleType] = None)

Bases: BaseExampleCollection[PredicateExample]

Collection of examples for semantic predicate operations.

Predicate operations evaluate conditions on input variables to produce boolean (True/False) results. This collection manages examples that demonstrate the expected boolean outcomes for different inputs.

Examples in this collection can have multiple input variables, each mapped to their respective values, with a single boolean output representing the evaluation result of the predicate.

Methods:

  • from_polars

    Create collection from a Polars DataFrame.

Source code in src/fenic/core/types/semantic_examples.py
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
def __init__(self, examples: List[ExampleType] = None):
    """Initialize a collection of semantic examples.

    Args:
        examples: Optional list of examples to add to the collection. Each example
            will be processed through create_example() to ensure proper formatting
            and validation.

    Note:
        The examples list is initialized as empty if no examples are provided.
        Each example in the provided list will be processed through create_example()
        to ensure proper formatting and validation.
    """
    self.examples: List[ExampleType] = []
    if examples:
        for example in examples:
            self.create_example(example)

from_polars classmethod

from_polars(df: DataFrame) -> PredicateExampleCollection

Create collection from a Polars DataFrame.

Source code in src/fenic/core/types/semantic_examples.py
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
@classmethod
def from_polars(cls, df: pl.DataFrame) -> PredicateExampleCollection:
    """Create collection from a Polars DataFrame."""
    collection = cls()

    # Validate output column exists
    if EXAMPLE_OUTPUT_KEY not in df.columns:
        raise InvalidExampleCollectionError(
            f"Predicate Examples DataFrame missing required '{EXAMPLE_OUTPUT_KEY}' column"
        )

    input_cols = [col for col in df.columns if col != EXAMPLE_OUTPUT_KEY]

    if not input_cols:
        raise InvalidExampleCollectionError(
            "Predicate Examples DataFrame must have at least one input column"
        )

    for row in df.iter_rows(named=True):
        if row[EXAMPLE_OUTPUT_KEY] is None:
            raise InvalidExampleCollectionError(
                f"Predicate Examples DataFrame contains null values in '{EXAMPLE_OUTPUT_KEY}' column"
            )

        input_dict = {col: row[col] for col in input_cols if row[col] is not None}

        example = PredicateExample(input=input_dict, output=row[EXAMPLE_OUTPUT_KEY])
        collection.create_example(example)

    return collection

QueryMetrics dataclass

QueryMetrics(execution_time_ms: float = 0.0, num_output_rows: int = 0, total_lm_metrics: LMMetrics = LMMetrics(), total_rm_metrics: RMMetrics = RMMetrics(), _operator_metrics: Dict[str, OperatorMetrics] = dict(), _plan_repr: PhysicalPlanRepr = lambda: PhysicalPlanRepr(operator_id='empty')())

Comprehensive metrics for an executed query.

Includes overall statistics and detailed metrics for each operator in the execution plan.

Attributes:

  • execution_time_ms (float) –

    Total query execution time in milliseconds

  • num_output_rows (int) –

    Total number of rows returned by the query

  • total_lm_metrics (LMMetrics) –

    Aggregated language model metrics across all operators

Methods:

get_execution_plan_details

get_execution_plan_details() -> str

Generate a formatted execution plan with detailed metrics.

Produces a hierarchical representation of the query execution plan, including performance metrics and language model usage for each operator.

Returns:

  • str ( str ) –

    A formatted string showing the execution plan with metrics.

Source code in src/fenic/core/metrics.py
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
def get_execution_plan_details(self) -> str:
    """Generate a formatted execution plan with detailed metrics.

    Produces a hierarchical representation of the query execution plan,
    including performance metrics and language model usage for each operator.

    Returns:
        str: A formatted string showing the execution plan with metrics.
    """

    def _format_node(node: PhysicalPlanRepr, indent: int = 1) -> str:
        op = self._operator_metrics[node.operator_id]
        indent_str = "  " * indent

        details = [
            f"{indent_str}{op.operator_id}",
            f"{indent_str}  Output Rows: {op.num_output_rows:,}",
            f"{indent_str}  Execution Time: {op.execution_time_ms:.2f}ms",
            f"{indent_str}  Cached: {op.is_cache_hit}",
        ]

        if op.lm_metrics.cost > 0:
            details.extend(
                [
                    f"{indent_str}  Language Model Usage: {op.lm_metrics.num_uncached_input_tokens:,} input tokens, {op.lm_metrics.num_cached_input_tokens:,} cached input tokens, {op.lm_metrics.num_output_tokens:,} output tokens",
                    f"{indent_str}  Language Model Cost: ${op.lm_metrics.cost:.6f}",
                ]
            )

        if op.rm_metrics.cost > 0:
            details.extend(
                [
                    f"{indent_str}  Embedding Model Usage: {op.rm_metrics.num_input_tokens:,} input tokens",
                    f"{indent_str}  Embedding Model Cost: ${op.rm_metrics.cost:.6f}",
                ]
            )
        return (
            "\n".join(details)
            + "\n"
            + "".join(_format_node(child, indent + 1) for child in node.children)
        )

    return f"Execution Plan\n{_format_node(self._plan_repr)}"

get_summary

get_summary() -> str

Summarize the query metrics in a single line.

Returns:

  • str ( str ) –

    A concise summary of execution time, row count, and LM cost.

Source code in src/fenic/core/metrics.py
127
128
129
130
131
132
133
134
135
136
137
138
def get_summary(self) -> str:
    """Summarize the query metrics in a single line.

    Returns:
        str: A concise summary of execution time, row count, and LM cost.
    """
    return (
        f"Query executed in {self.execution_time_ms:.2f}ms, "
        f"returned {self.num_output_rows:,} rows, "
        f"language model cost: ${self.total_lm_metrics.cost:.6f}, "
        f"embedding model cost: ${self.total_rm_metrics.cost:.6f}"
    )

QueryResult dataclass

QueryResult(data: DataLike, metrics: QueryMetrics)

Container for query execution results and associated metadata.

This dataclass bundles together the materialized data from a query execution along with metrics about the execution process. It provides a unified interface for accessing both the computed results and performance information.

Attributes:

  • data (DataLike) –

    The materialized query results in the requested format. Can be any of the supported data types (Polars/Pandas DataFrame, Arrow Table, or Python dict/list structures).

  • metrics (QueryMetrics) –

    Execution metadata including timing information, memory usage, rows processed, and other performance metrics collected during query execution.

Access query results and metrics
# Execute query and get results with metrics
result = df.filter(col("age") > 25).collect("pandas")
pandas_df = result.data  # Access the Pandas DataFrame
print(result.metrics.execution_time)  # Access execution metrics
print(result.metrics.rows_processed)  # Access row count
Work with different data formats
# Get results in different formats
polars_result = df.collect("polars")
arrow_result = df.collect("arrow")
dict_result = df.collect("pydict")

# All contain the same data, different formats
print(type(polars_result.data))  # <class 'polars.DataFrame'>
print(type(arrow_result.data))   # <class 'pyarrow.lib.Table'>
print(type(dict_result.data))    # <class 'dict'>
Note

The actual type of the data attribute depends on the format requested during collection. Use type checking or isinstance() if you need to handle the data differently based on its format.

RMMetrics dataclass

RMMetrics(num_input_tokens: int = 0, num_requests: int = 0, cost: float = 0.0)

Tracks embedding model usage metrics including token counts and costs.

Attributes:

  • num_input_tokens (int) –

    Number of tokens to embed

  • cost (float) –

    Total cost in USD to embed the tokens

Schema

Represents the schema of a DataFrame.

A Schema defines the structure of a DataFrame by specifying an ordered collection of column fields. Each column field defines the name and data type of a column in the DataFrame.

Attributes:

  • column_fields (List[ColumnField]) –

    An ordered list of ColumnField objects that define the structure of the DataFrame.

Methods:

  • column_names

    Get a list of all column names in the schema.

column_names

column_names() -> List[str]

Get a list of all column names in the schema.

Returns:

  • List[str]

    A list of strings containing the names of all columns in the schema.

Source code in src/fenic/core/types/schema.py
62
63
64
65
66
67
68
def column_names(self) -> List[str]:
    """Get a list of all column names in the schema.

    Returns:
        A list of strings containing the names of all columns in the schema.
    """
    return [field.name for field in self.column_fields]

StructField

A field in a StructType. Fields are nullable.

Attributes:

  • name (str) –

    The name of the field.

  • data_type (DataType) –

    The data type of the field.

StructType

Bases: DataType

A type representing a struct (record) with named fields.

Attributes:

  • fields

    List of field definitions.

Create a struct with name and age fields
StructType([
    StructField("name", StringType),
    StructField("age", IntegerType),
])

TranscriptType

Bases: _StringBackedType

Represents a string containing a transcript in a specific format.