Skip to content

fenic.core.types.datatypes

Core data type definitions for the DataFrame API.

This module defines the type system used throughout the DataFrame API. It includes: - Base classes for all data types - Primitive types (string, integer, float, etc.) - Composite types (arrays, structs) - Specialized types (embeddings, markdown, etc.)

Classes:

  • ArrayType

    A type representing a homogeneous variable-length array (list) of elements.

  • DataType

    Base class for all data types.

  • DocumentPathType

    Represents a string containing a a document's local (file system) or remote (URL) path.

  • EmbeddingType

    A type representing a fixed-length embedding vector.

  • StructField

    A field in a StructType. Fields are nullable.

  • StructType

    A type representing a struct (record) with named fields.

  • TranscriptType

    Represents a string containing a transcript in a specific format.

Attributes:

  • BooleanType

    Represents a boolean value. (True/False)

  • DateType

    Represents a date value.

  • DoubleType

    Represents a 64-bit floating-point number.

  • FloatType

    Represents a 32-bit floating-point number.

  • HtmlType

    Represents a string containing raw HTML markup.

  • IntegerType

    Represents a signed integer value.

  • JsonType

    Represents a string containing JSON data.

  • MarkdownType

    Represents a string containing Markdown-formatted text.

  • StringType

    Represents a UTF-8 encoded string value.

  • TimestampType

    Represents a timestamp value.

BooleanType module-attribute

BooleanType = _BooleanType()

Represents a boolean value. (True/False)

DateType module-attribute

DateType = _DateType()

Represents a date value.

DoubleType module-attribute

DoubleType = _DoubleType()

Represents a 64-bit floating-point number.

FloatType module-attribute

FloatType = _FloatType()

Represents a 32-bit floating-point number.

HtmlType module-attribute

HtmlType = _HtmlType()

Represents a string containing raw HTML markup.

IntegerType module-attribute

IntegerType = _IntegerType()

Represents a signed integer value.

JsonType module-attribute

JsonType = _JsonType()

Represents a string containing JSON data.

MarkdownType module-attribute

MarkdownType = _MarkdownType()

Represents a string containing Markdown-formatted text.

StringType module-attribute

StringType = _StringType()

Represents a UTF-8 encoded string value.

TimestampType module-attribute

TimestampType = _TimestampType()

Represents a timestamp value.

ArrayType

Bases: DataType

A type representing a homogeneous variable-length array (list) of elements.

Attributes:

  • element_type (DataType) –

    The data type of each element in the array.

Create an array of strings
ArrayType(StringType)
ArrayType(element_type=StringType)

DataType

Bases: ABC

Base class for all data types.

You won't instantiate this class directly. Instead, use one of the concrete types like StringType, ArrayType, or StructType.

Used for casting, type validation, and schema inference in the DataFrame API.

DocumentPathType

Bases: _LogicalType

Represents a string containing a a document's local (file system) or remote (URL) path.

EmbeddingType

Bases: _LogicalType

A type representing a fixed-length embedding vector.

Attributes:

  • dimensions (int) –

    The number of dimensions in the embedding vector.

  • embedding_model (str) –

    Name of the model used to generate the embedding.

Create an embedding type for text-embedding-3-small
EmbeddingType(384, embedding_model="text-embedding-3-small")

StructField

A field in a StructType. Fields are nullable.

Attributes:

  • name (str) –

    The name of the field.

  • data_type (DataType) –

    The data type of the field.

StructType

Bases: DataType

A type representing a struct (record) with named fields.

Attributes:

  • fields

    List of field definitions.

Create a struct with name and age fields
StructType([
    StructField("name", StringType),
    StructField("age", IntegerType),
])

TranscriptType

Bases: _LogicalType

Represents a string containing a transcript in a specific format.