fenic.core.types.datatypes
Core data type definitions for the DataFrame API.
This module defines the type system used throughout the DataFrame API. It includes: - Base classes for all data types - Primitive types (string, integer, float, etc.) - Composite types (arrays, structs) - Specialized types (embeddings, markdown, etc.)
Classes:
-
ArrayType–A type representing a homogeneous variable-length array (list) of elements.
-
DataType–Base class for all data types.
-
DocumentPathType–Represents a string containing a a document's local (file system) or remote (URL) path.
-
EmbeddingType–A type representing a fixed-length embedding vector.
-
StructField–A field in a StructType. Fields are nullable.
-
StructType–A type representing a struct (record) with named fields.
-
TranscriptType–Represents a string containing a transcript in a specific format.
Attributes:
-
BooleanType–Represents a boolean value. (True/False)
-
DateType–Represents a date value.
-
DoubleType–Represents a 64-bit floating-point number.
-
FloatType–Represents a 32-bit floating-point number.
-
HtmlType–Represents a string containing raw HTML markup.
-
IntegerType–Represents a signed integer value.
-
JsonType–Represents a string containing JSON data.
-
MarkdownType–Represents a string containing Markdown-formatted text.
-
StringType–Represents a UTF-8 encoded string value.
-
TimestampType–Represents a timestamp value.
BooleanType
module-attribute
BooleanType = _BooleanType()
Represents a boolean value. (True/False)
DateType
module-attribute
DateType = _DateType()
Represents a date value.
DoubleType
module-attribute
DoubleType = _DoubleType()
Represents a 64-bit floating-point number.
FloatType
module-attribute
FloatType = _FloatType()
Represents a 32-bit floating-point number.
HtmlType
module-attribute
HtmlType = _HtmlType()
Represents a string containing raw HTML markup.
IntegerType
module-attribute
IntegerType = _IntegerType()
Represents a signed integer value.
JsonType
module-attribute
JsonType = _JsonType()
Represents a string containing JSON data.
MarkdownType
module-attribute
MarkdownType = _MarkdownType()
Represents a string containing Markdown-formatted text.
StringType
module-attribute
StringType = _StringType()
Represents a UTF-8 encoded string value.
TimestampType
module-attribute
TimestampType = _TimestampType()
Represents a timestamp value.
ArrayType
DataType
Bases: ABC
Base class for all data types.
You won't instantiate this class directly. Instead, use one of the
concrete types like StringType, ArrayType, or StructType.
Used for casting, type validation, and schema inference in the DataFrame API.
DocumentPathType
Bases: _LogicalType
Represents a string containing a a document's local (file system) or remote (URL) path.
EmbeddingType
Bases: _LogicalType
A type representing a fixed-length embedding vector.
Attributes:
-
dimensions(int) –The number of dimensions in the embedding vector.
-
embedding_model(str) –Name of the model used to generate the embedding.
Create an embedding type for text-embedding-3-small
EmbeddingType(384, embedding_model="text-embedding-3-small")
StructField
A field in a StructType. Fields are nullable.
Attributes:
-
name(str) –The name of the field.
-
data_type(DataType) –The data type of the field.
StructType
Bases: DataType
A type representing a struct (record) with named fields.
Attributes:
-
fields–List of field definitions.
Create a struct with name and age fields
StructType([
StructField("name", StringType),
StructField("age", IntegerType),
])
TranscriptType
Bases: _LogicalType
Represents a string containing a transcript in a specific format.