Skip to content

fenic.api.column

Column API for Fenic DataFrames - represents column expressions and operations.

Classes:

  • Column

    A column expression in a DataFrame.

Column

A column expression in a DataFrame.

This class represents a column expression that can be used in DataFrame operations. It provides methods for accessing, transforming, and combining column data.

Create a column reference
# Reference a column by name using col() function
col("column_name")
Use column in operations
# Perform arithmetic operations
df.select(col("price") * col("quantity"))
Chain column operations
# Chain multiple operations
df.select(col("name").upper().contains("John"))

Methods:

  • alias

    Create an alias for this column.

  • asc

    Apply ascending order to this column during a dataframe sort or order_by.

  • asc_nulls_first

    Apply ascending order putting nulls first to this column during a dataframe sort or order_by.

  • asc_nulls_last

    Apply ascending order putting nulls last to this column during a dataframe sort or order_by.

  • cast

    Cast the column to a new data type.

  • contains

    Check if the column contains a substring.

  • contains_any

    Check if the column contains any of the specified substrings.

  • desc

    Apply descending order to this column during a dataframe sort or order_by.

  • desc_nulls_first

    Apply descending order putting nulls first to this column during a dataframe sort or order_by.

  • desc_nulls_last

    Apply descending order putting nulls last to this column during a dataframe sort or order_by.

  • ends_with

    Check if the column ends with a substring.

  • get_item

    Access an item in a struct or array column.

  • ilike

    Check if the column matches a SQL LIKE pattern (case-insensitive).

  • is_in

    Check if the column is in a list of values or a column expression.

  • is_not_null

    Check if the column contains non-NULL values.

  • is_null

    Check if the column contains NULL values.

  • like

    Check if the column matches a SQL LIKE pattern.

  • otherwise

    Evaluates a list of conditions and returns one of multiple possible result expressions.

  • rlike

    Check if the column matches a regular expression pattern.

  • starts_with

    Check if the column starts with a substring.

  • when

    Evaluates a list of conditions and returns one of multiple possible result expressions.

alias

alias(name: str) -> Column

Create an alias for this column.

This method assigns a new name to the column expression, which is useful for renaming columns or providing names for complex expressions.

Parameters:

  • name (str) –

    The alias name to assign

Returns:

  • Column ( Column ) –

    Column with the specified alias

Rename a column
# Rename a column to a new name
df.select(col("original_name").alias("new_name"))
Name a complex expression
# Give a name to a calculated column
df.select((col("price") * col("quantity")).alias("total_value"))
Source code in src/fenic/api/column.py
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
def alias(self, name: str) -> Column:
    """Create an alias for this column.

    This method assigns a new name to the column expression, which is useful
    for renaming columns or providing names for complex expressions.

    Args:
        name (str): The alias name to assign

    Returns:
        Column: Column with the specified alias

    Example: Rename a column
        ```python
        # Rename a column to a new name
        df.select(col("original_name").alias("new_name"))
        ```

    Example: Name a complex expression
        ```python
        # Give a name to a calculated column
        df.select((col("price") * col("quantity")).alias("total_value"))
        ```
    """
    return Column._from_logical_expr(AliasExpr(self._logical_expr, name))

asc

asc() -> Column

Apply ascending order to this column during a dataframe sort or order_by.

This method creates an expression that provides a column and sort order to the sort function.

Returns:

  • Column ( Column ) –

    A Column expression that provides a column and sort order to the sort function

Sort by age in ascending order
# Sort a dataframe by age in ascending order
df.sort(col("age").asc()).show()
Sort using column reference
# Sort using column reference with ascending order
df.sort(col("age").asc()).show()
Source code in src/fenic/api/column.py
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
def asc(self) -> Column:
    """Apply ascending order to this column during a dataframe sort or order_by.

    This method creates an expression that provides a column and sort order to the sort function.

    Returns:
        Column: A Column expression that provides a column and sort order to the sort function

    Example: Sort by age in ascending order
        ```python
        # Sort a dataframe by age in ascending order
        df.sort(col("age").asc()).show()
        ```

    Example: Sort using column reference
        ```python
        # Sort using column reference with ascending order
        df.sort(col("age").asc()).show()
        ```
    """
    return Column._from_logical_expr(SortExpr(self._logical_expr, ascending=True))

asc_nulls_first

asc_nulls_first() -> Column

Apply ascending order putting nulls first to this column during a dataframe sort or order_by.

This method creates an expression that provides a column and sort order to the sort function.

Returns:

  • Column ( Column ) –

    A Column expression that provides a column and sort order to the sort function

Sort by age in ascending order with nulls first
# Sort a dataframe by age in ascending order, with nulls appearing first
df.sort(col("age").asc_nulls_first()).show()
Sort using column reference
# Sort using column reference with ascending order and nulls first
df.sort(col("age").asc_nulls_first()).show()
Source code in src/fenic/api/column.py
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
def asc_nulls_first(self) -> Column:
    """Apply ascending order putting nulls first to this column during a dataframe sort or order_by.

    This method creates an expression that provides a column and sort order to the sort function.

    Returns:
        Column: A Column expression that provides a column and sort order to the sort function

    Example: Sort by age in ascending order with nulls first
        ```python
        # Sort a dataframe by age in ascending order, with nulls appearing first
        df.sort(col("age").asc_nulls_first()).show()
        ```

    Example: Sort using column reference
        ```python
        # Sort using column reference with ascending order and nulls first
        df.sort(col("age").asc_nulls_first()).show()
        ```
    """
    return Column._from_logical_expr(
        SortExpr(self._logical_expr, ascending=True, nulls_last=False)
    )

asc_nulls_last

asc_nulls_last() -> Column

Apply ascending order putting nulls last to this column during a dataframe sort or order_by.

This method creates an expression that provides a column and sort order to the sort function.

Returns:

  • Column ( Column ) –

    A Column expression that provides a column and sort order to the sort function

Sort by age in ascending order with nulls last
# Sort a dataframe by age in ascending order, with nulls appearing last
df.sort(col("age").asc_nulls_last()).show()
Sort using column reference
# Sort using column reference with ascending order and nulls last
df.sort(col("age").asc_nulls_last()).show()
Source code in src/fenic/api/column.py
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
def asc_nulls_last(self) -> Column:
    """Apply ascending order putting nulls last to this column during a dataframe sort or order_by.

    This method creates an expression that provides a column and sort order to the sort function.

    Returns:
        Column: A Column expression that provides a column and sort order to the sort function

    Example: Sort by age in ascending order with nulls last
        ```python
        # Sort a dataframe by age in ascending order, with nulls appearing last
        df.sort(col("age").asc_nulls_last()).show()
        ```

    Example: Sort using column reference
        ```python
        # Sort using column reference with ascending order and nulls last
        df.sort(col("age").asc_nulls_last()).show()
        ```
    """
    return Column._from_logical_expr(
        SortExpr(self._logical_expr, ascending=True, nulls_last=True)
    )

cast

cast(data_type: DataType) -> Column

Cast the column to a new data type.

This method creates an expression that casts the column to a specified data type. The casting behavior depends on the source and target types:

Primitive type casting:

  • Numeric types (IntegerType, FloatType, DoubleType) can be cast between each other
  • Numeric types can be cast to/from StringType
  • BooleanType can be cast to/from numeric types and StringType
  • StringType cannot be directly cast to BooleanType (will raise TypeError)

Complex type casting:

  • ArrayType can only be cast to another ArrayType (with castable element types)
  • StructType can only be cast to another StructType (with matching/castable fields)
  • Primitive types cannot be cast to/from complex types

Parameters:

  • data_type (DataType) –

    The target DataType to cast the column to

Returns:

  • Column ( Column ) –

    A Column representing the casted expression

Cast integer to string
# Convert an integer column to string type
df.select(col("int_col").cast(StringType))
Cast array of integers to array of strings
# Convert an array of integers to an array of strings
df.select(col("int_array").cast(ArrayType(element_type=StringType)))
Cast struct fields to different types
# Convert struct fields to different types
new_type = StructType([
    StructField("id", StringType),
    StructField("value", FloatType)
])
df.select(col("data_struct").cast(new_type))

Raises:

  • TypeError

    If the requested cast operation is not supported

Source code in src/fenic/api/column.py
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
def cast(self, data_type: DataType) -> Column:
    """Cast the column to a new data type.

    This method creates an expression that casts the column to a specified data type.
    The casting behavior depends on the source and target types:

    Primitive type casting:

    - Numeric types (IntegerType, FloatType, DoubleType) can be cast between each other
    - Numeric types can be cast to/from StringType
    - BooleanType can be cast to/from numeric types and StringType
    - StringType cannot be directly cast to BooleanType (will raise TypeError)

    Complex type casting:

    - ArrayType can only be cast to another ArrayType (with castable element types)
    - StructType can only be cast to another StructType (with matching/castable fields)
    - Primitive types cannot be cast to/from complex types

    Args:
        data_type (DataType): The target DataType to cast the column to

    Returns:
        Column: A Column representing the casted expression

    Example: Cast integer to string
        ```python
        # Convert an integer column to string type
        df.select(col("int_col").cast(StringType))
        ```

    Example: Cast array of integers to array of strings
        ```python
        # Convert an array of integers to an array of strings
        df.select(col("int_array").cast(ArrayType(element_type=StringType)))
        ```

    Example: Cast struct fields to different types
        ```python
        # Convert struct fields to different types
        new_type = StructType([
            StructField("id", StringType),
            StructField("value", FloatType)
        ])
        df.select(col("data_struct").cast(new_type))
        ```

    Raises:
        TypeError: If the requested cast operation is not supported
    """
    return Column._from_logical_expr(CastExpr(self._logical_expr, data_type))

contains

contains(other: Union[str, Column]) -> Column

Check if the column contains a substring.

This method creates a boolean expression that checks if each value in the column contains the specified substring. The substring can be either a literal string or a column expression.

Parameters:

  • other (Union[str, Column]) –

    The substring to search for (can be a string or column expression)

Returns:

  • Column ( Column ) –

    A boolean column indicating whether each value contains the substring

Find rows where name contains "john"
# Filter rows where the name column contains "john"
df.filter(col("name").contains("john"))
Find rows where text contains a dynamic pattern
# Filter rows where text contains a value from another column
df.filter(col("text").contains(col("pattern")))
Source code in src/fenic/api/column.py
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
def contains(self, other: Union[str, Column]) -> Column:
    """Check if the column contains a substring.

    This method creates a boolean expression that checks if each value in the column
    contains the specified substring. The substring can be either a literal string
    or a column expression.

    Args:
        other (Union[str, Column]): The substring to search for (can be a string or column expression)

    Returns:
        Column: A boolean column indicating whether each value contains the substring

    Example: Find rows where name contains "john"
        ```python
        # Filter rows where the name column contains "john"
        df.filter(col("name").contains("john"))
        ```

    Example: Find rows where text contains a dynamic pattern
        ```python
        # Filter rows where text contains a value from another column
        df.filter(col("text").contains(col("pattern")))
        ```
    """
    if isinstance(other, str):
        return Column._from_logical_expr(ContainsExpr(self._logical_expr, other))
    else:
        return Column._from_logical_expr(
            ContainsExpr(self._logical_expr, other._logical_expr)
        )

contains_any

contains_any(others: List[str], case_insensitive: bool = True) -> Column

Check if the column contains any of the specified substrings.

This method creates a boolean expression that checks if each value in the column contains any of the specified substrings. The matching can be case-sensitive or case-insensitive.

Parameters:

  • others (List[str]) –

    List of substrings to search for

  • case_insensitive (bool, default: True ) –

    Whether to perform case-insensitive matching (default: True)

Returns:

  • Column ( Column ) –

    A boolean column indicating whether each value contains any substring

Find rows where name contains "john" or "jane" (case-insensitive)
# Filter rows where name contains either "john" or "jane"
df.filter(col("name").contains_any(["john", "jane"]))
Case-sensitive matching
# Filter rows with case-sensitive matching
df.filter(col("name").contains_any(["John", "Jane"], case_insensitive=False))
Source code in src/fenic/api/column.py
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
def contains_any(self, others: List[str], case_insensitive: bool = True) -> Column:
    """Check if the column contains any of the specified substrings.

    This method creates a boolean expression that checks if each value in the column
    contains any of the specified substrings. The matching can be case-sensitive or
    case-insensitive.

    Args:
        others (List[str]): List of substrings to search for
        case_insensitive (bool): Whether to perform case-insensitive matching (default: True)

    Returns:
        Column: A boolean column indicating whether each value contains any substring

    Example: Find rows where name contains "john" or "jane" (case-insensitive)
        ```python
        # Filter rows where name contains either "john" or "jane"
        df.filter(col("name").contains_any(["john", "jane"]))
        ```

    Example: Case-sensitive matching
        ```python
        # Filter rows with case-sensitive matching
        df.filter(col("name").contains_any(["John", "Jane"], case_insensitive=False))
        ```
    """
    return Column._from_logical_expr(
        ContainsAnyExpr(self._logical_expr, others, case_insensitive)
    )

desc

desc() -> Column

Apply descending order to this column during a dataframe sort or order_by.

This method creates an expression that provides a column and sort order to the sort function.

Returns:

  • Column ( Column ) –

    A Column expression that provides a column and sort order to the sort function

Sort by age in descending order
# Sort a dataframe by age in descending order
df.sort(col("age").desc()).show()
Sort using column reference
# Sort using column reference with descending order
df.sort(col("age").desc()).show()
Source code in src/fenic/api/column.py
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
def desc(self) -> Column:
    """Apply descending order to this column during a dataframe sort or order_by.

    This method creates an expression that provides a column and sort order to the sort function.

    Returns:
        Column: A Column expression that provides a column and sort order to the sort function

    Example: Sort by age in descending order
        ```python
        # Sort a dataframe by age in descending order
        df.sort(col("age").desc()).show()
        ```

    Example: Sort using column reference
        ```python
        # Sort using column reference with descending order
        df.sort(col("age").desc()).show()
        ```
    """
    return Column._from_logical_expr(SortExpr(self._logical_expr, ascending=False))

desc_nulls_first

desc_nulls_first() -> Column

Apply descending order putting nulls first to this column during a dataframe sort or order_by.

This method creates an expression that provides a column and sort order to the sort function

Returns:

  • Column ( Column ) –

    A Column expression that provides a column and sort order to the sort function

Sort by age in descending order with nulls first
df.sort(col("age").desc_nulls_first()).show()
Sort using column reference
df.sort(col("age").desc_nulls_first()).show()
Source code in src/fenic/api/column.py
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
def desc_nulls_first(self) -> Column:
    """Apply descending order putting nulls first to this column during a dataframe sort or order_by.

    This method creates an expression that provides a column and sort order to the sort function

    Returns:
        Column: A Column expression that provides a column and sort order to the sort function

    Example: Sort by age in descending order with nulls first
        ```python
        df.sort(col("age").desc_nulls_first()).show()
        ```

    Example: Sort using column reference
        ```python
        df.sort(col("age").desc_nulls_first()).show()
        ```
    """
    return Column._from_logical_expr(
        SortExpr(self._logical_expr, ascending=False, nulls_last=False)
    )

desc_nulls_last

desc_nulls_last() -> Column

Apply descending order putting nulls last to this column during a dataframe sort or order_by.

This method creates an expression that provides a column and sort order to the sort function.

Returns:

  • Column ( Column ) –

    A Column expression that provides a column and sort order to the sort function

Sort by age in descending order with nulls last
# Sort a dataframe by age in descending order, with nulls appearing last
df.sort(col("age").desc_nulls_last()).show()
Sort using column reference
# Sort using column reference with descending order and nulls last
df.sort(col("age").desc_nulls_last()).show()
Source code in src/fenic/api/column.py
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
def desc_nulls_last(self) -> Column:
    """Apply descending order putting nulls last to this column during a dataframe sort or order_by.

    This method creates an expression that provides a column and sort order to the sort function.

    Returns:
        Column: A Column expression that provides a column and sort order to the sort function

    Example: Sort by age in descending order with nulls last
        ```python
        # Sort a dataframe by age in descending order, with nulls appearing last
        df.sort(col("age").desc_nulls_last()).show()
        ```

    Example: Sort using column reference
        ```python
        # Sort using column reference with descending order and nulls last
        df.sort(col("age").desc_nulls_last()).show()
        ```
    """
    return Column._from_logical_expr(
        SortExpr(self._logical_expr, ascending=False, nulls_last=True)
    )

ends_with

ends_with(other: Union[str, Column]) -> Column

Check if the column ends with a substring.

This method creates a boolean expression that checks if each value in the column ends with the specified substring. The substring can be either a literal string or a column expression.

Parameters:

  • other (Union[str, Column]) –

    The substring to check for at the end (can be a string or column expression)

Returns:

  • Column ( Column ) –

    A boolean column indicating whether each value ends with the substring

Find rows where email ends with "@gmail.com"
df.filter(col("email").ends_with("@gmail.com"))
Find rows where text ends with a dynamic pattern
df.filter(col("text").ends_with(col("suffix")))

Raises:

  • ValueError

    If the substring ends with a regular expression anchor ($)

Source code in src/fenic/api/column.py
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
def ends_with(self, other: Union[str, Column]) -> Column:
    """Check if the column ends with a substring.

    This method creates a boolean expression that checks if each value in the column
    ends with the specified substring. The substring can be either a literal string
    or a column expression.

    Args:
        other (Union[str, Column]): The substring to check for at the end (can be a string or column expression)

    Returns:
        Column: A boolean column indicating whether each value ends with the substring

    Example: Find rows where email ends with "@gmail.com"
        ```python
        df.filter(col("email").ends_with("@gmail.com"))
        ```

    Example: Find rows where text ends with a dynamic pattern
        ```python
        df.filter(col("text").ends_with(col("suffix")))
        ```

    Raises:
        ValueError: If the substring ends with a regular expression anchor ($)
    """
    if isinstance(other, str):
        return Column._from_logical_expr(EndsWithExpr(self._logical_expr, other))
    else:
        return Column._from_logical_expr(
            EndsWithExpr(self._logical_expr, other._logical_expr)
        )

get_item

get_item(key: Union[str, int]) -> Column

Access an item in a struct or array column.

This method allows accessing elements in complex data types:

  • For array columns, the key should be an integer index
  • For struct columns, the key should be a field name

Parameters:

  • key (Union[str, int]) –

    The index (for arrays) or field name (for structs) to access

Returns:

  • Column ( Column ) –

    A Column representing the accessed item

Access an array element
# Get the first element from an array column
df.select(col("array_column").get_item(0))
Access a struct field
# Get a field from a struct column
df.select(col("struct_column").get_item("field_name"))
Source code in src/fenic/api/column.py
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
def get_item(self, key: Union[str, int]) -> Column:
    """Access an item in a struct or array column.

    This method allows accessing elements in complex data types:

    - For array columns, the key should be an integer index
    - For struct columns, the key should be a field name

    Args:
        key (Union[str, int]): The index (for arrays) or field name (for structs) to access

    Returns:
        Column: A Column representing the accessed item

    Example: Access an array element
        ```python
        # Get the first element from an array column
        df.select(col("array_column").get_item(0))
        ```

    Example: Access a struct field
        ```python
        # Get a field from a struct column
        df.select(col("struct_column").get_item("field_name"))
        ```
    """
    return Column._from_logical_expr(IndexExpr(self._logical_expr, key))

ilike

ilike(other: str) -> Column

Check if the column matches a SQL LIKE pattern (case-insensitive).

This method creates a boolean expression that checks if each value in the column matches the specified SQL LIKE pattern, ignoring case. The pattern must be a literal string and cannot be a column expression.

SQL LIKE pattern syntax:

  • % matches any sequence of characters
  • _ matches any single character

Parameters:

  • other (str) –

    The SQL LIKE pattern to match against

Returns:

  • Column ( Column ) –

    A boolean column indicating whether each value matches the pattern

Find rows where name starts with "j" and ends with "n" (case-insensitive)
# Filter rows where name matches the pattern "j%n" (case-insensitive)
df.filter(col("name").ilike("j%n"))
Find rows where code matches pattern (case-insensitive)
# Filter rows where code matches the pattern "a_b%" (case-insensitive)
df.filter(col("code").ilike("a_b%"))
Source code in src/fenic/api/column.py
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
def ilike(self, other: str) -> Column:
    r"""Check if the column matches a SQL LIKE pattern (case-insensitive).

    This method creates a boolean expression that checks if each value in the column
    matches the specified SQL LIKE pattern, ignoring case. The pattern must be a literal string
    and cannot be a column expression.

    SQL LIKE pattern syntax:

    - % matches any sequence of characters
    - _ matches any single character

    Args:
        other (str): The SQL LIKE pattern to match against

    Returns:
        Column: A boolean column indicating whether each value matches the pattern

    Example: Find rows where name starts with "j" and ends with "n" (case-insensitive)
        ```python
        # Filter rows where name matches the pattern "j%n" (case-insensitive)
        df.filter(col("name").ilike("j%n"))
        ```

    Example: Find rows where code matches pattern (case-insensitive)
        ```python
        # Filter rows where code matches the pattern "a_b%" (case-insensitive)
        df.filter(col("code").ilike("a_b%"))
        ```
    """
    return Column._from_logical_expr(ILikeExpr(self._logical_expr, other))

is_in

is_in(other: Union[List[Any], ColumnOrName]) -> Column

Check if the column is in a list of values or a column expression.

Parameters:

  • other (Union[List[Any], ColumnOrName]) –

    A list of values or a Column expression

Returns:

  • Column ( Column ) –

    A Column expression representing whether each element of Column is in the list

Check if name is in a list of values
# Filter rows where name is in a list of values
df.filter(col("name").is_in(["Alice", "Bob"]))
Check if value is in another column
# Filter rows where name is in another column
df.filter(col("name").is_in(col("other_column")))
Source code in src/fenic/api/column.py
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
def is_in(self, other: Union[List[Any], ColumnOrName]) -> Column:
    """Check if the column is in a list of values or a column expression.

    Args:
        other (Union[List[Any], ColumnOrName]): A list of values or a Column expression

    Returns:
        Column: A Column expression representing whether each element of Column is in the list

    Example: Check if name is in a list of values
        ```python
        # Filter rows where name is in a list of values
        df.filter(col("name").is_in(["Alice", "Bob"]))
        ```

    Example: Check if value is in another column
        ```python
        # Filter rows where name is in another column
        df.filter(col("name").is_in(col("other_column")))
        ```
    """
    if isinstance(other, list):
        try:
            type_ = infer_dtype_from_pyobj(other)
            return Column._from_logical_expr(InExpr(self._logical_expr, LiteralExpr(other, type_)))
        except TypeInferenceError as e:
            raise ValidationError(f"Cannot apply IN on {other}. List argument to IN must be be a valid Python List literal.") from e
    else:
        return Column._from_logical_expr(InExpr(self._logical_expr, other._logical_expr))

is_not_null

is_not_null() -> Column

Check if the column contains non-NULL values.

This method creates an expression that evaluates to TRUE when the column value is not NULL.

Returns:

  • Column ( Column ) –

    A Column representing a boolean expression that is TRUE when this column is not NULL

Filter rows where a column is not NULL
df.filter(col("some_column").is_not_null())
Use in a complex condition
df.filter(col("col1").is_not_null() & (col("col2") <= 50))
Source code in src/fenic/api/column.py
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
def is_not_null(self) -> Column:
    """Check if the column contains non-NULL values.

    This method creates an expression that evaluates to TRUE when the column value is not NULL.

    Returns:
        Column: A Column representing a boolean expression that is TRUE when this column is not NULL

    Example: Filter rows where a column is not NULL
        ```python
        df.filter(col("some_column").is_not_null())
        ```

    Example: Use in a complex condition
        ```python
        df.filter(col("col1").is_not_null() & (col("col2") <= 50))
        ```
    """
    return Column._from_logical_expr(IsNullExpr(self._logical_expr, False))

is_null

is_null() -> Column

Check if the column contains NULL values.

This method creates an expression that evaluates to TRUE when the column value is NULL.

Returns:

  • Column ( Column ) –

    A Column representing a boolean expression that is TRUE when this column is NULL

Filter rows where a column is NULL
# Filter rows where some_column is NULL
df.filter(col("some_column").is_null())
Use in a complex condition
# Filter rows where col1 is NULL or col2 is greater than 100
df.filter(col("col1").is_null() | (col("col2") > 100))
Source code in src/fenic/api/column.py
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
def is_null(self) -> Column:
    """Check if the column contains NULL values.

    This method creates an expression that evaluates to TRUE when the column value is NULL.

    Returns:
        Column: A Column representing a boolean expression that is TRUE when this column is NULL

    Example: Filter rows where a column is NULL
        ```python
        # Filter rows where some_column is NULL
        df.filter(col("some_column").is_null())
        ```

    Example: Use in a complex condition
        ```python
        # Filter rows where col1 is NULL or col2 is greater than 100
        df.filter(col("col1").is_null() | (col("col2") > 100))
        ```
    """
    return Column._from_logical_expr(IsNullExpr(self._logical_expr, True))

like

like(other: str) -> Column

Check if the column matches a SQL LIKE pattern.

This method creates a boolean expression that checks if each value in the column matches the specified SQL LIKE pattern. The pattern must be a literal string and cannot be a column expression.

SQL LIKE pattern syntax:

  • % matches any sequence of characters
  • _ matches any single character

Parameters:

  • other (str) –

    The SQL LIKE pattern to match against

Returns:

  • Column ( Column ) –

    A boolean column indicating whether each value matches the pattern

Find rows where name starts with "J" and ends with "n"
# Filter rows where name matches the pattern "J%n"
df.filter(col("name").like("J%n"))
Find rows where code matches specific pattern
# Filter rows where code matches the pattern "A_B%"
df.filter(col("code").like("A_B%"))
Source code in src/fenic/api/column.py
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
def like(self, other: str) -> Column:
    r"""Check if the column matches a SQL LIKE pattern.

    This method creates a boolean expression that checks if each value in the column
    matches the specified SQL LIKE pattern. The pattern must be a literal string
    and cannot be a column expression.

    SQL LIKE pattern syntax:

    - % matches any sequence of characters
    - _ matches any single character

    Args:
        other (str): The SQL LIKE pattern to match against

    Returns:
        Column: A boolean column indicating whether each value matches the pattern

    Example: Find rows where name starts with "J" and ends with "n"
        ```python
        # Filter rows where name matches the pattern "J%n"
        df.filter(col("name").like("J%n"))
        ```

    Example: Find rows where code matches specific pattern
        ```python
        # Filter rows where code matches the pattern "A_B%"
        df.filter(col("code").like("A_B%"))
        ```
    """
    return Column._from_logical_expr(LikeExpr(self._logical_expr, other))

otherwise

otherwise(value: Column) -> Column

Evaluates a list of conditions and returns one of multiple possible result expressions.

If Column.otherwise() is not invoked, None is returned for unmatched conditions. Otherwise() will return for rows with None inputs.

Parameters:

  • value (Column) –

    A literal value or Column expression to return

Returns:

  • Column ( Column ) –

    A Column expression representing whether each element of Column is not matched by any previous conditions

Use when/otherwise for conditional logic
# Create a DataFrame with age and name columns
df = session.createDataFrame(
    {"age": [2, 5]}, {"name": ["Alice", "Bob"]}
)

# Use when/otherwise to create a case result column
df.select(
    col("name"),
    when(col("age") > 3, 1).otherwise(0).alias("case_result")
).show()
# Output:
# +-----+-----------+
# | name|case_result|
# +-----+-----------+
# |Alice|          0|
# |  Bob|          1|
# +-----+-----------+
Source code in src/fenic/api/column.py
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
def otherwise(self, value: Column) -> Column:
    """Evaluates a list of conditions and returns one of multiple possible result expressions.

    If Column.otherwise() is not invoked, None is returned for unmatched conditions.
    Otherwise() will return for rows with None inputs.

    Args:
        value (Column): A literal value or Column expression to return

    Returns:
        Column: A Column expression representing whether each element of Column is not matched by any previous conditions

    Example: Use when/otherwise for conditional logic
        ```python
        # Create a DataFrame with age and name columns
        df = session.createDataFrame(
            {"age": [2, 5]}, {"name": ["Alice", "Bob"]}
        )

        # Use when/otherwise to create a case result column
        df.select(
            col("name"),
            when(col("age") > 3, 1).otherwise(0).alias("case_result")
        ).show()
        # Output:
        # +-----+-----------+
        # | name|case_result|
        # +-----+-----------+
        # |Alice|          0|
        # |  Bob|          1|
        # +-----+-----------+
        ```
    """
    return Column._from_logical_expr(OtherwiseExpr(self._logical_expr, value._logical_expr))

rlike

rlike(other: str) -> Column

Check if the column matches a regular expression pattern.

This method creates a boolean expression that checks if each value in the column matches the specified regular expression pattern. The pattern must be a literal string and cannot be a column expression.

Parameters:

  • other (str) –

    The regular expression pattern to match against

Returns:

  • Column ( Column ) –

    A boolean column indicating whether each value matches the pattern

Find rows where phone number matches pattern
# Filter rows where phone number matches a specific pattern
df.filter(col("phone").rlike(r"^\d{3}-\d{3}-\d{4}$"))
Find rows where text contains word boundaries
# Filter rows where text contains a word with boundaries
df.filter(col("text").rlike(r"\bhello\b"))
Source code in src/fenic/api/column.py
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
def rlike(self, other: str) -> Column:
    r"""Check if the column matches a regular expression pattern.

    This method creates a boolean expression that checks if each value in the column
    matches the specified regular expression pattern. The pattern must be a literal string
    and cannot be a column expression.

    Args:
        other (str): The regular expression pattern to match against

    Returns:
        Column: A boolean column indicating whether each value matches the pattern

    Example: Find rows where phone number matches pattern
        ```python
        # Filter rows where phone number matches a specific pattern
        df.filter(col("phone").rlike(r"^\d{3}-\d{3}-\d{4}$"))
        ```

    Example: Find rows where text contains word boundaries
        ```python
        # Filter rows where text contains a word with boundaries
        df.filter(col("text").rlike(r"\bhello\b"))
        ```
    """
    return Column._from_logical_expr(RLikeExpr(self._logical_expr, other))

starts_with

starts_with(other: Union[str, Column]) -> Column

Check if the column starts with a substring.

This method creates a boolean expression that checks if each value in the column starts with the specified substring. The substring can be either a literal string or a column expression.

Parameters:

  • other (Union[str, Column]) –

    The substring to check for at the start (can be a string or column expression)

Returns:

  • Column ( Column ) –

    A boolean column indicating whether each value starts with the substring

Find rows where name starts with "Mr"
# Filter rows where name starts with "Mr"
df.filter(col("name").starts_with("Mr"))
Find rows where text starts with a dynamic pattern
# Filter rows where text starts with a value from another column
df.filter(col("text").starts_with(col("prefix")))

Raises:

  • ValueError

    If the substring starts with a regular expression anchor (^)

Source code in src/fenic/api/column.py
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
def starts_with(self, other: Union[str, Column]) -> Column:
    """Check if the column starts with a substring.

    This method creates a boolean expression that checks if each value in the column
    starts with the specified substring. The substring can be either a literal string
    or a column expression.

    Args:
        other (Union[str, Column]): The substring to check for at the start (can be a string or column expression)

    Returns:
        Column: A boolean column indicating whether each value starts with the substring

    Example: Find rows where name starts with "Mr"
        ```python
        # Filter rows where name starts with "Mr"
        df.filter(col("name").starts_with("Mr"))
        ```

    Example: Find rows where text starts with a dynamic pattern
        ```python
        # Filter rows where text starts with a value from another column
        df.filter(col("text").starts_with(col("prefix")))
        ```

    Raises:
        ValueError: If the substring starts with a regular expression anchor (^)
    """
    if isinstance(other, str):
        return Column._from_logical_expr(StartsWithExpr(self._logical_expr, other))
    else:
        return Column._from_logical_expr(
            StartsWithExpr(self._logical_expr, other._logical_expr)
        )

when

when(condition: Column, value: Column) -> Column

Evaluates a list of conditions and returns one of multiple possible result expressions.

If Column.otherwise() is not invoked, None is returned for unmatched conditions. Otherwise() will return for rows with None inputs.

Parameters:

  • condition (Column) –

    A boolean Column expression

  • value (Column) –

    A literal value or Column expression to return if the condition is true

Returns:

  • Column ( Column ) –

    A Column expression representing whether each element of Column matches the condition

Raises:

  • TypeError

    If the condition is not a boolean Column expression

Use when/otherwise for conditional logic
# Create a DataFrame with age and name columns
df = session.createDataFrame(
    {"age": [2, 5]}, {"name": ["Alice", "Bob"]}
)

# Use when/otherwise to create a case result column
df.select(
    col("name"),
    when(col("age") > 3, 1).otherwise(0).alias("case_result")
).show()
# Output:
# +-----+-----------+
# | name|case_result|
# +-----+-----------+
# |Alice|          0|
# |  Bob|          1|
# +-----+-----------+
Source code in src/fenic/api/column.py
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
def when(self, condition: Column, value: Column) -> Column:
    """Evaluates a list of conditions and returns one of multiple possible result expressions.

    If Column.otherwise() is not invoked, None is returned for unmatched conditions.
    Otherwise() will return for rows with None inputs.

    Args:
        condition (Column): A boolean Column expression
        value (Column): A literal value or Column expression to return if the condition is true

    Returns:
        Column: A Column expression representing whether each element of Column matches the condition

    Raises:
        TypeError: If the condition is not a boolean Column expression

    Example: Use when/otherwise for conditional logic
        ```python
        # Create a DataFrame with age and name columns
        df = session.createDataFrame(
            {"age": [2, 5]}, {"name": ["Alice", "Bob"]}
        )

        # Use when/otherwise to create a case result column
        df.select(
            col("name"),
            when(col("age") > 3, 1).otherwise(0).alias("case_result")
        ).show()
        # Output:
        # +-----+-----------+
        # | name|case_result|
        # +-----+-----------+
        # |Alice|          0|
        # |  Bob|          1|
        # +-----+-----------+
        ```
    """
    return Column._from_logical_expr(WhenExpr(self._logical_expr, condition._logical_expr, value._logical_expr))