fenic.api.functions.json
JSON functions.
Functions:
-
contains
–Check if a JSON value contains the specified value using recursive deep search.
-
get_type
–Get the JSON type of each value.
-
jq
–Applies a JQ query to a column containing JSON-formatted strings.
contains
contains(column: ColumnOrName, value: str) -> Column
Check if a JSON value contains the specified value using recursive deep search.
Parameters:
-
column
(ColumnOrName
) –Input column of type
JsonType
. -
value
(str
) –Valid JSON string to search for.
Returns:
-
Column
(Column
) –A column of booleans indicating whether the JSON contains the value.
Matching Rules
- Objects: Uses partial matching -
{"role": "admin"}
matches{"role": "admin", "level": 5}
- Arrays: Uses exact matching -
[1, 2]
only matches exactly[1, 2]
, not[1, 2, 3]
- Primitives: Uses exact matching -
42
matches42
but not"42"
- Search is recursive: Searches at all nesting levels throughout the JSON structure
- Type-aware: Distinguishes between
42
(number) and"42"
(string)
Find objects with partial structure match
# Find objects with partial structure match (at any nesting level)
df.select(json.contains(col("json_data"), '{"name": "Alice"}'))
# Matches: {"name": "Alice", "age": 30} and {"user": {"name": "Alice"}}
Find exact array match
# Find exact array match (at any nesting level)
df.select(json.contains(col("json_data"), '["read", "write"]'))
# Matches: {"permissions": ["read", "write"]} but not ["read", "write", "admin"]
Find exact primitive values
# Find exact primitive values (at any nesting level)
df.select(json.contains(col("json_data"), '"admin"'))
# Matches: {"role": "admin"} and ["admin", "user"] but not {"role": "administrator"}
Type distinction matters
# Type distinction matters
df.select(json.contains(col("json_data"), '42')) # number 42
df.select(json.contains(col("json_data"), '"42"')) # string "42"
Raises:
-
ValidationError
–If
value
is not valid JSON.
Source code in src/fenic/api/functions/json.py
76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 |
|
get_type
get_type(column: ColumnOrName) -> Column
Get the JSON type of each value.
Parameters:
-
column
(ColumnOrName
) –Input column of type
JsonType
.
Returns:
-
Column
(Column
) –A column of strings indicating the JSON type ("string", "number", "boolean", "array", "object", "null").
Get JSON types
df.select(json.get_type(col("json_data")))
Filter by type
# Filter by type
df.filter(json.get_type(col("data")) == "array")
Source code in src/fenic/api/functions/json.py
49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 |
|
jq
jq(column: ColumnOrName, query: str) -> Column
Applies a JQ query to a column containing JSON-formatted strings.
Parameters:
-
column
(ColumnOrName
) –Input column of type
JsonType
. -
query
(str
) –A JQ expression used to extract or transform values.
Returns:
-
Column
(Column
) –A column containing the result of applying the JQ query to each row's JSON input.
Notes
- The input column must be of type
JsonType
. Usecast(JsonType)
if needed to ensure correct typing. - This function supports extracting nested fields, transforming arrays/objects, and other standard JQ operations.
Extract nested field
# Extract the "user.name" field from a JSON column
df.select(json.jq(col("json_col"), ".user.name"))
Cast to JsonType before querying
df.select(json.jq(col("raw_json").cast(JsonType), ".event.type"))
Work with arrays
# Work with arrays using JQ functions
df.select(json.jq(col("json_array"), "map(.id)"))
Source code in src/fenic/api/functions/json.py
12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 |
|