Skip to content

Add deprecation warnings for Expr passed to confirmed literal-only function arguments#1605

Open
kosiew wants to merge 2 commits into
apache:mainfrom
kosiew:deprecation-typehint-02-1496
Open

Add deprecation warnings for Expr passed to confirmed literal-only function arguments#1605
kosiew wants to merge 2 commits into
apache:mainfrom
kosiew:deprecation-typehint-02-1496

Conversation

@kosiew

@kosiew kosiew commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Which issue does this PR close?

Rationale for this change

As part of the audit of Expr | literal function arguments, this change targets a set of arguments that act as control/configuration parameters rather than row-varying data expressions. These arguments are intended to be provided as Python literals (for example, encoding names, digest methods, data type specifiers, and metadata keys).

To prepare for future API cleanup while preserving compatibility, passing an Expr to these arguments now emits a DeprecationWarning but continues to work.

What changes are included in this PR?

  • Added a reusable helper, _warn_if_expr_for_literal_arg, that emits a deprecation warning when an Expr is supplied to a confirmed literal-only argument.

  • Added deprecation warnings for the following function arguments:

    • encode(..., encoding=...)
    • decode(..., encoding=...)
    • digest(..., method=...)
    • arrow_cast(..., data_type=...)
    • arrow_try_cast(..., data_type=...)
    • arrow_metadata(..., key=...)
  • Preserved existing behavior by continuing to coerce these values to expressions after warning.

  • Added tests covering both warning and non-warning cases.

Are these changes tested?

Yes.

Added tests in python/tests/test_functions.py that verify:

  • Passing literal(...) / Expr values to the audited literal-only arguments emits a DeprecationWarning.

  • Passing native Python literal values does not emit a DeprecationWarning.

  • Coverage includes:

    • encode
    • decode
    • digest
    • arrow_cast
    • arrow_try_cast
    • arrow_metadata

Are there any user-facing changes?

Yes.

Users will now receive a DeprecationWarning when passing an Expr to the following literal-only arguments:

  • encode(..., encoding=...)
  • decode(..., encoding=...)
  • digest(..., method=...)
  • arrow_cast(..., data_type=...)
  • arrow_try_cast(..., data_type=...)
  • arrow_metadata(..., key=...)

Existing behavior is otherwise unchanged, and Python literal inputs remain fully supported without warnings.

LLM-generated code disclosure

This PR includes code, comments generated with assistance from LLM. All LLM-generated content has been manually reviewed.

kosiew added 2 commits June 18, 2026 16:45
- Introduced shared `_warn_if_expr_for_literal_arg` in `functions/__init__.py`
- Added `DeprecationWarning` for the following methods when `Expr` is passed as argument:
- `encode(..., encoding=Expr)`
- `decode(..., encoding=Expr)`
- `digest(..., method=Expr)`
- `arrow_cast(..., data_type=Expr)`
- `arrow_try_cast(..., data_type=Expr)`
- `arrow_metadata(..., key=Expr)`

test: update tests to check for warnings

- Implemented tests in `test_functions.py` to ensure:
- Warning is raised for `Expr` form
- No warning for native literal form
@kosiew kosiew marked this pull request as ready for review June 18, 2026 12:30

@nuno-faria nuno-faria left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @kosiew. In addition to the comments below, it's better to update the arrow_cast calls at user-guide/common-operations/functions.md which are still using expressions.

Comment on lines 67 to +82
def _warn_expr_for_literal_arg(function_name: str, arg_name: str) -> None:
warnings.warn(
f"Passing Expr for {function_name}() argument {arg_name!r} is deprecated; "
"pass a Python literal instead.",
DeprecationWarning,
stacklevel=4,
)


def _warn_if_expr_for_literal_arg(
value: Any, function_name: str, arg_name: str
) -> None:
if isinstance(value, Expr):
_warn_expr_for_literal_arg(function_name, arg_name)


Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be better to combine these two into a single function?

Comment on lines 1603 to 1619
def test_binary_string_functions(df):
df = df.select(
f.encode(column("a").cast(pa.string()), literal("base64").cast(pa.string())),
f.decode(
f.encode(
column("a").cast(pa.string()), literal("base64").cast(pa.string())
),
literal("base64").cast(pa.string()),
),
)
result = df.collect()
assert len(result) == 1
result = result[0]
assert result.column(0) == pa.array(["SGVsbG8", "V29ybGQ", "IQ"])
assert pa.array(result.column(1)).cast(pa.string()) == pa.array(
["Hello", "World", "!"]
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The calls to encode and decode need to be updated to not raise warnings.

assert result.column(0) == expected_result


def test_hash_functions(df):

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The calls here also need to be updated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants