Skip to content

fix: detect ScyllaDB via SUPPORTED protocol extensions, not shard count#910

Open
mykaul wants to merge 1 commit into
scylladb:masterfrom
mykaul:fix-scylla-detection-without-sharding
Open

fix: detect ScyllaDB via SUPPORTED protocol extensions, not shard count#910
mykaul wants to merge 1 commit into
scylladb:masterfrom
mykaul:fix-scylla-detection-without-sharding

Conversation

@mykaul

@mykaul mykaul commented Jun 16, 2026

Copy link
Copy Markdown

Summary

The driver previously identified ScyllaDB solely by the presence of shard-related fields (SCYLLA_NR_SHARDS, SCYLLA_SHARD, etc.) in the SUPPORTED response.
When shard-awareness is disabled on the server side (allow_shard_aware_drivers: false), those fields are absent, causing the driver to misidentify a ScyllaDB cluster as Cassandra.
The immediate consequence is that the driver leaves peers_v2 enabled (which ScyllaDB does not support) and fails to connect — the same regression described in scylladb/gocql#902.

This fix mirrors scylladb/gocql#903.

Changes

cassandra/protocol_features.py

  • Add ProtocolFeatures.is_scylla bool (default False), set by the new detect_scylla() static method.
  • detect_scylla() returns True when any known Scylla-specific extension key (SCYLLA_LWT_ADD_METADATA_MARK, SCYLLA_RATE_LIMIT_ERROR, TABLETS_ROUTING_V1) is present in the SUPPORTED response, or when sharding_info is populated.
    These keys are advertised by ScyllaDB regardless of whether shard-awareness is enabled, so is_scylla stays True even with sharding disabled.
  • Fix latent crash in parse_sharding_info: int(shard_id) raised TypeError when SCYLLA_PARTITIONER/SCYLLA_SHARDING_ALGORITHM was present without SCYLLA_SHARD. Default to 0.

cassandra/shard_info.py / cassandra/c_shard_info.pyx

  • _ShardingInfo.__init__ / ShardingInfo.__init__: guard int(shards_count) and int(sharding_ignore_msb) against None (default to 0), matching the existing pattern used for shard_aware_port. Fixes the remaining crash path where a shard-defining field was present without SCYLLA_NR_SHARDS.

cassandra/cluster.py

  • _uses_peers_v2 disable and _metadata_request_timeout now gate on is_scylla instead of sharding_info is not None, so both work correctly when sharding is disabled.

cassandra/metadata.py

  • _is_not_scylla() now uses not is_scylla instead of shard_id is None.
    The old check was always False after the OPTIONS exchange (Cassandra gets shard_id=0, and 0 is None is False), meaning trigger-metadata queries were silently skipped even for Cassandra/DSE connections.

tests/unit/test_protocol_features.py

  • 7 new unit tests covering: LWT-only / rate-limit-only / tablets-only detection, full-sharding detection, pure-Cassandra non-detection, and two crash-regression tests for the int(None) paths.

Shard-aware pooling is unaffected

pool.py continues to gate per-shard connection opening on sharding_info is not None.
A ScyllaDB node with sharding disabled correctly gets is_scylla=True (so peers_v2 is disabled and USING TIMEOUT works) but sharding_info=None (so no per-shard connections are opened). This matches the gocql#903 fix's isScyllaConn() && nrShards != 0 guard.

Note on Cassandra behavioral change

_is_not_scylla() was previously always returning False post-OPTIONS for Cassandra (shard_id=0, 0 is None = False), so system_schema.triggers queries were silently skipped even for Cassandra/DSE. With this fix, Cassandra/DSE connections
will correctly issue the trigger-metadata query and populate TableMetadata.triggers.

See also

@coderabbitai

coderabbitai Bot commented Jun 16, 2026

Copy link
Copy Markdown

Review Change Stack

Important

Review skipped

No new commits to review since the last review.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 4d8b4f8b-199c-4873-a47d-1c4f3f9933e3

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

ProtocolFeatures now stores an is_scylla flag derived from SUPPORTED data and handles missing sharding fields by defaulting them to 0. ControlConnection._try_connect() uses connection.features.is_scylla to choose peers_v2 and metadata timeout behavior, and SchemaParserV3._is_not_scylla() now returns not is_scylla. Unit tests cover detection via Scylla-specific extensions, sharding metadata, Cassandra-only payloads, and partial sharding advertisements.

Sequence Diagram(s)

sequenceDiagram
  participant ProtocolFeatures
  participant ControlConnection
  participant SchemaParserV3
  ProtocolFeatures->>ProtocolFeatures: parse_from_supported(supported)
  ProtocolFeatures->>ProtocolFeatures: detect_scylla(supported, sharding_info)
  ProtocolFeatures-->>ControlConnection: is_scylla
  ControlConnection->>ControlConnection: set _uses_peers_v2 and metadata timeout
  ControlConnection->>SchemaParserV3: _is_not_scylla(features)
  SchemaParserV3-->>ControlConnection: not is_scylla
Loading
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main change: ScyllaDB detection now relies on SUPPORTED extensions instead of shard count.
Description check ✅ Passed The description covers the summary, change details, impacts, and tests, though it omits the template's pre-review checklist section.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

The driver previously identified ScyllaDB by the presence of shard-related
fields (SCYLLA_NR_SHARDS, SCYLLA_SHARD, etc.) in the SUPPORTED response.
When shard-awareness is disabled on the server side
(allow_shard_aware_drivers: false) those fields are absent, causing the
driver to misidentify a ScyllaDB cluster as Cassandra.  The immediate
consequence is that the driver enables peers_v2 (which ScyllaDB does not
support) and fails to connect -- the same regression described in
scylladb/gocql#902.

Fix (mirrors scylladb/gocql#903):

* Add ProtocolFeatures.is_scylla: set to True whenever ANY known
  Scylla-specific extension key (SCYLLA_LWT_ADD_METADATA_MARK,
  SCYLLA_RATE_LIMIT_ERROR, TABLETS_ROUTING_V1) is present in the
  SUPPORTED response, or whenever sharding_info is populated.  This
  flag stays True even when sharding is disabled.

* ProtocolFeatures.sharding_info remains None when sharding is disabled,
  so shard-aware connection pooling (pool.py) is correctly left inactive.

* cluster.py: use is_scylla (not sharding_info is not None) to gate the
  peers_v2 disable and the USING TIMEOUT metadata request timeout.

* metadata.py _is_not_scylla(): use is_scylla instead of the previous
  check (shard_id is None), which was always False after the OPTIONS
  exchange and therefore silently broke Cassandra trigger-metadata queries.

* protocol_features.py parse_sharding_info(): fix a latent crash -- if
  SCYLLA_PARTITIONER or SCYLLA_SHARDING_ALGORITHM were present without
  SCYLLA_SHARD, int(None) would raise TypeError.  Default shard_id to 0.

Fixes: scylladb/python-driver#<TBD>
See also: scylladb/gocql#902, scylladb/gocql#903

Fixes: scylladb#909
@mykaul mykaul force-pushed the fix-scylla-detection-without-sharding branch from a64d05e to bea45ac Compare June 16, 2026 13:32
@mykaul mykaul requested a review from Copilot June 16, 2026 15:50

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes ScyllaDB detection by using Scylla-specific SUPPORTED protocol extension keys (and sharding info when present) instead of relying on shard-related fields alone, preventing misclassification as Cassandra when shard-awareness is disabled and avoiding peers_v2-related connection failures.

Changes:

  • Add ProtocolFeatures.is_scylla and implement Scylla detection via known Scylla-specific SUPPORTED extension keys.
  • Harden sharding parsing/initialization against missing shard fields to prevent int(None) crashes.
  • Gate peers_v2 usage, USING TIMEOUT, and trigger-metadata behavior on is_scylla instead of shard-awareness heuristics; add unit tests for detection/regressions.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
cassandra/protocol_features.py Introduces is_scylla detection and adjusts sharding parsing to avoid crashes.
cassandra/shard_info.py Guards integer conversion for missing shard-related fields.
cassandra/c_shard_info.pyx Mirrors shard-info conversion guards in the Cython implementation.
cassandra/cluster.py Switches peers_v2 disabling and metadata timeout gating to is_scylla.
cassandra/metadata.py Updates Scylla-vs-not logic to use features.is_scylla.
tests/unit/test_protocol_features.py Adds unit tests for Scylla detection and crash regressions.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +101 to +103
resolved_shard_id = int(shard_id) if shard_id is not None else 0
return resolved_shard_id, _ShardingInfo(shard_id, shards_count, partitioner, sharding_algorithm, sharding_ignore_msb,
shard_aware_port, shard_aware_port_ssl)
Comment on lines +91 to +95
# SCYLLA_PARTITIONER passes the sharding guard, so sharding_info is populated
# with zero defaults rather than crashing.
assert pf.sharding_info is not None
assert pf.sharding_info.shards_count == 0
assert pf.sharding_info.sharding_ignore_msb == 0
Comment on lines +106 to +108
assert pf.is_scylla is True
assert pf.sharding_info is not None
assert pf.sharding_info.shards_count == 0
@mykaul

mykaul commented Jun 26, 2026

Copy link
Copy Markdown
Author

@coderabbitai review

@mykaul mykaul marked this pull request as ready for review June 26, 2026 17:28
@coderabbitai

coderabbitai Bot commented Jun 26, 2026

Copy link
Copy Markdown
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
tests/unit/test_protocol_features.py (1)

79-108: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Assert the normalized pf.shard_id in these regressions.

These cases exercise the missing-SCYLLA_SHARD path, but they never check the new top-level normalization to 0. Adding assert pf.shard_id == 0 would pin the actual fix and catch a regression back to None.

Suggested assertions
     def test_scylla_without_sharding_no_crash(self):
         ...
         assert pf.is_scylla is True
+        assert pf.shard_id == 0
         assert pf.sharding_info is not None
         assert pf.sharding_info.shards_count == 0
         assert pf.sharding_info.sharding_ignore_msb == 0

     def test_scylla_sharding_algorithm_only_no_crash(self):
         ...
         assert pf.is_scylla is True
+        assert pf.shard_id == 0
         assert pf.sharding_info is not None
         assert pf.sharding_info.shards_count == 0
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unit/test_protocol_features.py` around lines 79 - 108, The regression
tests in test_scylla_without_sharding_no_crash and
test_scylla_sharding_algorithm_only_no_crash only verify sharding_info, but they
do not assert the new normalized top-level shard_id value. Update both
ProtocolFeatures.parse_from_supported regression cases to also check that
pf.shard_id is 0, so the tests pin the missing-SCYLLA_SHARD normalization
behavior and catch any return to None.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@tests/unit/test_protocol_features.py`:
- Around line 79-108: The regression tests in
test_scylla_without_sharding_no_crash and
test_scylla_sharding_algorithm_only_no_crash only verify sharding_info, but they
do not assert the new normalized top-level shard_id value. Update both
ProtocolFeatures.parse_from_supported regression cases to also check that
pf.shard_id is 0, so the tests pin the missing-SCYLLA_SHARD normalization
behavior and catch any return to None.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 6e195da2-f03f-4a94-a124-f7657f65a989

📥 Commits

Reviewing files that changed from the base of the PR and between 763af09 and bea45ac.

📒 Files selected for processing (6)
  • cassandra/c_shard_info.pyx
  • cassandra/cluster.py
  • cassandra/metadata.py
  • cassandra/protocol_features.py
  • cassandra/shard_info.py
  • tests/unit/test_protocol_features.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants