Pipe: merge batched aligned chunks in scan parser by Caideyipi · Pull Request #18010 · apache/iotdb

Caideyipi · 2026-06-23T09:22:10Z

Description

This PR improves the pipe TsFile scan parser for legal aligned TsFiles whose value chunks are physically written in column batches, such as files produced by batched aligned compaction.

The current scan parser emits an aligned tablet when the value chunk occurrence index changes. For batched aligned compaction output, value chunks can be laid out as:

time chunk 0, time chunk 1
value columns 0-9 for chunk 0 and chunk 1
value columns 10-19 for chunk 0 and chunk 1
...

This layout is valid, but the previous parser behavior makes the emitted tablets inherit the physical compaction batch width, commonly 10 columns from compaction_max_aligned_series_num_in_one_batch, even when pipe reader memory allows a wider aligned tablet. That increases the number of tablets and hurts pipe performance.

This PR changes the scan parser to cache pending aligned value chunk groups by time chunk index and emit them only when memory limits or chunk group boundaries require it. With enough memory, consecutive physical value column batches for the same aligned chunks are merged into wider aligned tablets instead of being split at the compaction batch boundary.

It also defines pipeDataStructureTabletRowSize <= 0 as disabling the row-count cap for pipe tablets. In that mode, tablet row count is calculated only from pipe_data_structure_tablet_size_in_bytes, so users can rely on the memory-size limit instead of the fixed row-count limit.

Changes

Cache aligned value chunks in pending groups keyed by time chunk index in TsFileInsertionEventScanParser.
Preserve chunk/page memory protection when merging multiple physical aligned value chunk groups.
Keep cached value chunk replay subject to the same memory threshold checks.
Treat non-positive pipeDataStructureTabletRowSize as no row-count cap in PipeMemoryWeightUtil.
Add tests for batched aligned value chunk layout merging, memory-boundary flushing, and disabling the tablet row-size cap with 0/negative values.

Validation

mvn spotless:apply -pl iotdb-core/datanode
git diff --check

I also tried:

mvn -Ddevelocity.off=true -pl iotdb-core/datanode -DskipTests compile
mvn -Ddevelocity.off=true -Dmaven.main.skip=true -pl iotdb-core/datanode -Dtest=TsFileInsertionEventParserTest#testScanParserMergesBatchedAlignedValueChunkGroups+testPipeTabletRowSizeCanBeDisabledByNonPositiveValue test
mvn -pl iotdb-core/datanode -Dtest=TsFileInsertionEventParserTest#testScanParserMergesBatchedAlignedValueChunkGroups+testScanParserFlushesBatchedAlignedValueChunkGroupsByMemoryLimit+testPipeTabletRowSizeCanBeDisabledByNonPositiveValue test

These Maven compile/test attempts are blocked in this workspace by existing datanode-wide compile issues outside this PR, including generated query fill/aggregation classes and IOUtils.readFully unresolved symbols in unrelated files. The focused tests did not get executed because compilation fails before Surefire runs.

sonarqubecloud · 2026-06-23T10:03:58Z

Quality Gate passed

Issues
9 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

Caideyipi added 2 commits June 23, 2026 17:19

Pipe: merge batched aligned chunks in scan parser

66b19a5

Test pipe batched aligned chunk memory boundaries

f2bd2eb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pipe: merge batched aligned chunks in scan parser#18010

Pipe: merge batched aligned chunks in scan parser#18010
Caideyipi wants to merge 2 commits into
masterfrom
fix/pipe-merge-batched-aligned-chunks

Caideyipi commented Jun 23, 2026 •

edited

Loading

Uh oh!

sonarqubecloud Bot commented Jun 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Caideyipi commented Jun 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Changes

Validation

Uh oh!

sonarqubecloud Bot commented Jun 23, 2026

Quality Gate passed

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Caideyipi commented Jun 23, 2026 •

edited

Loading