Skip to content

Add support for everything Firmware_extractor supported#146

Open
akhilnarang wants to merge 11 commits into
sebaubuntu-python:masterfrom
deadman96385:master
Open

Add support for everything Firmware_extractor supported#146
akhilnarang wants to merge 11 commits into
sebaubuntu-python:masterfrom
deadman96385:master

Conversation

@akhilnarang

Copy link
Copy Markdown

No description provided.

deadman96385 and others added 11 commits April 2, 2026 13:22
Add a new Rust + PyO3 native extension (firmware-parsers/) that provides
Android sparse image conversion, replacing the external simg2img binary
dependency. The crate exposes sparse_to_raw, sparse_chunks_to_raw, and
is_sparse functions to Python via the firmware_parsers module.

All simg2img subprocess calls in multipartitions.py, raw_image.py, and
sparsed_images.py now use the native Rust implementation when available,
with graceful fallback to the simg2img binary when firmware_parsers is
not installed. The simg2img tool requirement is also conditionally
removed from REQUIRED_TOOLS in dumpyara.py.
Add Rust extractors for NB0, PAC, and MTK signed image formats and expose them through the PyO3 module.

Update format detection to recognize PAC and NB0 files directly and inspect zip archives for MTK signed image payloads.
Implement Phases 3-6 of the firmware-parsers plan:
- Phase 3: OPPO ozip (AES-128-ECB decrypt, 35 keys, mode-1 direct and
  mode-2 zip-wrapped with inner .ozip entry decryption) and Sony SIN/FTF
  (v3/v4/v5 + legacy SSSS/BFBF + Sony sparse chunks 0xCAC1-0xCAC5)
- Phase 4: Amlogic USB burning tool and Rockchip RKFW/AFP containers
- Phase 5: QFIL rawprogram XML, ZTE update.zip, KDDI .bin extractors
- Phase 6: Wire firmware_parsers.detect() into extract_archive.py with
  graceful fallback to shutil.unpack_archive on failure

Also fix issues found in review of Phase 1-2 code:
- Remove "sparse" from detect() returns (no matching Python function)
- Strengthen NB0 probe with printable-ASCII filename validation
- Validate BFBF/MTK header magic before 0x4040 byte strip
- Sanitize output filenames in nb0/pac/amlogic/rockchip/qfil (path traversal)
- Fix SIN tar entry processing (per-entry instead of concatenation)
- Use static OnceLock<Regex> in ZTE/KDDI instead of per-call compilation
- Expand partition name whitelist for ZTE/KDDI bin detection

Format-specific fixes:
- OZIP: handle PK-wrapped mode-2 inputs, decrypt inner .ozip entries,
  strip .ozip extension so downstream recognizes decrypted payloads;
  preserve decrypted filename for non-zip payloads (e.g.
  system.new.dat.br.ozip -> system.new.dat.br, not .br.img);
  use temp work directory for zip extraction to avoid leaked temp files;
  propagate decryption failure instead of silently renaming ciphertext
- QFIL: group <program> entries by label to merge sparsechunks instead
  of overwriting; resolve filenames by basename for flattened layouts;
  sanitize XML labels against path traversal; honor file_sector_offset
  when rebuilding partitions from shared backing files; branch on
  all_same_file without requiring non-zero first offset
- Amlogic: detect tar.bz2 wrappers by content (BZh magic) instead of
  filename extension; verify Amlogic magic before accepting tar entries
  to avoid selecting non-Amlogic .img files that precede the payload
- SIN: stream-scan for gzip/tar offset instead of reading entire file
  into memory (avoids OOM on multi-GB files); strip .sin extension
  case-insensitively to match case-insensitive FTF detection
- detect.rs: scan all tar.bz2 members for Amlogic magic (not just
  the first entry, since packages may have readme/manifest first);
  tighten bzip2 check from 2-byte "BZ" to 3-byte "BZh"

Housekeeping:
- Remove unused Cargo dependencies (thiserror, memmap2, bytemuck, byteorder)
Rockchip RKFW header uses packed/unaligned fields:
- AFP container offset at 0x21 (not 0x1C)
- AFP container size at 0x25 (not 0x20)

AFP container structure corrected:
- AFP header is 0x8C bytes (not 0x4C)
- entry_count at offset 0x88 (not 0x44)
- Each entry is 0x70 / 112 bytes (not 0x48 / 72)
- Entry offset at 0x60, size at 0x6C (not 0x20/0x28)
* upstream/master:
  build(deps): bump gitpython from 3.1.46 to 3.1.47
  dumpyara: Fix checks action file license header
  dumpyara: v1.1.0
  dumpyara: README.md: Reformat
  dumpyara: Update Codacy badge
  dumpyara: Add checks action
  dumpyara: Move to SPDX license headers
  dumpyara: pyright
  dumpyara: ruff format
  dumpyara: ruff check
  dumpyara: Add a ruff config
  dumpyara: REUSE compliance
  dumpyara: Pull gitignore updates from GitHub repo
  dumpyara: Use the proper way to show the program description
  dumpyara: Drop module_path and current_path
  dumpyara: Skip copying non-regular ramdisk files
  dumpyara: Update sebaubuntu_libs to v2.0.0
  dumpyara: Drop unused Poetry section

Signed-off-by: Akhil Narang <me@akhilnarang.dev>

# Conflicts:
#	dumpyara/dumpyara.py
#	dumpyara/steps/extract_archive.py
#	dumpyara/utils/multipartitions.py
#	dumpyara/utils/raw_image.py
#	dumpyara/utils/sparsed_images.py
When a firmware ships both a real partition and its alias (e.g. modem.img
alongside NON-HLOS.img on recent Oppo/ColorOS builds), fix_aliases logged
"Ignoring <alias> (<name> already extracted)" and unlinked the alias, but
then fell through and still attempted move(alias, partition). Since the
alias was just deleted, this raised FileNotFoundError and aborted the dump.

Add the missing continue so the redundant alias is dropped and the real
partition is left untouched.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Some firmware (e.g. aftermarket car head units) wrap the real dumpable ROM
one zip deeper -- the outer archive holds vendor blobs and APK dirs, while
the actual block-based OTA (system.new.dat.br + *.transfer.list, payload.bin,
super*.img, *.tar.md5) lives inside an inner .zip such as update_car.zip.
dumpyara only scanned the outer archive's top level, found no recognized
partition container, and aborted with "System folder doesn't exist".

Extend nested-archive handling: for each top-level .zip not already covered
by NESTED_ARCHIVES, peek its central directory (zipfile namelist, no
extraction) and recurse only when it contains a partition marker. The marker
patterns are anchored to a path boundary and the *.new.dat.br / *.transfer.list
markers are restricted to known partition names, so APK/config zips (which
carry none of these) are never exploded.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@codacy-production

Copy link
Copy Markdown

Not up to standards ⛔

🔴 Issues 1 medium · 1 minor

Alerts:
⚠ 2 issues (≤ 0 issues of at least minor severity)

Results:
2 new issues

Category Results
UnusedCode 1 medium
Documentation 1 minor

View in Codacy

🟢 Metrics 623 complexity · 48 duplication

Metric Results
Complexity 623
Duplication 48

View in Codacy

NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants