gsoc26: Format Conversion Layer (Layer 2) + CLI refactor (#59, #61)#62
Conversation
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
✅ Action performedReview finished.
|
Integer-Ctrl
left a comment
There was a problem hiding this comment.
One last note: Right now, the --compression option can only be used to create compressed files. It would be cool if there were options like --compression none or --compression decompress. But if you'd rather work on the rest of the GSOC project first, that's fine too. In that case, I'll just create an issue to add decompression later on. Let me know if you’d like to do this now, later, or maybe not at all. I'm fine with your preference 👍
Overall, well done. I did discover a few edge cases, though, that need to be resolved. Once those are fixed, feel free to merge the pull request yourself if you feel confident. I trust you on that 😄 . Otherwise, let me know, and I'll test it again and merge it if everything works
Thanks for the review! I'll prefer fixing all 4 comments before merging itself (will push as soon as fixed), and considering this is the 1st milestone, I think you should go ahead and merge once you're done rechecking through it, I can continue doing to for later ones. 😄 |
Pull Request
Description
Implements Layer 2 (Format Conversion) for the Databus Python Client download pipeline, bringing it to feature parity with the Java client as described in Frey et al. Users can now convert between RDF serialization formats and tabular formats on-the-fly during download using the new --format flag. Also refactors the compression CLI (Issue #61) by replacing --convert-to / --convert-from with a single --compression flag.
Related Issues
Issue #59 (Format and Mapping Conversion Layer — Layer 2)
Issue #61 (Refactor CLI compression)
Type of change
Checklist:
poetry run pytest- all tests passedpoetry run ruff check- no linting errorsWhat was added:
databusclient/filehandling/format.py — Layer 2 with TripleHandler, QuadHandler, TSDHandler classes using rdflib.Graph, rdflib.Dataset, and list[list[str]] as intermediate representations respectively. Each handler exposes read(), write(), and convert().
What was changed:
Tests:
Closes #59
Closes #61
Summary by CodeRabbit
New Features
--compressionand--formatoptionsDocumentation
Tests