Add cpio format to create an initramfs #48

Open
opened 2025-05-15 02:07:35 +00:00 by Wulf · 6 comments

Hi,
I'd love to see a cpio output format in mmdebstrap. It shouldn't go through tar first, but produce the cpio archive directly.
Use case is to build a system that I can easily kexec into.

Current workaround is use the tar output, and inside a fakeroot untar it again into a temporary location and run find . -print0 | cpio -o -0 --format=newc. The output I pipe into zstd.

Hi, I'd love to see a `cpio` output format in `mmdebstrap`. It shouldn't go through `tar` first, but produce the `cpio` archive directly. Use case is to build a system that I can easily `kexec` into. Current workaround is use the `tar` output, and inside a `fakeroot` untar it again into a temporary location and run `find . -print0 | cpio -o -0 --format=newc`. The output I pipe into `zstd`.
Owner

I think direct cpio output would be a nice feature!

Do you happen to know about a cpio archiver which is able to accept a tarball on standard input and then produce the cpio on standard output without having to unpack everything first?

In the worst case, I can also write that myself. cpio is not a very complicated format.

I think direct cpio output would be a nice feature! Do you happen to know about a cpio archiver which is able to accept a tarball on standard input and then produce the cpio on standard output without having to unpack everything first? In the worst case, I can also write that myself. cpio is not a very complicated format.
Author

Do you happen to know about a cpio archiver which is able to accept a tarball on standard input and then produce the cpio on standard output without having to unpack everything first?

I looked for one and couldn't find any.
Would it be so much more effort to not use tar if cpio output is used?

> Do you happen to know about a cpio archiver which is able to accept a tarball on standard input and then produce the cpio on standard output without having to unpack everything first? I looked for one and couldn't find any. Would it be so much more effort to not use tar if cpio output is used?
Owner

Usually the bsdtar utility from the libarchive-tools package contains surprising amounts of functionality. Can you test if this works for your use-case:

mmdebstrap unstable - | bsdtar -cf - --format cpio @- > ./unstable-chroot.cpio
Usually the bsdtar utility from the libarchive-tools package contains surprising amounts of functionality. Can you test if this works for your use-case: mmdebstrap unstable - | bsdtar -cf - --format cpio @- > ./unstable-chroot.cpio
Author

Nice, didn't know this tool existed.

Format needs to be newc to work. cpio is some other format the kernel doesn't like.

There's a small caveat though. Hard linked files are missing after extraction.
When I extract the archive manally with cpio -i I get:

cpio: ./usr/bin/perl5.36.0: unknown file type
cpio: ./usr/bin/perlthanks: unknown file type

I don't really care about those two files. But it might break in other cases.

Bsdtar manpage states:

Converting between dissimilar archive formats (such as tar and cpio) using the @- convention can cause hard link
information to be lost. (This is a consequence of the incompatible ways that different archive formats store
hardlink information.)

I could add a hook to replace any hard links. Alternatively tar -c --hard-dereference would work.

Nice, didn't know this tool existed. Format needs to be `newc` to work. `cpio` is some other format the kernel doesn't like. There's a small caveat though. Hard linked files are missing after extraction. When I extract the archive manally with `cpio -i` I get: ``` cpio: ./usr/bin/perl5.36.0: unknown file type cpio: ./usr/bin/perlthanks: unknown file type ``` I don't really care about those two files. But it might break in other cases. Bsdtar manpage states: > Converting between dissimilar archive formats (such as tar and cpio) using the @- convention can cause hard link > information to be lost. (This is a consequence of the incompatible ways that different archive formats store > hardlink information.) I could add a hook to replace any hard links. Alternatively `tar -c --hard-dereference` would work.
Owner

Right, throwing in a --hard-dereference on the tar side of things is easy. Would you be fine for mmdebstrap to require bsdtar for this functionality?

I also wonder how the format should be called given the format differences between cpio and newc. Should it just be --format=newc and should the auto-detect filename extension be *.newc?

Right, throwing in a `--hard-dereference` on the tar side of things is easy. Would you be fine for mmdebstrap to require bsdtar for this functionality? I also wonder how the format should be called given the format differences between cpio and newc. Should it just be --format=newc and should the auto-detect filename extension be `*.newc`?
Author

I changed by build script to pipe the output of mmdebstrap through bsdtar. So I'm not sure if it would be a big benefit if mmdebstrap did the piping for me. Might save a few bytes in my script and other users might not know about bsdtar either if they need cpio output, so perhaps there is some benefit for this approach.

My original idea was to somehow use the cpio command instead of tar. Either implemented directly, or by allowing a hook which is executed inside the namespace/fakeroot/etc.

Would you be fine for mmdebstrap to require bsdtar for this functionality?

I wouldn't mind, I already use it anyway. But imo it should be an optional dependency.

I also wonder how the format should be called given the format differences between cpio and newc. Should it just be --format=newc and should the auto-detect filename extension be *.newc?

*.newc looks awkward to me. I don't think anyone would need a different flavour of cpio other than SVR4 with (cpio -H crc) or without (cpio -H newc) checksum. The linux kernel supports only those two. rpm seems to use some derivation of the crc format, pax seems to support multiple formats. With some luck nobody else uses cpio.

bsdtar doesn't appear to support writing the crc format, which leaves us with newc as the only useful thing for bsdtar to output when cpio is chosen.

I changed by build script to pipe the output of mmdebstrap through bsdtar. So I'm not sure if it would be a big benefit if mmdebstrap did the piping for me. Might save a few bytes in my script and other users might not know about bsdtar either if they need cpio output, so perhaps there is some benefit for this approach. My original idea was to somehow use the cpio command *instead* of tar. Either implemented directly, or by allowing a hook which is executed inside the namespace/fakeroot/etc. > Would you be fine for mmdebstrap to require bsdtar for this functionality? I wouldn't mind, I already use it anyway. But imo it should be an optional dependency. > I also wonder how the format should be called given the format differences between cpio and newc. Should it just be --format=newc and should the auto-detect filename extension be `*.newc`? `*.newc` looks awkward to me. I don't think anyone would need a different flavour of cpio other than `SVR4` with (`cpio -H crc`) or without (`cpio -H newc`) checksum. The linux kernel supports only those two. `rpm` seems to use some derivation of the `crc` format, `pax` seems to support multiple formats. With some luck nobody else uses cpio. `bsdtar` doesn't appear to support writing the `crc` format, which leaves us with `newc` as the only useful thing for `bsdtar` to output when `cpio` is chosen.
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: josch/mmdebstrap#48
No description provided.