WIP: Support file:/ mirrors #25
Loading…
Reference in a new issue
No description provided.
Delete branch "DonKult/mmdebstrap:feature/filemirror"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Hi,
mmdebstrap currently doesn't support file:/ as it wants to have all deb files in
/var/cache/apt/archives
for what amounts to a lot of guess work. This PR removes the guess work by "just" asking apt to tell us which deb files it wants to install instead. Easy, right?I think for the long term we can talk about adding a
--print-archive-filenames
flag (name subject to change) in apt, but for this WIP I opted for no-apt-change which seems to work for the most part but beware that I have neither tested it extensively nor am I a Perl coder – I had to look up whats the deal with$var
vs%var
vs@var
, that is my perl experience level (aka: n00b).Thankfully, the current WIP mostly consists of code removal (I just love diffs adding features/fixing bugs while reducing line count) and apt saucery, the tricky and hard part of making it maintainable & work for everyone I am gladly leaving up to someone else if need be.
Known bugs so far
Requires /tmp in chroot?
Test
(56/183) mode=unshare,variant=custom: missing /dev, /sys, /proc inside the chroot
fails withE: failed to open /tmp/mmdebstrap.9TctWt82qW/tmp/mmdebstrap.listofdebs.alMoIAtC1c0U for reading: No such file or directory
. I suppose this chroot also lacks/tmp
. If so this was broken before if you were using a non-empty archives, just that this situation isn't tested much.We don't actually need apt to generate a file, I just lack the perl foo to open a file descriptor and work with that accordingly. An apt proof-of-concept looks like this:
apt install awesome -o Debug::pkgDpkgPm=1 -o Dir::Log=/dev/null -oAPT::Keep-Fds::=5 -oDPkg::Tools::options::'cat >&5'::InfoFD=5 -oDpkg::Pre-Install-Pkgs::='cat >&5' -o Debug::NoLocking=1 -o Dpkg::Use-Pty=0 -y 5>/tmp/foo.lst; cat /tmp/foo.lst
.I fully suspect more test failures, I just haven't investigated further so far given it takes literal ages to run the tests and skipping and stuff is to be implemented by hand…
Reproducibility of /var/cache/apt/archives
The invocation I played with is reproducible just fine except for one minor problem: If no deb file is downloaded/copied to /var/cache/apt/archives it has a different timestamp compared to if something was (temporarily) there.
For context: WTH am I doing?
My test invocation is:
mmdebstrap --variant=apt --architectures=amd64 --skip=essential/unlink --setup-hook='mkdir "$1/tmp/mirror"; mount -o ro,bind /tmp/mirror "$1/tmp/mirror";' --customize-hook='sync-out /var/cache/apt/archives ./cache' --customize-hook='umount "$1/tmp/mirror"; rmdir "$1/tmp/mirror";' unstable - 'mirror+file:///tmp/mirror/list' | mmtarfilter --path-exclude='/dev/*' > debian-amd64-unstable.tar
(note that
./cache
needs to exist before the call for me. A--setup-hook
as advised in the archives-preserving example creates it with the wrong uid for me)The mirrorlist file
/tmp/mirror/list
contains:The general idea here being that every file it can get via file:/ is accessed this way while not already cached files are acquired from the web, synced out of the chroot and in a step not shown here merged into the local mirror structure.
So, yes, my distant endgoal is basically reinventing caching proxies except that this "proxy" has no server component I have to setup and is agnostic to the protocol used to access one (or more) upstream mirrors.
apt-cacher-ng
is already at its limits if the mirror ishttps
, good luck wanting to access mirrors only via tor or downloading from multiple mirrors in parallel.Not sure if that will become truely useful in the end as so far its just me stretching various tools to its limits for fun (& to find regression I caused myself…). That is to say: This here has at best whistlist priority for me, it isn't blocking any serious work or similar such.
Very nice, thank you! I'm going to mangle your commits a little bit because I see that the EIPP support that one commit introduces is removed in another.
Do you think it's possible to somehow autodetect which directories to bind-mount?
If I have a sources.list entry like
deb file:///tmp/mirror ...
, is there a way to extract the/tmp/mirror
bit without me writing my own sources.list parser? I cannot useapt-get index-target
output, because apt inside the chroot will not be able to access/tmp/mirror
when runningapt-get update
.Thanks!
In that sense, I also don't understand how this can work:
For me, this test fails with:
Which makes sense, because the mmdebstrap call doesn't set up a bind-mount. So lets set one up by adding the following:
But then it fails with:
Since mmdebstrap assumes that it had downloaded the deb, it will clean it up again so that it doesn't end up inside the resulting tarball. But with a
file://
mirror and mind mounts we now have to figure out whether we downloaded the deb or whether it was there before. How can we do that?d3952de003
to6baf4151f9
EDSP->EIPP: Yeah, I replaced it later on… I was thinking at first I would go with an external planner and/or a 'dpkg' which does the extract rather than getting the filenames out, but that turned messy rather fast and so I dropped that direction I left it in non the less as I thought that this might have value even if the rest of the MR turns into a dumbster fire.
tests: I can't say much about it as I don't quite follow how that coverage script works and I got a bit annoyed by having one of the very last tests fail on me on a clean checkout after half a day with no easy way of just trying that test again. My machine isn't powerful enough to enjoy these tests properly & my "usecase" works, so who cares about the rest… 😜 So as said in the MR, I haven't really run the tests much, so changes there are mostly done blind.
hooks: I have a setup-hook who basically does that yes. I also have
--skip=essential/unlink
in my calls as I want to keep$1/var/cache/apt/archives
so I can sync the debs out of the chroot and merge that into my mirror later on… as that isn't very documented btw, I managed to figure out that the call you have to do in a hook script ismmdebstrap --hook-helper "$MMDEBSTRAP_ROOT" "$MMDEBSTRAP_MODE" "$MMDEBSTRAP_HOOK" 'env' "$MMDEBSTRAP_VERBOSITY" 'sync-out' '/var/cache/apt/archives' "${CACHE}/archives" >&${MMDEBSTRAP_HOOKSOCK} <&${MMDEBSTRAP_HOOKSOCK}
(just added the commit setting$MMDEBSTRAP_VERBOSITY
… I think that makes sense so that a hook script can be as verbose (or not) as the mmdebstrap call is supposed to be). I suppose the unlink should just check if the deb it wants to delete is in/var/cache/apt/archives
and ignore it otherwise as it ignores cached debs already.autobind: Mhh.
apt-get indextargets --no-release-info
doesn't need access to any downloaded files. It means it will potentially talk about files which do not exist in reality, but that should be fine as we are mostly intested inRepo-URI
field I assume.The EIPP stuff was a very interesting read but unfortunately I don't know whether I'll need it anywhere else now. :/
Yeah, the testsuite has to be rewritten but since I was (so far) the only one running it, there wasn't really much motivation to do so. A related problem is, that I don't want to write a testsuite runner from scratch and am looking for a framework that does what I want but wasn't successful in finding one so far.
There are a bunch of undocumented options (but as an apt developer you are probably used to that) which are undocumented because I do not consider their interface stable yet. An
MMDEBSTRAP_VERBOSITY
variable sounds like a good idea, thanks! (I wish we had a setting to run maintainer scripts with -x without me having to rebuild src:glibc just because I want to putset -x
in its preinst...)Ah, an undocumented
--no-release-info
argument setting an undocumented optionAPT::Get::IndexTargets::ReleaseInfo
. We were just talking about those undocumented bits. ;) See, this is why I'm happy to have an apt developer here who can teach me all the magic incantations. Indeed all the undocumented--hook-helper
magic of mmdebstrap is just because I want to be just as cool as apt! :DThat sounds good. I think I'll write a hook script that does the right
indextarget
magic and can be used by those who want to usefile://
and want to automount their stuff. Thanks!I know you are very serious with the later remarks, but:
--no-release-info
is actually documented indoc/acquire-additional-files.md
. That the underlying option isn't documented is somewhat normal as its just there to tell the code that the flag was used and there is no reason to use that option instead of the documented flag.As I tried to mention the EDSP->EIPP code I dropped could theoretically be used to fold the extract step into the download step. If that is a useful direction though… also EIPP is faster than EDSP (as less data is pushed around) so if the entire MR would be trashed (as mentioned on IRC, I half-expected that) at least that part could have been salvaged.
automount: Not sure I would try to be too clever here.
file:/
support is all nice, but as this MR request mentions I am actually usingmirror+file:/
which points to a single file somewhere and from within potentially to other methods, I gave an example above. Trying to support that might be a bit overkill…Yeah I had read that part but wasn't sure how you meant this could be done. Could you roughly outline the idea behind it being possible to download and extract at the same time?
How would a
mirror+file://
entry look like in apt-get indextarget output? Currently, I have this setup-hook script:And this customize-hook script:
I think I'm done reworking your commits. I fixed some of the Perl problems (arrays are hard) and am using a file descriptor instead of a file in
/tmp
(super easy to set up all those read/write handles and fork an extra process to read from it...). With those changes, the tests pass with nearly no changes, so I don't think that this should introduce any regressions but just enable new functionality. If you have any comments, then I'm eager to hear them. Otherwise, this is what I'm going to commit to close this pull request:The
Repo-URI
for a(.*\+)mirror+file
is e.g.tor+mirror+file:/path/to/file/mirror.file/
– which is the filename of the file with a / at the end as that is how the URI will look in general liketor+mirror+file:/path/to/file/mirror.file/dists/unstable/InRelease
.Regarding merging download and unpack: You could e.g. extract in a
dpkg::pre-install-pkgs
hook and then have apt somewhat normally run dpkg over it. Probably harder than I make it sound though.The patch seems mostly fine (except that you are adding yet another quote…) with a few comments:
main
– some fuzzy some offset, but it can be applied automatically.proxysolver
as its no longer in useAppart from that the diff shows that I only made a couple Perl errors… gonna write "Perl Expert" in my CV now. 😉 My "setup" also still works with this patch, so as said, and in summary, fine by me 👍
Okay, thanks. Currently, the
file-mirror-automount
hook scripts only supportfile://
mirrors but i think that's already a useful start.I'm currently in the process to rewrite the test suite such that it becomes trivial to skip tests or only run specific tests in an attempt to motivate future contributions. Looking forward to your patch! ;)
Hey, I need those so that future-me can still figure out what the apt magic is all about!
Yes, I have a number of uncommitted changes locally that I will push after I am done with rewriting the testsuite.
Originally, I wrote the proxysolver for a completely different use-case outside of mmdebstrap. Even if mmdebstrap isn't using it anymore, I need a package that ships it until I port everything else away from it.
Thank you! Fixed locally.
The big thing in Perl that takes a lot of getting used to (at least it did for me) is that the type of a variable is decided by the context the variable is used in. You only confused Perl arrays with Perl lists, resulting in
Odd number of elements in anonymous hash
even though you didn't attempt to create a hash anywere. ;)Awesome! I really like your patch because (as you also already pointed out) it removes so much code without breaking functionality. Let me know if you ever implement
apt-get bootstrap
so that I can remove another few thousand lines. ;)Fixed in
cc3150ef04
Pull request closed