Breaking dependency cycles

Depending on build dependency there exist multiple ways to make that build dependency optional for bootstrapping.

  1. Introduce build profiles
  2. Move dependencies from Build-Depends to Build-Depends-Indep
  3. Choose different installation sets for not-strong dependencies
  4. Split the source package (so that the part that creates the cycle can be built separately) or join the source packages forming the cycle (so that the cycle can be broken in debian/rules)
  5. Cross compile the appropriate set of source packages

While (1) is probably the most obvious solution, it cannot be implemented until the Debian archive accepts source packages with the build profile syntax. Depending on the type of build dependency, (2) can already be implemented right now for build dependencies on documentation generating packages and the like. Point (3) requires a deeper insight into the overall dependency situation and should be left to an actual bootstrapper instead of a package maintainer. This is because of the required inside knowledge and the fact that strong dependencies might change rapidly over time. Point (4) is most likely undesirable in most situations. The last point (5) is again up to the bootstrapper but cross buildability is a welcome feature in any source package.

Self-Cycles

Self-cycles are dependency cycles of source packages with themselves. There exist three types of self cycles. For Debian to be bootstrappable, all self-cycles must be breakable. Self-cycles have a special status compared to cycles of larger sizes because they only leave one option to break them. This makes them a hard heuristic as they are not-optional and have to be worked on by package maintainers.

Type 1 Self-Cycles

These are the most obvious self-cycles and mostly consist of compilers which need themselves to be built. Cycles of this type can easily be found by checking whether a source package builds a binary package it also build depends on. On the other hand, the solution to these cycles is often hard because it requires to make the compiler able to bootstrap itself. Upstream often does not provide this functionality.

Type 2 Self-Cycles

These self-cycles are of the same size as the former ones but are not anymore as obvious. They consist of a source package A builld depending on a binary package B which strongly depends on another binary package C which in turn is built from source package A. Since only build dependencies can be modified, the only way to break this cycle is to not have the source package A build depend on binary package B anymore. Binary package B might be completely unrelated to source package A as it is only connected to C via a long dependency chain but nevertheless this is the only way to break these cycles.

Type 3 Self-Cycles

These are like type 2 self-cycles except that the dependency relationship between B and C is not strong anymore. This means that this cyclic dependency might be solvable using other means.

Degree based heuristics

Degree based heuristics are those which take no other input than the connectivity of single vertices in the dependency graph with their neighbors.

Amount of Missing Build Dependencies

This heuristic maps source packages to the amount of their build dependencies which are not available. This is not equal to the amount of build dependencies but less because some of the build dependencies might already be available.

Only Weak Build Dependencies Missing

This heuristic maps source packages to the amount of their build dependencies which have been classified as "weak". This classification is a soft one and mostly considers binary packages which are in most cases used for documentation generation. Often, these "weak" build dependencies can be dealt with by moving them into Build-Depends-Indep

Ratio Binary Heuristic

Find binary packages A that are only needed by few source package B but need many other source packages C to be built to satisfy the runtime dependencies of A. Maybe the source packages B that needs the binary packages A can be built without them?

Ratio Source Heuristic

Find source packages A that build-depend on many others but are only needed by few binary packages B which are in turn only needed buy a few source packages C. Maybe the source packages C that need those few binary packages B can be built without them and thus not require all the source packages A?

Cycle based heuristics

Cycle based heuristics are those which take no other input than a list of cycles in the dependency graph. Since the total amount of cycles goes in the millions, only cycles up to a small length are considered.

Amount of Cycles through Edges

This heuristics finds build dependencies of source packages on binary packages which are part of many dependency cycles in the graph. If such a build dependency can be broken, then this means that all those dependency cycles are immediately broken as well.

Cycles

This heuristic lists the found cycles. For small cycles it might be possible to immediately identify and easy way to break them.

Feedback Arc Set

A feedback arc set is a set of edges which, if removed from the graph, make the graph acyclic. The heuristic used produces a very small feedback arc set. If a build dependency is listed as being part of a calculated feedback arc set, then making it removable would greatly simplify the cyclic dependency situation.

Strong bridges and articulation points

Strong bridges and strong articulation points are the edges and vertices respectively that, if removed from the graph, split it into more strongly connected components than before.

Strong Articulation Points

If a source package is listed as a strong articulation point of the dependency graph, then its removal will break the graph into multiple strongly connected components. If the amount of resulting strongly connected components the dependency graph will be split into is high, then it might be worthwhile to make the source package available, for example by cross compilation.

Strong Bridges

If a build dependency is listed as a strong bridge of the dependency graph, then its removal will break the graph into multiple strongly connected components. If the amount of resulting strongly connected compoents the dependency graph will be split into is high, then it might be worthwhile to make this build dependency droppable.