# Package Manager

## 1 What

Today's coding exercise works through a toy example of writing a "Package Manager" such as apt or yum. A "package" is an installable package. A "package manager" is a program that offers functionality for defining and operating on packages. The typical operations include installing packages, uninstalling packages, and listing the installed packages. Moreover, a package manager usually allows one to establish dependencies among packages, and will automatically handle installing or uninstalling packages as needed in order to keep the system in a consistent state.

Here's an example of a script containing the kinds of commands we have in mind.

DEPEND FTP NETSTACK NETLIB
DEPEND NETSTACK NETLIB
DEPEND DNS NETSTACK NETLIB
DEPEND BROWSER NETSTACK HTML
INSTALL NETLIB
INSTALL FTP
INSTALL foo
REMOVE NETLIB
INSTALL BROWSER
INSTALL DNS
LIST
REMOVE FTP
REMOVE NETLIB
REMOVE DNS
REMOVE NETLIB
INSTALL NETLIB
REMOVE NETSTACK
REMOVE BROWSER
REMOVE NETSTACK
LIST
END


As we can see, we want a system that supports these commands.

DEPEND item1 item2
both defines packages item1 and item2 and establishes a dependency between them
INSTALL item1
installs item1 and possibly any packages it depends on
REMOVE item1
removes item1 and possibly those it depends on
LIST
lists installed packages

Note that a couple of "rules" go along with these commands.

1. Package installation can be either direct or indirect. Direct installation occurs when a package is named in an INSTALL command. Indirect installation occurs when a package is a dependency of another package, which is itself being installed. We distinguish direct and indirect installation/removal for reasons that will become clear.
2. Package installation is recursive. When a package is installed, the package manager will ensure that the packages it depends on also are installed, and will install them if necessary. In so doing, it will check the dependencies of those packages and install them if necessary, as well.
3. Package installation should be efficient. If a package is already installed, then if a direct installation is requested via an INSTALL command, then nothing should happen. Likewise, if an indirect installation of an already-installed package would occur by virtue of dependency relationships, then nothing should happen.
4. Package removal should also be efficient. If a package is not installed, then if a direct removal is requested via a REMOVE command, then nothing should happen. Note that the indirect removal of packages that no other packages depend on need not be considered if the package manager maintains consistency.
5. Package management should be consistent. All of the dependencies of an installed package should themselves be installed. Otherwise, errors will occur in the use of those packages.
6. Consequently, package removal should be safe. A package should not be uninstalled if other installed packages depend on it.
7. Package management should be optimal. If a package is the target of a REMOVE command, then of course it should be uninstalled, if possible (i.e., if no other packages depend on it). Moreover, all packages that it depends on should also be considered for indirect removal (if possible).
8. Package removal should be deferential. If a package would be considered for indirect removal because packages that depend on it are being removed, then it should be removed only if the user (via input commands) has not requested a direct installation of that package, via an INSTALL command. If the package was installed directly, then all bets are off when it comes to removal.

## 2 Why

This is a fun little exercise that brings together a few important concepts.

## 3 How

The strategy is as follows. We define a Registry, which is a sort of "database". As a database, it records the state of the system: what packages have been defined, how they are related, and which ones have been installed (either directly or indirectly). It also has a number of operations defined on it, which correspond to the commands in our little scripting language defined above. Consequently, we also define a Command, which generalizes the concept of these operations. Specific instances of a Command house the code that implements the corresponding operation. We also need something to install, so we define a Package that encapsulates the package's name, the packages that it depends on, and the packages that depend upon it. Finally, we need something to orchestrate these pieces and to bootstrap the system, so we define a Client so that we have an actual program to run. Let's work through these elements one by one. Note that we use Java for this exercise. There's no particular reason for this choice, though being an Object-Oriented language (like Python, Ruby, and many others) it maps decently-well to our strategy. Of course, there are other strategies, as well. In any case, Java might not be the easiest choice for whipping up an ad-hoc quick-and-dirty proof-of-concept exercise like this, but we don't have to insist on always making things easy for ourselves.

There are a few other things to mention about the code that follows. First, every class occupies a whole, single code block. Unfortunately, this article's publishing system, Emacs Org Mode, doesn't have great support for Java and among its constraints is that every example code block must be a public class. Not only that, but every block must be executable with the Java interpreter, which is why second, every class contains a static main method, even if it doesn't do anything. Now, onto the code.

The Package class represents an installable package.

 1: import java.util.*;
2:
3: public class Package {
4:     public static void main (String[] args) {}
5:
6:     String name = "";
7:     ArrayList<Package> children = new ArrayList<>();
8:     ArrayList<Package> parents = new ArrayList<>();
9:     boolean installed = false;
10:     boolean root = false;
11:
12:     Package (String name) {
13:         this.name = name;}
14:
15:     Package (String name, List<Package> deps) {
16:         this(name);
18:
19:     void depend (Package c) {
22:
23:     void install (boolean root) {
24:         if (!installed) {
25:             for (Package c : children) c.install(false);
26:             installed = true;
27:             System.out.println(String.format("    Installing %s", name));
28:             this.root = root;}
29:         else
30:             if (root) System.out.println(String.format("    %s is already installed.", name));}
31:
32:     void remove (boolean root) {
33:         if (installed) {
34:             if (!hasParents()) {
35:                 installed = false;
36:                 System.out.println(String.format("    Removing %s", this));
37:                 for (Package child : children)
38:                     if (!child.isRoot()) {
39:                         child.parents.remove(this);
40:                         child.remove(false);}}
41:             else if (root) System.out.println(String.format("    %s is still needed.", name));}
42:         else System.out.println(String.format("    %s is not installed.", name));}
43:
44:     boolean hasChildren () {
45:         return children.isEmpty();}
46:
47:     boolean hasParents () {
48:         return !parents.isEmpty();}
49:
50:     boolean isRoot () {
51:         return this.root;}
52:
53:     public boolean equals (Object o) {
54:         if (!(o instanceof Package)) return false;
55:         if (o==this) return true;
56:         return name.equalsIgnoreCase(((Package)o).name);}
57:
58:     public int hashCode () {
59:         return name.hashCode();}
60:
61:     public String toString () {
62:         return name;}}


The Command class encapsulates commands. It's abstract so that it can at the same time define an Application Programmer Interface (API), leave parts of its implementation unspecified, such as the run method, and yet specify other parts, such as the constructor, along with specifying the class's internal structure.

1: public abstract class Command {
2:     public static void main (String[] args) {}
3:
4:     Registry reg = null;
5:
6:     Command (Registry reg) {
7:         this.reg = reg;}
8:
9:     abstract public void run (String[] args);}


The Registry stores information about what packages exist, implicitly what their dependencies are, implicitly whether or not they're installed, and also a list of applicable commands. By "implicitly" I mean the packages themselves know whether they are or are not installed, whether they are a "root" (i.e., the target of a INSTALL command), what they depend on, and who depends on them. This concept of a "root" is important, as this is what guarantees that the system will be deferential as described above. Root packages—those whose installation was specifically requested by the user—should not be uninstalled indirectly by means of dependency-tracking (which is what makes the system optimal as described above). Root packages should only be uninstalled if their removal is specifically requested by the user. Of course, packages root or otherwise should also only be installed if no other packages that will remain installed depend upon them, something that guarantees safety as described above.

 1: import java.util.*;
2:
3: public class Registry {
4:     public static void main (String[] args) {}
5:
6:     Map<String, Package> db = new HashMap<>();
7:     Map<String, Command> cmds = new HashMap<>();
8:     Registry () {
9:         cmds.put("DEPEND", new Command(this) {
10:                 public void run (String[] args) {
11:                     Package root = null;
12:                     Package dep = null;
13:                     for (String name : args)
14:                         if (root==null) {
15:                             root = db.containsKey(name) ? db.get(name) : new Package(name);
16:                             db.put(name, root);}
17:                         else {
18:                             dep = db.containsKey(name) ? db.get(name) : new Package(name);
19:                             db.put(name, dep);
20:                             root.depend(dep);}}});
21:         cmds.put("INSTALL", new Command(this) {
22:                 public void run (String[] args) {
23:                     String name = args[0];
24:                     if (!db.containsKey(name)) {
25:                         Package root = new Package(name);
26:                         db.put(name, root);}
27:                     db.get(name).install(true);}});
28:         cmds.put("REMOVE", new Command(this) {
29:                 public void run (String[] args) {
30:                     String name = args[0];
31:                     db.get(name).remove(true);}});
32:         cmds.put("LIST", new Command(this) {
33:                 public void run (String[] args) {
34:                     for (Package c : db.values())
35:                         if (c.installed) System.out.println(String.format("    %s", c));}});
36:         cmds.put("END", new Command(this) {
37:                 public void run (String[] args) {
38:                     System.exit(0);}});}
39:
40:     Command getCommand (String name) {
41:         return cmds.get(name);}
42:
43:     boolean isInstalled (Package c) {
44:         return db.containsKey(c.name) && db.get(c.name).installed;}
45:
46:     boolean isInstalled (String name) {
47:         return db.containsKey(name) && db.get(name).installed;}}


Finally, we need a Client to run everything.

 1: import java.io.FileNotFoundException;
2: import java.io.File;
3: import java.util.*;
4:
5: public class Client {
6:     public static void main (String[] args) throws FileNotFoundException {
7:         Scanner in = new Scanner(new File("test.txt"));
8:         Registry reg = new Registry();
9:         while (in.hasNextLine()) {
10:             String line = in.nextLine();
11:             System.out.println(line);
12:             List<String> tokens = new ArrayList<>();
13:             StringTokenizer tokenizer = new StringTokenizer(line);
15:             reg.getCommand(tokens.get(0)).run(tokens.subList(1, tokens.size()).toArray(new String[0]));}}}

DEPEND FTP NETSTACK NETLIB
DEPEND NETSTACK NETLIB
DEPEND DNS NETSTACK NETLIB
DEPEND BROWSER NETSTACK HTML
INSTALL NETLIB
Installing NETLIB
INSTALL FTP
Installing NETSTACK
Installing FTP
INSTALL foo
Installing foo
REMOVE NETLIB
NETLIB is still needed.
INSTALL BROWSER
Installing HTML
Installing BROWSER
INSTALL DNS
Installing DNS
LIST
NETSTACK
BROWSER
FTP
NETLIB
foo
DNS
HTML
REMOVE FTP
Removing FTP
REMOVE NETLIB
NETLIB is still needed.
REMOVE DNS
Removing DNS
REMOVE NETLIB
NETLIB is still needed.
INSTALL NETLIB
REMOVE NETSTACK
NETSTACK is still needed.
REMOVE BROWSER
Removing BROWSER
Removing NETSTACK
Removing HTML
REMOVE NETSTACK
NETSTACK is not installed.
LIST
NETLIB
foo
END