## October 20, 2021 Single Round match 816 Editorials

#### AirportCodes

In this problem we have to shorten each of the given airport names into a unique three-letter code, if possible. The constraints are small, so the problem is solvable by various brute force approaches – but we still have to be at least a little careful, as the most naive brute force can still time out.

In the time complexity estimates below we will use N to denote the number of airports, L to denote the maximum length of an airport name, and S to denote the size of the alphabet. The constraints in the problem were N=L=30 and S=26, so when determining how fast an approach will be, we can simply say that “everything is 30”.

There are O(L^3) ways how to choose which three characters of an airport’s name will form its code. Once we choose the code, we need to make sure that we cannot choose the same code for a different airport.

We could consider doing this by the same brute force: for each of the other N-1 airports, go through all O(L^3) ways to select a code and check that none of them matches the one we have. However, this approach then runs in O(N*L^6) time, which is too much.

Luckily, there is a better way to check whether an airport can have a given code. We just need to read the name of the airport once. E.g., when checking whether an airport can have the code “ABC”, we will read the name of the airport until we see an ‘A’, then continue from there until we see a ‘B’, and from there we then look for a ‘C’. If we succeed, the airport can have the code “ABC” (and we found one way to produce it), if we fail, we can be certain that the airport cannot have the code “ABC”. This algorithm improves the above solution to run in O(N*L^4) time, which is fast enough by a wide margin. A sample implementation follows.

```
boolean canHaveCode(String airport, String code) {
int seen = 0;
for (char c : airport.toCharArray()) {
if (c == code.charAt(seen)) ++seen;
if (seen == 3) return true;
}
return false;
}
String getCode(String[] airports, int which) {
int N = airports.length;
int L = airports[which].length();
for (int a=0; a<L; ++a) for (int b=a+1; b<L; ++b) for (int c=b+1; c<L; ++c){
String candidate = "";
candidate += airports[which].charAt(a);
candidate += airports[which].charAt(b);
candidate += airports[which].charAt(c);
boolean conflict = false;
for (int i=0; i<N; ++i) if (i != which)
if (canHaveCode(airports[i], candidate)) conflict = true;
if (!conflict) return candidate;
}
return "";
}
public String[] name(String[] airports) {
int N = airports.length;
String[] codes = new String[N];
for (int n=0; n<N; ++n) codes[n] = getCode(airports, n);
return codes;
}
```

There are also many other approaches that work. We’ll add one more: one that would be usable for much bigger N and L, too. For this approach, note that there are only S^3 possible airport codes – this is the number of all possible 3-letter strings. We can allocate an S*S*S array to remember information about each of them. Initially, the whole array will be initialized to the value -1, meaning “we haven’t seen any airport that could have this code”. Then, for each of the N airports, we go through all O(L^3) ways to generate its possible code, and for each of those codes we mark it in the array: Once we find the first airport that can have a particular code, we change its value from -1 to the id of that airport. Once we find a second airport that can have that code, we change the airport id to the value -2, meaning “this code can belong to multiple airports and is therefore unusable”. In the end, we then look through the entire S*S*S array to find all codes that can be used for airports.

As described above, the time complexity of this solution is O(S^3 + N*L^3), which is an order of magnitude faster than the previous solution.

Challenge: Assuming L is large and S is still 26, can you find a better function that takes an airport name as input and generates all possible codes for that airport? Ideally, the time complexity of your function should only be linear in L, not cubic.

#### SmallOccupiedAirplane

In this problem we just need to do what we are told. The main challenge is choosing the data structures and algorithms in such a way that we get a short clean implementation with as few special cases as possible.

A sample implementation is shown below. We represent the airplane by a 2D array of booleans (marking occupied seats) and a 1D array of integers (for each row, the current number of occupied seats). The first representation tells us quickly whether a particular seat is available, the second one tells us whether a group fits into this row.

Probably the trickiest part is handling the cases when we are seating a single passenger. In those, we want to iterate through the plane three times: the first time looking for a window seat, the second time looking for an aisle seat, and the third time looking for a middle seat. Check the implementation below to see how this can be handled using nested cycles instead of copying and pasting code.

```
int R;
boolean[][] occupied;
int[] occupiedCount;
String[] answer;
int seated;
String seatNumber(int row, int column) { return "" + (row + 1) + "ABCDEF".charAt(column); }
void seatOne() {
int[][] attempts = new int[][] { {0,5}, {2,3}, {1,4} };
for (int a=0; a<3; ++a) for (int r=0; r<R; ++r) for (int c : attempts[a]) {
if (!occupied[r][c]) {
occupied[r][c] = true;
++occupiedCount[r];
answer[seated++] = seatNumber(r,c);
return;
}
}
}
void seatMany(int count) {
for (int r=0; r<R; ++r) if (occupiedCount[r] + count <= 6) {
occupiedCount[r] += count;
for (int c=0; c<6; ++c) if (count > 0 && !occupied[r][c]) {
occupied[r][c] = true;
answer[seated++] = seatNumber(r,c);
--count;
}
return;
}
for (int i=0; i<count; ++i) seatOne();
}
public String[] seat(int _R, int[] groups) {
R = _R;
occupied = new boolean[R][6];
occupiedCount = new int[R];
int sg = 0;
for (int g : groups) sg += g;
answer = new String[sg];
seated = 0;
for (int g : groups) if (g == 1) seatOne(); else seatMany(g);
return answer;
}
```

#### BalancedAirplane

The main trick in this problem is to think about it in terms of prefix sums. Let’s look at one particular column c, and let P[r] be the number of people in the first r rows of our plane. How many people are in this column in rows from the half-open range [r1,r2)? Exactly P[r2] – P[r1].

A balanced section of the plane is a section such that each column of that section contains the same number of people. In other words, a section [r1,r2) is balanced if and only if the values P[r2] – P[r1] for all c are equal to the same constant k. Or, stating the same thing in yet another way, the values P[*][r2] are the values P[*][r1], each incremented by the same constant k.

This gives us a general idea on how to look for balanced sections. We will process the airplane from the front to the back. Each time we process another row, we add its people to the running totals, so now we know the number of people we already saw in each column. Now we can ask the question: how many balanced sections end with the row we just processed? Each such section corresponds to some previous situation in which we saw the same vector of people counts we see now, only with each value decreased by the same constant. This is what we need to do efficiently.

There are multiple more-or-less equivalent ways to do this. The two most straightforward ones are differences and normalization. Let’s look at both.

In both cases, let u=(u[0],…,u[S-1]) be a vector of S values.

The difference operator D takes a vector of S values and returns a vector of S-1 values: the differences between adjacent elements of the input vector. That is, D(u) = (u[1]-u[0], u[2]-u[1], …, u[S-1]-u[S-2] ).

Now, clearly, two vectors u and v differ by adding a constant to each term if and only if D(u) = D(v).

Hence, if the vector u represents the current number of passengers in each column, the question “how many balanced sections end here” can now be rephrased as “how many times have we already seen a vector u’ such that D(u’) = D(u)?”. And this is easy to answer efficiently: we will simply use a map where we remember, for each vector of differences, the number of times we have already seen it.

This gives us a solution that runs in guaranteed O(R*S*log(R)) time if we use a balanced tree map, and in expected O(RS) time if we use a hashmap.

The other option we mentioned above is normalization. Consider the following two operators: N0(u) subtracts the value u[0] from all elements of u, thus producing a new vector in which the initial element is zero. Nm(u) subtracts the value min(u) from all elements of u, thus producing a new vector in which the smallest element is zero. Clearly, either of these operators can be used instead of the difference operator and the solution still works: u and v differ by the same constant everywhere if and only if N0(u) = N0(v), and also if and only if Nm(u) = Nm(v). The latter option is implemented in the code shown below.

```
void normalize(vector<int> &counts) {
int m = counts[0];
for (int x : counts) m = min(m, x);
for (int &x : counts) x -= m;
}
long long count(int S, int R, const vector<int> &A) {
map< vector<int>, int > seen;
vector<int> counts(S, 0);
++seen[counts];
long long answer = 0;
for (int r=0; r<R; ++r) {
for (int s=0; s<S; ++s) if ((A[r] >> s) & 1) ++counts[s];
normalize(counts);
answer += seen[counts];
++seen[counts];
}
return answer;
}
```

#### OccupiedAirplane

In this problem we need to come up with an efficient way of looking for available seats in an airplane, according to the rules given in the statement. We will show two different efficient ways of doing so.

One solution will be based on the following observation: as there are only 6 seats in each row, there are only 2^6 possible patterns of free and occupied seats in a row. We will divide all rows into groups according to their current pattern. Clearly, whenever seating people, from each group only the first row matters, because if we encounter that one and don’t use it, we also won’t use any other row that looks the same. This gives us only a constant number of candidates to consider for each seating operation. Once we seat a person, the row where it happens needs to be moved into a different group, which involves two operations on ordered sets.

In total, the worst-case time complexity of this approach for a plane with R rows of S seats is O(RS*(2^S + log R)). This is the solution shown in the implementation below.

Another efficient solution can be based on the idea of amortized time complexity. Suppose at some point we were trying to seat a group of 4, we looked at row 47 and concluded that at this moment the group does not fit into this row. Clearly this means that in the future we never need to check row 47 for a group of size 4, as the number of empty seats in a row cannot increase. Thus, for each group size between 2 and 6, inclusive, we can simply maintain a single index: the last row into which we tried to place a group of this size. Whenever we get a next group of the same size, that is where we’ll start the search for their places.

(One can also argue that if a row does not fit a group of size 4, it won’t fit any bigger group. This improvement can eliminate some steps but it’s not necessary to implement it and it doesn’t change the worst-case time complexity of the solution.)

While any single query can take long to process, during the entire run of the solution each of these variables only increases, and thus for each group size we’ll only traverse the entire plane once.

For individual passengers we can do essentially the same, but we’ll also need to remember which types of seats we are currently looking for, and the variable will therefore traverse the plane three times: once while assigning window seats, once for aisle, and once for middle seats.

This approach therefore works in worst-case O(RS) time complexity, which is clearly optimal. (Well, technically it’s O(RS^2) if we are lazy with how we implement actually finding empty seats for a group within a row, but improving this is an easy exercise.)

```
int[][] seatNumbers = { {0,5}, {2,3}, {1,4} }; // window/aisle/middle seats
int R;
int[] airplane;
long seated;
long checksum;
ArrayList< TreeSet<Integer> > rowTypes;
int[] minVal;
boolean[][] hasSeatType;
int[] freeSeats;
boolean precomputeHasSeatType(int row, int type) {
for (int seat : seatNumbers[type])
if (((row >> seat) & 1) == 0) return true;
return false;
}
void seatIndividual(int row, int[] columnOptions) {
// Seat an individual into the given row, onto the first available of
// the listed column options. Then, update the sets of rows.
for (int c : columnOptions) {
if ((airplane[row] & 1<<c) == 0) {
int oldRow = airplane[row], newRow = airplane[row] | 1<<c;
airplane[row] = newRow;
rowTypes.get(oldRow).remove(row);
rowTypes.get(newRow).add(row);
minVal[oldRow] = rowTypes.get(oldRow).first();
minVal[newRow] = rowTypes.get(newRow).first();
++seated;
checksum += seated * (6*row + c);
return;
}
}
}
public void findIndividualPlace() {
// find the best place to seat an individual traveler and put them there
for (int t=0; t<3; ++t) {
int bestRow = R;
for (int r=0; r<64; ++r)
if (hasSeatType[r][t]) bestRow = Math.min( bestRow, minVal[r] );
if (bestRow < R) {
seatIndividual( bestRow, seatNumbers[t] );
return;
}
}
}
public void seatGroup(int size) {
// find the places for a group of the given size
if (size == 1) { findIndividualPlace(); return; }
int bestRow = R;
for (int r=0; r<64; ++r)
if (freeSeats[r] >= size) bestRow = Math.min( bestRow, minVal[r] );
if (bestRow < R) {
int[] options = {0, 1, 2, 3, 4, 5};
for (int i=0; i<size; ++i) seatIndividual(bestRow, options);
} else {
for (int i=0; i<size; ++i) findIndividualPlace();
}
}
public long seat(int _R, int[] groups, int seed) {
// initialize an empty airplane
R = _R;
airplane = new int[R];
// for each row pattern, precompute which types of individual seats it has
// and what is the total number of free seats in it
hasSeatType = new boolean[64][3];
for (int r=0; r<64; ++r)
for (int t=0; t<3; ++t) hasSeatType[r][t] = precomputeHasSeatType(r,t);
freeSeats = new int[64];
for (int r=0; r<64; ++r)
for (int s=0; s<6; ++s) if (((r >> s) & 1) == 0) ++freeSeats[r];
// divide all airplane rows into groups and cache their minima
rowTypes = new ArrayList< TreeSet<Integer> >();
for (int i=0; i<64; ++i) rowTypes.add( new TreeSet<Integer>() );
minVal = new int[64];
for (int i=0; i<64; ++i) { rowTypes.get(i).add(R); minVal[i] = R; }
minVal[0] = 0;
for (int r=0; r<R; ++r) rowTypes.get(0).add(r);
seated = 0;
checksum = 0;
// generate and seat all groups
for (int g : groups) seatGroup(g);
long state = seed;
while (true) {
state = (state * 1103515245 + 12345) % (1L << 31);
int g = (int)(((state >> 10) % 6) + 1);
if (seated + g > 6*R) break;
seatGroup(g);
}
return checksum;
}
```

#### FlightReduction

Many optimization problems related to dense subgraphs turn out to be NP-hard, but this particular version (finding the subgraph in which the number of edges divided by the number of vertices is maximal) is solvable in polynomial time. The first person to discover and publish an efficient algorithm for this problem was Goldberg in 1984. The main idea of the algorithm is to find a clever reduction from our problem to maximum flow (or, equivalently, minimum cut) in a related graph.

Suppose we have a non-negative guess g for the density of a subgraph. How can we verify whether our graph has any subgraph with density at least g?

Let’s take our original graph and assign capacity 1 to each of its edges. We’ll now add two new vertices (the source s and the sink t) and 2*n extra edges:

- for each original vertex v, an edge between s and v with capacity m, where m is the number of original edges
- for each original vertex v, an edge between v and t. Initially, let’s also set the capacity of these edges to m, but we’ll tweak them later.

One possible s-t cut in this graph is the cut ({s}, everything else). The capacity of this cut is n*m.

In any other cut there is some non-empty set A of |A|=a vertices of the original graph that is in the same part as vertex s. Let B be the other vertices of the original graph What is the capacity of the cut {s}+A, {t}+B? The cut consists of:

- (n-a) edges between s and vertices in B, total capacity (n-a)*m
- all edges between A and B, total capacity equal to their number
- all edges between t and vertices in A, total capacity is currently a*m

So, currently the capacity of this cut would be n*m + the number of edges between A and B. That’s not what we want. Our goal is to set the capacities for v-t edges so that the total will somehow be related to the density of the set A instead. For that, we DO want the edges in A, but we DON’T want the edges between A and B. How can we do something along those lines? For each v-t edge, we can add -degree(v) to its capacity, where degree(v) is the degree of v in the original graph. Then, the sum of capacities of all v-t edges decreases by the sum of degrees in A. The sum of degrees in A equals twice the number of edges within A, plus the number of edges between A and B. Hence, now the capacity of our s-t cut became equal to n*m – 2*e(A), where e(A) is the number of edges in A.

We now want to compare the density of A to our guess g. If our guess is exactly correct, A should have exactly a*g edges. Hence, let’s add 2*g to each of the v-t edges.

Summary of what we got: in the final graph each of the v-t edges has the capacity m + 2*g – degree(v). Then, for any subset A of vertices of the original graph, the capacity of the cut ({s}+A, {t}+B) is n*m + 2*(a*g – e(A)).

What this means:

- If our guess g is too high, the capacity of each such cut is greater than n*m, and therefore the minimum cut capacity is exactly n*m and the optimal cut is ({s}, everything else)
- If our guess g is too low, the capacity of the cut that corresponds to the most dense subgraph will be less than n*m, and therefore the minimum cut capacity will be less than n*m.
- If our guess happens to be exact, the minimum cut capacity is n*m but there is more than one such cut, not just the trivial one.

Hence, we can use binary search to look for the optimal density g. This can either be implemented using doubles, or using integers. The main question in either case is when to stop. The density of any subgraph is a fraction with denominator at most equal to n. Hence, the difference between any two valid densities is at most 1/n(n-1). If solving in doubles, as soon as our interval becomes shorter than that, we know that we are done. If solving in integers, we can binary search on the list of all possible fractions, and once we are testing a g that is a fraction, we can scale everything by its denominator and then compute the whole max flow / min cut in integers.

**misof**