Some thoughts around fenwick trees

References

Questions

https://codeforces.com/contest/863/problem/E

https://www.hackerearth.com/practice/data-structures/advanced-data-structures/fenwick-binary-indexed-trees/tutorial/
Two's complement : https://en.wikipedia.org/wiki/Two%27s_complement
Proof of correctness : https://cs.stackexchange.com/questions/86835/proof-of-fenwick-trees-correctness
Range update point query : https://cp-algorithms.com/data_structures/fenwick.html#2-range-update-and-point-query

Details

How does it work?

Sigh.

Again divide things into power of 2s.

All about $x \& (-x)$

This gives back the last set bit in a number $x$.

How do you represent $-x$?

So we use 2's complement to represent signed integers in binary. This just means that $-x_2 = 2^n - x$
So let's say we want to represent -6, then we find the bitwise representation of $6 = (0110)_2$
Now to find $-6 = 2^4 - 6 = 16 - 6 = 10 = (1010)_2$

Another way to do this is to just change the first bit from 0 to 1, where 1 always represents a negative number, then flip all the remaining bits (perform a bitwise not operation), and then add 1.

Again $6 = (0110)_2$

So $-6 = (1001)_2 + 1_2 = (1010)_2$

2's complement is the standard way to represent negative numbers in almost all hardwarres.

Finding the last set bit now

A very good explanation for this is present at : https://www.hackerearth.com/practice/data-structures/advanced-data-structures/fenwick-binary-indexed-trees/tutorial/

Let's say you have $x = a1b$ where 1 is the last set bit, b is all zeroes, and a can have both 1s and 0s.

We know that $-x = (a1b)' + 1$, where $'$ means NOT operation.

$-x = a'0(1...1) + 1 = a'1(0...0)$

So if we do $x \& -x = (a1b) \& a'1(0...0) = (0...0)1(0...0)$
That's why this operation works to find the last set bit in the number.

How does that help with fenwick trees?

Let's first work for the standard fenwick tree use case of point update, range query, where we can update certain indices with different running values, and in between we can have range queries.

For example : Let's say this array is already present : [1, 3, 0, 2, 5]

As operation 1, we say that add 1 to index 3, then this becomes : [1, 3, 1, 2, 5]

Then as operation 2, we ask what is the sum of elements in the range 2 to 4, so the answer would be 3 + 1 + 2 = 6.

The last significant bit represents the size of the range that the current number supports. So for example, $lsb(10) = 2$ so it supports two indexes - 9 and 10. When we add $lsb$ to itself, then we find another range. So $10 + lsb(10) = 10 + 2 = 12$, where 12 represents 9, 10, 11 and 12 - therefore whatever we are updating in 10, needs to get updated in 12 as well.

So basically, there are multiple overlapping ranges, which have the same index - therefore when this index is updated, all of these ranges are required to be updated.

Now LSB is the range of numbers that $i$ covers. If I *add* LSB to $i$, then it will give us a number that will have a higher LSB. That means that by default, that number would contain $i$ in it's range.

But how do we ensure that all relevant indexes that cover this index are updated?

When we are going up the tree, we add $LSB(i)$ to $i$ - but why this specifically? And what if this doesn't cover all the indexes that cover i? In that case we would miss updating some valid indexes, and one of our query operations would be wrong.

How do we do range update and point queries?

Ref : https://cp-algorithms.com/data_structures/fenwick.html#2-range-update-and-point-query

Consider a case where as an operation you can update ranges, and then you need to query the values at some points, after these range updates.

We can use fenwick trees here, by re-using the point-update, range query method above.

This stems from the basic pre-fix sum method of ranges. If you have a set of ranges, and you want to query the number of ranges covering a point, you can just put a +1 at L and -1 at R + 1. Then you can start from the leftmost point, and keep on adding till the right.

For example, let's say we have an array of size 5 : [0, 0, 0, 0, 0] and we have the following ranges:
1. [1,3]
2. [3,4]
3. [1,5]
4. [4,5]

Now for each index, we want to find the number of ranges covering them. So using the above mentioned methodology:

1. [1,3] : [1, 0, 0, -1, 0]
2. [3,4] : [1, 0, 1, -1, -1]
3. [1,5] : [2, 0, 1, -1, -1]
4. [4,5] : [2, 0, 1, 0, -1]
Let's make this a pre-fix array:

[2, 2, 3, 3, 2]

Now 1 is covered by 2 ranges : [1,3] and [1, 5] - so this is correct
2 is covered by the same two ranges
3 is covered by 3 ranges : [1,3], [3, 4] and [1,5]
4 is covered by 3 ranges as well : [3, 4], [1, 5] and [4,5]
5 is covered by 2 ranges : [1, 5] and [4, 5]

Therefore this method indeed works

How to do that via fenwick trees:

Let's say we have an empty array of size 5 : [0, 0, 0, 0, 0], and we want to update a range [2,3] with 1s. So this becomes : [0, 1, 1, 0, 0]

We can use a fenwick tree for this by:
1. For range [2,3], increase by 1, the index 2 in the BIT.
1. So this will mean that in the tree : [0, 1, 0, 1, 0] will happen, because an update in 2 will mean an update in 4 as well.
2. Now for range above 3 (that is from index 4), decrease that by 1 : That will update only 4 : [0, 1, 0, 0, 0]
3. Now when you query for index 1, you get 0
4. When you query for index 2, you get 1
5. When you query for index 3, you get 1
6. When you query for index 4 you get 0
7. When you query for index 5 you get 0

Some thoughts around sparse tables

References

https://cp-algorithms.com/data_structures/sparse-table.html
Problems:

https://codeforces.com/blog/entry/70418

Very good : https://codeforces.com/blog/entry/101083

https://codeforces.com/blog/entry/75611

There is something called as a [[Disjoint sparse trees]] as well

Details

What's the main idea?

The major use case that sparse tables solve is the range minimum query - for an immutable array, in $O(1)$ time.

The main idea behind that is : If I want to get the minimum of a range say $[L,R]$ = then that's equivalent to saying $min(min(L, L + k), min(R - k + 1, R))$ where $k \geq \frac{L + R} {2}$. This is because even if some middle elements in the range are common in both the ranges, the output still remains correct.

If you've this minimum of sub-ranges already pre-calculated for ks of appropriate size, then this operation can be done in $O(1)$ time.

How's the pre-calculation done here?

We can keep ranges like this:
- For every index inside the array - we keep the minimums for all powers of 2 (only until it makes sense, that is at max $log_2(n)$ where $n$ is the size of the array).

So let's say we have an array like this : [1, 2, 9, 4, 11]

Here, if we are talking from the perspective of index = 1, which is 2, then we will keep:

Range of size 1 ( $= 2^0$ ) from index = 1, has min = 2 (only 2 in the current range)
Range of size 2 ($=2^1$) from index = 1, has min = 2 (2 and 9 in the current range)
Range of size 4 ($=2^2$) from index = 1, has min = 2 (2, 9, 4 and 11 in the current range)
Range of size 8 - not possible, as array size itself is just 5.

We save this information for all the indexes of the array. This, therefore is saved inside a 2-D matrix. Let's call this matrix $st$, where $st[i][j]$ = minimum for the range from index $i$ of size $2^j$

Therefore note that $st[i][0]$ will just be the array itself.

Now we can calculate this easily by noticing that for a range of size 4, you can get the answer from ranges of sizes 2. That is for min of [a, b, c, d], you can use the min of [a, b] and [c, d].

Therefore we can calculate each power of 2 level-wise, and use that to calculate the next level.

Therefore we can say:

$$ st[i][j] = min(st[i][j - 1], st[i + (1<<(j - 1))][j - 1]) $$

This just means that for the range of size $2^j$, we are making use of two ranges of size $2^{j-1}$. One that starts from $i$ and goes until the middle element of the range : $i + (1<<(j - 1)) - 1$. Second that starts from next of the middle element : $i + (1<<(j - 1))$, till the end of the range which is $i + 2 \times (1 << (j - 1))$ which is just equal to $i + (1<<j)$. That's how this calculation works under the hood.

Now once $st$ is pre-calculated, you can use this to answer range minimum queries.

Let's say you need to find the min for the range $[L, R]$, you can do so by:
1. Find the size of the range = $R - L + 1$
2. Find the largest power of 2 that is lower or equal to this range = $log_2(R - L + 1)$. Let's call this $h$
3. Now use $h$ to calculate : $min(st[L][h], st[R - (1<<h) + 1][h]$

The final output will give you the min of the range.

Blatherstrike - The strike of the nonsense

Sunday, 25 May 2025