Some thoughts around fenwick trees
References
- Questions
- https://www.hackerearth.com/practice/data-structures/advanced-data-structures/fenwick-binary-indexed-trees/tutorial/
- Two's complement : https://en.wikipedia.org/wiki/Two%27s_complement
- Proof of correctness : https://cs.stackexchange.com/questions/86835/proof-of-fenwick-trees-correctness
- Range update point query : https://cp-algorithms.com/data_structures/fenwick.html#2-range-update-and-point-query
Details
How does it work?
Sigh.
Again divide things into power of 2s.
All about $x \& (-x)$
This gives back the last set bit in a number $x$.
How do you represent $-x$?
So we use 2's complement to represent signed integers in binary. This just means that $-x_2 = 2^n - x$
So let's say we want to represent -6, then we find the bitwise representation of $6 = (0110)_2$
Now to find $-6 = 2^4 - 6 = 16 - 6 = 10 = (1010)_2$
Another way to do this is to just change the first bit from 0 to 1, where 1 always represents a negative number, then flip all the remaining bits (perform a bitwise not operation), and then add 1.
Again $6 = (0110)_2$
So $-6 = (1001)_2 + 1_2 = (1010)_2$
2's complement is the standard way to represent negative numbers in almost all hardwarres.
Finding the last set bit now
A very good explanation for this is present at : https://www.hackerearth.com/practice/data-structures/advanced-data-structures/fenwick-binary-indexed-trees/tutorial/
Let's say you have $x = a1b$ where 1 is the last set bit, b is all zeroes, and a can have both 1s and 0s.
We know that $-x = (a1b)' + 1$, where $'$ means NOT operation.
$-x = a'0(1...1) + 1 = a'1(0...0)$
So if we do $x \& -x = (a1b) \& a'1(0...0) = (0...0)1(0...0)$
That's why this operation works to find the last set bit in the number.
How does that help with fenwick trees?
Let's first work for the standard fenwick tree use case of point update, range query, where we can update certain indices with different running values, and in between we can have range queries.
For example : Let's say this array is already present : [1, 3, 0, 2, 5]
As operation 1, we say that add 1 to index 3, then this becomes : [1, 3, 1, 2, 5]
Then as operation 2, we ask what is the sum of elements in the range 2 to 4, so the answer would be 3 + 1 + 2 = 6.
The last significant bit represents the size of the range that the current number supports. So for example, $lsb(10) = 2$ so it supports two indexes - 9 and 10. When we add $lsb$ to itself, then we find another range. So $10 + lsb(10) = 10 + 2 = 12$, where 12 represents 9, 10, 11 and 12 - therefore whatever we are updating in 10, needs to get updated in 12 as well.
So basically, there are multiple overlapping ranges, which have the same index - therefore when this index is updated, all of these ranges are required to be updated.
Now LSB is the range of numbers that $i$ covers. If I *add* LSB to $i$, then it will give us a number that will have a higher LSB. That means that by default, that number would contain $i$ in it's range.
But how do we ensure that all relevant indexes that cover this index are updated?
When we are going up the tree, we add $LSB(i)$ to $i$ - but why this specifically? And what if this doesn't cover all the indexes that cover i? In that case we would miss updating some valid indexes, and one of our query operations would be wrong.
How do we do range update and point queries?
Ref : https://cp-algorithms.com/data_structures/fenwick.html#2-range-update-and-point-query
Consider a case where as an operation you can update ranges, and then you need to query the values at some points, after these range updates.
We can use fenwick trees here, by re-using the point-update, range query method above.
This stems from the basic pre-fix sum method of ranges. If you have a set of ranges, and you want to query the number of ranges covering a point, you can just put a +1 at L and -1 at R + 1. Then you can start from the leftmost point, and keep on adding till the right.
For example, let's say we have an array of size 5 : [0, 0, 0, 0, 0] and we have the following ranges:
1. [1,3]
2. [3,4]
3. [1,5]
4. [4,5]
Now for each index, we want to find the number of ranges covering them. So using the above mentioned methodology:
1. [1,3] : [1, 0, 0, -1, 0]
2. [3,4] : [1, 0, 1, -1, -1]
3. [1,5] : [2, 0, 1, -1, -1]
4. [4,5] : [2, 0, 1, 0, -1]
Let's make this a pre-fix array:
[2, 2, 3, 3, 2]
Now 1 is covered by 2 ranges : [1,3] and [1, 5] - so this is correct
2 is covered by the same two ranges
3 is covered by 3 ranges : [1,3], [3, 4] and [1,5]
4 is covered by 3 ranges as well : [3, 4], [1, 5] and [4,5]
5 is covered by 2 ranges : [1, 5] and [4, 5]
Therefore this method indeed works
How to do that via fenwick trees:
Let's say we have an empty array of size 5 : [0, 0, 0, 0, 0], and we want to update a range [2,3] with 1s. So this becomes : [0, 1, 1, 0, 0]
We can use a fenwick tree for this by:
1. For range [2,3], increase by 1, the index 2 in the BIT.
1. So this will mean that in the tree : [0, 1, 0, 1, 0] will happen, because an update in 2 will mean an update in 4 as well.
2. Now for range above 3 (that is from index 4), decrease that by 1 : That will update only 4 : [0, 1, 0, 0, 0]
3. Now when you query for index 1, you get 0
4. When you query for index 2, you get 1
5. When you query for index 3, you get 1
6. When you query for index 4 you get 0
7. When you query for index 5 you get 0