Previously we sometimes used initial loading code, sometimes not and
had to remember to `return`. Keep the code paths separate and let
the extension decide what to do.
It was a very unclear boolean with unclear semantics, when what
we really meant was "if this is a room update and there is room
data included in this response, send back extension data".
Previously, we would not send unread count INCREASES to the client,
as we would expect the actual event update to wake up the client conn.
This was great because it meant the event+unread count arrived atomically
on the client. This was implemented as "parse unread counts first, then events".
However, this introduced a bug when there were >1 user in the same room. In this
scenario, one poller may get the event first, which would go through to the client.
The subsequent unread count update would then be dropped and not sent to the client.
This would just be an unfortunate UI bug if it weren't for sorting by_notification_count
and sorting by_notification_level. Both of these sort operations use the unread counts
to determine room list ordering. This list would be updated on the server, but no
list operation would be sent to the client, causing the room lists to de-sync, and
resulting in incorrect DELETE/INSERT ops. This would manifest as duplicate rooms
on the room list.
In the process of fixing this, also fix a bug where typing notifications would not
always be sent to the client - it would only do so when piggybacked due to incorrect
type switches.
Also fix another bug which prevented receipts from always being sent to the client.
This was caused by the extensions handler not checking if the receipt extension had
data to determine if it should return. This the interacted with an as-yet unfixed bug
which cleared the extension on subequent updates, causing the receipt to be lost entirely.
A fix for this will be inbound soon.
When connections expire, the user cache listener is removed.
This acquires a write lock.
When user cache events are emitted, a read lock is acquired
_and held_ when invoking listeners.
Unfortunately, invoking a listener can cause a connection to
expire in the case of a full buffer, which would then deadlock
and prevent new user cache listeners from being added.
The user cache listeners slice is written to by HTTP goroutines
when clients make requests, and is read by the callbacks from v2
pollers. This slice wasn't protected from bad reads, only writes
were protected. Expanded the mutex to be RW to handle this.
cancelOutstandingReq is the context cancellation function to terminate
previous requests when a new request arrives. Whilst the request itself
is held in a mutex, invoking this cancellation function was not held
by anything. Added an extra mutex for this.
The RoomFinder accesses s.allRooms and is used when sorting the
room list, where we would expect many accesses. Previously, we
returned copies of room metadata, which caused significant amounts
of GC churn, enough to show up on traces.
Swap to using pointers and rename the function to `ReadOnlyRoom(roomID)`
to indicate that it isn't safe to write to this return value.
This allows you to send `timeline_limit: 1` in one request, then
swap to `timeline_limit: 10` in the 2nd request and get 10 events,
without it affecting the window (no ops or required_state resent).
This is being added to support fast preloading on mobile devices,
where timeline_limit: 1 is used to populate the room preview in the
room list and then timeline_limit: 20 is used to quickly pre-cache
a screen full of messages in case the user clicks through to the room.
This results in flakey tests and a bad UX because one SS response can
say "the count changed from 0 to 1" but the message takes another
SS response. We _only_ send notification counts if they _decrease_,
and piggyback increases on the events in question which caused the
counts to go up.
- Scope transaction IDs to the device ID (access token) rather
than the user ID, as this is more accurate with the spec.
- Batch up all transaction ID lookups for all rooms being returned
into a single query. Previously, we would sequentially call SELECT
n times, one per room being returned, which was taking lots of time
just due to RTTs to the database server (often this table is empty).