Specifically this is targetting invite rejections, where the leave
event is inside the leave block of the sync v2 response.
Previously, we would make a snapshot with this leave event. If the
proxy wasn't in this room, it would mean the room state would just
be the leave event, which is wrong. If the proxy was in the room,
then state would correctly be rolled forward.
The previous query would:
- Map room IDs to snapshot NIDs
- UNNEST(events) on all those state snapshots
- Compare if the type/state_key match the filter
This was very slow under the following circumstances:
- The rooms have lots of members (e.g Matrix HQ)
- The required_state has no filter on m.room.member
This is what Element X does.
To improve this, we now have _two_ columns per state snapshot:
- membership_events : only the m.room.member events
- events : everything else
Now if a query comes in which doesn't need m.room.member events, we just need
to look in the everything-else bucket of events which is significantly smaller.
This reduces these queries to about 50ms, from 500ms.
Features:
- Add `typing` extension.
- Add `receipts` extension.
- Add comprehensive prometheus `/metrics` activated via `SYNCV3_PROM`.
- Add `SYNCV3_PPROF` support.
- Add `by_notification_level` sort order.
- Add `include_old_rooms` support.
- Add support for `$ME` and `$LAZY`.
- Add correct filtering when `*,*` is used as `required_state`.
- Add `num_live` to each room response to indicate how many timeline entries are live.
Bug fixes:
- Use a stricter comparison function on ranges: fixes an issue whereby UTs fail on go1.19 due to change in sorting algorithm.
- Send back an `errcode` on HTTP errors (e.g expired sessions).
- Remove `unsigned.txn_id` on insertion into the DB. Otherwise other users would see other users txn IDs :(
- Improve range delta algorithm: previously it didn't handle cases like `[0,20] -> [20,30]` and would panic.
- Send HTTP 400 for invalid range requests.
- Don't publish no-op unread counts which just adds extra noise.
- Fix leaking DB connections which could eventually consume all available connections.
- Ensure we always unblock WaitUntilInitialSync even on invalid access tokens. Other code relies on WaitUntilInitialSync() actually returning at _some_ point e.g on startup we have N workers which bound the number of concurrent pollers made at any one time, we need to not just hog a worker forever.
Improvements:
- Greatly improve startup times of sync3 handlers by improving `JoinedRoomsTracker`: a modest amount of data would take ~28s to create the handler, now it takes 4s.
- Massively improve initial initial v3 sync times, by refactoring `JoinedRoomsTracker`, from ~47s to <1s.
- Add `SlidingSyncUntil...` in tests to reduce races.
- Tweak the API shape of JoinedUsersForRoom to reduce state block processing time for large rooms from 63s to 39s.
- Add trace task for initial syncs.
- Include the proxy version in UA strings.
- HTTP errors now wait 1s before returning to stop clients tight-looping on error.
- Pending event buffer is now 2000.
- Index the room ID first to cull the most events when returning timeline entries. Speeds up `SelectLatestEventsBetween` by a factor of 8.
- Remove cancelled `m.room_key_requests` from the to-device inbox. Cuts down the amount of events in the inbox by ~94% for very large (20k+) inboxes, ~50% for moderate sized (200 events) inboxes. Adds book-keeping to remember the unacked to-device position for each client.
A fundamental assumption in the proxy has been that the order of events
in `timeline` in v2 will be the same all the time. There's some evidence
to suggest this isn't true in the wild. This commit refactors the proxy
to not assume this. It does this by:
- Not relying on the number of newly inserted rows and slicing the events
to figure out _which_ events are new. Now the INSERT has `RETURNING event_id, event_nid`
and we return a map from event ID to event NID to explicitly say which
events are new.
- Add more paranoia when calculating new state snapshots: if we see the
same (type, state key) tuple more than once in a snapshot we error out.
- Add regression tests which try to insert events out of order to trip the
proxy up.
Specifically, we can double-process events if we don't take into account
the event NID. This happens because we can receive live events before
we have loaded the initial connection state (list of joined rooms). We
base the initial load around an event NID, so we need to make sure to
ignore any live streamed events which are <= the initial load NID.
We could have alternatively loaded the initial connection state and
/then/ register to receive live events, but this means we could drop
events in the gap between loading events and making the register call
which is arguably worse. We could slap a mutex around it all to atomically
do this, but this means that getting pushed new events is tied to
loading (potentially a lot of) state for a single Conn, increasing
lock contention.
It's easier to roll forward than roll backwards. Add 'replaces_nid' field
on the events table which tells which nid in the snapshot gets replaced, if any.