Features:
- Add `typing` extension.
- Add `receipts` extension.
- Add comprehensive prometheus `/metrics` activated via `SYNCV3_PROM`.
- Add `SYNCV3_PPROF` support.
- Add `by_notification_level` sort order.
- Add `include_old_rooms` support.
- Add support for `$ME` and `$LAZY`.
- Add correct filtering when `*,*` is used as `required_state`.
- Add `num_live` to each room response to indicate how many timeline entries are live.
Bug fixes:
- Use a stricter comparison function on ranges: fixes an issue whereby UTs fail on go1.19 due to change in sorting algorithm.
- Send back an `errcode` on HTTP errors (e.g expired sessions).
- Remove `unsigned.txn_id` on insertion into the DB. Otherwise other users would see other users txn IDs :(
- Improve range delta algorithm: previously it didn't handle cases like `[0,20] -> [20,30]` and would panic.
- Send HTTP 400 for invalid range requests.
- Don't publish no-op unread counts which just adds extra noise.
- Fix leaking DB connections which could eventually consume all available connections.
- Ensure we always unblock WaitUntilInitialSync even on invalid access tokens. Other code relies on WaitUntilInitialSync() actually returning at _some_ point e.g on startup we have N workers which bound the number of concurrent pollers made at any one time, we need to not just hog a worker forever.
Improvements:
- Greatly improve startup times of sync3 handlers by improving `JoinedRoomsTracker`: a modest amount of data would take ~28s to create the handler, now it takes 4s.
- Massively improve initial initial v3 sync times, by refactoring `JoinedRoomsTracker`, from ~47s to <1s.
- Add `SlidingSyncUntil...` in tests to reduce races.
- Tweak the API shape of JoinedUsersForRoom to reduce state block processing time for large rooms from 63s to 39s.
- Add trace task for initial syncs.
- Include the proxy version in UA strings.
- HTTP errors now wait 1s before returning to stop clients tight-looping on error.
- Pending event buffer is now 2000.
- Index the room ID first to cull the most events when returning timeline entries. Speeds up `SelectLatestEventsBetween` by a factor of 8.
- Remove cancelled `m.room_key_requests` from the to-device inbox. Cuts down the amount of events in the inbox by ~94% for very large (20k+) inboxes, ~50% for moderate sized (200 events) inboxes. Adds book-keeping to remember the unacked to-device position for each client.
This is so clients can accurately calculate the push rule:
```
{"kind":"room_member_count","is":"2"}
```
Also fixed a bug in the global room metadata for the joined/invited
counts where it could be wrong because of Synapse sending duplicate
join events as we were tracking +-1 deltas. We now calculate these
counts based on the set of user IDs in a specific membership state.
Caused by us not updating the `CanonicalisedName` which is what we use to sort on.
This field is a bit of an oddity because it lived outside the user/global cache
fields because it is a calculated value from the global cache data in the scope
of a user, whereas other user cache values are derived directly from specific
data (notif counts, DM-ness). This is a silly distinction however, since spaces
are derived from global room data as well, so move `CanonicalisedName` to the
UserCache and keep it updated when the room name changes.
Longer term: we need to clean this up so only the user cache is responsible
for updating user cache fields, and connstate treats user room data and global
room data as immutable. This is _mostly_ true today, but isn't always, and it
causes headaches. In addition, it looks like we maintain O(n) caches based on
the number of lists the user has made: we needn't do this and should lean
much more heavily on `s.allRooms`, just keeping pointers to this slice from
whatever lists the user requests.
Fixes https://github.com/matrix-org/sliding-sync/issues/23
- Added InvitesTable
- Allow invites to be sorted/searched the same as any other room by
implementing RoomMetadata for the invite (though this is best effort
as we don't have heroes)
caches pkg is required to avoid import loops as sync3 depends on extensions
as the extension type is embedded in the `sync3.Request`, but extensions will
need to know about the caches for account data to function correctly.
Previously the 'initial' flag was set on the entire response which was pointless
as the client can detect this by the presence/absence of `?pos=`.
Instead, move the `initial` flag to be present on the Room object, so clients know
when to replace or update their databases. Use this to fix a bug where the timeline
would show incorrectly when clicking on rooms due to appending to the timeline when
the room subscription data was collected.
`sync3` contains data structures and logic which is very isolated and
testable (think ConnMap, Room, Request, SortableRooms, etc) whereas
`sync3/handler` contains control flow which calls into `sync3` data
structures.
This has numerous benefits:
- Gnarly complicated structs like `ConnState` are now more isolated
from the codebase, forcing better API design on `sync3` structs.
- The inability to do import cycles forces structs in `sync3` to remain
simple: they cannot pull in control flow logic from `sync3/handler`
without causing a compile error.
- It's significantly easier to figure out where to start looking for
code that executes when a new request is received, for new developers.
- It simplifies the number of things that `ConnState` can touch. Previously
we were gut wrenching out of convenience but now we're forced to move
more logic from `ConnState` into `sync3` (depending on the API design).
For example, adding `SortableRooms.RoomIDs()`.
RoomMetadata stores the current invite/join count, heroes for the
room, most recent timestamp, name event content, canonical alias, etc
This information is consistent across all users so can be globally
cached for future use. Make ConnState call CalculateRoomName with
RoomMetadata to run the name algorithm.
This is *almost* complete but as there are no Heroes yet in the
metadata, things don't quite render correctly yet.
Keep it pure (not dependent on `state.Storage`) to make testing
easier. The responsibility for fanning out user cache updates
is with the Handler as it generally deals with glue code.