With regression test. The behaviour is:
- Delete the connection, such that incoming requests will end up with M_UNKNOWN_POS
- The next request will then return HTTP 401.
This has knock-on effects:
- We no longer send HTTP 502 if /whoami returns 401, instead we return 401.
- When the token is expired (pollers get 401, the device is deleted from the DB).
Previously, we would not send unread count INCREASES to the client,
as we would expect the actual event update to wake up the client conn.
This was great because it meant the event+unread count arrived atomically
on the client. This was implemented as "parse unread counts first, then events".
However, this introduced a bug when there were >1 user in the same room. In this
scenario, one poller may get the event first, which would go through to the client.
The subsequent unread count update would then be dropped and not sent to the client.
This would just be an unfortunate UI bug if it weren't for sorting by_notification_count
and sorting by_notification_level. Both of these sort operations use the unread counts
to determine room list ordering. This list would be updated on the server, but no
list operation would be sent to the client, causing the room lists to de-sync, and
resulting in incorrect DELETE/INSERT ops. This would manifest as duplicate rooms
on the room list.
In the process of fixing this, also fix a bug where typing notifications would not
always be sent to the client - it would only do so when piggybacked due to incorrect
type switches.
Also fix another bug which prevented receipts from always being sent to the client.
This was caused by the extensions handler not checking if the receipt extension had
data to determine if it should return. This the interacted with an as-yet unfixed bug
which cleared the extension on subequent updates, causing the receipt to be lost entirely.
A fix for this will be inbound soon.
- Scope transaction IDs to the device ID (access token) rather
than the user ID, as this is more accurate with the spec.
- Batch up all transaction ID lookups for all rooms being returned
into a single query. Previously, we would sequentially call SELECT
n times, one per room being returned, which was taking lots of time
just due to RTTs to the database server (often this table is empty).
- `Conn`s now expose a direct `OnUpdate(caches.Update)` function
for updates which concern a specific device ID.
- Add a bitset in `DeviceData` to indicate if the OTK or fallback keys were changed.
- Pass through the affected `DeviceID` in `pubsub.V2DeviceData` updates.
- Remove `DeviceDataTable.SelectFrom` as it was unused.
- Refactor how the poller invokes `OnE2EEData`: it now only does this if
there are changes to OTK counts and/or fallback key types and/or device lists,
and _only_ sends those fields, setting the rest to the zero value.
- Remove noisy logging.
- Add `caches.DeviceDataUpdate` which has no data but serves to wake-up the long poller.
- Only send OTK counts / fallback key types when they have changed, not constantly. This
matches the behaviour described in MSC3884
The entire flow now looks like:
- Poller notices a diff against in-memory version of otk count and invokes `OnE2EEData`
- Handler updates device data table, bumps the changed bit for otk count.
- Other handler gets the pubsub update, directly finds the `Conn` based on the `DeviceID`.
Invokes `OnUpdate(caches.DeviceDataUpdate)`
- This update is handled by the E2EE extension which then pulls the data out from the database
and returns it.
- On initial connections, all OTK / fallback data is returned.
Features:
- Add `typing` extension.
- Add `receipts` extension.
- Add comprehensive prometheus `/metrics` activated via `SYNCV3_PROM`.
- Add `SYNCV3_PPROF` support.
- Add `by_notification_level` sort order.
- Add `include_old_rooms` support.
- Add support for `$ME` and `$LAZY`.
- Add correct filtering when `*,*` is used as `required_state`.
- Add `num_live` to each room response to indicate how many timeline entries are live.
Bug fixes:
- Use a stricter comparison function on ranges: fixes an issue whereby UTs fail on go1.19 due to change in sorting algorithm.
- Send back an `errcode` on HTTP errors (e.g expired sessions).
- Remove `unsigned.txn_id` on insertion into the DB. Otherwise other users would see other users txn IDs :(
- Improve range delta algorithm: previously it didn't handle cases like `[0,20] -> [20,30]` and would panic.
- Send HTTP 400 for invalid range requests.
- Don't publish no-op unread counts which just adds extra noise.
- Fix leaking DB connections which could eventually consume all available connections.
- Ensure we always unblock WaitUntilInitialSync even on invalid access tokens. Other code relies on WaitUntilInitialSync() actually returning at _some_ point e.g on startup we have N workers which bound the number of concurrent pollers made at any one time, we need to not just hog a worker forever.
Improvements:
- Greatly improve startup times of sync3 handlers by improving `JoinedRoomsTracker`: a modest amount of data would take ~28s to create the handler, now it takes 4s.
- Massively improve initial initial v3 sync times, by refactoring `JoinedRoomsTracker`, from ~47s to <1s.
- Add `SlidingSyncUntil...` in tests to reduce races.
- Tweak the API shape of JoinedUsersForRoom to reduce state block processing time for large rooms from 63s to 39s.
- Add trace task for initial syncs.
- Include the proxy version in UA strings.
- HTTP errors now wait 1s before returning to stop clients tight-looping on error.
- Pending event buffer is now 2000.
- Index the room ID first to cull the most events when returning timeline entries. Speeds up `SelectLatestEventsBetween` by a factor of 8.
- Remove cancelled `m.room_key_requests` from the to-device inbox. Cuts down the amount of events in the inbox by ~94% for very large (20k+) inboxes, ~50% for moderate sized (200 events) inboxes. Adds book-keeping to remember the unacked to-device position for each client.
We don't care about that as they never form part of the timeline.
Also, only send up a timeline limit: 1 filter to sync v2 when there
is no ?since token. Otherwise, we want a timeline limit >1 so we
can ensure that we remain gapless (else the proxy drops events).
- Completely ignore events in the `state` block when processing
sync v3 requests with a large `timeline_limit`. We should never
have been including them in the first place as they are not
chronological at all.
- Perform sync v2 requests with a timeline limit of 1 to ensure
we can always return a `prev_batch` token to the caller. This
means on the first startup, clicking a room will force a `/messages`
hit until there have been `$limit` new events, in which case it
will be able to serve these events from the local DB. Critically,
this ensures that we never send back an empty `prev_batch`, which
causes clients to believe that there is no history in a room.
We can do this now because we store the access token for each device.
Throttled at 16 concurrent sync requests to avoid causing
thundering herds on startup.
- Add `SYNCV3_SECRET` env var which is SHA256'd and used as an AES
key to encrypt/decrypt tokens.
- Add column `v2_token_encrypted` to `syncv3_sync2_devices`
- Update unit tests to check encryption/decryption work.
This provides an extra layer of security in case the database is
compromised and real user access tokens are leaked. This forces
an attacker to obtain both the database table _and_ the secret
env var (which will typically be stored in secure storage e.g
k8s secrets). Unfortunately, we need to have the access_token
in the plain so we cannot rely on password-style storage algorithms
like bcrypt/scrypt, which would be safer.
Fixes https://github.com/matrix-org/sliding-sync/issues/23
- Added InvitesTable
- Allow invites to be sorted/searched the same as any other room by
implementing RoomMetadata for the invite (though this is best effort
as we don't have heroes)
Clients rely on transaction IDs coming down their /sync streams so they
can pair up an incoming event with an event they just sent but have not
yet got the event ID for.
The proxy has not historically handled this because of the shared work
model of operation, where we store exactly 1 copy of the event in the
database and no more. This means if Alice and Bob are running in the
same proxy, then Alice sends a message, Bob's /sync stream may get the
event first and that will NOT contain the `transaction_id`. This then
gets written into the database. Later when Alice /syncs, she will not
get the `transaction_id` for her event which she sent.
This commit fixes this by having a TTL cache which maps (user, event)
-> txn_id. Transaction IDs are inherently ephemeral, so keeping the
last 5 minutes worth of txn IDs in-memory is an easy solution which
will be good enough for the proxy. Actual server implementations of
sliding sync will be able to trivially deal with this behaviour natively.
- Modify the API to instead have `WaitUntilInitialSync()` which is backed by a `WaitGroup`.
- Call this new function when a poller exists and hasn't been terminated. Previously,
we would assume that if a poller exists then it has done an initial sync, which may
not always be true. This could lead to position mismatches as a connection would be
re-created after EnsurePolling returned.
- Persist OTK counts and device list changes in-memory per Poller.
- Expose a new `E2EEFetcher` to allow the E2EE extension code to
grab said E2EE data from the Poller.
- OTK counts are replaced outright.
- Device lists are updated in a user_id->changed|left map which is then
deleted when read.
- Add tests for basic functionality and some edge cases like ensuring that
v3 request retries still return changed|left values.
This means we can serve rooms/events from the v3 database
immediately if they exist. The downside is that we still do
need to hit v2 to pull in to-device messages, but they can
come in later.
- Add `AccountDataTable` with tests.
- Read global and per-room account data from sync v2 and add new callbacks to the poller.
- Update the `SyncV3Handler` to persist account data from sync v2 then notify the user cache.
- Update the `UserCache` to update `UserRoomData.IsDM` status on `m.direct` events.
- Read `m.direct` event from the DB when `UserCache` is created to track DM status per-room.
- Only have a single database for all tests, like CI.
- Calling `PrepareDBConnectionString` drops all tables before returning
the string.
- Tests must be run with no concurrency else they will step on each other
due to the previous point.
This should prevent cases where local tests pass but CI fails.
Document a nasty race condition which can happen if >1 user is joined
to the same room. Fixed to ensure that `GlobalCache` will always stay
in-sync with the database without having to hit the database.