18 Commits

Author SHA1 Message Date
Kegan Dougal
fdbebaea68 Some review comments; swap to UPDATE..RETURNING 2024-05-20 08:22:48 +01:00
Kegan Dougal
b383ed0d82 Add migrations and refactor internal structs 2024-05-17 13:45:14 +01:00
Kegan Dougal
2cd9a81ab2 Add DeviceListTable
Shift over unit tests from DeviceDataTable
2024-05-17 09:37:38 +01:00
Kegan Dougal
f564f2d774 Redo the device data unit tests
They were A) confusing and B) testing the wrong thing as they got refactored
after the bad refactor of the source code..
2024-05-09 15:32:21 +01:00
David Robertson
3150c17cde
Test helper driver-by comment 2023-09-13 19:17:53 +01:00
Kegan Dougal
acf9748fe8 Remove nonsense tests now that JSONB is no more 2023-08-14 19:03:43 +01:00
Kegan Dougal
97113aa1a6 Test upsert as well 2023-08-11 10:55:38 +01:00
Kegan Dougal
651c27bdda Add test cases for bad data in the DB 2023-08-11 10:05:12 +01:00
Till Faelligen
9c6d620347
DeviceID can stay 2023-08-10 10:46:07 +02:00
Till Faelligen
721c3f80bd
Replace Bob with 💣 to validate we can upsert/swap 2023-08-10 10:44:47 +02:00
Till
079c6699bd
Update state/device_data_table_test.go
Co-authored-by: kegsay <kegan@matrix.org>
2023-08-04 15:32:35 +02:00
Till Faelligen
1fe920d1b6
Fix tests, as they weren't actually checking DeviceLists 2023-08-04 14:39:53 +02:00
Till Faelligen
ee714ea95e
Set a fillfactor of 90%, use DeepEqual instead of marshal -> bytes.Equal 2023-07-26 08:56:29 +02:00
Kegan Dougal
a6c3f8f3fc When a device is deleted, remove all device data with it (to-device events, device lists) 2023-03-01 16:56:04 +00:00
Kegan Dougal
a7eed93722 Add comprehensive regression test for GlobalSnapshot(); ensure we clear db conns when tests end 2023-01-18 14:54:26 +00:00
Kegan Dougal
6c4f7d3722 improvement: completely refactor device data updates
- `Conn`s now expose a direct `OnUpdate(caches.Update)` function
  for updates which concern a specific device ID.
- Add a bitset in `DeviceData` to indicate if the OTK or fallback keys were changed.
- Pass through the affected `DeviceID` in `pubsub.V2DeviceData` updates.
- Remove `DeviceDataTable.SelectFrom` as it was unused.
- Refactor how the poller invokes `OnE2EEData`: it now only does this if
  there are changes to OTK counts and/or fallback key types and/or device lists,
  and _only_ sends those fields, setting the rest to the zero value.
- Remove noisy logging.
- Add `caches.DeviceDataUpdate` which has no data but serves to wake-up the long poller.
- Only send OTK counts / fallback key types when they have changed, not constantly. This
  matches the behaviour described in MSC3884

The entire flow now looks like:
- Poller notices a diff against in-memory version of otk count and invokes `OnE2EEData`
- Handler updates device data table, bumps the changed bit for otk count.
- Other handler gets the pubsub update, directly finds the `Conn` based on the `DeviceID`.
  Invokes `OnUpdate(caches.DeviceDataUpdate)`
- This update is handled by the E2EE extension which then pulls the data out from the database
  and returns it.
- On initial connections, all OTK / fallback data is returned.
2022-12-22 15:08:42 +00:00
Kegan Dougal
aa28df161c Rename package -> github.com/matrix-org/sliding-sync 2022-12-15 11:08:50 +00:00
Kegan Dougal
be8543a21a add extensions for typing and receipts; bugfixes and additional perf improvements
Features:
 - Add `typing` extension.
 - Add `receipts` extension.
 - Add comprehensive prometheus `/metrics` activated via `SYNCV3_PROM`.
 - Add `SYNCV3_PPROF` support.
 - Add `by_notification_level` sort order.
 - Add `include_old_rooms` support.
 - Add support for `$ME` and `$LAZY`.
 - Add correct filtering when `*,*` is used as `required_state`.
 - Add `num_live` to each room response to indicate how many timeline entries are live.

Bug fixes:
 - Use a stricter comparison function on ranges: fixes an issue whereby UTs fail on go1.19 due to change in sorting algorithm.
 - Send back an `errcode` on HTTP errors (e.g expired sessions).
 - Remove `unsigned.txn_id` on insertion into the DB. Otherwise other users would see other users txn IDs :(
 - Improve range delta algorithm: previously it didn't handle cases like `[0,20] -> [20,30]` and would panic.
 - Send HTTP 400 for invalid range requests.
 - Don't publish no-op unread counts which just adds extra noise.
 - Fix leaking DB connections which could eventually consume all available connections.
 - Ensure we always unblock WaitUntilInitialSync even on invalid access tokens. Other code relies on WaitUntilInitialSync() actually returning at _some_ point e.g on startup we have N workers which bound the number of concurrent pollers made at any one time, we need to not just hog a worker forever.

Improvements:
 - Greatly improve startup times of sync3 handlers by improving `JoinedRoomsTracker`: a modest amount of data would take ~28s to create the handler, now it takes 4s.
 - Massively improve initial initial v3 sync times, by refactoring `JoinedRoomsTracker`, from ~47s to <1s.
 - Add `SlidingSyncUntil...` in tests to reduce races.
 - Tweak the API shape of JoinedUsersForRoom to reduce state block processing time for large rooms from 63s to 39s.
 - Add trace task for initial syncs.
 - Include the proxy version in UA strings.
 - HTTP errors now wait 1s before returning to stop clients tight-looping on error.
 - Pending event buffer is now 2000.
 - Index the room ID first to cull the most events when returning timeline entries. Speeds up `SelectLatestEventsBetween` by a factor of 8.
 - Remove cancelled `m.room_key_requests` from the to-device inbox. Cuts down the amount of events in the inbox by ~94% for very large (20k+) inboxes, ~50% for moderate sized (200 events) inboxes. Adds book-keeping to remember the unacked to-device position for each client.
2022-12-14 18:53:55 +00:00