22 Commits

Author SHA1 Message Date
Kegan Dougal
6d6a2d6c08 Add a sensible timeline_limit cap
To avoid pathological cases where large timeline limits are requested.
2024-02-21 13:22:20 +00:00
Kegan Dougal
c868e540db bugfix: handle malformed state/timeline responses
Specifically, look for the create event in the timeline as this
has been seen in the wild on Synapse. Fixes #367.
2023-11-08 10:36:36 +00:00
Kegan Dougal
d3285a39f1
Always ensure integ tests send a timeline event when sending state 2023-08-22 16:10:52 +01:00
David Robertson
9a787d08ab
Add extra integration test 2023-08-01 17:02:21 +01:00
David Robertson
7c5442d7e8
Integration test review comments 2023-07-28 18:55:18 +01:00
David Robertson
d0067008e1
Fix new integration test timing 2023-07-27 13:03:14 +01:00
David Robertson
142290fa0e
WIP make test pass?? 2023-07-25 19:32:33 +01:00
David Robertson
a8253759c7
Reproduce the problem 2023-07-25 19:08:09 +01:00
David Robertson
9982ab24ee
Use different device names for each test 2023-05-15 19:14:09 +01:00
kegsay
53f6d5e4f2
Merge pull request #63 from matrix-org/kegan/missing-msg-fix
bugfix: fix #56 by sending the correct NID with each live event
2023-04-11 17:21:00 +01:00
Kegan Dougal
6293edc98a Correct SYNC range 2023-04-11 17:11:30 +01:00
Kegan Dougal
74c9f77e5d bugfix: fix #56 by sending the correct NID with each live event
Early versions of the proxy tended to send a list of event JSONs
and the "latest" NID for that batch. This then interacted badly
with later code which used these NIDs to determine if the event
in question should be returned to the client or not. We sometimes
filter them out in cases where the "initial" room has already
included this event, e.g room has msgs A,B,C which were pulled in
initially via a DB call, then we receive C down as an update, we
should not include it else we will send back A,B,C,C. By only
sending the "latest" NID, we will filter out other events in
that batch as they are <= the previously seen latest NID.

This was not tested in E2E tests because it relies on slow pollers
which cause >1 timeline event for a single room to arrive. This
may be a cause of flakey tests. We now have an integration test for
this which injected batches of events for the same room and ensures
they are all seen down the connection.

Thanks to @jplatte and @manuroe for helping debug this.
2023-04-11 16:57:44 +01:00
David Robertson
af893f526b
update tests 2023-04-04 22:25:39 +01:00
Kegan Dougal
c327fe7ed2 Add more timeline trickling tests 2023-02-01 11:04:36 +00:00
Kegan Dougal
425b5d1284 Unbreak tests 2023-01-23 13:52:01 +00:00
Kegan Dougal
5d29512ac5 Merge branch 'main' into kegan/lists-as-keys 2023-01-23 13:25:30 +00:00
Kegan Dougal
5f1b95b914 feat: support timeline 'trickling' by resending when the limit changes
This allows you to send `timeline_limit: 1` in one request, then
swap to `timeline_limit: 10` in the 2nd request and get 10 events,
without it affecting the window (no ops or required_state resent).

This is being added to support fast preloading on mobile devices,
where timeline_limit: 1 is used to populate the room preview in the
room list and then timeline_limit: 20 is used to quickly pre-cache
a screen full of messages in case the user clicks through to the room.
2023-01-20 18:48:10 +00:00
Kegan Dougal
ca6ceb28da BREAKING: Change the API to refer to lists by keys not index positions
This provides more flexibility to refer to lists as well as delete them.
2022-12-20 13:32:39 +00:00
Kegan Dougal
aa28df161c Rename package -> github.com/matrix-org/sliding-sync 2022-12-15 11:08:50 +00:00
Kegan Dougal
be8543a21a add extensions for typing and receipts; bugfixes and additional perf improvements
Features:
 - Add `typing` extension.
 - Add `receipts` extension.
 - Add comprehensive prometheus `/metrics` activated via `SYNCV3_PROM`.
 - Add `SYNCV3_PPROF` support.
 - Add `by_notification_level` sort order.
 - Add `include_old_rooms` support.
 - Add support for `$ME` and `$LAZY`.
 - Add correct filtering when `*,*` is used as `required_state`.
 - Add `num_live` to each room response to indicate how many timeline entries are live.

Bug fixes:
 - Use a stricter comparison function on ranges: fixes an issue whereby UTs fail on go1.19 due to change in sorting algorithm.
 - Send back an `errcode` on HTTP errors (e.g expired sessions).
 - Remove `unsigned.txn_id` on insertion into the DB. Otherwise other users would see other users txn IDs :(
 - Improve range delta algorithm: previously it didn't handle cases like `[0,20] -> [20,30]` and would panic.
 - Send HTTP 400 for invalid range requests.
 - Don't publish no-op unread counts which just adds extra noise.
 - Fix leaking DB connections which could eventually consume all available connections.
 - Ensure we always unblock WaitUntilInitialSync even on invalid access tokens. Other code relies on WaitUntilInitialSync() actually returning at _some_ point e.g on startup we have N workers which bound the number of concurrent pollers made at any one time, we need to not just hog a worker forever.

Improvements:
 - Greatly improve startup times of sync3 handlers by improving `JoinedRoomsTracker`: a modest amount of data would take ~28s to create the handler, now it takes 4s.
 - Massively improve initial initial v3 sync times, by refactoring `JoinedRoomsTracker`, from ~47s to <1s.
 - Add `SlidingSyncUntil...` in tests to reduce races.
 - Tweak the API shape of JoinedUsersForRoom to reduce state block processing time for large rooms from 63s to 39s.
 - Add trace task for initial syncs.
 - Include the proxy version in UA strings.
 - HTTP errors now wait 1s before returning to stop clients tight-looping on error.
 - Pending event buffer is now 2000.
 - Index the room ID first to cull the most events when returning timeline entries. Speeds up `SelectLatestEventsBetween` by a factor of 8.
 - Remove cancelled `m.room_key_requests` from the to-device inbox. Cuts down the amount of events in the inbox by ~94% for very large (20k+) inboxes, ~50% for moderate sized (200 events) inboxes. Adds book-keeping to remember the unacked to-device position for each client.
2022-12-14 18:53:55 +00:00
Kegan Dougal
5b0e8568ea tests: move Match* functions to testutils/m
In preparation for migrating end-to-end style integration tests
to be actual end-to-end tests. The intended split is:
 - Does the test exclusively use the public sliding sync API for test assertions?
 - Does the test exclusively use the public sync v2 API for configuring the test?
If the answer to both questions is YES, then they should be end-to-end tests.
Some examples of this include testing core functionality of the API like
room subscriptions, multiple lists, filters, extensions, etc.

Some examples of tests which are NOT end-to-end tests include:
 - Testing connection handling (e.g sending multiple duplicate requests)
 - Ensuring outstanding requests get cancelled.
 - Testing restarts of the proxy.
 - Testing out-of-order responses.
 - Benchmarks.

These all involve configuring the test / asserting different things, which would
be extremely difficult to reliably engineer using a real homeserver.
2022-07-26 10:11:06 +01:00
Kegan Dougal
75c3579f9e refactor: move integration tests to tests-integration directory
Add tests-e2ee directory for end-to-end tests which require a synapse
server.
2022-07-25 15:06:13 +01:00