Update API docs

This commit is contained in:
Kegan Dougal 2021-10-07 17:50:59 +01:00
parent 362aca5954
commit 12f9226eab
4 changed files with 446 additions and 327 deletions

437
api.md Normal file
View File

@ -0,0 +1,437 @@
## Paginated Sync v3
*Please file an issue on this repository if you wish to make comments on this.*
This is a proposal to replace Sync v2 (the current sync mechanism in the r0 spec) with a new paginated sync mechanism.
### Why?
- Sync v2 slows down with more rooms due to lack of room pagination. Some accounts now have 1000s of rooms making them completely impractical to sync on.
- Sync v2 sends far too much data which you cannot opt-out of e.g receipts for ALL rooms.
- Sync v2 supports very very old sync tokens, forcing the server to calculate extremely large and costly deltas.
A new sync mechanism should have the following properties:
- Sync time should be independent of the number of rooms you are in.
- Time from launch to confident usability should be as low as possible.
- Time from login on existing accounts to usability should be as low as possible.
- Bandwidth should be minimised.
- Support lazy-loading of things like read receipts (and avoid sending unnecessary data to the client)
- Support informing the client when state changes from under it, due to state res
- Clients should be able to work correctly without ever syncing in the full set of rooms theyre in.
- Dont incremental sync rooms you dont care about.
- Combining uploaded filters with ad-hoc filter parameters (which isnt possible with sync v2 today)
- Servers should not need to store all past since tokens. If a since token has been discarded we should gracefully degrade to initial sync.
A critical component to all of these properties is to support *paginated rooms*, which sync v2 does not do.
### What Matrix does currently
For every event received by a homeserver, an immutable position is assigned to it. Sync tokens are thus the position in this single linear stream (ignoring vector clocks that Synapse workers do). This has problems. If you sync with an ancient position, you get a bazillion events. This was the failure mode of sync v1 (`/initialSync` and `/events`): using an old position would cause massive amounts of data to be sent to clients via `/events`. Sync v2 remedied this by introducing room state deltas and timeline limits. This helps but it is still very costly on the server to calculate the state delta. Sync v3 is required because the number of rooms people are in now is getting large enough to cause unreasonably long delays. We want to paginate rooms, and cut down on the amount of room state that needs to be sent to the client to get them operational.
### What this could look like
The overarching model here is to imagine `/sync` as a pubsub system, where you are "subscribing" to *ranges* of a sorted room list array. In addition, you can also "subscribe" to explicit room IDs whenever you want e.g. when you are viewing the room or receiving a permalink for a room, and data is de-duplicated between these two subscriptions if the room is both explicitly subscribed to and in the subslice.
`POST /v3/sync`:
```json=
{
// Identifies the session for the purposes of remembering request
// parameters. This allows a single device to have multiple sync
// sessions active and not have them step on each other.
// "to-device" messages will only be deleted from the server once
// ALL sessions have received said message. Sessions can be deleted
// by the server after a period of inactivity. Deleted sessions do
// not result in to-device messages being purged if they have never
// been delivered to any session yet: they must be delivered to at
// least one active session on the device.
// If this id is missing, it is set to 'default'.
"session_id": "arbitrary-client-chosen-string",
// first 100 rooms
"rooms": [ [0,99] ],
// how `rooms` gets sorted. Note "by_name" means servers need to
// implement the room name calculation algorithm. We may be able to
// add a "locale" key for sorting rooms which are composed of user
// names more sensibly according to i18n.
"sort": [ "by_notification_count", "by_recency", "by_name" ],
"required_state": [
["m.room.join_rules", ""],
["m.room.history_visibility", ""],
["m.space.child", "*"] // wildcard
],
// the initial timeline limit to send for a new room, live stream
// data can exceed this limit
"timeline_limit": 10,
"room_subscriptions": {
"!sub1:bar": { // the client may be actively viewing this room
"required_state": [ ["*","*"] ], // all state events
"timeline_limit": 50
},
// empty object will use the same request params as the list subscription
"!sub2:bar": {}
},
// if the client was already subscribed to this room, this is how you unsub
// unsubbing twice is a no-op
"unsubscribe_rooms": [ "!sub3:bar" ]
"filters": {
// only returns rooms in these spaces (ignores subspaces)
"spaces": ["!space1:example.com", "!space2:example.com"],
// options to control which events should be live-streamed e.g not_types, types from sync v2
}
}
```
Returns:
```json=
{
"ops": [
{
"range": [0,99],
"op": "SYNC",
"rooms": [
{
"room_id": "!foo:bar",
"name": "The calculated room name",
// this is the CURRENT STATE, unlike v2 sync
"required_state": [
{"sender":"@alice:example.com","type":"m.room.join_rules", "state_key":"", "content":{"join_rule":"invite"}},
{"sender":"@alice:example.com","type":"m.room.history_visibility", "state_key":"", "content":{"history_visibility":"joined"}},
{"sender":"@alice:example.com","type":"m.space.child", "state_key":"!foo:example.com", "content":{"via":["example.com"]}},
{"sender":"@alice:example.com","type":"m.space.child", "state_key":"!bar:example.com", "content":{"via":["example.com"]}},
{"sender":"@alice:example.com","type":"m.space.child", "state_key":"!baz:example.com", "content":{"via":["example.com"]}}
],
"timeline": [
// We can de-dupe events in `required_state` via a top-level event map so only the event IDs are referenced here.
{"sender":"@alice:example.com","type":"m.room.join_rules", "state_key":"", "content":{"join_rule":"invite"}},
{"sender":"@alice:example.com","type":"m.room.message", "content":{"body":"A"}},
{"sender":"@alice:example.com","type":"m.room.message", "content":{"body":"B"}},
{"sender":"@alice:example.com","type":"m.room.message", "content":{"body":"C"}},
{"sender":"@alice:example.com","type":"m.room.message", "content":{"body":"D"}},
],
"notification_count": 54, // from sync v2
"highlight_count": 3 // from sync v2
},
{
"room_id": "!sub1:bar"
// because this is an explicit room subscription, the
// room data goes into room_subscriptions and
// only the bare minimum data is here to provide the sort ordering
}
// ... 98 more items
],
}
],
"room_subscriptions": {
"!sub1:bar": {
"name": "#canonical-alias:localhost",
"required_state": [
{"sender":"@alice:example.com","type":"m.room.create", "state_key":"", "content":{"creator":"@alice:example.com"}},
{"sender":"@alice:example.com","type":"m.room.join_rules", "state_key":"", "content":{"join_rule":"invite"}},
{"sender":"@alice:example.com","type":"m.room.history_visibility", "state_key":"", "content":{"history_visibility":"joined"}},
{"sender":"@alice:example.com","type":"m.room.member", "state_key":"@alice:example.com", "content":{"membership":"join"}}
],
"timeline": [
{"sender":"@alice:example.com","type":"m.room.create", "state_key":"", "content":{"creator":"@alice:example.com"}},
{"sender":"@alice:example.com","type":"m.room.join_rules", "state_key":"", "content":{"join_rule":"invite"}},
{"sender":"@alice:example.com","type":"m.room.history_visibility", "state_key":"", "content":{"history_visibility":"joined"}},
{"sender":"@alice:example.com","type":"m.room.member", "state_key":"@alice:example.com", "content":{"membership":"join"}}
{"sender":"@alice:example.com","type":"m.room.message", "content":{"body":"A"}},
{"sender":"@alice:example.com","type":"m.room.message", "content":{"body":"B"}},
],
// 0 notif count fields are required initially as if they are
// omitted it may indicate "no update/change" instead of "0".
"notification_count": 0, // from sync v2
"highlight_count": 0 // from sync v2
},
"!sub2:bar": {
// this room isn't even in the first 100 rooms but it is here
// because we had an explicit room_subscription for it
}
}
// the total number of rooms the user is joined to, used to pre-allocate
// placeholder rooms for smooth scrolling
"count": 1337,
"notifications": { .... } // see later section
}
```
Subsequent updates are just live-streamed to the client as and when they happen. For a topic change in the 4th room:
```json=
{
"ops": [
{
"index": 3,
"op": "UPDATE",
"room": {
"timeline": [
{"sender":"@alice:example.com","type":"m.room.topic", "state_key":"", "content":{"topic":"This is a nice topic"}},
],
"notification_count": 55, // increments by 1
}
}
],
"count": 1337 // the total number of rooms the user is joined to
}
```
UPDATEs do exactly that, update fields without removing existing fields. The above response means "append to the timeline". Clients need to know that state events in the timeline ALSO mean to update the current state of the room. Updates which affect [calculating the room name](https://matrix.org/docs/spec/client_server/latest#calculating-the-display-name-for-a-room) will also update the `name` field for that room, in addition to returning the event which modifies the room name. This means clients don't need to implement the room name calculation algorithm at all. If an update occurs in a room which is both in the sorted list and an explicit room subscription, only the room subscription will receive the information: there will be no explicit UPDATE operation:
```json=
{
"room_subscriptions": {
"!sub1:bar": {
"timeline": [
{"sender":"@alice:example.com","type":"m.room.topic", "state_key":"", "content":{"topic":"This is a nice topic"}},
],
"notification_count": 55, // increments by 1
}
}
}
```
If the user leaves the 9th room, we need to bump everything up and add an entry at the 100th position:
```json=
{
"ops": [
{
"index": 8,
"op": "UPDATE",
"room": {
"timeline": [
{"sender":"@alice:example.com","type":"m.room.member", "state_key":"@alice:example.com", "content":{"membership":"leave"}},
]
}
}
{
"op": "DELETE",
"index": 8
},
{
"op": "INSERT",
"index": 99,
"room": {
"room_id": "!foo:bar",
"required_state": [
{"sender":"@alice:example.com","type":"m.room.join_rules", "state_key":"", "content":{"join_rule":"invite"}},
{"sender":"@alice:example.com","type":"m.room.history_visibility", "state_key":"", "content":{"history_visibility":"joined"}},
{"sender":"@alice:example.com","type":"m.space.child", "state_key":"!foo:example.com", "content":{"via":["example.com"]}},
{"sender":"@alice:example.com","type":"m.space.child", "state_key":"!bar:example.com", "content":{"via":["example.com"]}},
{"sender":"@alice:example.com","type":"m.space.child", "state_key":"!baz:example.com", "content":{"via":["example.com"]}}
],
"timeline": [
// We can de-dupe events in `required_state` via a top-level
// event map so only the event IDs are referenced here.
{"sender":"@alice:example.com","type":"m.room.join_rules", "state_key":"", "content":{"join_rule":"invite"}},
{"sender":"@alice:example.com","type":"m.room.message", "content":{"body":"A"}},
{"sender":"@alice:example.com","type":"m.room.message", "content":{"body":"B"}},
{"sender":"@alice:example.com","type":"m.room.message", "content":{"body":"C"}},
{"sender":"@alice:example.com","type":"m.room.message", "content":{"body":"D"}},
]
},
}
],
// the count is AFTER the ops have been applied so decremented by 1
"count": 1336
}
```
It is up to the client to decide what to do here. We could have configurable options for:
- Leaving a room removes from the list.
- Getting banned in a room does NOT remove from the list (so the user can see they were banned).
- Forgetting a room (e.g a banned room) then removes it from the list.
If a user joins a room in the 35th position we need to get rid of the 100th entry:
```json=
{
"ops": [
{
"op": "DELETE",
"index": 99
},
{
"op": "INSERT",
"index": 34,
"room": {
"room_id": "!foo:bar",
"required_state": [
{"sender":"@alice:example.com","type":"m.room.join_rules", "state_key":"", "content":{"join_rule":"invite"}},
{"sender":"@alice:example.com","type":"m.room.history_visibility", "state_key":"", "content":{"history_visibility":"joined"}},
{"sender":"@alice:example.com","type":"m.space.child", "state_key":"!foo:example.com", "content":{"via":["example.com"]}},
{"sender":"@alice:example.com","type":"m.space.child", "state_key":"!bar:example.com", "content":{"via":["example.com"]}},
{"sender":"@alice:example.com","type":"m.space.child", "state_key":"!baz:example.com", "content":{"via":["example.com"]}}
],
"timeline": [
// We can de-dupe events in `required_state` via a top-level
// event map so only the event IDs are referenced here.
{"sender":"@alice:example.com","type":"m.room.join_rules", "state_key":"", "content":{"join_rule":"invite"}},
{"sender":"@alice:example.com","type":"m.room.message", "content":{"body":"A"}},
{"sender":"@alice:example.com","type":"m.room.message", "content":{"body":"B"}},
{"sender":"@alice:example.com","type":"m.room.message", "content":{"body":"C"}},
{"sender":"@alice:example.com","type":"m.room.message", "content":{"body":"D"}},
]
},
}
],
"count": 1337 // the count is AFTER the ops so incremented by 1
}
```
Invites would be handled outside the core `rooms` array as they often appear in their own prominent section. If a room is tracked via an explicit subscription and it enters or leaves the sorted list, only the INSERT/DELETE operations will be present, and the INSERT operation will only have the `room_id` field.
If the user scrolls down, we need to request and subscribe to the next 100 rooms:
`POST /v3/sync`:
```json=
{
"rooms": [ [0,99], [100,199] ], // first 200 rooms
// request parameters are sticky and don't need to be specified again
// a notable exception to this is 'unsubscribe_rooms' which merely alters
// the 'room_subscriptions' map when it is received and then gets cleared.
}
```
The server sees the client wanting to subscribe to 0-99 but there is already an active subscription so it's a no-op. It is required though because the _absence_ of the range would unsubscribe the client from 0-99. The server sees a new range 100-199 so returns:
```json=
{
"ops": [
{
"range": [100,199],
"op": "SYNC",
"rooms": [
// ... 100 rooms ...
]
}
]
}
```
Updates happen in the first 200 rooms now. When the client scrolls even more, the client just requests 0-99 and 200-299 (effectively the 1st and 3rd pages):
`POST /v3/sync`:
```json=
{
"rooms": [ [0,99], [200,299] ]
}
```
The server sees 100-199 is missing and issues an invalidation to tell the client that they will be working on stale data for this range. When the user scrolls back up they will need to re-subscribe to this range:
```json=
{
"ops": [
{
"op": "INVALIDATE",
"range": [100,199]
}
]
}
```
It's up to the client to decide what to do when rooms are INVALIDATEd. For offline support, these rooms should still be visible and clickable, and ultimately interactable. The client needs to speedily request that range again in case the rooms have shifted from under them. Alternatively, they can just delete the rooms and display placeholders until the range is requested again.
#### Limitations of this approach
- Scrolling the room list becomes expensive. If a page is invalidated, they need to be fully synced from scratch again. This consumes needless bandwidth if the rooms haven't changed much.
- Resyncing after the connection has been closed becomes expensive. The client may have many timeline events and state for a room, but will be told all of this again. If there have been no events in the room, this becomes needlessly bandwidth consuming.
- A lot rides on the ability to detect when a connection has been closed. This is tricky (but possible) to do with long-poll connections by relying on timeouts. If the client doesn't send another `/sync` request after N seconds then the "connection" is treated as closed and a sync request with that sync token returns `M_UNKNOWN_SYNC_TOKEN` which causes the client to start over from scratch.
- You lose the ability to "replay" sync requests. Events are live-streamed then dropped.
Care needs to be taken on the server to synchronise incoming requests for additional pages with returning deltas to the client i.e protect these operations with a shared mutex. Failure to do so could result in duplicates or missing data e.g client knows `[0,99]` and then requests `[100,199]`. At same time, room `115` gets an event and gets bumped to position `0`. If the range request is processed first, the bump needs to take into account the newly tracked range. If the event is processed first, the range request must not return the room again in `[100,199]`.
### Hybrid approach
We want sync v3 to work in low bandwidth scenarios. This means we want to make use of as much data we know the client knows about. On re-establishing a sync connection, or re-requesting a page that was previously INVALIDATEd, the server will perform the following operations:
- For this device/session: check the last sent event ID for the room ID in question. Count the number of timeline events from that point to the latest event. Call it `N`.
- For this specific sync request: calculate a reasonable upper-bound for how many events will be returned in a reasonable worst-case scenario. This is simply `timeline_limit + len(required_state)` (ignoring `*` wildcards on state). Call it `M`.
- If N > M then we would probably send more events if we did a delta than just telling the client everything from scratch, so issue a `SYNC` for this room.
- If N < M then we don't have many events since the connection was last established, so just send the delta as an `UPDATE`.
This approach has numerous benefits:
- In the common case when you scroll a room, you won't get any `SYNC`s for rooms that were invalidated because it's highly unlikely to receive 10+ events during the room scroll (assuming you scroll back up in reasonable time).
- When you reconnect after sleeping your laptop overnight, most rooms will be `UPDATE`s, and busy rooms like Matrix HQ will be `SYNC`ed from fresh rather than sending 100s of events.
This imposes more restrictions on the server implementation:
- Servers still need the absolute stream ordering for events to work out how many events from `$event_id` to `$latest_event_id`.
- Servers need to remember the last sent event ID for each session for each room. If rooms share a single monotonically increasing stream, then this is a single integer per session (akin to today's sync tokens for PDU events). Servers need to remember _which rooms_ have been sent to the client, along with the stream position when that was sent. So it's basically a `map[string]int64`.
An example of what this looks like in the response:
```json=
{
"ops": [
{
"range": [100,117],
"op": "SYNC",
"rooms": [
// ... 18 rooms with complete state ...
]
},
{
"range": [118,124],
"op": "UPDATE",
"rooms": [
// ... 7 rooms with a few timeline events ...
// It is assumed that clients will keep a map of room_id -> Room object
// and when a room gets DELETEd or INVALIDATEd in this API that the Rooms
// are persisted as stale such that an UPDATE like this can bring it
// up-to-date again.
]
},
{
"range": [125,177],
"op": "SYNC",
"rooms": [
// ... 53 rooms with complete state ...
]
}
]
}
```
Some clients don't want to store state and are happy with using more bandwidth. For these clients, sync v2 has `?full_state=`. We can add a similar flag in this API to say "never incrementally catch me up from an earlier connection / invalidated page".
If a client gets a `SYNC` for a room where they previously had timeline events and state for, they MUST drop the state but can keep the timeline events as a disjointed timeline section. They may be able to tie the sections together again via `/messages` requests (backfilling).
For cases where the state resolution algorithm has deleted state, we can force a `SYNC` on that room to re-issue the correct state, with an empty timeline section to inform the client that no new events have been sent, but the current state has changed.
#### Notifications
If you are tracking the top 5 rooms and an event arrives in the 6th room, you will be notified about the event ONLY IF the sort order means the room bumps into the top 5. If for example you sorted `by_name` then you won't be notified about the event in the 6th room, unless it's an `m.room.name` event which moves the room into the top 5. In most "recent" sort orders a new event *will result* in the 6th room bumping to the top of the list. A notable exception is when the rooms are sorted in *alphabetical order* (`by_name`), which is what some other chat clients do for example. In this case, you don't care about the event unless the event is a "highlightable" event (e.g direct @mention). If you are explicitly "highlighted" in a room (according to push rules), a new section appears **at the top-level**:
```json=
{
"notifications": [
{
"room_id": "!foo:bar",
"event_id": "$aaaaaabbbbbccccc",
"highlight_count": 1,
"name": "The room name",
"last_message_timestamp": 1633105777488
}
]
}
```
If a client gets a notification when they are not connected to this API, the first `SYNC` response will contain a `notifications` section like this. A client will want to display this on the UI e.g "NEW UNREADS" in the below image:
![](https://i.imgur.com/Wc0A9c7.png)
In order for the "NEW UNREADS" message to be positioned at the top or bottom of the list, we need to include sorting information. This is why the notification contains enough information to sort the notification into the room list client-side. We may want to replace `last_message_timestamp` with the actual `event` which caused the notification in order to immediately display tray notifications (e.g on Desktop, which may need lazy-loaded members as well).
Clients need to "subscribe" to this room to track this room and pull in any other timeline events and state for this room. Why? Because the client has not explicitly subscribed (in a pubsub sense) to this room, so we aren't going to flood them with data whenever an unsolicited @mention arrives. This means we can send redundant data (e.g if the same user @mentions the client it's possible we will send 2x `m.room.member` events for each lazy-loaded member).
##### End-to-end encryptions (E2EE) rooms
The server cannot calculate the `highlight_count` in E2EE rooms as it cannot read the message content. This is a problem when clients want to sort by `highlight_count`. In comparison, the server can calculate the name, `unread_count`, and work out the most recent timestamp when sorting by those fields. What should the server do when the client wants to sort by `highlight_count` (which is pretty typical!)? It can:
- Assume `highlight_count == 1` whenever `unread_count > 0`. This ensures that E2EE rooms are always bumped above unreads in the list, but doesn't allow sorting within the list of highlighted rooms.
- Assume `highlight_count == 0` always. This will always sort E2EE rooms below the highlight list, even if the E2EE room has a @mention.
- Sort E2EE rooms in their own dedicated list.
In all cases, the client needs to do additional work to calculate the `highlight_count`. When the client is streaming this work is very small as it just concerns a single event. However, when the client has been offline for a while there could be hundreds or thousands of missed events. There are 3 options here:
- Do no work and immediately red-highlight the room. Risk of false positives.
- Grab the last N messages and see if any of them are highlights. **Current implementations using sync v2 do this.**
- Grab all the missed messages and see if any of them are highlights. Denial of service risk if there are thousands of messages.
Once the highlight count has been adequately *estimated* (it's only truly calculated if you grab all messages), this may affect the sort order for this room - it may diverge from that of the server. More specifically, it may bump the room up or down the list, depending on what the sort implementation is for E2EE rooms (top of list or below rooms with highlights). How this interacts with this API has not yet been fully determined.
### Missing bits
- Room invites. This can be in a separate section of the response, outside the sorted `rooms` array.
- Typing notifs, read receipts, room tag data, and any other room-scoped data. This can be added as request params to state whether you want these or not.
- Account data. Again, this can be added as request params and we can do similar pubsub for updates to types the client is interested in.
- To-device messages. It would be nice to have a queue per event type / sender / room so clients can rapidly get at room keys without having to wade through lots of key share requests. Need to check with the crypto team whether the ordering on to-device messages cross-event-type is important or not.
- Presence and member lists in general.
- Device lists and OTK counts.

View File

@ -236,15 +236,13 @@ const render = (container) => {
const r = rooms.roomIdToRoom[roomId];
if (!r) {
// placeholder
roomCell.getElementsByClassName("roomname")[0].textContent = randomName(i, false);
roomCell.getElementsByClassName("roomcontent")[0].textContent = randomName(i, true);
roomCell.getElementsByClassName("roominfo")[0].style = "filter: blur(5px);";
roomCell.getElementsByClassName("roomname")[0].textContent = "";
roomCell.getElementsByClassName("roomcontent")[0].textContent = "";
roomCell.getElementsByClassName("roomavatar")[0].src = "/client/placeholder.svg";
roomCell.style = "";
continue;
}
roomCell.style = "";
roomCell.getElementsByClassName("roominfo")[0].style = "";
roomCell.getElementsByClassName("roomname")[0].textContent = r.name || r.room_id;
if (r.avatar) {
roomCell.getElementsByClassName("roomavatar")[0].src = mxcToUrl(r.avatar);

View File

@ -1,323 +0,0 @@
### Room List Stream
This is the first stream a client will access on a freshly logged-in client.
The purpose of this API is to:
- provide the client with a list of room IDs sorted based on some useful sort criteria. Critically,
the data used to sort these rooms _is told to the client_ so they can continue to sort rooms as live streaming data comes in.
- Track the list of room IDs the client is interested in, for feeding into other APIs automagically.
*This list is called the "**active set**".* This saves bandwidth as the full list of room IDs don't need to be constantly sent back and forth.
- provide **exactly** the information about a room ID to populate room summary without having to track the entire room state.
At present, the client needs to know the room IDs of the top-level spaces the client is joined to first if they wish to make use of the Spaces
functionality in this API. This can get these space room IDs via [MSC2946](https://github.com/matrix-org/matrix-doc/pull/2946).
This stream is paginatable. The first "initial" sync always returns a paginated response to seed the client
with data initially. This is **pagination mode**.
```
POST /sync?since=
{
room_list: {
sort: ["by_name", "by_recency", "by_space_order"]
limit: 5,
state_events: [
["m.room.topic", ""],
["m.room.avatar", ""],
["m.space.parent", "*"]
],
spaces: ["!foo:bar", "!baz:quuz"],
lazy_load_members: true,
track_notifications: true
}
}
```
- `sort`: (default: `by_recency`) The sort operations to perform on the rooms. The first element is applied first, then tiebreaks
are done with the 2nd element, then 3rd, and so on. For example: `["by_name", "by_recency"]`:
```
Another Room Name
Random Room Name (last msg: 5m ago)
Random Room Name (last msg: 15h ago)
Some Room Name
```
* `by_name`: [Calculate the room name](https://spec.matrix.org/unstable/client-server-api/#calculating-the-display-name-for-a-room) and sort lexiographically A-Z, A comes first. This means servers need to handle heroes and internationalisation.
* `by_recency`: Sort based on the last event's `origin_server_ts`, higher comes first.
* `by_space_order`: Sort by the space ordering according to the `order` in `m.space.child` events. If this is set, `spaces` cannot be empty.
* `by_highlight_count`: Sort based on the number of highlight-able (typically red) unread notifications. Highest first.
* `by_notification_count`: Sort based on the number of unread notifications (typically grey). Highest first.
- `limit`: (default: 20) The number of rooms to return per request.
- `state_events`: (default: `[]`) Array of 2-element arrays. The subarray `[0]` is the event type, `[1]` is the state key.
The state events to return in the response. This format compresses better in low bandwidth mode. If the state key is `*` then return all matching
events with that event type.
- `spaces`: (default: `[]`) Restrict the rooms returned in this API to the following spaces rather than all the rooms the client is joined to.
This can be used to reduce the total amount of data sent to the client. The client must be joined to these spaces. Subspaces are not enumerated.
- `lazy_load_members`: (default: `true`) If true, returns the `m.room.member` events for senders in the timeline, along with `m.room.member` events
for all members which are used to form the room name. This allows clients to render a room avatar based on the members in the room if needed.
- `track_notifications`: (default: `true`) If true, events which would [cause a notification to appear](https://spec.matrix.org/unstable/client-server-api/#receiving-notifications) will cause the room to appear in this list. For E2E rooms, all events will be notified as it's impossible to know if the encrypted event should cause a notification.
Returns the response:
```
{
room_list: {
rooms: [
{
room_id: "!foo:bar",
name: "My room name",
state_events: [
{ event JSON }, { lazy m.room.member }
],
prev_batch: "backpagination_token",
timeline: [
{ last event JSON }
]
},
{ ... }
],
next_page: "get_more_rooms_token"
notifications: [
{
room_id: "!bbbb:bar",
name: "a random room not used in a long time",
state_events: [
{ event JSON }, { lazy m.room.member }
],
events: [ { ... } ],
highlight_count: 52,
notification_count: 102,
},
{
room_id: "!encrypted:bar",
name: "My Encrypted Room",
state_events: [
{ event JSON }, { lazy m.room.member }
],
prev_batch: "backpagination_token",
timeline: [
{ last event JSON }
]
},
]
},
next_batch: "s1"
}
```
- `rooms`: The ordered list of rooms.
- `next_page`: A pagination token to retrieve more rooms.
- `rooms[].room_id`: The room ID of this room.
- `rooms[].name`: The calculated name of this room.
- `rooms[].highlight_count`: The unread notification count (see `unread_notifications` in /sync v2).
- `rooms[].notification_count`: The unread notification count (see `unread_notifications` in /sync v2).
- `rooms[].state_events`: A list of state events requested via `state_events`.
- `rooms[].prev_batch`: The backpagination token for `/messages`.
- `rooms[].timeline`: An ordered list of events in the timeline. The last event is the most recent.
- `notifications`: An unordered list of `rooms` based on notification criteria. Returned only if `track_notifications: true`.
Notifiable events do not return a `timeline` as this section only produces the odd notified event and not a coherent stream of events.
E2EE rooms however DO return a `timeline` as every single event is "notifiable" effectively as the server doesn't know. By sending this
as a timeline, it allows the room data API to not send events after a given event ID, thus saves bandwidth.
- `notifications[].events`: The notifiable events. Typically this will just contain a single entry but if there is a large gap it's possible
to receive two or more notifiable events in the same room. Critically, these events do not form part of a coherent timeline, hence why it's
called `events` and not `timeline`.
Clients don't need to paginate through the entire list of rooms so they can ignore `next_page` if they wish.
If they want to paginate, they provide the value of `next_page` in the next request along with the `next_batch` value,
to pin the results to a particular snapshot in time:
```
POST /sync?since=s1
{
room_list: {
next_page: "get_more_rooms_token"
}
}
```
The server will remember the "active set" of room IDs for this stream according to the following rules:
- If `spaces` is non-empty then remember the entire set of top-level children rooms for each space. De-duplicate as needed.
Clients can track subspaces by including the subspace room ID in `spaces`.
- If `spaces` is empty then remember the room IDs sent to the client in the response, increasing this list as the client paginates.
- When the client leaves a room in the active set, remove it.
NB: Clients can specify `limit: 0` and `spaces: []` to indicate an empty set as the active set. This is useful when combined
with `track_notifications: true` which means this API will _only_ return a `notifications` section.
NBB: There is no special handling for left rooms. Clients will receive their leave
event in the `timeline` and then no further events. It is up to the client to then remove this room from the sorted list.
If a request comes in without a `next_page` pagination token but with a `?since=` value, this swaps this API into **streaming mode**.
When operating in **streaming mode**, rooms will be sent to the client based on the "active set" and `track_notifications`:
- For an event E that arrives in a room R, if R is in the active set then send E to the client.
- Else if `track_notifications` is `true` and E is a notifiable event or E is in an E2EE room based on the `m.room.encryption` state event,
then send E to the client.
- Else drop the event.
With this API, you can _mostly_ (favourites need to be part of a space) emulate Element-Web's LHS room list with the following request:
```
POST /sync?since=
{
room_list: {
sort: ["by_highlight_count", "by_notification_count", "by_recency"]
limit: 10,
state_events: [
["m.room.avatar", ""]
],
lazy_load_members: true,
track_notifications: true
},
}
```
Caveats:
- The room list stream cannot be used to track invitations. That needs a new stream which is fine as they aren't sorted in the same way as joined rooms.
- Tracking notifications can be heavy if the user is joined to a lot of E2EE rooms. It might be nice to have some way of filtering this list down,
particularly as we expect the number of E2EE rooms to increase over time.
### Room Data API
The purpose of this API is to provide the entire room state/timeline for a single room (often the room the user is viewing). This stream
is NOT paginatable. This API is evaluated after the room list API if both streams are requested in a single `/sync` request.
```
POST /sync?since=
{
room_data: {
room_id: "room_list",
earliest_timeline_event_ids: ["$aaaa","$bbbb"],
room_member_limit: 5,
room_member_sort: "by_pl",
timeline_limit: 20,
format: "client|federation",
types: [ "m.room.message" ],
not_types: [ "m.room.topic" ]
}
}
```
- `room_id`: The room ID to get a data for, or the magic constant `"room_list"` to pull from tracked room IDs.
- `earliest_timeline_event_ids`: Optional. If set, the room data API will not return any events between the event ID given
and the `since` value provided in this request (inclusive of both), as the server will assume it has been fetching the timeline by other means
such as the room list API. This saves bandwidth by not sending duplicate events if the event IDs are the earliest events the client has
for each room being tracked.
- `room_member_limit`: The maximum number of room members to fetch for each room. If the limit is exceeded, a pagination token for
the room member stream will be provided. If 0 or missing, does not paginate room members.
- `room_member_sort`: Enum representing the sort order. See the room member stream for full values.
- `timeline_limit`: The max number of timeline events to return. Mostly this is useful if you are calling the room data stream without having a room list stream running.
- `format`: (default: `client`) The event format to send back. See [Filtering](https://spec.matrix.org/unstable/client-server-api/#filtering)
- `types` and `not_types`: [Event filters](https://spec.matrix.org/unstable/client-server-api/#filtering) to use.
Returns the response:
```
{
room_data: {
rooms: {
$room_id: {
state: {
state_before: "$aaaa",
events: [ ... ]
},
members: {
events: [ ... ],
next_batch: "s1",
next_page: "p1"
},
timeline: {
prev_batch: "p1",
events: [ ... ]
}
}
}
}
}
```
- `state.state_before`: Optional. Set to an event ID specified in `earliest_timeline_event_ids`. If set, the `state.events` refer to the state
of the room _before_ the event ID specified in `state_before`. Clients should set the room state to these events, then roll forward their already stored timeline
events. Only after that point should the events in `timeline.events` be applied. In this case, the `timeline.prev_batch` refers to the batch of
events prior to the event in `state_before`, NOT `timeline.events[0].event_id`. To be clear:
```
from room list timeline.events
[$aaa,$bbb,$ccc,$ddd] [$eee, $fff, $ggg]
^
|
state.events
state.state_before = "$aaa"
timeline.prev_batch = "token_for_events_before_$aaa"
```
This only applies if `earliest_timeline_event_ids` is not empty AND contains an event in a one of the room IDs given.
- `state.events`: The state events at the start of the timeline, excluding room member events. The timeline may either be `timeline.events` or the earliest event given in `earliest_timeline_event_ids`.
- `members.events`: The `m.room.member` state events at the start of the timeline, same as `state.events`. May be partial, depending on the `room_member_limit`.
Sorted according to the `room_member_sort` value.
### Room Member API
The purpose of this API is to provide a paginated list of room members for a given room.
```
POST /sync?since=
{
room_member: {
room_id: "room_list",
limit: 5,
sort: "by_pl"
}
}
```
- `room_id`: The room to fetch members in.
- `limit`: The max number of members to fetch per page.
- `sort`: How to sort the list of room members. One of:
* `by_name`: Lexicographical order from A->Z (case-insensitive, unicode case-folding)
* `by_pl`: Sort highest power level first, then `by_name`.
Returns the response:
```
{
room_member: {
limit: 5,
events: [ m.room.member events ]
next_page: "p1"
}
}
```
- `limit`: The negotiated limit, may be lower than the `limit` requested.
### Server implementation guide
Server-side, the pagination operations performed for `room_list` are:
- Multiplex together the `room_list` filter params as per normal v3 semantics.
- Load latest stream position or use `?since=` if provided, call it `SP`. Results are anchored at this position.
- Load all joined rooms for this user at `SP`.
- If `spaces` is non-empty, reduce the set of joined rooms to ones belonging to these spaces. Add all these rooms to the "active set".
- Sort the joined rooms according to `sort`. If it is `by_name` then the server needs to calculate the room name and handle internationalisation.
- Subslice the room IDs based on `limit` (and `next_page` if it exists) to produce a list of room IDs `R`.
- For each room in `R`:
* Load the state events based on `state_events`, honouring wildcard state keys.
* Calculate the room name and set it on `name`. If `m.room.member` events were required to do this, include them in the state events if `lazy_load_members: true`.
* Load the most recent event in `R`. Include it in the `timeline`, and include the `m.room.member` event of the sender in the state events if `lazy_load_members: true`.
* Set the `prev_batch` value based on the timeline event.
* Sort the state events lexiographically by event type then state key then add them to the `state_events` response for this room.
* Add `R` to the "active set" if `spaces` is empty.
Server-side, the streaming operations performed for `room_list` are:
- Load the latest stream position `SP` and the `since` value.
- If the delta between the two positions is too large (heuristic), reset the session.
- Multiplex together the `room_list` filter params as per normal v3 semantics.
- Load the active set, updating it if needed (e.g if `spaces` is non-empty and a new room has been added to the space between the 2 stream positions).
Events _should_ start flowing from the point the room was added to the space, but ultimately it isn't a hard requirement so long as the client is
able to view these events.
- Load the set of room IDs `Rsent` who have had a complete response already sent to the client.
- For all events between `since` and `SP`:
* apply visibility checks to filter out events the user cannot see due to not being joined anymore.
* if the event is in a room in the "active set", add it to `rooms[].timeline`.
* else if the event is a notifiable event based on push rules or E2EE presence then add it to `notifications`. If the event
is part of an E2EE room then add it to `notifications[].timeline`, else add it to `notifications[].events`. If there are multiple
notifiable events in the same non-E2EE room, append it to `events`.
* if the room ID in the event is not in `Rsent` then add it to `Rsent` and include all the necessary `state_events` according to the filter,
in addition to the `name` and `prev_batch` (see the pagination section).
### Notes, Rationale and Queries
#### Room data
- Only the member events are paginated, not the entire room state. This is probably okay as the vast
majority of current state in rooms are actually just member events. Member events can be sorted coherently, but
arbitrary state events cannot (what do you sort by? Timestamp?).

View File

@ -264,3 +264,10 @@ func (h *SyncLiveHandler) AddToDeviceMessages(userID, deviceID string, msgs []go
_, err := h.Storage.ToDeviceTable.InsertMessages(deviceID, msgs)
return err
}
func (h *SyncLiveHandler) UpdateUnreadCounts(roomID, userID string, highlightCount, notifCount *int) {
err := h.Storage.UnreadTable.UpdateUnreadCounters(userID, roomID, highlightCount, notifCount)
if err != nil {
logger.Err(err).Str("user", userID).Str("room", roomID).Msg("failed to update unread counters")
}
}