As a follow-up to the inline layout discussion in Orlando, I wanted to float ideas for how to implement fragmentation in Servo layout 2.0. Fragmentation encompasses pagination, multi-column, and inline layout, as well as possible future CSS features such as overflow: fragments. As I see it, there are basically three models we can choose from, each with advantages and disadvantages. Because it's not immediately obvious which to choose, I'd like to lay them all out on the table so that we can get a clear picture of the advantages and disadvantages of each.
One-tree model
In this model, we have one layout tree. During frame construction, we build the tree without any fragmentation. We then mutate it as we determine where line and page breaks are in the process of layout to create the post-fragmentation tree (via continuation frames). For dynamic layout, we need logic to undo these changes and then redo them as appropriate as styles change. This corresponds to what Gecko does today; I think Blink and WebKit may still do this as well. The latter two engines are both moving away from this model in their rewrites, as I understand it.
Advantages: Having just one tree saves memory, and the data model is simple.
Disadvantages: Having to undo changes adds a large amount of complexity. Cutting up trees is both complex and bad for cache locality. Tree mutation during layout makes layout harder to understand and potentially negatively impacts parallelism.
Verdict: I'm basically assuming that this model is a non-starter, for the reasons above.
Two-tree model
For this model, we have two layout trees: a pre-fragmentation tree and a post-fragmentation tree. Frame construction generates the pre-fragmentation tree, and layout is (at a high level) a pure function that takes a pre-fragmentation tree as input and produces a post-fragmentation tree as output. Portions of the post-fragmentation tree must be invalidated whenever the frame constructor is reinvoked due to style changes.
In an ECS model, we could conceivably share entity IDs between the boxes in the pre-fragmentation tree and the first fragment of each box in the post-fragmentation tree. However, if we do so we will need to be careful about which layout data belongs to which tree, lest we inherit some of the problems of the one-tree model. For instance, if we store resolved border sizes somewhere (which we probably wouldn't do in practice, but just as an example) then we would need to be clear about whether those resolved border sizes refer to the borders of a box before fragmentation or after fragmentation. This is a potential footgun. On the other hand, sharing entity IDs in this way makes the common (99%+) case of no pagination potentially a lot faster, because we don't have to generate so many boxes.
This model corresponds to the model that Blink and WebKit are moving to, per my understanding.
Advantages: Layout becomes more functional, which reduces complexity. There is no need to undo changes during dynamic situations in frame construction; we can simply invalidate the post-fragmentation data as appropriate. Because we have two trees that can be arbitrarily similar or different, we can handle anything the CSS working groups might throw at us.
Disadvantages: Cutting up trees for line and page breaks is complex. Having two trees reduces cache efficiency and increases memory usage.
Verdict: This is an attractive approach because of its conceptual cleanliness. But for inline layout this could have negative complexity and performance implications similar to those that we see in Gecko with continuation frames, which is vexing because inline layout is the 99% case. Making what is overwhelmingly the most common (and expensive!) layout scenario slower to unify the logic with rare features doesn't sit well with me (though I'm of course open to it if it's ultimately the right thing to do).
Range model
This model represents fragment containers as ranges. Like the Range API in the DOM, a range is simply a (start entity ID, start offset, end entity ID, end offset) tuple. Such a range represents all boxes that would be visited in between the start and end entities during an in-order traversal of the frame tree. Dynamic layout is accomplished by simply invalidating the ranges as necessary and rebuilding them.
Previously, I was referring to this model as the "flat list" model, and in fact a further optimization of this model for inline layout is to flatten inline frames that are part of the same inline formatting context into a single linked list or array. However, it doesn't seem feasible to flatten frames where pagination and multicol are concerned. So instead I call this the range model, because ranges are the fundamental concept that can potentially apply to not only inline but also paginated and multi-column layout.
As mentioned before, the DOM has the Range concept, which is very similar to this model except that it operates over the DOM instead of the frame tree. OS line layout systems such as Apple's Core Text frequently use this model (in Core Text's case via "attributed strings").
Fragments need to store data unique to them--for example, if an inline element gets split, each fragment needs a position of its own--and so this model must choose a strategy to handle this. The two strategies that I can see correspond to the one-tree model and the two-tree model above: either we mutate the frame tree to split frames during layout and undo that mutation as necessary for dynamic layout, or we build up fragments into a separate list that then gets invalidated and rebuilt as necessary. The former approach is what Servo does today, but I think the latter is probably preferable for Servo Layout 2.0.
The big issue with this model is that it requires that the specs are written in such a way as to guarantee that ranges are actually sufficient to determine page and multi-column boundaries, and I'm not sure that this is the case. Note, however, that we could extend this model to treat fragmentainers as a set of ranges, much like the DOM Selection API treats selections as a set of ranges.
Advantages: The model maximizes cache efficiency and therefore performance. There's no need to store two trees, which is both simpler and better for memory consumption.
Disadvantages: It's not clear that this model will be flexible enough to handle everything that CSS requires right now or that CSS working groups might throw at us in the future.
Verdict: I'm leaning toward this model, if it works.
Discussion
I think that either the two-tree or the range model are likely the way to go, but I'm somewhat torn between them. I'm very curious to know what others think.
As a follow-up to the inline layout discussion in Orlando, I wanted to float ideas for how to implement fragmentation in Servo layout 2.0. Fragmentation encompasses pagination, multi-column, and inline layout, as well as possible future CSS features such as
overflow: fragments. As I see it, there are basically three models we can choose from, each with advantages and disadvantages. Because it's not immediately obvious which to choose, I'd like to lay them all out on the table so that we can get a clear picture of the advantages and disadvantages of each.One-tree model
In this model, we have one layout tree. During frame construction, we build the tree without any fragmentation. We then mutate it as we determine where line and page breaks are in the process of layout to create the post-fragmentation tree (via continuation frames). For dynamic layout, we need logic to undo these changes and then redo them as appropriate as styles change. This corresponds to what Gecko does today; I think Blink and WebKit may still do this as well. The latter two engines are both moving away from this model in their rewrites, as I understand it.
Advantages: Having just one tree saves memory, and the data model is simple.
Disadvantages: Having to undo changes adds a large amount of complexity. Cutting up trees is both complex and bad for cache locality. Tree mutation during layout makes layout harder to understand and potentially negatively impacts parallelism.
Verdict: I'm basically assuming that this model is a non-starter, for the reasons above.
Two-tree model
For this model, we have two layout trees: a pre-fragmentation tree and a post-fragmentation tree. Frame construction generates the pre-fragmentation tree, and layout is (at a high level) a pure function that takes a pre-fragmentation tree as input and produces a post-fragmentation tree as output. Portions of the post-fragmentation tree must be invalidated whenever the frame constructor is reinvoked due to style changes.
In an ECS model, we could conceivably share entity IDs between the boxes in the pre-fragmentation tree and the first fragment of each box in the post-fragmentation tree. However, if we do so we will need to be careful about which layout data belongs to which tree, lest we inherit some of the problems of the one-tree model. For instance, if we store resolved border sizes somewhere (which we probably wouldn't do in practice, but just as an example) then we would need to be clear about whether those resolved border sizes refer to the borders of a box before fragmentation or after fragmentation. This is a potential footgun. On the other hand, sharing entity IDs in this way makes the common (99%+) case of no pagination potentially a lot faster, because we don't have to generate so many boxes.
This model corresponds to the model that Blink and WebKit are moving to, per my understanding.
Advantages: Layout becomes more functional, which reduces complexity. There is no need to undo changes during dynamic situations in frame construction; we can simply invalidate the post-fragmentation data as appropriate. Because we have two trees that can be arbitrarily similar or different, we can handle anything the CSS working groups might throw at us.
Disadvantages: Cutting up trees for line and page breaks is complex. Having two trees reduces cache efficiency and increases memory usage.
Verdict: This is an attractive approach because of its conceptual cleanliness. But for inline layout this could have negative complexity and performance implications similar to those that we see in Gecko with continuation frames, which is vexing because inline layout is the 99% case. Making what is overwhelmingly the most common (and expensive!) layout scenario slower to unify the logic with rare features doesn't sit well with me (though I'm of course open to it if it's ultimately the right thing to do).
Range model
This model represents fragment containers as ranges. Like the
RangeAPI in the DOM, a range is simply a (start entity ID, start offset, end entity ID, end offset) tuple. Such a range represents all boxes that would be visited in between the start and end entities during an in-order traversal of the frame tree. Dynamic layout is accomplished by simply invalidating the ranges as necessary and rebuilding them.Previously, I was referring to this model as the "flat list" model, and in fact a further optimization of this model for inline layout is to flatten inline frames that are part of the same inline formatting context into a single linked list or array. However, it doesn't seem feasible to flatten frames where pagination and multicol are concerned. So instead I call this the range model, because ranges are the fundamental concept that can potentially apply to not only inline but also paginated and multi-column layout.
As mentioned before, the DOM has the
Rangeconcept, which is very similar to this model except that it operates over the DOM instead of the frame tree. OS line layout systems such as Apple's Core Text frequently use this model (in Core Text's case via "attributed strings").Fragments need to store data unique to them--for example, if an inline element gets split, each fragment needs a position of its own--and so this model must choose a strategy to handle this. The two strategies that I can see correspond to the one-tree model and the two-tree model above: either we mutate the frame tree to split frames during layout and undo that mutation as necessary for dynamic layout, or we build up fragments into a separate list that then gets invalidated and rebuilt as necessary. The former approach is what Servo does today, but I think the latter is probably preferable for Servo Layout 2.0.
The big issue with this model is that it requires that the specs are written in such a way as to guarantee that ranges are actually sufficient to determine page and multi-column boundaries, and I'm not sure that this is the case. Note, however, that we could extend this model to treat fragmentainers as a set of ranges, much like the DOM Selection API treats selections as a set of ranges.
Advantages: The model maximizes cache efficiency and therefore performance. There's no need to store two trees, which is both simpler and better for memory consumption.
Disadvantages: It's not clear that this model will be flexible enough to handle everything that CSS requires right now or that CSS working groups might throw at us in the future.
Verdict: I'm leaning toward this model, if it works.
Discussion
I think that either the two-tree or the range model are likely the way to go, but I'm somewhat torn between them. I'm very curious to know what others think.