Skip to content

font: Consider lang attribute when shaping#43447

Merged
mrobinson merged 4 commits into
servo:mainfrom
RichardTjokroutomo:font-language
Mar 23, 2026
Merged

font: Consider lang attribute when shaping#43447
mrobinson merged 4 commits into
servo:mainfrom
RichardTjokroutomo:font-language

Conversation

@RichardTjokroutomo

@RichardTjokroutomo RichardTjokroutomo commented Mar 19, 2026

Copy link
Copy Markdown
Contributor

Fonts: Added language field in the struct ShapingOptions so Harfbuzz can also consider language when shaping glyphs.

Testing: Existing WPT. Most recent try run: link

2 new passes, 5 new fails.

Failures:

  • /css/css-text-decor/text-emphasis-punctuation-2.html should be a false positive since text-emphasis shorthand hasn't been supported on stylo yet.
  • Not quite sure about this, but /css/css-text/text-spacing-trim/text-spacing-trim-quote-001.html?class=halt,htb&lang=ja tests text-spacing-trim default behavior on JA texts. Since this property is not defined in Stylo, I believe that this property's behavior (including default) hasn't been considered in Servo. So Servo previously passing the test should be a false positive. As a side note, this test also fails on Firefox.
  • /html/canvas/element/manual/text/canvas.2d.lang.dynamic.html, /html/canvas/element/manual/text/canvas.2d.lang.html, /html/canvas/element/manual/text/canvas.2d.lang.inherit.disconnected.canvas.html fail because canvas' experimental lang attribute hasn't been supported yet.

credits to @mrobinson for figuring out the reason for last 3 failing WPT tests!

Fixes: #41825

@codecov-commenter

Copy link
Copy Markdown

⚠️ JUnit XML file not found

The CLI was unable to find any JUnit XML files to upload.
For more help, visit our troubleshooting guide.

@RichardTjokroutomo RichardTjokroutomo marked this pull request as draft March 19, 2026 03:12
@RichardTjokroutomo RichardTjokroutomo force-pushed the font-language branch 2 times, most recently from 8332aea to da0893e Compare March 19, 2026 03:41
Signed-off-by: Richard Tjokroutomo <richard.tjokro2@gmail.com>
@RichardTjokroutomo RichardTjokroutomo marked this pull request as ready for review March 19, 2026 06:02
@servo-highfive servo-highfive added the S-awaiting-review There is new code that needs to be reviewed. label Mar 19, 2026
@RichardTjokroutomo RichardTjokroutomo changed the title Consider language when shaping glyphs Fonts: Consider language when shaping glyphs Mar 19, 2026
@mrobinson

Copy link
Copy Markdown
Member

Please update the testing field to with your analysis of the WPT differences. Manual testing is not enough for these kind of changes typically.

@RichardTjokroutomo

Copy link
Copy Markdown
Contributor Author

Please update the testing field to with your analysis of the WPT differences. Manual testing is not enough for these kind of changes typically.

Updated. I've also pushed the updated WPT expectations. PTAL!

Comment thread components/fonts/Cargo.toml Outdated
url = { workspace = true }
uuid = { workspace = true, features = ["v4"] }
webrender_api = { workspace = true }
icu_locid.workspace = true

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be inserted into the list in alphabetical order and also follow the pattern of other entries:

Suggested change
icu_locid.workspace = true
icu_locid = { workspace = true }

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my bad; I ran cargo add without actually checking...

Comment thread components/fonts/font.rs
/// The Unicode script property of the characters in this run.
pub script: Script,

pub language: Language,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs rustdoc.

Comment on lines +414 to +418
let lang = parent_style.get_font()._x_lang.clone();
let lang_identifier = match lang.0.parse() {
Ok(parsed_language) => parsed_language,
Err(_) => Language::UND,
};

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let lang = parent_style.get_font()._x_lang.clone();
let lang_identifier = match lang.0.parse() {
Ok(parsed_language) => parsed_language,
Err(_) => Language::UND,
};
let language = parent_style.get_font()._x_lang.0.parse().unwrap_or(Langauge::UND);

Comment thread components/layout/Cargo.toml Outdated
fonts_traits = { workspace = true }
html5ever = { workspace = true }
icu_locid = { workspace = true }
icu_locid.workspace = true

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please keep this consistent with the other entries. There's no reason to make one different.

Comment on lines +2348 to +2349
let mut current_text_run = UnshapedTextRun::default();
current_text_run.set_language(language.clone());

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of doing this, add a new() method that takes the Language and remove the implementation of Default on UnshapedTextRun.

Comment thread components/script/Cargo.toml Outdated
x25519-dalek = { workspace = true }
xml5ever = { workspace = true }
xpath = { workspace = true }
icu_locid.workspace = true

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see above comments.

Comment thread Cargo.toml
gstreamer-base = "0.25"
gstreamer-gl = "0.25"
gstreamer-gl-egl = "0.25"
gstreamer-gl-sys = "0.25"

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks unrelated?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, maybe it's a leftover from earlier commits? I originally used icu_locale, but switched to icu_locid because icu_locale is based off newer icu version.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should probably revert this particular change as it is unrelated. Cargo.toml isn't automatically generated, so this change comes from some change that you made or was made locally in your working directory.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah? I thought I've re-added it in previous commit.

My bad; should be good now.

@servo-highfive servo-highfive added S-needs-code-changes Changes have not yet been made that were requested by a reviewer. and removed S-awaiting-review There is new code that needs to be reviewed. labels Mar 19, 2026
@servo-highfive servo-highfive added S-awaiting-review There is new code that needs to be reviewed. and removed S-needs-code-changes Changes have not yet been made that were requested by a reviewer. labels Mar 19, 2026
@RichardTjokroutomo

Copy link
Copy Markdown
Contributor Author

Thanks for the review! I've addressed them all, PTAL again.

@RichardTjokroutomo RichardTjokroutomo force-pushed the font-language branch 2 times, most recently from 5644efa to 98aa949 Compare March 19, 2026 13:48

@mrobinson mrobinson left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good with two changes before landing.

Comment thread components/fonts/font.rs Outdated
pub word_spacing: Au,
/// The Unicode script property of the characters in this run.
pub script: Script,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra line here.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

Comment thread Cargo.toml
gstreamer-base = "0.25"
gstreamer-gl = "0.25"
gstreamer-gl-egl = "0.25"
gstreamer-gl-sys = "0.25"

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should probably revert this particular change as it is unrelated. Cargo.toml isn't automatically generated, so this change comes from some change that you made or was made locally in your working directory.

@servo-highfive servo-highfive removed the S-awaiting-review There is new code that needs to be reviewed. label Mar 19, 2026
@mrobinson mrobinson changed the title Fonts: Consider language when shaping glyphs font: Consider `lang attribute when shaping Mar 19, 2026
@mrobinson mrobinson changed the title font: Consider `lang attribute when shaping font: Consider lang attribute when shaping Mar 19, 2026
Signed-off-by: Richard Tjokroutomo <richard.tjokro2@gmail.com>
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Mar 19, 2026
@servo-highfive servo-highfive added S-tests-failed The changes caused existing tests to fail. and removed S-awaiting-merge The PR is in the process of compiling and running tests on the automated CI. labels Mar 19, 2026
script,
..Default::default()
string: Default::default(),
language,

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the cause of new failing tests.

Previously, I leave it to default (Language::UND), but after @mrobinson suggested to add new method for UnshapedTextRun, I set the value to language.

I think setting it to language should be the way to go, so I'll need to investigate the new failing tests, sorry about this...

@RichardTjokroutomo

RichardTjokroutomo commented Mar 20, 2026

Copy link
Copy Markdown
Contributor Author

@mrobinson regarding the 3 new failing tests, here's what's known:

  • Firefox also fails all 3 tests.
  • All 3 tests use JS to set the language, but the ref test themselves use html attribute. But Servo can set language via JS (elem.lang = ...). So the reason of failure shouldn't be JS-related.
  • All 3 tests use FontFace API to set the font. But Servo already support FontFace since this PR, so Servo should've chosen the correct font for the test.

I still think that setting the language field in UnshapedTextRun is correct, but I'm not sure why the 3 WPT tests failed.

I'm thinking that for now, I can either set the language to unknown & add a TODO, or update the WPT expectations and open a new issue, because it seems the problem only exists with <canvas> element. What do you say?

@mrobinson

Copy link
Copy Markdown
Member

@RichardTjokroutomo It's probably fine to leave the language set to undefined there and look at it later, but maybe it's worth at least looking at how the results differ first, in case they point to some unexpected behavior:

  1. ./mach test-wpt -r /html/canvas/element/manual/text/canvas.2d.lang.html --log-html out.html
  2. Then open out.html to see the differences in the actual versus expected results.

@servo-highfive servo-highfive removed the S-tests-failed The changes caused existing tests to fail. label Mar 20, 2026
@RichardTjokroutomo

Copy link
Copy Markdown
Contributor Author

@RichardTjokroutomo It's probably fine to leave the language set to undefined there and look at it later, but maybe it's worth at least looking at how the results differ first, in case they point to some unexpected behavior:

  1. ./mach test-wpt -r /html/canvas/element/manual/text/canvas.2d.lang.html --log-html out.html
  2. Then open out.html to see the differences in the actual versus expected results.

Thanks, I didn't know we can do this! I've created an issue and update it once this PR lands (since the code in question doesnt exist in main branch yet)

@RichardTjokroutomo

RichardTjokroutomo commented Mar 20, 2026

Copy link
Copy Markdown
Contributor Author

newest try run (just to be certain).

@mrobinson

Copy link
Copy Markdown
Member

@RichardTjokroutomo Were you able to understand the reason that the tests are failing or at least upload a couple of the screenshots here?

@RichardTjokroutomo

RichardTjokroutomo commented Mar 20, 2026

Copy link
Copy Markdown
Contributor Author

@mrobinson hmm, no. For example, for /html/canvas/element/manual/text/canvas.2d.lang.html, the expected result is that one will render "fi" ligature for en text when lang=en & another one for tr text when lang=tr. Instead, both render the ligature for Turkish text.

I plan to upload all screenshots in this issue, but I can upload it here. Gimme a while.

@RichardTjokroutomo

Copy link
Copy Markdown
Contributor Author

screenshots from test results:

  • /html/canvas/element/manual/text/canvas.2d.lang.dynamic.html

actual:
lang-dynamic-actual

expected:
lang-dynamic-expected

  • /html/canvas/element/manual/text/canvas.2d.lang.html

actual:
lang-actual

expected:
lang-expected

  • /html/canvas/element/manual/text/canvas.2d.lang.inherit.disconnected.canvas.html

actual:
lang-disconnected-actual

expected:
lang-disconnected-expected

@mrobinson

Copy link
Copy Markdown
Member

I think the issue here is that there is also an experimental lang attribute on CanvasRenderingContext2D. We do not support this.

@RichardTjokroutomo

Copy link
Copy Markdown
Contributor Author

I think the issue here is that there is also an experimental lang attribute on CanvasRenderingContext2D. We do not support this.

In that case, let me update the TODO desc

@mrobinson

Copy link
Copy Markdown
Member

@RichardTjokroutomo Okay. Please update the PR with the results and also update the Testing field of the PR to mention the canvas failures, explaining why they are now failing.

…experimental `lang` attribute

Signed-off-by: Richard Tjokroutomo <richard.tjokro2@gmail.com>
@RichardTjokroutomo

Copy link
Copy Markdown
Contributor Author

@mrobinson newest try run succeeded. Is there anything else you want to change?

@servo-highfive servo-highfive removed the S-awaiting-review There is new code that needs to be reviewed. label Mar 23, 2026
@mrobinson mrobinson added this pull request to the merge queue Mar 23, 2026
@servo-highfive servo-highfive added the S-awaiting-merge The PR is in the process of compiling and running tests on the automated CI. label Mar 23, 2026
Merged via the queue into servo:main with commit afa4b52 Mar 23, 2026
32 checks passed
@servo-highfive servo-highfive removed the S-awaiting-merge The PR is in the process of compiling and running tests on the automated CI. label Mar 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Servo does not appear to respect the lang attribute when rendering text

4 participants