Skip to content

utf-8 encoding of runtime names#610

Merged
kennykerr merged 8 commits intomasterfrom
kennykerr-compact
May 7, 2020
Merged

utf-8 encoding of runtime names#610
kennykerr merged 8 commits intomasterfrom
kennykerr-compact

Conversation

@kennykerr
Copy link
Collaborator

@kennykerr kennykerr commented May 6, 2020

Part of #607 to store runtime names in binaries using utf-8 rather than utf-16.

The next step is to fold common prefixes to reduce binary size even further.

@kennykerr
Copy link
Collaborator Author

@DefaultRyan I tried using u8"literals" as you suggested but that doesn't seem to work in a way that will work with both C++17 and C++20. When compiled with C++20 those literals are char8_t string literals but char8_t is new to C++20 so I can't use it in C++17. I think I'll just leave it until we switch to C++20 and leave C++17 behind.

@DefaultRyan
Copy link
Member

An approach I've taken elsewhere is to conditionally define the char type using the char8_t feature test macro:

#ifdef __cpp_char8_t
using char_type = char8_t;
#else
using char_type = char;
#endif

@kennykerr
Copy link
Collaborator Author

/azp run

@azure-pipelines
Copy link

No pipelines are associated with this pull request.

@kennykerr kennykerr requested a review from DefaultRyan May 7, 2020 03:22
@kennykerr
Copy link
Collaborator Author

/azp run

@azure-pipelines
Copy link

No pipelines are associated with this pull request.

@YohDeadfall
Copy link
Contributor

@kennykerr looks like you're doing changes from a different machine or the git config doesn't contain the mail associated with your account.

@walbourn Since all processes which share a library have separate memory they can store a flag to indicate does the current process uses UTF-8 or UTF-16. So if an app has UTF-8 strings, then there should be no conversion. Otherwise, conversion is required. Near the same way Nt and Zw functions work in Windows.

In case of WinRT where strings represented as handles and where the string lifetime is controlled by a reference counter, RAII could help: c_str should return a temporary struct which is implicitly convertible to char8_t/char, which does a conversion in the constructor and deallocation in the destructor if the flag reports that an app uses UTF-16. In case of UTF-8 no actions are needed.

Copy link
Member

@DefaultRyan DefaultRyan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a couple recommendations, but nothing I want to block over.

template <size_t Size>
hstring literal_to_hstring(std::array<char_type, Size> const& value) noexcept
{
return to_hstring(std::string_view(reinterpret_cast<char const*>(value.data()), Size - 1));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommend adding an overload of to_hstring for std::u8string_view, which will reduce developer friction if when they use char8_t strings. Then, the reinterpret_cast can live there, as its only purpose is to translate from char8_t to the char required by MultiByteToWideChar.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was hoping to use u8string_view but it looks like its new to C++20.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

u8string_view is C++to, but basic_string_view<winrt::impl::char_type> always works.

REQUIRE(winrt::name_of<winrt_container>() == midl_container::z_get_rc_name_impl());

constexpr auto name = winrt::name_of<winrt::guid>();
auto name = winrt::name_of<winrt::guid>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change makes this not really a constexpr test any more. Shouldn't this test still be using name_v and comparing it to u8"Guid"?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I should just remove this test.

@kennykerr kennykerr merged commit 08eca8b into master May 7, 2020
@kennykerr kennykerr deleted the kennykerr-compact branch May 7, 2020 19:51
kennykerr added a commit that referenced this pull request May 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants

Comments