Zero-dependency BCP 47 / RFC 5646 language tag toolkit for JavaScript and TypeScript
Built with AI, end to end. Every line in this repo — source, tests, docs, release scripts — is generated, reviewed, and verified through Claude Code.
- Strict conventions enforced by the agent.
.claude/rules/pin TypeScript strict mode,Array<T>,readonlyby default, noany, semantic naming (nodata/result/item), and 18 testing rules — applied on every change without manual review.- Bundled Claude Code skills (
commit,verify,review,release,testing) that drive the full lifecycle: lint → build → test → conventional commit → tag-triggered release with SLSA provenance via OIDC trusted publishing — powered bynpm-trust(the/solo-npm:releaseskill itself comes from thesolo-npmmarketplace plugin).- Co-located Vitest specs for every operator (
parse,canonicalize,filter,lookup,extensionU/extensionT,acceptLanguage). The agent writes the failing test first, then makes it pass —pnpm testis the regression net, not a checkbox.- PRs disabled by design. Contributors can't push code directly; the maintainer takes issues and discussions through Claude Code. See
CONTRIBUTING.mdfor the full process.
- Parse any BCP 47 language tag into a structured, typed object
- Stringify a tag object back into a well-formed language tag string
- Canonicalize with case normalization and IANA registry data (deprecated subtags, suppress-script, extlang)
- Match language tags with
filterandlookupper RFC 4647 - Extension U/T extraction for Unicode locales (RFC 6067) and transformed content (RFC 6497)
- Accept-Language header parsing per RFC 9110
- WCAG-ready — use
parse()to validatelangattributes per WCAG 2.x SC 3.1.1 and SC 3.1.2 - TypeScript-first with full type inference and strict types out of the box
- Zero dependencies, tree-shakeable, works in Node.js and browsers
npm install rfc-bcp47Tree-shakeable operators — import only what you need.
import { parse, stringify } from 'rfc-bcp47';
const tag = parse('en-Latn-US');
if (tag?.type === 'langtag') {
tag.langtag.language // 'en'
tag.langtag.script // 'Latn'
tag.langtag.region // 'US'
}
stringify(tag!); // 'en-Latn-US'
parse('invalid!'); // nullparse returns one of three tag types or null for invalid input:
type |
When | Fields available |
|---|---|---|
'langtag' |
Standard language tags (en-US, zh-Hant-TW) |
langtag.language, langtag.script, langtag.region, langtag.extlang, langtag.variant, langtag.extension, langtag.privateuse |
'privateuse' |
Private use tags (x-custom) |
privateuse |
'grandfathered' |
Legacy registered tags (i-klingon) |
grandfathered.type, grandfathered.tag |
Build a tag from known parts without parsing a string. Validates subtags and throws RangeError on invalid input:
import { langtag, stringify } from 'rfc-bcp47';
const tag = langtag('en', { region: 'US' });
stringify(tag); // 'en-US'
langtag('!!!'); // RangeError — invalid languageReduce equivalent tags to a single canonical form — handles case normalization, deprecated subtags, suppress-script, extlang promotion, and extension ordering using IANA registry data:
import { canonicalize } from 'rfc-bcp47';
canonicalize('iw'); // 'he' (deprecated language)
canonicalize('zh-cmn'); // 'cmn' (extlang to preferred)
canonicalize('en-Latn'); // 'en' (suppress-script)
canonicalize('de-DD'); // 'de-DE' (deprecated region)Find all matching tags with subtag-aware filtering per RFC 4647 §3.3.2:
import { filter } from 'rfc-bcp47';
const tags = ['de', 'de-DE', 'de-Latn-DE', 'de-AT', 'en-US', 'fr-FR'];
filter(tags, 'de-DE'); // ['de-DE', 'de-Latn-DE'] (skips Latn to match DE)
filter(tags, 'de'); // ['de', 'de-DE', 'de-Latn-DE', 'de-AT'] (all German)
filter(tags, '*-DE'); // ['de-DE', 'de-Latn-DE'] (* wildcard = any language)Find the single best match via progressive truncation per RFC 4647 §3.4:
import { lookup } from 'rfc-bcp47';
const tags = ['en', 'en-US', 'fr', 'de'];
lookup(tags, 'en-US-x-custom'); // 'en-US' (truncates to match)
lookup(tags, 'fr-CA'); // 'fr' (truncates to match)
lookup(tags, 'ja', 'en'); // 'en' (default fallback)Pair with acceptLanguage for HTTP content negotiation:
import { acceptLanguage, lookup } from 'rfc-bcp47';
const prefs = acceptLanguage('fr-CA, en-US;q=0.8, en;q=0.5');
const best = lookup(['en', 'en-US', 'fr', 'fr-CA'], prefs.map((p) => p.tag));
// 'fr-CA'Extract Unicode locale and transformed content extensions:
import { parse, extensionU, extensionT } from 'rfc-bcp47';
extensionU(parse('de-DE-u-co-phonebk-ca-buddhist')!);
// { attributes: [], keywords: { co: 'phonebk', ca: 'buddhist' } }
extensionT(parse('und-t-it-m0-ungegn')!);
// { source: 'it', fields: { m0: 'ungegn' } }Parse HTTP Accept-Language headers:
import { acceptLanguage } from 'rfc-bcp47';
acceptLanguage('fr-CA, en-US;q=0.8, en;q=0.5, *;q=0.1');
// [
// { tag: 'fr-CA', quality: 1.0 },
// { tag: 'en-US', quality: 0.8 },
// { tag: 'en', quality: 0.5 },
// { tag: '*', quality: 0.1 }
// ]Try the interactive demo or see the examples/ folder for more usage patterns.
| Operator | Description |
|---|---|
parse(tag) |
Parse a BCP 47 tag string into a structured object. Returns null for invalid input. |
stringify(tag) |
Convert a parsed tag object back into a well-formed string. |
langtag(language, options?) |
Create a langtag object with sensible defaults. Throws RangeError on invalid input. |
canonicalize(tag) |
Normalize casing, sort extensions, apply IANA registry mappings (deprecated subtags, suppress-script, extlang). Returns null for invalid input. |
filter(tags, patterns) |
Subtag-aware filtering with * wildcard support per RFC 4647 §3.3.2. Returns matched tags. |
lookup(tags, preferences, defaultValue?) |
Lookup per RFC 4647 §3.4. Returns first match or defaultValue/null. |
extensionU(tag) |
Extract Unicode locale attributes and keywords from the u extension. Takes a BCP47Tag (not a string). Returns null if absent. |
extensionT(tag) |
Extract transformed content data from the t extension. Takes a BCP47Tag (not a string). Returns null if absent. |
acceptLanguage(header) |
Parse an Accept-Language header into sorted { tag, quality } entries. |
Typed constants mapping extension keys to human-readable descriptions, sourced from the CLDR BCP 47 data. Zero runtime cost — tree-shaken if unused.
| Constant | Description |
|---|---|
UNICODE_LOCALE_KEYS |
U extension keys → descriptions (e.g. ca → 'Calendar', nu → 'Numbering system') |
TRANSFORM_KEYS |
T extension keys → descriptions (e.g. m0 → 'Transform mechanism', s0 → 'Transform source') |
| I want to... | Use |
|---|---|
| Validate a language tag string | parse(tag) !== null |
| Read subtags (language, script, region) | parse(tag) → access .langtag.* |
| Build a tag from known parts | langtag(language, options) → stringify(tag) |
| Normalize casing and deprecated subtags | canonicalize(tag) |
| Read Unicode locale preferences (calendar, collation) | parse(tag) → extensionU(parsedTag) |
| Read transformed content metadata (source language) | parse(tag) → extensionT(parsedTag) |
| Find all locales matching a preference | filter(tags, patterns) |
| Pick the single best locale for a user | lookup(tags, preferences, defaultValue) |
| Parse an HTTP Accept-Language header | acceptLanguage(header) → lookup() or filter() |
Note:
canonicalizeandacceptLanguagetake strings.extensionUandextensionTtake a pre-parsedBCP47Tagfromparse(). This avoids re-parsing when you need multiple operations on the same tag.
See CHANGELOG.md for breaking changes and release history.
