Parse a BCP 47 language tag
bcp_parse.RdDecomposes a BCP 47 language tag into its constituent subtags following the
syntax defined in RFC 5646. Both hyphen (-) and underscore (_) are
accepted as subtag separators.
Value
A named list with the following elements:
languageThe primary language subtag (e.g.,
"en","zh"), orNAfor a pure private-use tag.extlangA character vector of extended language subtags (three-letter codes following the primary language), or
NULL.scriptThe four-letter script subtag (e.g.,
"latn","hans"), orNAif absent.regionThe two-letter or three-digit region subtag (e.g.,
"us","419"), orNAif absent.variantsA character vector of variant subtags, or
NULL.extensionsA named list of extension subtag sequences, keyed by the single-letter extension singleton.
privateA character vector of private-use subtags (following
x-), orNULL.
All subtags are returned in lower-case.
Examples
bcp_parse('en-US')
#> $language
#> [1] "en"
#>
#> $extlang
#> NULL
#>
#> $script
#> [1] NA
#>
#> $region
#> [1] "us"
#>
#> $variants
#> NULL
#>
#> $extensions
#> list()
#>
#> $private
#> NULL
#>
bcp_parse('zh-Hans-CN')
#> $language
#> [1] "zh"
#>
#> $extlang
#> NULL
#>
#> $script
#> [1] "hans"
#>
#> $region
#> [1] "cn"
#>
#> $variants
#> NULL
#>
#> $extensions
#> list()
#>
#> $private
#> NULL
#>
bcp_parse('de-1901')
#> $language
#> [1] "de"
#>
#> $extlang
#> NULL
#>
#> $script
#> [1] NA
#>
#> $region
#> [1] NA
#>
#> $variants
#> [1] "1901"
#>
#> $extensions
#> list()
#>
#> $private
#> NULL
#>
bcp_parse('x-private')
#> $language
#> [1] NA
#>
#> $extlang
#> NULL
#>
#> $script
#> [1] NA
#>
#> $region
#> [1] NA
#>
#> $variants
#> NULL
#>
#> $extensions
#> list()
#>
#> $private
#> [1] "private"
#>