More actions
No edit summary |
No edit summary |
||
| (2 intermediate revisions by the same user not shown) | |||
| Line 2: | Line 2: | ||
{{Module rating |general}} | {{Module rating |general}} | ||
<!-- Please place categories where indicated at the bottom of this page and interwikis at Wikidata (see [[Wikipedia:Wikidata]]) --> | <!-- Please place categories where indicated at the bottom of this page and interwikis at Wikidata (see [[Wikipedia:Wikidata]]) --> | ||
Implements Lua functions | Implements Lua functions mw.text.decode, mw.text.encode in a module. | ||
:<code><nowiki>{{#invoke:decodeEncode|decode|s=Source&nbsp;text&copy;}}</nowiki></code> → <code><nowiki>Source text©</nowiki></code> | :<code><nowiki>{{#invoke:decodeEncode|decode|s=Source&nbsp;text&copy;}}</nowiki></code> → <code><nowiki>Source text©</nowiki></code> | ||
See | See List of XML and HTML character entity references. | ||
== Decode ({{mono|1=&copy;}} → ©) <span class="anchor" id="Decode"></span>== | == Decode ({{mono|1=&copy;}} → ©) <span class="anchor" id="Decode"></span>== | ||
{{hatnote|See {{slink||Known issues}} for possible THIN SPACE, epsilon issues}} | {{hatnote|See {{slink||Known issues}} for possible THIN SPACE, epsilon issues}} | ||
:Decodes | :Decodes Named Entities ''from'' entity name ''into'' a regular (unicode) character: | ||
:<code>&copy;</code> → <code>©</code> | :<code>&copy;</code> → <code>©</code> | ||
:<code>&gt;</code> → <code>></code> | :<code>&gt;</code> → <code>></code> | ||
| Line 65: | Line 64: | ||
* 13 Sep 2021: NOTE: The encode function with user-supplied charset is now used productively in {{tl|R/superscript}} and {{tl|R/ref}}. Before implementing breaking changes here, these templates need to be adjusted accordingly! | * 13 Sep 2021: NOTE: The encode function with user-supplied charset is now used productively in {{tl|R/superscript}} and {{tl|R/ref}}. Before implementing breaking changes here, these templates need to be adjusted accordingly! | ||
* 26 Sep 2021: | * 26 Sep 2021: | ||
:Note: Possible bug: Decoding <code>&ThinSpace;</code> works, but <code>&thinsp;</code> doesn't. | :Note: Possible bug: Decoding <code>&ThinSpace;</code> works, but <code>&thinsp;</code> doesn't. | ||
:Resolved in code. | :Resolved in code. | ||
* 4 Feb 2023: | * 4 Feb 2023: | ||
{{tracked|T328840}} | {{tracked|T328840}} | ||
:See {{slink|Module_talk:DecodeEncode|Bug_report:_bad_decoding_of_U+03B5_ε_(epsilon)}} | :See {{slink|Module_talk:DecodeEncode|Bug_report:_bad_decoding_of_U+03B5_ε_(epsilon)}} | ||
| Line 76: | Line 75: | ||
==See also== | ==See also== | ||
* | * mw.text.decode | ||
* | * mw.text.encode | ||
* | * :Module:Urldecode | ||
{{Navbox wikitext-handling templates}} | {{Navbox wikitext-handling templates}} | ||
Latest revision as of 17:45, 9 April 2025
| This page uses Creative Commons Licensed content from Wikipedia (view authors). |
Implements Lua functions mw.text.decode, mw.text.encode in a module.
{{#invoke:decodeEncode|decode|s=Source text©}}→Source text©
See List of XML and HTML character entity references.
Decode (© → ©)
- Decodes Named Entities from entity name into a regular (unicode) character:
©→©>→>
All well-defined named entities are decoded (HTML Named character references, formally: as defined in the PHP table).
- A regular, rendered sentence:
- "At 100 °F, & with a "burning" sun above, we , we ⁄walked⁄."
- In code:
- "
At 100 °F, & with a "burning" sun above, we ⁄walked⁄." -- wikitext
- "
- Processing:
{{#invoke:decodeEncode|decode|s=At 100 °F, & with a "burning" sun above, we ⁄walked⁄.}}→At 100 °F, & with a "burning" sun above, we ⁄walked⁄.-- In code: straight characters, no named entities.
- Renders, again:
- "At 100 °F, & with a "burning" sun above, we ⁄walked⁄."
Decode a reduced set only
By setting |subset_only=true, only these five entity names are decoded: '<', '>', '&', '"', ' ' (that is, into '<', '>', '&', '"', ' ').
- Note: There is a difference with the relevant Lua parameter. (This only concerns your task if you also work directly with the Lua mw.text.decode function). Lua documentation defines parameter
|decodeNamedEntities=, having this effect: when omitted or false, only the reduced set of entities is recognized and decoded. This use of 'false' is inverted in using|subset_only=:|decodeNamedEntities=false=|subset_only=true.
- Also, this module ignores the "omitted" logic:
|subset_only=should be set explicitly to 'true' to be effective.
Encode (© → ©)
- Function
encodeencodes some entity-named characters into that name (for example:&→&).
Regular sentence:
- "At >100 °F, & with a "burning" sun above, we walked. ©"
In code:
- "
At >100 °F, & with a "burning" sun above, we walked. ©"
Encode:
{{#invoke:decodeEncode|encode|s=At >100 °F, & with a "burning" sun above, we walked. ©|charset=&<>{{!}}°"'&©}}
- →
At >100 °F, & with a "burning" sun above, we walked. ©
- Renders as:
- "At >100 °F, & with a "burning" sun above, we walked. ©"
character set to encode
Per Lua documentation, only a small set of characters is processed. The characterset can be set (expanded) by using |charset=.
- Example:
|charset=<>" \'&(the default),|charset=<>°"'&©{{!}}; characters not in the default will be replaced by their decimal entity:©→©(hexadecimal number, not decimal nor named ©)
Known issues
- 13 Sep 2021: NOTE: The encode function with user-supplied charset is now used productively in {{R/superscript}} and {{R/ref}}. Before implementing breaking changes here, these templates need to be adjusted accordingly!
- 26 Sep 2021:
- Note: Possible bug: Decoding
 works, but doesn't. - Resolved in code.
- 4 Feb 2023:
- See Module talk:DecodeEncode § Bug report: bad decoding of U+03B5 ε (epsilon)
- Resolved in code.
See also
- mw.text.decode
- mw.text.encode
- :Module:Urldecode