008. H5 Special Characters - V-SignOn, A Personal Knowledge System for Technology _H5_CSS_wasm_GoLang_STM32_WebAssembly

    HTML must use special symbols
    
    English spaces( ): &nbsp; &#32;
    And symbol(&): &amp; &#38;
    Less than sign(<): &lt; &#60;
    Greater than sign(>): &gt; &#62;
    Half width double quotation marks("): &quot; &#34;
    Half width single quotation mark ('): &lsquo; &#39;
    
    &#00000; is a decimal representation ofUCS-2 encodingcharacters,&#x0000; is a hexadecimal representation ofUCS-2 encodedcharacters. UCS-2 encodingcharacter compatibilityASCII code, is a subset ofUTF-16.
    
    Required special symbols in JS
    Half width double quotation mark ("): \ u0022
    Half width single quotation mark ('): \ u0027
    \u represents UCS-2 in hexadecimal in JS
    
    Required special symbols in CSS
    Half width double quotation mark ("): \ 0022
    Half width single quotation mark ('): \ 0027
    \ represents UCS-2 in hexadecimal in CSS
    
    With the code representation of these special symbols, HTML pages can represent any content.
    
    In JavaScript and CSS, characters are encoded using the UCS-2 encoding scheme, which is actually a subset of UTF-16 instead of the complete UTF-16 encoding scheme. Characters that cannot be represented by two bytes, JavaScript represents through proxy pairs, which means that two UCS-2 characters combined represent one character. This character encoding method is the UCS-2+surrogate pair encoding method. The proxy character surrogate starts with& #xD800.
    
    In JavaScript, proxy character pairs are treated as two characters, and often cause the result incorrect..
    
    And ES6 treats proxy character pairs as a single character. So, ES6 can accurately process any character.
    
    For example, the processing of extended characters in native JS is as follows:
    "bytes:💩".split("")
    The result is:
    
    
    The ES6 method for processing extended characters is (ES6 syntax, expanded into an array):
    [..."bytes:💩"]
    The result is:
    
    
    Namely: Native JS handles proxy character pairs as two characters; ES6 processes proxy character pairs as a single character.
    So, the search and splitting of special characters or strings in JS is quite special, and it is necessary to consider the handling method of 4-byte characters.
    
    For example, the following processing method is incorrect:
    "bytes:💩".substring(0,7)
    The result is:
    
    
    The correct method is:
    [..."bytes:💩"].splice(0,7).join("");
    The result is:
    
    
    So, ES6 has also extended regular expressions by adding the u symbol. When recognizing 4-byte text after D800, ES6 treats the 4-byte text as a single character to match ES6.
    The correct way to add a JS or ES6 to handle international extended character set surnames:
    var surname=[..."💩No"][0];
    The result is:
    
    
    -- www.v-signon-com Learner Encouragement