mg冰球突破客户端 - mg冰球突破手机版下载(集团)有限公司

mg冰球突破客户端 - mg冰球突破手机版下载(集团)有限公司

beat3651ä‹è½½-best365网址【mg冰球突破客户端】官方网站

beat3651ä‹è½½来源:网络2024-03-06 04:04:26

在介绍ASCII、GB2312、UNICODE、UTF-8、UTF-16等等之前,我觉得还是有必要先说下题目,什么是字符集,什么是编码方式。鄙人愚见,字符集就是字符的集合,如ASCII、GB2312、UNICODE等等,而编码方式指的是码值与字符集之间的映射关系,如对于UNICODE字符集,有UTF-8、UTF-16、UTF-32等编码方式。

ASCII

标准ASCII 码也叫基础ASCII码,使用7位二进制数(剩下的1位二进制为0)来表示所有的大写和小写字母,数字0 到9、标点符号,以及在美式英语中使用的特殊控制字符。

 在英语中,用128个符号编码便可以表示所有,但是用来表示其他语言,128个符号是不够的。于是利用字节中闲置的最高位编入新的符号,就出现了扩展ASCII编码。由于不同国家字母多种多样,所以扩展ASCII编码也分很多种,但是不管怎样,所有这些编码方式中,0--127表示的符号是一样的,不一样的只是128--255的这一段,下面只介绍其中两种。

扩展ASCII(非Latin格式)

扩展ASCII(Latin1)

DEC OCT HEX BIN Symbol HTML Number HTML Name Description 128 200 80 10000000 € € € Euro sign 129 201 81 10000001 130 202 82 10000010 ‚ ‚ ‚ Single low-9 quotation mark 131 203 83 10000011 ƒ ƒ ƒ Latin small letter f with hook 132 204 84 10000100 „ „ „ Double low-9 quotation mark 133 205 85 10000101 … … … Horizontal ellipsis 134 206 86 10000110 † † † Dagger 135 207 87 10000111 ‡ ‡ ‡ Double dagger 136 210 88 10001000 ˆ ˆ ˆ Modifier letter circumflex accent 137 211 89 10001001 ‰ ‰ ‰ Per mille sign 138 212 8A 10001010 Š Š Š Latin capital letter S with caron 139 213 8B 10001011 ‹ ‹ ‹ Single left-pointing angle quotation 140 214 8C 10001100 Œ Œ Œ Latin capital ligature OE 141 215 8D 10001101 142 216 8E 10001110 Ž Ž Latin capital letter Z with caron 143 217 8F 10001111 144 220 90 10010000 145 221 91 10010001 ‘ ‘ ‘ Left single quotation mark 146 222 92 10010010 ’ ’ ’ Right single quotation mark 147 223 93 10010011 “ “ “ Left double quotation mark 148 224 94 10010100 ” ” ” Right double quotation mark 149 225 95 10010101 • • • Bullet 150 226 96 10010110 – – – En dash 151 227 97 10010111 — — — Em dash 152 230 98 10011000 ˜ ˜ ˜ Small tilde 153 231 99 10011001 ™ ™ ™ Trade mark sign 154 232 9A 10011010 š š š Latin small letter S with caron 155 233 9B 10011011 › › › Single right-pointing angle quotation mark 156 234 9C 10011100 œ œ œ Latin small ligature oe 157 235 9D 10011101 158 236 9E 10011110 ž ž Latin small letter z with caron 159 237 9F 10011111 Ÿ Ÿ Ÿ Latin capital letter Y with diaeresis 160 240 A0 10100000     Non-breaking space 161 241 A1 10100001 ¡ ¡ ¡ Inverted exclamation mark 162 242 A2 10100010 ¢ ¢ ¢ Cent sign 163 243 A3 10100011 £ £ £ Pound sign 164 244 A4 10100100 ¤ ¤ ¤ Currency sign 165 245 A5 10100101 ¥ ¥ ¥ Yen sign 166 246 A6 10100110 ¦ ¦ ¦ Pipe, Broken vertical bar 167 247 A7 10100111 § § § Section sign 168 250 A8 10101000 ¨ ¨ ¨ Spacing diaeresis - umlaut 169 251 A9 10101001 © © © Copyright sign 170 252 AA 10101010 ª ª ª Feminine ordinal indicator 171 253 AB 10101011 « « « Left double angle quotes 172 254 AC 10101100 ¬ ¬ ¬ Not sign 173 255 AD 10101101 ­ ­ ­ Soft hyphen 174 256 AE 10101110 ® ® ® Registered trade mark sign 175 257 AF 10101111 ¯ ¯ ¯ Spacing macron - overline 176 260 B0 10110000 ° ° ° Degree sign 177 261 B1 10110001 ± ± ± Plus-or-minus sign 178 262 B2 10110010 ² ² ² Superscript two - squared 179 263 B3 10110011 ³ ³ ³ Superscript three - cubed 180 264 B4 10110100 ´ ´ ´ Acute accent - spacing acute 181 265 B5 10110101 µ µ µ Micro sign 182 266 B6 10110110 ¶ ¶ ¶ Pilcrow sign - paragraph sign 183 267 B7 10110111 · · · Middle dot - Georgian comma 184 270 B8 10111000 ¸ ¸ ¸ Spacing cedilla 185 271 B9 10111001 ¹ ¹ ¹ Superscript one 186 272 BA 10111010 º º º Masculine ordinal indicator 187 273 BB 10111011 » » » Right double angle quotes 188 274 BC 10111100 ¼ ¼ ¼ Fraction one quarter 189 275 BD 10111101 ½ ½ ½ Fraction one half 190 276 BE 10111110 ¾ ¾ ¾ Fraction three quarters 191 277 BF 10111111 ¿ ¿ ¿ Inverted question mark 192 300 C0 11000000 À À À Latin capital letter A with grave 193 301 C1 11000001 Á Á Á Latin capital letter A with acute 194 302 C2 11000010    Latin capital letter A with circumflex 195 303 C3 11000011 à à à Latin capital letter A with tilde 196 304 C4 11000100 Ä Ä Ä Latin capital letter A with diaeresis 197 305 C5 11000101 Å Å Å Latin capital letter A with ring above 198 306 C6 11000110 Æ Æ Æ Latin capital letter AE 199 307 C7 11000111 Ç Ç Ç Latin capital letter C with cedilla 200 310 C8 11001000 È È È Latin capital letter E with grave 201 311 C9 11001001 É É É Latin capital letter E with acute 202 312 CA 11001010 Ê Ê Ê Latin capital letter E with circumflex 203 313 CB 11001011 Ë Ë Ë Latin capital letter E with diaeresis 204 314 CC 11001100 Ì Ì Ì Latin capital letter I with grave 205 315 CD 11001101 Í Í Í Latin capital letter I with acute 206 316 CE 11001110 Î Î Î Latin capital letter I with circumflex 207 317 CF 11001111 Ï Ï Ï Latin capital letter I with diaeresis 208 320 D0 11010000 Ð Ð Ð Latin capital letter ETH 209 321 D1 11010001 Ñ Ñ Ñ Latin capital letter N with tilde 210 322 D2 11010010 Ò Ò Ò Latin capital letter O with grave 211 323 D3 11010011 Ó Ó Ó Latin capital letter O with acute 212 324 D4 11010100 Ô Ô Ô Latin capital letter O with circumflex 213 325 D5 11010101 Õ Õ Õ Latin capital letter O with tilde 214 326 D6 11010110 Ö Ö Ö Latin capital letter O with diaeresis 215 327 D7 11010111 × × × Multiplication sign 216 330 D8 11011000 Ø Ø Ø Latin capital letter O with slash 217 331 D9 11011001 Ù Ù Ù Latin capital letter U with grave 218 332 DA 11011010 Ú Ú Ú Latin capital letter U with acute 219 333 DB 11011011 Û Û Û Latin capital letter U with circumflex 220 334 DC 11011100 Ü Ü Ü Latin capital letter U with diaeresis 221 335 DD 11011101 Ý Ý Ý Latin capital letter Y with acute 222 336 DE 11011110 Þ Þ Þ Latin capital letter THORN 223 337 DF 11011111 ß ß ß Latin small letter sharp s - ess-zed 224 340 E0 11100000 à à à Latin small letter a with grave 225 341 E1 11100001 á á á Latin small letter a with acute 226 342 E2 11100010 â â â Latin small letter a with circumflex 227 343 E3 11100011 ã ã ã Latin small letter a with tilde 228 344 E4 11100100 ä ä ä Latin small letter a with diaeresis 229 345 E5 11100101 å å å Latin small letter a with ring above 230 346 E6 11100110 æ æ æ Latin small letter ae 231 347 E7 11100111 ç ç ç Latin small letter c with cedilla 232 350 E8 11101000 è è è Latin small letter e with grave 233 351 E9 11101001 é é é Latin small letter e with acute 234 352 EA 11101010 ê ê ê Latin small letter e with circumflex 235 353 EB 11101011 ë ë ë Latin small letter e with diaeresis 236 354 EC 11101100 ì ì ì Latin small letter i with grave 237 355 ED 11101101 í í í Latin small letter i with acute 238 356 EE 11101110 î î î Latin small letter i with circumflex 239 357 EF 11101111 ï ï ï Latin small letter i with diaeresis 240 360 F0 11110000 ð ð ð Latin small letter eth 241 361 F1 11110001 ñ ñ ñ Latin small letter n with tilde 242 362 F2 11110010 ò ò ò Latin small letter o with grave 243 363 F3 11110011 ó ó ó Latin small letter o with acute 244 364 F4 11110100 ô ô ô Latin small letter o with circumflex 245 365 F5 11110101 õ õ õ Latin small letter o with tilde 246 366 F6 11110110 ö ö ö Latin small letter o with diaeresis 247 367 F7 11110111 ÷ ÷ ÷ Division sign 248 370 F8 11111000 ø ø ø Latin small letter o with slash 249 371 F9 11111001 ù ù ù Latin small letter u with grave 250 372 FA 11111010 ú ú ú Latin small letter u with acute 251 373 FB 11111011 û û û Latin small letter u with circumflex 252 374 FC 11111100 ü ü ü Latin small letter u with diaeresis 253 375 FD 11111101 ý ý ý Latin small letter y with acute 254 376 FE 11111110 þ þ þ Latin small letter thorn 255 377 FF 11111111 ÿ ÿ ÿ Latin small letter y with diaeresis

GB2312

GB2312用于汉字处理、汉字通信等系统之间的信息交换,采用双字节编码。对所有字符集分成94个区,每区有94个位。每个区位上只有一个字符,因此可用所在的区和位来对汉字进行编码,称为区位码。依旧拿“王”字来举例,由下图可见,“王”在GB2312字符集中的区位码为0xCDF5。

UTF-8

UTF-8如文章最开始说的,它是UNICODE的其中一种编码方式,最大的一个特点,就是它是一种变长的编码方式。它可以使用1~4个字节表示一个符号,根据不同的符号而变化字节长度。

比如汉字“王”的Unicode编码为0x738B。二进制表示为:111 0011 1000 1011。共有15个二进制位,编码UTF-8需要3个字节。

3字节的UTF-8二进制格式为:1110XXXX 10XXXXXX 10XXXXXX。001011填入最低字节,001110填入中间字节,111填入最高字节。最后的UTF-8编码即为:11100111 10001110 10001011,十六进制值为0xE78E8B。

 常用的字符集与编码方式基本也就这样,当看到一个被编码的字符,无法确定属于哪个字符集时候,可以先在最上面的这些表里面查,再或者,如果你想知道一个字符在特定字符集中的值或某种特定编码后的值的时候,可以这么干,这里我用的软件是UE。

下方选择好编码方式以后,输入想要查询的文本,然后在上方选择编辑---十六进制模式,就能看到文本所对应的编码的值了。

 汉字当然也是可以的。