====== 综述 ====== icu是unicode联盟官方提供的编码转换工具 ====== 转码 ====== icu中码表地址:[[https://github.com/unicode-org/icu-data/tree/main/charset/data/ucm|ucm码表]]、[[https://github.com/unicode-org/icu-data/tree/main/charset/data/xml|xml码表]] ====== 杂项 ====== GBK编码在UTF8中最长连续码段是U4E00-U9FA5(一-龥 20902个字符) ”的“字(U7684)存在于280个编码表中 ”の“字(U306E)存在于258个编码表中 ”丝“字(U4E1D)存在于49个编码表中,如下 euc-tw-2014.ucm gb-18030-2000.ucm gb-18030-2005.ucm glibc-EUC_CN-2.1.2.ucm glibc-EUC_CN-2.3.3.ucm glibc-GBK-2.3.3.ucm hpux-hp15CN-11.11.ucm ibm-13125_P100-1997.ucm ibm-13676_P102-2001.ucm ibm-1380_P100-1995.ucm ibm-1380_X100-1995.ucm ibm-1381_P110-1999.ucm ibm-1381_X110-1999.ucm ibm-1382_P100-1995.ucm ibm-1382_X100-1995.ucm ibm-1383_P110-1999.ucm ibm-1383_X110-1999.ucm ibm-1385_P100-1997.ucm ibm-1385_P100-2005.ucm ibm-1386_P100-2001.ucm ibm-1386_P110-1997.ucm ibm-1388_P103-2001.ucm ibm-1388_P110-2000.ucm ibm-17221_P100-2001.ucm ibm-4933_P100-1996.ucm ibm-4933_P100-2002.ucm ibm-5478_P100-1995.ucm ibm-5488_P100-2001.ucm ibm-837_P100-1995.ucm ibm-837_P100-2011.ucm ibm-837_X100-1995.ucm ibm-928_P100-1995.ucm ibm-935_P110-1999.ucm ibm-935_X110-1999.ucm ibm-946_P100-1995.ucm ibm-9577_P100-2001.ucm ibm-9580_P110-1999.ucm java-Cp1381-1.3_P.ucm java-Cp1383-1.3_P.ucm java-Cp935-1.3_P.ucm java-EUC_CN-1.3_P.ucm macos-1057-10.2.ucm solaris-zh_CN.euc-2.7.ucm solaris-zh_CN.gbk-2.7.ucm solaris-zh_CN_cp935-2.7.ucm windows-10008-2000.ucm windows-20936-2000.ucm windows-51936-2000.ucm windows-936-2000.ucm