.


:




:

































 

 

 

 





8- . , . 8- . , : , , .

, , 8- . , HTML .

.

Windows NT UTF-16LE. , , . . Microsoft Windows.

UNIX- , GNU/Linux, BSD, OS X, UTF-8. UTF-8 , , . UCS-4, .

Java. 8- 16-. , : ( char). , , . , , 16 J2SE 5.0 char (. ).

, .

, .

Microsoft Windows

: Microsoft Windows

Windows 2000, (charmap.exe) , ( U+0000U+FFFF). U+10000 .

, , Microsoft Word.

, Alt+X, , , WordPad, Microsoft Word. Alt+X .

MS Windows, Unicode, Alt . , Alt+0171 (), Alt+0187 () Alt+0769 ( ). Alt+0133 () Alt+0151 ().

Macintosh

Mac OS 8.5 , Unicode Hex Input. Option . , U+FFFF, ; . .

Mac OS X 10.2, Character Palette, , , .

GNU/Linux

GNOME ( gucharmap), . , ISO 14755: Ctrl+⇧ Shift ( GTK+, U). 32 , .

X Window, GNOME KDE, Compose. , Compose, , ⇪ Caps Lock.

GNU/Linux Alt. : AltGr, AF NumLock ↵ Enter ( ). ISO 14755. , unicode_start(1) setfont(8).

Mozilla Firefox Linux ISO 14755.

a a . ( a ) . ; -, .

, , . .

. , , () (), ( CJK-), . , ( , . ). , .

. : İi Iı , , i I. , , [37].

: , [38] . .

, .

, , , ( UTF-8 , ASCII, , ASCII[39]). , , ́ , [40]. ; , , , .

, . , (BOM) . ( ).

( ) .

. , , , , , .

?

Unicode ( , , Unicode Consortium), , .

. - ( uni- -: , , , ) . , , , , uni- - (, . .), , , UNICEF United Nations International Childrens Emergency Fund .

. . MS Windows .

, Unicode . [1].

, Unicode, .

UTF-16 (. Unicode Transformation Format) 16- . U+0000..U+D7FF U+E000..U+10FFFF ( 1 112 064). ( ).

UTF-16 Q ISO/IEC 10646, IETF RFC 2781 UTF-16, an encoding of ISO 10646.

(1991 .) 16- ; 216 (65 536). (1996 .) ; , 16- , UTF-16. 0xD8000xDFFF, , .

UTF-16 220+216−2048 (1 112 064) ,

  DC00 DFFE DFFF
D800   0103FE 0103FF
D801   0107FE 0107FF
DBFF 10FC00 10FFFE  

UTF-16 ( 0 FFFF16). Unicode 000016..D7FF16 E00016..10FFFF16. D80016..DFFF16 , 16- .

Unicode FFFF16 ( ) 16- .

1000016..10FFFF16 ( 16 ) :

( 1000016). FFFFF16, 20 .

10 ( 000016..03FF16) D80016, () , D80016..DBFF16.

10 ( 000016..03FF16) DC0016, () , DC0016..DFFF16.

6 . 11 15 ( ) 110112, 10- 0 1 . , .

UTF-16 . , , . , x86, little endian, m68k SPARC big endian.

(. Byte order mark). U+FEFF. , U+FEFF U+FFFE, , U+FFFE . UTF-8 0xFE 0xFF, , UTF-16 UTF-8.

UTF-16LE UTF-16BE

UTF-16LE UTF-16BE (little-endian / big-endian), UTF-16. (U+FEFF) .

UTF-16 Windows

: Microsoft

API Win32, Microsoft Windows, : 8- UTF-16.

UTF-16, Windows , UTF-16LE, UTF-16BE . Windows - UTF-16LE. , WinAPI. UTF-16LE[1].

NTFS, FAT , UTF-16LE.

. (Little-Endian, x86). Word (16- ), UInt32 32- . $.

WriteWord() , ( ). LoWord() 32- ( ).

// Code: $0000..$D7FF, $E000..$10FFFF.

Procedure WriteUTF16Char(Code: UInt32)

If (Code < $10000) Then

WriteWord(LoWord(Code))

Else

Code = Code - $10000

Var Lo10: Word = LoWord(Code And $3FF)

Var Hi10: Word = LoWord(Code Shr 10)

WriteWord($D800 Or Hi10)

WriteWord($DC00 Or Lo10)

End If

End Procedure

ReadWord() ( ). . WordToUInt32 , . Error() ( ).

//

// $0000..$D7FF $E000..$10FFFF.

Function ReadUTF16Char: UInt32

Var Leading: Word // () .

Var Trailing: Word // () .

 

Leading = ReadWord();





:


: 2016-11-24; !; : 1107 |


:

:

, .
==> ...

1533 - | 1323 -


© 2015-2024 lektsii.org - -

: 0.043 .