Charset formatter.

Charset factories support following options:

  • :charset - charset type, use charset-formatter to set this option

  • :replacement - replacement string

  • :malformed-mode - see charset-formatter for available options

  • :unmappable-mode - see charset-formatter for available options

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
(ns foo.baz
  (:api dunaj))

(vec (print utf-8 "hello world \u03BB"))
;;=> [104 101 108 108 111 32 119 111 114 108 100 32 -50 -69]

(def v [-16 -99 -109 -105 -16 -99 -109 -82 -16 -99 -109 -75 -16 -99 -109 -75 -16 -99 -109 -72 32 -16 -99 -108 -128 -16 -99 -109 -72 -16 -99 -109 -69 -16 -99 -109 -75 -16 -99 -109 -83])
;;=> #'foo.baz/v

(str (parse utf-8 v))
;;=>"𝓗𝓮𝓵𝓵𝓸 𝔀𝓸𝓻𝓵𝓭"

(vec (print utf-16 (parse utf-8 v)))
;;=> [-2 -1 -40 53 -36 -41 -40 53 -36 -18 -40 53 -36 -11 -40 53 -36 -11 -40 53 -36 -8 0 32 -40 53 -35 0 -40 53 -36 -8 -40 53 -36 -5 -40 53 -36 -11 -40 53 -36 -19]

;; write utf-8 text to file
(with-scope
  (write! "out.txt" (print utf-8 "lorem ipsum dolor sit amet")))
;;=> 26

;; ad-hoc charset with custom replacement
(str (parse (assoc (charset-formatter "ASCII") :replacement "!") [104 101 108 108 111 32 119 111 114 108 100 -20]))
;;=> "hello world!"

charset-formatter

Available since version 1.0 (view source)

Usage:
  • (charset-formatter charset & {:as opts})

Type signature:
  • (String+ ⨯ Any) → (I IParserFactory IPrinterFactory)

Returns charset formatter for a given charset string.

May supply following additional options:

  • :replacement - nil (default, chooses charsets default) or string.

  • :malformed-mode - :ignore, :replace (default), :report.

  • :unmappable-mode -:ignore, :replace (default), :report.

default-charset

Available since version 1.0 (view source)

VAR of type (I IParserFactory IPrinterFactory)

A default charset formatter factory as specified by host.

utf-16

Available since version 1.0 (view source)

VAR of type (I IParserFactory IPrinterFactory)

UTF-16 charset formatter factory which supports byte-order mark when decoding and defaults to and encodes in the big endian.

utf-8

Available since version 1.0 (view source)

VAR of type (I IParserFactory IPrinterFactory)

UTF-8 charset formatter factory.