This uses macro magic, compound literals, and _Generic to recall
printf() to the subsequent degree: form-noteworthy printing, printing into compound
literal char arrays, easy UTF-8, -16, and -32, with lawful error
handling.
The goal is to be noteworthy by taking out the need for characteristic varargs.
Featured Content Ads
add advertising hereThe similar outdated C printf formatting syntax is passe, with some restrictions
and quite a couple of extensions.
This let’s you mix UTF-8, -16, -32 strings seamlessly in input and
output strings, without manual string format conversions, and without
the usage of different format specifiers or print characteristic names.
This liberates you from obsessed with %u
vs. %lu
vs. %llu
vs. %zu
, even in portable code with different integer forms: the
compiler choses the lawful characteristic to name for your parameter, and
they all print sexy with %u
. (Even strings will print sexy with
%u
.)
‘Form-noteworthy’ on this context does no longer mean that you just get extra bring collectively
errors, nonetheless that the format specifier does no longer must specify the
argument form, nonetheless appropriate defines the print format. Genuinely, format
strings with this library can fill less bring collectively-time checking (namely
none) than with modern compilers for typical printf
. This form
is nonetheless safer: with this library, you appropriate cannot toddle the injurious
size parameter and smash.
Featured Content Ads
add advertising hereCompatibility
This library requires as a minimum a C11 compiler (for _Generic
,
char16_t
, char32_t
), and it uses a couple of gcc extensions that are
moreover understood by Clang and any other compilers (({...})
,
,##__VA_ARGS__
, __typeof__
, __attribute__
).
Synopsis
Within the following, Char
would be char
, char16_t
, or char32_t
:
Featured Content Ads
add advertising hereva_stream_file_t
VA_STREAM_FILE(FILE *f);
#include Char va_snprintf(Char *s, size_t n, Char const *format, …);
Char va_szprintf(Char s[], Char const *format, …);
char va_nprintf(size_t n, Char const *format, …);
char16_t va_unprintf(size_t n, Char const *format, …);
char32_t va_Unprintf(size_t n, Char const *format, …);
va_stream_charp_t #include char va_mprintf(void *(*alloc)(void *, size_t, size_t), Char const *, …)
char16_t va_umprintf(void *(*alloc)(void *, size_t, size_t), Char const *, …)
char32_t va_Umprintf(void *(*alloc)(void *, size_t, size_t), Char const *, …)
va_stream_vec_t va_stream_vec16_t va_stream_vec32_t #include size_t va_stream_len_t #include va_stream_…t va_xprintf(va_stream_…t *s, Char const *format, …)
void void #include typedef struct { … } va_stream_t;
typedef struct { … } va_stream_vtab_t;
typedef struct { unsigned code; } va_error_t va_stream_t
VA_STREAM_CHARP(Char const *s, size_t n);
VA_STREAM_VEC(void *(*alloc)(void *, size_t, size_t));
VA_STREAM_VEC16(void *(*alloc)(void *, size_t, size_t));
VA_STREAM_VEC32(void *(*alloc)(void *, size_t, size_t));
va_lprintf(Char const *format, …);
VA_STREAM_LEN();
va_iprintf(va_stream_…t *s, Char const *format, …);
va_pprintf(va_stream_vtab_t *v, Char const *format, …);
#define VA_E_OK
#define VA_E_NULL
#define VA_E_DECODE
#define VA_E_ENCODE
#define VA_E_TRUNC
VA_STREAM(va_stream_vtab_t const *vtab)”>
#encompass
Description
This library gives a form-noteworthy printing mechanism to print
any roughly string of frightening form char
, char16_t
, or char32_t
,
or any integer or pointer into a brand unique string, an array, or a file.
The library moreover gives functions for user-outlined output streams
that can print into any other roughly poke.
The arguments to the formatted print are handed into a _Generic()
macro in preference to ‘…’ and the resulting characteristic name is thus
form-noteworthy in accordance with the true argument form, and can’t smash attributable to a
injurious format specifier.
The format specifiers on this printing mechanism inspire to provide an explanation for which
output format need to be passe, as they build no longer appear to be needed for form
data. The format specifier “%v” might perhaps likely even be passe as a generic
‘default’ output format.
Layout Specifiers
Indulge in in C, a format specifier begins with ‘%’ adopted by:
- a listing of flag characters
- a width specifier
- a precision specifier
- a listing of integer cowl and quotation specifiers
- a conversion letter
The following flags are recognised:
#
print in several form. For numeric format, a prefix to
designate the frightening is prefixed to the worth with the exception of to 0:- for
o
and frightening 8,0
is prefixed - for
b
and frightening 2,0b
is prefixed, - for
B
and frightening 2,0B
is prefixed, - for
x
and frightening 16,0x
is prefixed, - for
X
and frightening 16,0X
is prefixed, - for
e
and frightening 32,0e
is prefixed, - for
E
and frightening 32,0E
is prefixed.
- for
For quoted strings, this inhibits printing of delimiting quotes.
-
0
pads numerics with zero0
on the left quite than
with a apartment character
neglected.For C and JSON quotation, this selects to quote non-US-ASCII
characters the usage ofu
andU
in preference to printing them in
output encoding. -
-
selects to left flush in preference to the default lawful flush. -
in entrance of obvious signed integers. -
+
selects that a+
is printed in entrance of obvious signed
integers. -
=
specifies that the final worth is printed again the usage of this
unique format specifier. Right here is meager substitute for the$
situation specifiers that are no longer utilized on this library.
A width is either a decimal integer, or a *
. The *
selects
that the width is taken from the subsequent characteristic parameter. If fewer
code aspects result from the conversion, the output is padded with
white apartment up the width. A negative width is intepreted as
a -
flag adopted by a obvious width.
A precision is specified by a .
(length) adopted by either a
decimal integer or a *
. The *
selects that the width is taken
from the subsequent characteristic parameter. If the precision is appropriate .
,
it’s interpreted as zero. The precision defines the minimum number
of digits in numeric conversions. For strings, here is the maximum
different of raw code gadgets read from the input string (no longer the number
of converted code aspects, nonetheless the low-degree different of parts
within the string, so as that non-NUL terminated arrays might perhaps likely even be printed
with their size handed as precision, even with multi-byte/multi-phrase
encodings kept internal. The input decoder will no longer read incomplete
encodings at the stop of restricted strings, nonetheless will quit forward of. If
a pointer to a string pointer is handed, then the
pointer could be up up to now so as that it aspects to the subsequent character, i.e.,
the one after the final one who change into as soon as read.
The following integer cowl and quotation specifiers are recognised:
-
h
applies the cowl0xffff
to an integer, then zero extends unsigned
values, or stamp extends signed values.
E.g.,va_printf("%#hx",0xabcdef)
prints-0x3211
. -
hh
applies the cowl0xff
to an integer, then zero extends unsigned
values, or stamp extends signed values.
E.g.,va_printf("%hhX",0xabcdU)
printsCD
. -
z
reinterprets a signed integer as unsigned (mnemonic: zero
extension).z
is implicit in codecsu
(andU
).
E.g.,va_printf("%hhu", -1)
prints255
. -
q
selects C quotation for strings and char format. There might perhaps be a
separate half under to show this. -
Q
selects JSON quotation for strings and char format. There might perhaps be a
separate half under to show this. -
ample
selects Bourne or Korn shell quotation. There might perhaps be a
separate half under to show this.
Point to that a gigantic selection of the same outdated size specifiers (l
, ll
, and so on.) identified
from C get no sense and are no longer recognised (nor neglected), because
form casting withhold watch over in varargs is no longer needed here attributable to the
form-security.
The following conversion letters are recognised:
-
v
prints one thing in default notation (mnemonic: ‘worth’).
There are a gigantic selection of different unassigned letters that print in default
notation.v
is no longer passe by typical Cprintf
and appears to be
no longer going to be assigned any special notation. -
o
selects octal integer notation for numeric printing (at the side of
pointers). -
d
ori
selects decimal integer notation for numeric
printing (at the side of pointers). -
u
is rather likezd
, i.e., prints a signed integer as
unsigned in decimal notation. -
x
orX
selects hexadecimal integer notation for numeric
printing (at the side of pointers).x
uses lower case digits,
X
upper case. Point to that this moreover prints signed numbers with
a-
if appropriate:va_printf("%#x", -5)
prints-0x5
.
There might perhaps be thez
flag to print signed integers as unsigned. -
b
orB
selects binary integer notation for numeric
printing (at the side of pointers).b
uses lower case prefix,
B
uses upper case. The adaptation is ideal visible
with the#
flag. -
e
orE
selects Base32 notation the usage of the digits
‘a’..’z’,’2′..’7′.e
uses lower case digits and prefix,
E
uses upper case. -
p
prints adore#x
, and for any pointer, at the side of strings,
prints the pointer worth in preference to the contents. Point to that
it moreover prints signed numbers:va_print("%p",-5)
prints-0x5
. -
P
is appropriate adorep
, nonetheless with upper case hexadecial digits. -
c
prints integers (nonetheless no longer pointers) as characters, adore a
one-ingredient string. Point to that the NUL character is no longer printed,
nonetheless behaves adore an empty string. For string quotation where
hexadecimals are printed, this uses lower case characters. -
C
is appropriate adorec
, nonetheless in string quotation when hexadecimals
are printed, uses upper case characters. -
t
prints the argument form forward of adjustments, in C
syntax:int8_
..int64_t
,uint8_t
..uint64_t
,char*
,
char16_t*
,char32_t*
,void*
. Point to thatva_error_t*
arguments on no yarn print, and on no yarn eat a%
format, nonetheless
consistently appropriate return the poke error. -
any letter no longer talked about above or any aggregate of letter and
form no longer talked about above prints in default notation. If the letter
is upper case, it uses upper case letters where appropriate.
Characteristic parameters within the again of the final format specifier within the format
string are printed in default notation after everything that is
printed within the format string.
Characteristic Parameters
The following characteristic parameter forms are recognised:
-
int
,unsigned
,char
,signed char
,unsigned char
,short
,
unsigned short
,long
,unsigned long
,long long
,
unsigned long long
: these are integer and are printed
in unsigned or signed decimal integer notation by default.Which skill that
char
,char16_t
, andchar32_t
all print in
numeric format by default, no longer in character format, as they build no longer appear to be
obvious forms. For decoding them as a 1-ingredient Unicode
codepoint string,c
format need to be passe.Additionally demonstrate that character constants adore
'a'
fill formint
in C
and print numerically by default. -
char *
,char const *
: 8-bit character strings or
arrays. They print as is by default.The default string encoding is UTF-8, It’ll also be reset to
a user encoding by #definingva_char_p_decode
. Additionally note
the half on encoding under.Unquoted,
NULL
prints empty and gadgets theVA_E_NULL
error.
Additionally note the half on quotation under. -
char16_t *
,char16_t const *
: 16-bit character strings or
arrays. The default encoding is UTF-16, that is likely
switched the usage ofva_char16_p_decode
.Unquoted,
NULL
prints empty and gadgets theVA_E_NULL
error. -
char32_t *
,char32_t const *
: 32-bit character strings or
arrays. The default encoding is UTF-32, that is likely
switched the usage ofva_char32_p_decode
.Unquoted,
NULL
prints empty and gadgets theVA_E_NULL
error. -
Char
,Char const
: pointers to pointers to
characters, i.e., pointers to string, will print the string
and then update the pointer to existing the code
unit appropriate within the again of the final one who change into as soon as read from the
string. And not utilizing a precision given within the format, they’re going to
existing the terminating NUL character. When these
parameters are printed a couple of instances the usage of the=
flag,
the string could be reset whenever and the up up to now worth
will correspond to the stop situation for the length of the final print
of the string. -
va_error_t*
: this retrieves the error code from the
poke and writes it into the handed struct. This can
be passe to test for encoding or decoding errors, out
of memory stipulations, or hitting the stop of the output
array.NULL
must no longer be handed as a pointer. -
va_read_iter_t*
: here is an inner form to read from
strings. There are quite a couple of constraints on
provide an explanation for a legitva_read_iter_t
, which would be no longer all
documented here. -
one thing: is tried to be converted to a pointer and
printed in hexadecimal encoding by default, i.e., in%x
format.
Unicode
Internally, this library uses 32-bit codepoints with 24-bit payload
and 8-bit tags for processing strings, and by default, the payload
representation is Unicode. The library tries now to no longer interpret the
payload data except obligatory, so as that other encodings might perhaps likely well in
precept be passe and handed via the library.
The suitable location the core library uses Unicode interpretation is when
quoting C or JSON strings for codepoints >0x80 (e.g., when formatting
with %0qs
), and if a decoding error is encountered or if the worth
is no longer official Unicode, then it uses ufffd to illustrate this, because the
quotation the usage of u or U would in any other case be a lie.
The inner representation permits any worth internal 24 bits to be passe
for codepoints. 0 is interpreted as ‘stop of string’ and is on no yarn
printed into the output poke.
UTF-8, -16, and -32 encoders and decoders take a look at that the Unicode
constraints are met, adore rather than one thing above 0x10FFFF and excessive
and low UTF-16 surrogates, and detecting decoding errors in accordance with
the Unicode suggestions and most productive practices. The encoder/decoder
pairs usually strive to head via frightening sequences as is, if that you just might perhaps likely well likely imagine,
e.g., studying ISO-8859-1 data from an UTF-8 %s
and printing it into
an UTF-8 output poke preserves the distinctive ISO-8859-1 byte
sequence, despite the undeniable reality that the intermediate steps attain lift ‘unlawful sequence’
errors.
Integers print without Unicode assessments, i.e., if an integer is printed
as a persona the usage of %c
, then the lower 24 bits is handed the total model down to
the output poke encoder as is. If integers bigger than 0xffffff are
tried to be printed with %c
, this leads to a decoding error, and
ideal the lower 24 bits are passe.
Encodings
The library helps different string encodings for the format string,
for input strings, and for output streams. The defaults are UTF-8,
UTF-16, or UTF-32. This might perhaps perhaps be switched by environment the following
#defines forward of at the side of headers of this library, i.e., it cannot be
switched dynamically out of the box, because this would mean that all
the encoding modules would consistently be linked. Dynamic switching can
be added by defining a brand unique encoding that internally switches dynamically.
The following #defines switch characteristic names:
Layout String Encoding
The default is UTF-8, -16, or -32 encoding, and it would also be changed
by #defining forward of #encompass
:
#provide an explanation for va_char_p_format utf8
#provide an explanation for va_char16_p_format utf16
#provide an explanation for va_char32_p_format utf32
These macros are appended to an identifier to salvage the suitable
reader for the format string as follows:
va_char_p_read_vtab ## va_char_p_format
va_char16_p_read_vtab ## va_char16_p_format
va_char32_p_read_vtab ## va_char32_p_format
When the usage of a obvious encoding than the default, it is also ensured
that the corresponding vtab declarations are visible.
String Price Encoding
The default for studying string values is UTF-8, -16, or -32 encoding,
for "..."
, u"..."
,and U"..."
strings,resp. The default might perhaps likely even be
changed by defining one of many following macros forward of #encompass
:
#provide an explanation for va_char_p_decode utf8
#provide an explanation for va_char16_p_decode utf16
#provide an explanation for va_char32_p_decode utf32
These macros are appended to an identifier to salvage the suitable
reader for the string worth as follows:
va_xprintf_char_p_ ## va_char_p_decode
va_xprintf_char_pp_ ## va_char_p_decode
va_xprintf_char_const_pp_ ## va_char_p_decode
va_xprintf_char16_p_ ## va_char16_p_decode
va_xprintf_char16_pp_ ## va_char16_p_decode
va_xprintf_char16_const_pp_ ## va_char16_p_decode
va_xprintf_char32_p_ ## va_char32_p_decode
va_xprintf_char32_pp_ ## va_char32_p_decode
va_xprintf_char32_const_pp_ ## va_char32_p_decode
Point to that for every and every parameter form, a obvious printer characteristic is passe,
so for a obvious encoding, three functions must be equipped. A
usual such characteristic implementation appears as follows:
va_stream_t *va_xprintf_char_p_utf8(
va_stream_t *s,
char const *x)
{
va_read_iter_t iter = VA_READ_ITER(&va_char_p_read_vtab_utf8, x);
return va_xprintf_iter(s, &iter);
}
Output Circulation Encoding
For encoding strings into character arrays, the default encoding is
UTF-8, UTF-16, or UTF-32, relying on the string form. To override
the default, the following #defines might perhaps likely even be situation
forward of #encompass
.
#provide an explanation for va_char_p_encode utf8
#provide an explanation for va_char16_p_encode utf16
#provide an explanation for va_char32_p_encode utf32
These are suffixed to salvage the vtab object for writing:
va_char_p_vtab_ ## va_char_p_encode
va_char16_p_vtab_ ## va_char16_p_encode
va_char32_p_vtab_ ## va_char32_p_encode
For dynamically distributed arrays, there are separate #definitions:
#provide an explanation for va_vec8_encode utf8
#provide an explanation for va_vec16_encode utf16
#provide an explanation for va_vec32_encode utf32
These are suffixed to salvage the vtab object for writing:
va_vec_vtab_ ## va_vec_encode
va_vec16_vtab_ ## va_vec16_encode
va_vec32_vtab_ ## va_vec32_encode
For FILE*
output, the default encoding is UTF-8, UTF-16BE,
and UTF-32BE, relying on output character width. The
following #defines correspond to the encoding:
#provide an explanation for va_file8_encode utf8
#provide an explanation for va_file16_encode utf16be
#provide an explanation for va_file32_encode utf32be
These are suffixed to salvage the vtab object for writing:
va_file_vtab_ ## va_file_encode
va_file16_vtab_ ## va_file16_encode
va_file32_vtab_ ## va_file32_encode
Citation
C/C++ quotation
q
quotation option in format specifier- when printing integers, here is neglected
- when printing pointers, this adds the
#
flag, i.e., the
0x
prefix is printed - when printing strings, this selects C format quoted output
NULL
strings print asNULL
, and accomplish no longer situation the
VA_E_NULL
error, in distinction to unquoted printing.- without
#
, prints quotation marks, single forc
andC
,
conversion, in any other case double. - with
z
prints the string size indicator in accordance with the input
string: empty forchar
,u
forchar16_t
, andU
for
char32_t
(and moreoverU
for 64-bit ints). - quotation of unprintable characters
- quotation of some characters in special notation:
t
,r
,n
,b
,f
,'
,"
,\
.0
flag quotes all non-ASCII the usage ofu
orU
. Point to
thatx
is no longer passe, because it might perhaps perhaps likely well no longer stop, so
quotingx1
plus1
is extra sophisticated.- with
0
flag, chars that are marked as decoding errors are
quoted asufffd
, the unreal character, to lead obvious of
printing encoding errors withu
quotation, which would
get the resulting string extra injurious than with ideal the
encoding errors. With out0
flag, encoding errors are
handed via if the input encoding equals output
encoding, in any other caseU+FFFD
is encoded.- upper case codecs expend upper case letters in hexadecimals
- quotation of some characters in special notation:
Examples:
va_printf("%qs", "foo'bar")
prints"foo'bar"
.va_printf("%qzs", u"foo'bar")
printsu"foo'bar"
.va_printf("%qc", 10)
prints'n'
.va_printf("%qzc", 10)
printsU'n'
va_printf("%#qc", 16)
prints