* internals.texi (C Integer Types): New section.

This follows up and records an email in <http://lists.gnu.org/archive/html/emacs-devel/2012-07/msg00496.html>.
author: Paul Eggert 2012-12-10 16:13:44 -0800
committer: Paul Eggert 2012-12-10 16:13:44 -0800
commit: d92d9c95017a384e8bd04bd139fb050d3e50bac1 (patch)
tree: 37b8d7b7c4ad581fc54c94b8af8cc08a0e7185a7
parent: ed6f2cd47f126b38f81ab0f45b7da42a8ae1985f (diff)
download: emacs-d92d9c95017a384e8bd04bd139fb050d3e50bac1.tar.gz
emacs-d92d9c95017a384e8bd04bd139fb050d3e50bac1.zip
2 files changed, 94 insertions, 0 deletions
diff --git a/doc/lispref/ChangeLog b/doc/lispref/ChangeLog
index 05716cd77b3..43d737b618f 100644
--- a/doc/lispref/ChangeLog
+++ b/doc/lispref/ChangeLog
@@ -1,3 +1,9 @@
+2012-12-11  Paul Eggert  <eggert@cs.ucla.edu>
+        * internals.texi (C Integer Types): New section.
+        This follows up and records an email in
+        <http://lists.gnu.org/archive/html/emacs-devel/2012-07/msg00496.html>.
 2012-12-10  Stefan Monnier  <monnier@iro.umontreal.ca>
        * control.texi (Pattern maching case statement): New node.
diff --git a/doc/lispref/internals.texi b/doc/lispref/internals.texi
index 830a00ec9e6..025042a6869 100644
--- a/doc/lispref/internals.texi
+++ b/doc/lispref/internals.texi
@@ -16,6 +16,7 @@ internal aspects of GNU Emacs that may be of interest to C programmers.
 * Memory Usage::        Info about total size of Lisp objects made so far.
 * Writing Emacs Primitives::   Writing C code for Emacs.
 * Object Internals::    Data formats of buffers, windows, processes.
+* C Integer Types::     How C integer types are used inside Emacs.
 @end menu
 @node Building Emacs
@@ -1531,4 +1532,91 @@ Symbol indicating the type of process: @code{real}, @code{network},
 @end table
+@node C Integer Types
+@section C Integer Types
+@cindex integer types (C programming language)
+Here are some guidelines for use of integer types in the Emacs C
+source code.  These guidelines sometimes give competing advice; common
+sense is advised.
+@itemize @bullet
+@item
+Avoid arbitrary limits.  For example, avoid @code{int len = strlen
+(s);} unless the length of @code{s} is required for other reasons to
+fit in @code{int} range.
+@item
+Do not assume that signed integer arithmetic wraps around on overflow.
+This is no longer true of Emacs porting targets: signed integer
+overflow has undefined behavior in practice, and can dump core or
+even cause earlier or later code to behave ``illogically''.  Unsigned
+overflow does wrap around reliably, modulo a power of two.
+@item
+Prefer signed types to unsigned, as code gets confusing when signed
+and unsigned types are combined.  Many other guidelines assume that
+types are signed; in the rarer cases where unsigned types are needed,
+similar advice may apply to the unsigned counterparts (e.g.,
+@code{size_t} instead of @code{ptrdiff_t}, or @code{uintptr_t} instead
+of @code{intptr_t}).
+@item
+Prefer @code{int} for Emacs character codes, in the range 0 ..@: 0x3FFFFF.
+@item
+Prefer @code{ptrdiff_t} for sizes, i.e., for integers bounded by the
+maximum size of any individual C object or by the maximum number of
+elements in any C array.  This is part of Emacs's general preference
+for signed types.  Using @code{ptrdiff_t} limits objects to
+@code{PTRDIFF_MAX} bytes, but larger objects would cause trouble
+anyway since they would break pointer subtraction, so this does not
+impose an arbitrary limit.
+@item
+Prefer @code{intptr_t} for internal representations of pointers, or
+for integers bounded only by the number of objects that can exist at
+any given time or by the total number of bytes that can be allocated.
+Currently Emacs sometimes uses other types when @code{intptr_t} would
+be better; fixing this is lower priority, as the code works as-is on
+Emacs's current porting targets.
+@item
+Prefer the Emacs-defined type @code{EMACS_INT} for representing values
+converted to or from Emacs Lisp fixnums, as fixnum arithmetic is based
+on @code{EMACS_INT}.
+@item
+When representing a system value (such as a file size or a count of
+seconds since the Epoch), prefer the corresponding system type (e.g.,
+@code{off_t}, @code{time_t}).  Do not assume that a system type is
+signed, unless this assumption is known to be safe.  For example,
+although @code{off_t} is always signed, @code{time_t} need not be.
+@item
+Prefer the Emacs-defined type @code{printmax_t} for representing
+values that might be any signed integer value that can be printed,
+using a @code{printf}-family function.
+@item
+Prefer @code{intmax_t} for representing values that might be any
+signed integer value.
+@item
+In bitfields, prefer @code{unsigned int} or @code{signed int} to
+@code{int}, as @code{int} is less portable: it might be signed, and
+might not be.  Single-bit bit fields are invariably @code{unsigned
+int} so that their values are 0 and 1.
+@item
+In C, Emacs commonly uses @code{bool}, 1, and 0 for boolean values.
+Using @code{bool} for booleans can make programs easier to read and a
+bit faster than using @code{int}.  Although it is also OK to use
+@code{int}, this older style is gradually being phased out.  When
+using @code{bool}, respect the limitations of the replacement
+implementation of @code{bool}, as documented in the source file
+@file{lib/stdbool.in.h}, so that Emacs remains portable to pre-C99
+platforms.
+@end itemize
 @c FIXME Mention src/globals.h somewhere in this file?
author	Paul Eggert	2012-12-10 16:13:44 -0800
committer	Paul Eggert	2012-12-10 16:13:44 -0800
commit	d92d9c95017a384e8bd04bd139fb050d3e50bac1 (patch)
tree	37b8d7b7c4ad581fc54c94b8af8cc08a0e7185a7
parent	ed6f2cd47f126b38f81ab0f45b7da42a8ae1985f (diff)
download	emacs-d92d9c95017a384e8bd04bd139fb050d3e50bac1.tar.gz emacs-d92d9c95017a384e8bd04bd139fb050d3e50bac1.zip

diff --git a/doc/lispref/ChangeLog b/doc/lispref/ChangeLog index 05716cd77b3..43d737b618f 100644 --- a/doc/lispref/ChangeLog +++ b/doc/lispref/ChangeLog
@@ -1,3 +1,9 @@
		1	2012-12-11 Paul Eggert <eggert@cs.ucla.edu>
		2
		3	* internals.texi (C Integer Types): New section.
		4	This follows up and records an email in
		5	<http://lists.gnu.org/archive/html/emacs-devel/2012-07/msg00496.html>.
		6
1	2012-12-10 Stefan Monnier <monnier@iro.umontreal.ca>	7	2012-12-10 Stefan Monnier <monnier@iro.umontreal.ca>
2		8
3	* control.texi (Pattern maching case statement): New node.	9	* control.texi (Pattern maching case statement): New node.


diff --git a/doc/lispref/internals.texi b/doc/lispref/internals.texi index 830a00ec9e6..025042a6869 100644 --- a/doc/lispref/internals.texi +++ b/doc/lispref/internals.texi
@@ -16,6 +16,7 @@ internal aspects of GNU Emacs that may be of interest to C programmers.
16	* Memory Usage:: Info about total size of Lisp objects made so far.	16	* Memory Usage:: Info about total size of Lisp objects made so far.
17	* Writing Emacs Primitives:: Writing C code for Emacs.	17	* Writing Emacs Primitives:: Writing C code for Emacs.
18	* Object Internals:: Data formats of buffers, windows, processes.	18	* Object Internals:: Data formats of buffers, windows, processes.
		19	* C Integer Types:: How C integer types are used inside Emacs.
19	@end menu	20	@end menu
20		21
21	@node Building Emacs	22	@node Building Emacs
@@ -1531,4 +1532,91 @@ Symbol indicating the type of process: @code{real}, @code{network},
1531		1532
1532	@end table	1533	@end table
1533		1534
		1535	@node C Integer Types
		1536	@section C Integer Types
		1537	@cindex integer types (C programming language)
		1538
		1539	Here are some guidelines for use of integer types in the Emacs C
		1540	source code. These guidelines sometimes give competing advice; common
		1541	sense is advised.
		1542
		1543	@itemize @bullet
		1544	@item
		1545	Avoid arbitrary limits. For example, avoid @code{int len = strlen
		1546	(s);} unless the length of @code{s} is required for other reasons to
		1547	fit in @code{int} range.
		1548
		1549	@item
		1550	Do not assume that signed integer arithmetic wraps around on overflow.
		1551	This is no longer true of Emacs porting targets: signed integer
		1552	overflow has undefined behavior in practice, and can dump core or
		1553	even cause earlier or later code to behave ``illogically''. Unsigned
		1554	overflow does wrap around reliably, modulo a power of two.
		1555
		1556	@item
		1557	Prefer signed types to unsigned, as code gets confusing when signed
		1558	and unsigned types are combined. Many other guidelines assume that
		1559	types are signed; in the rarer cases where unsigned types are needed,
		1560	similar advice may apply to the unsigned counterparts (e.g.,
		1561	@code{size_t} instead of @code{ptrdiff_t}, or @code{uintptr_t} instead
		1562	of @code{intptr_t}).
		1563
		1564	@item
		1565	Prefer @code{int} for Emacs character codes, in the range 0 ..@: 0x3FFFFF.
		1566
		1567	@item
		1568	Prefer @code{ptrdiff_t} for sizes, i.e., for integers bounded by the
		1569	maximum size of any individual C object or by the maximum number of
		1570	elements in any C array. This is part of Emacs's general preference
		1571	for signed types. Using @code{ptrdiff_t} limits objects to
		1572	@code{PTRDIFF_MAX} bytes, but larger objects would cause trouble
		1573	anyway since they would break pointer subtraction, so this does not
		1574	impose an arbitrary limit.
		1575
		1576	@item
		1577	Prefer @code{intptr_t} for internal representations of pointers, or
		1578	for integers bounded only by the number of objects that can exist at
		1579	any given time or by the total number of bytes that can be allocated.
		1580	Currently Emacs sometimes uses other types when @code{intptr_t} would
		1581	be better; fixing this is lower priority, as the code works as-is on
		1582	Emacs's current porting targets.
		1583
		1584	@item
		1585	Prefer the Emacs-defined type @code{EMACS_INT} for representing values
		1586	converted to or from Emacs Lisp fixnums, as fixnum arithmetic is based
		1587	on @code{EMACS_INT}.
		1588
		1589	@item
		1590	When representing a system value (such as a file size or a count of
		1591	seconds since the Epoch), prefer the corresponding system type (e.g.,
		1592	@code{off_t}, @code{time_t}). Do not assume that a system type is
		1593	signed, unless this assumption is known to be safe. For example,
		1594	although @code{off_t} is always signed, @code{time_t} need not be.
		1595
		1596	@item
		1597	Prefer the Emacs-defined type @code{printmax_t} for representing
		1598	values that might be any signed integer value that can be printed,
		1599	using a @code{printf}-family function.
		1600
		1601	@item
		1602	Prefer @code{intmax_t} for representing values that might be any
		1603	signed integer value.
		1604
		1605	@item
		1606	In bitfields, prefer @code{unsigned int} or @code{signed int} to
		1607	@code{int}, as @code{int} is less portable: it might be signed, and
		1608	might not be. Single-bit bit fields are invariably @code{unsigned
		1609	int} so that their values are 0 and 1.
		1610
		1611	@item
		1612	In C, Emacs commonly uses @code{bool}, 1, and 0 for boolean values.
		1613	Using @code{bool} for booleans can make programs easier to read and a
		1614	bit faster than using @code{int}. Although it is also OK to use
		1615	@code{int}, this older style is gradually being phased out. When
		1616	using @code{bool}, respect the limitations of the replacement
		1617	implementation of @code{bool}, as documented in the source file
		1618	@file{lib/stdbool.in.h}, so that Emacs remains portable to pre-C99
		1619	platforms.
		1620	@end itemize
		1621
1534	@c FIXME Mention src/globals.h somewhere in this file?	1622	@c FIXME Mention src/globals.h somewhere in this file?