aboutsummaryrefslogtreecommitdiffstats
path: root/admin/unidata/README (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Update Unicode data and files to Unicode 10.0Eli Zaretskii2017-07-081-7/+11
| | | | | | | | | | | | | | | * admin/notes/unicode: * admin/unidata/README: * admin/unidata/BidiBrackets.txt: * admin/unidata/BidiMirroring.txt: * admin/unidata/Blocks.txt: * admin/unidata/IVD_Sequences.txt: * admin/unidata/NormalizationTest.txt: * admin/unidata/SpecialCasing.txt: * admin/unidata/UnicodeData.txt: * lisp/international/characters.el: * lisp/international/fontset.el (script-representative-chars): * lisp/international/mule-cmds.el (ucs-names): Update per Unicode 10.0.
* Support casing characters which map into multiple code points (bug#24603)Michal Nazarewicz2017-04-061-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement unconditional special casing rules defined in Unicode standard. Among other things, they deal with cases when a single code point is replaced by multiple ones because single character does not exist (e.g. ‘fi’ ligature turning into ‘FL’) or is not commonly used (e.g. ß turning into SS). * admin/unidata/SpecialCasing.txt: New data file pulled from Unicode standard distribution. * admin/unidata/README: Mention SpecialCasing.txt. * admin/unidata/unidata-get.el (unidata-gen-table-special-casing, unidata-gen-table-special-casing--do-load): New functions generating ‘special-uppercase’, ‘special-lowercase’ and ‘special-titlecase’ character Unicode properties built from the SpecialCasing.txt Unicode data file. * src/casefiddle.c (struct casing_str_buf): New structure for representing short strings used to handle one-to-many character mappings. (case_character_imlp): New function which can handle one-to-many character mappings. (case_character, case_single_character): Wrappers for the above functions. The former may map one character to multiple (or no) code points while the latter does what the former used to do (i.e. handles one-to-one mappings only). (do_casify_natnum, do_casify_unibyte_string, do_casify_unibyte_region): Use case_single_character. (do_casify_multibyte_string, do_casify_multibyte_region): Support new features of case_character. * (do_casify_region): Updated to reflact do_casify_multibyte_string changes. (casify_word): Handle situation when one character-length of a word can change affecting where end of the word is. (upcase, capitalize, upcase-initials): Update documentation to mention limitations when working on characters. * test/src/casefiddle-tests.el (casefiddle-tests-char-properties): Add test cases for the newly introduced character properties. (casefiddle-tests-casing): Update test cases which are now passing. * test/lisp/char-fold-tests.el (char-fold--ascii-upcase, char-fold--ascii-downcase): New functions which behave like old ‘upcase’ and ‘downcase’. (char-fold--test-match-exactly): Use the new functions. This is needed because otherwise fi and similar characters are turned into their multi- -character representation. * doc/lispref/strings.texi: Describe issue with casing characters versus strings. * doc/lispref/nonascii.texi: Describe the new character properties.
* Add tests for ucs-normalize.elNoam Postavsky2016-07-161-0/+4
| | | | | | | | | | Some tests are marked as expected to fail. * test/lisp/international/ucs-normalize-tests.el: New tests. * admin/unidata/NormalizationTest.txt: Add data for tests. * admin/unidata/README: Add URL for NormalizationTest.txt. * admin/notes/unicode: Add note about running (and updating the data for) the new tests. Remove note about normalization being unsupported.
* Generate char-script-table from Unicode source. (Bug#20789)Glenn Morris2015-06-161-0/+4
| | | | | | | | | | | | | | | | | | | | | | | * admin/unidata/Makefile.in (AWK): New, set by configure. (all): Add charscript.el. (blocks): New variable. (charscript.el, ${unidir}/charscript.el): New targets. (extraclean): Also remove generated charscript.el. * admin/unidata/blocks.awk: New script. * admin/unidata/Blocks.txt: New data file, from unicode.org. * lisp/international/characters.el: Load charscript. * src/Makefile.in (charscript): New variable. (${charscript}): New target. (${lispintdir}/characters.elc): Depend on charscript.elc. (temacs$(EXEEXT)): Depend on charscript. ; * admin/unidata/README: Mention Blocks.txt. ; * .gitignore: Add lisp/international/charscript.el.
* Update admin/unidata data files to latest versionsGlenn Morris2014-06-211-5/+5
| | | | | | | | | | * admin/unidata/BidiMirroring.txt: Update to 7.0.0 (only comment changes). * admin/unidata/UnicodeData.txt: Update to 7.0.0. * admin/unidata/IVD_Sequences.txt: Update to 2014-05-16 version. * admin/unidata/README: Update for above changes.
* Include sources used to create macuvs.h.Paul Eggert2014-05-261-4/+18
| | | | | | | | | | | | | | | | | * admin/unidata/IVD_Sequences.txt: New file. * admin/unidata/Makefile.in (${top_srcdir}/src/macuvs.h): New rule. (all): Build it. (extraclean): Remove it. * admin/unidata/README: Mention BidiMirroring.txt and IVD_Sequences.txt. * admin/unidata/copyright.html: Update to current version from Unicode Consortium. * admin/unidata/uvs.el: Rename from admin/mac/uvs.el. (uvs-print-table-ivd): Output a header in the form that unidata-gen.el generates. * lisp/international/README: Refer to the Unicode Terms of Use rather than copying it bodily here, as that simplifies maintenance. * src/Makefile.in ($(srcdir)/macuvs.h): New rule. * src/macuvs.h: Use automatically-generated header.
* Update the Unicode database and derived files for Unicode 6.1.Eli Zaretskii2012-04-071-1/+1
| | | | | | | | | | | | | | | | | | | | | admin/unidata/README: admin/unidata/copyright.html: admin/unidata/BidiMirroring.txt: admin/unidata/UnicodeData.txt: Update for the latest version 6.1 of the Unicode Standard. lisp/international/uni-bidi.el: lisp/international/uni-category.el: lisp/international/uni-combining.el: lisp/international/uni-decimal.el: lisp/international/uni-decomposition.el: lisp/international/uni-digit.el: lisp/international/uni-lowercase.el: lisp/international/uni-mirrored.el: lisp/international/uni-name.el: lisp/international/uni-numeric.el: lisp/international/uni-titlecase.el: lisp/international/uni-uppercase.el: Update for Unicode 6.1.
* *** empty log message ***Kenichi Handa2009-10-131-1/+1
|
* Adjusted for Unicode 5.0.Kenichi Handa2006-08-211-35/+4
|
* *** empty log message ***Kenichi Handa2005-05-071-1/+1
|
* New file.Kenichi Handa2005-01-301-0/+35