aboutsummaryrefslogtreecommitdiffstats
path: root/admin/unidata/README (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Use stable URLs for files imported from UnicodeRobert Pluim2024-09-131-22/+12
| | | | | | | * admin/notes/unicode: Point people at "admin/unidata/README" for URLs for Unicode files. * admin/unidata/README: Use stable URLs for the various files. Remove dates, the files self-describe their dates anyway.
* ; Prefer HTTPS to HTTP in many URLsStefan Kangas2022-10-151-7/+7
|
* Add textsec-domain-suspicious-pLars Ingebrigtsen2022-01-181-2/+2
| | | | | | | | | | | | | | * .gitignore: Ignore idna-mapping.el. * admin/notes/unicode: Note idna-mapping file. * admin/unidata/IdnaMappingTable.txt: New file. * admin/unidata/Makefile.in (all): Generate idna-mapping.el. * admin/unidata/unidata-gen.el (unidata-gen-idna-mapping): Generate. * lisp/international/textsec.el (textsec-domain-suspicious-p): New function.
* Add textsec support for confusable charactersLars Ingebrigtsen2022-01-181-0/+4
| | | | | | | | | | | | | | | | * admin/notes/unicode: Note the confusables.txt file. * admin/unidata/Makefile.in (${unidir}/uni-confusable.el): Generate the confusable file. * admin/unidata/README (https): Add confusables.txt. * admin/unidata/confusables.txt: New file. * admin/unidata/unidata-gen.el (unidata-gen-confusable): Parse the confusables.txt file. * lisp/international/textsec.el (textsec-ascii-confusable-p) (textsec-unconfuse-string): New functions.
* Add support for functions that deal with Unicode scriptsLars Ingebrigtsen2022-01-171-0/+12
| | | | | | | | | | | | | | | | | | | * admin/unidata/Makefile.in (${unidir}/uni-scripts.el): Build uni-scripts.el. * admin/unidata/Scripts.txt: * admin/unidata/ScriptExtensions.txt: * admin/unidata/PropertyValueAliases.txt: New files from Unicode. * admin/unidata/README: Update. * admin/unidata/unidata-gen.el (unidata-gen-charprop): Allow writing other data, too. (unidata-gen-scripts, unidata-gen--read-script-aliases) (unidata-gen--insert-file): New functions to parse the Script* files. * lisp/international/textsec.el: Implement some functions that work on scripts.
* Add emoji insertion support to EmacsLars Ingebrigtsen2021-11-061-0/+4
| | | | | | | | | | | | | | | | | | | * .gitignore: Ignore the generated emoji-labels.el file. * admin/unidata/Makefile.in (${unidir}/emoji-labels.el): Generate the emoji-labels.el file. (gen-clean): Delete it. * admin/unidata/README (https): Note the source for the Unicode file that has emoji categorisations. * admin/unidata/emoji-test.txt: Import another Unicode file. * doc/emacs/mule.texi (Input Methods): Document the new key bindings. * lisp/international/emoji.el: New file. * lisp/international/mule-cmds.el (ctl-x-map): Bind the emoji commands.
* Support for Unicode emoji sequencesRobert Pluim2021-09-201-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | This covers both sequences using Zero-Width-Joiner codepoints and those without. Bug#39799, I hope. * .gitignore: Add emoji-zwj.el * admin/notes/unicode: Add emoji-zwj-sequences.txt and emoji-sequences.txt references. Describe how to test after updating to a newer Unicode version. * admin/unidata/Makefile.in (all): add emoji-zwj.el as a dependency. (emoji-zwj.el): Add target plus rules for building. (gen-clean): Add emoji-zwj.el. * admin/unidata/README: Add emoji-zwj-sequences.txt and emoji-sequences.txt references. * admin/unidata/blocks.awk: Force emoji script to be used for certain codepoints that are used by the Unicode sequences. * admin/unidata/emoji-sequences.txt: New file. * admin/unidata/emoji-zwj-sequences.txt: New file. * admin/unidata/emoji-zwj.awk: New file. Derives composition-function-table rules from emoji-zwj-sequences.txt, plus hardcodes some derived manually from emoji-sequences.txt. * etc/NEWS: Announce change. * lisp/international/characters.el: Load the generated emoji-zwj.el * src/Makefile.in (emoji-zwj): New target. (temacs): Add emoji-zwj as a dependency.
* ; admin/unidata/README: remove mistaken addition of local fileGlenn Morris2021-09-201-4/+0
|
* ; admin/unidata/README: sort entriesGlenn Morris2021-09-201-12/+12
|
* ; admin/unidata/README: update file datesGlenn Morris2021-09-201-7/+11
| | | | | I'm not sure how useful it is to keep this information in the README. Also, add missing EastAsianWidth.txt.
* Split Unicode emoji into their own scriptRobert Pluim2021-09-171-0/+4
| | | | | | | | | | | | | | | | * admin/notes/unicode: Describe how to update emoji for new Unicode release. * admin/unidata/Makefile.in: Pass emoji-data.txt to blocks.awk script. * admin/unidata/README: Add pointer to emoji-data.txt file. * admin/unidata/blocks.awk: Parse emoji-data.txt, add emoji codepoints to the 'emoji' script (except for the ASCII ones). * admin/unidata/emoji-data.txt: New file. * etc/NEWS: Describe new 'emoji' script. * etc/TODO: Update item about 'emoji' script. * lisp/international/fontset.el (script-representative-chars): Add 'emoji' script. (setup-default-fontset): Add 'emoji' script. Use "Noto Color Emoji" as default font for it.
* Update Unicode data and files to Unicode 10.0Eli Zaretskii2017-07-081-7/+11
| | | | | | | | | | | | | | | * admin/notes/unicode: * admin/unidata/README: * admin/unidata/BidiBrackets.txt: * admin/unidata/BidiMirroring.txt: * admin/unidata/Blocks.txt: * admin/unidata/IVD_Sequences.txt: * admin/unidata/NormalizationTest.txt: * admin/unidata/SpecialCasing.txt: * admin/unidata/UnicodeData.txt: * lisp/international/characters.el: * lisp/international/fontset.el (script-representative-chars): * lisp/international/mule-cmds.el (ucs-names): Update per Unicode 10.0.
* Support casing characters which map into multiple code points (bug#24603)Michal Nazarewicz2017-04-061-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement unconditional special casing rules defined in Unicode standard. Among other things, they deal with cases when a single code point is replaced by multiple ones because single character does not exist (e.g. ‘fi’ ligature turning into ‘FL’) or is not commonly used (e.g. ß turning into SS). * admin/unidata/SpecialCasing.txt: New data file pulled from Unicode standard distribution. * admin/unidata/README: Mention SpecialCasing.txt. * admin/unidata/unidata-get.el (unidata-gen-table-special-casing, unidata-gen-table-special-casing--do-load): New functions generating ‘special-uppercase’, ‘special-lowercase’ and ‘special-titlecase’ character Unicode properties built from the SpecialCasing.txt Unicode data file. * src/casefiddle.c (struct casing_str_buf): New structure for representing short strings used to handle one-to-many character mappings. (case_character_imlp): New function which can handle one-to-many character mappings. (case_character, case_single_character): Wrappers for the above functions. The former may map one character to multiple (or no) code points while the latter does what the former used to do (i.e. handles one-to-one mappings only). (do_casify_natnum, do_casify_unibyte_string, do_casify_unibyte_region): Use case_single_character. (do_casify_multibyte_string, do_casify_multibyte_region): Support new features of case_character. * (do_casify_region): Updated to reflact do_casify_multibyte_string changes. (casify_word): Handle situation when one character-length of a word can change affecting where end of the word is. (upcase, capitalize, upcase-initials): Update documentation to mention limitations when working on characters. * test/src/casefiddle-tests.el (casefiddle-tests-char-properties): Add test cases for the newly introduced character properties. (casefiddle-tests-casing): Update test cases which are now passing. * test/lisp/char-fold-tests.el (char-fold--ascii-upcase, char-fold--ascii-downcase): New functions which behave like old ‘upcase’ and ‘downcase’. (char-fold--test-match-exactly): Use the new functions. This is needed because otherwise fi and similar characters are turned into their multi- -character representation. * doc/lispref/strings.texi: Describe issue with casing characters versus strings. * doc/lispref/nonascii.texi: Describe the new character properties.
* Add tests for ucs-normalize.elNoam Postavsky2016-07-161-0/+4
| | | | | | | | | | Some tests are marked as expected to fail. * test/lisp/international/ucs-normalize-tests.el: New tests. * admin/unidata/NormalizationTest.txt: Add data for tests. * admin/unidata/README: Add URL for NormalizationTest.txt. * admin/notes/unicode: Add note about running (and updating the data for) the new tests. Remove note about normalization being unsupported.
* Generate char-script-table from Unicode source. (Bug#20789)Glenn Morris2015-06-161-0/+4
| | | | | | | | | | | | | | | | | | | | | | | * admin/unidata/Makefile.in (AWK): New, set by configure. (all): Add charscript.el. (blocks): New variable. (charscript.el, ${unidir}/charscript.el): New targets. (extraclean): Also remove generated charscript.el. * admin/unidata/blocks.awk: New script. * admin/unidata/Blocks.txt: New data file, from unicode.org. * lisp/international/characters.el: Load charscript. * src/Makefile.in (charscript): New variable. (${charscript}): New target. (${lispintdir}/characters.elc): Depend on charscript.elc. (temacs$(EXEEXT)): Depend on charscript. ; * admin/unidata/README: Mention Blocks.txt. ; * .gitignore: Add lisp/international/charscript.el.
* Update admin/unidata data files to latest versionsGlenn Morris2014-06-211-5/+5
| | | | | | | | | | * admin/unidata/BidiMirroring.txt: Update to 7.0.0 (only comment changes). * admin/unidata/UnicodeData.txt: Update to 7.0.0. * admin/unidata/IVD_Sequences.txt: Update to 2014-05-16 version. * admin/unidata/README: Update for above changes.
* Include sources used to create macuvs.h.Paul Eggert2014-05-261-4/+18
| | | | | | | | | | | | | | | | | * admin/unidata/IVD_Sequences.txt: New file. * admin/unidata/Makefile.in (${top_srcdir}/src/macuvs.h): New rule. (all): Build it. (extraclean): Remove it. * admin/unidata/README: Mention BidiMirroring.txt and IVD_Sequences.txt. * admin/unidata/copyright.html: Update to current version from Unicode Consortium. * admin/unidata/uvs.el: Rename from admin/mac/uvs.el. (uvs-print-table-ivd): Output a header in the form that unidata-gen.el generates. * lisp/international/README: Refer to the Unicode Terms of Use rather than copying it bodily here, as that simplifies maintenance. * src/Makefile.in ($(srcdir)/macuvs.h): New rule. * src/macuvs.h: Use automatically-generated header.
* Update the Unicode database and derived files for Unicode 6.1.Eli Zaretskii2012-04-071-1/+1
| | | | | | | | | | | | | | | | | | | | | admin/unidata/README: admin/unidata/copyright.html: admin/unidata/BidiMirroring.txt: admin/unidata/UnicodeData.txt: Update for the latest version 6.1 of the Unicode Standard. lisp/international/uni-bidi.el: lisp/international/uni-category.el: lisp/international/uni-combining.el: lisp/international/uni-decimal.el: lisp/international/uni-decomposition.el: lisp/international/uni-digit.el: lisp/international/uni-lowercase.el: lisp/international/uni-mirrored.el: lisp/international/uni-name.el: lisp/international/uni-numeric.el: lisp/international/uni-titlecase.el: lisp/international/uni-uppercase.el: Update for Unicode 6.1.
* *** empty log message ***Kenichi Handa2009-10-131-1/+1
|
* Adjusted for Unicode 5.0.Kenichi Handa2006-08-211-35/+4
|
* *** empty log message ***Kenichi Handa2005-05-071-1/+1
|
* New file.Kenichi Handa2005-01-301-0/+35