aboutsummaryrefslogtreecommitdiffstats
path: root/admin/notes/unicode (unfollow)
Commit message (Collapse)AuthorFilesLines
2025-09-14Fix 'ucs-normalize' tests following Unciode 17.0 importEli Zaretskii1-8/+13
* lisp/international/ucs-normalize.el (ucs-normalize-composition-exclusions): Doc fix. * test/lisp/international/ucs-normalize-tests.el (ucs-normalize-tests--failing-lines-part1): Update to _really_ match Unicode 17.0. * admin/notes/unicode: Update instructions.
2025-09-13Fix Unicode-related testsEli Zaretskii1-0/+8
* test/lisp/international/mule-tests.el (mule-cmds-tests--ucs-names-missing-names): Update no-name regions of codepoints to Unicode 17.0. * lisp/international/mule-cmds.el (ucs-names): Fix comments. * admin/notes/unicode: Update instructions.
2025-01-02Update copyright year to 2025Stefan Kangas1-1/+1
Run "TZ=UTC0 admin/update-copyright".
2025-01-01Update copyright year to 2025Paul Eggert1-1/+1
Run "TZ=UTC0 admin/update-copyright".
2024-09-17; * admin/notes/unicode: Need to run textsec-tests (bug#73312).Eli Zaretskii1-2/+8
2024-09-13Use stable URLs for files imported from UnicodeRobert Pluim1-0/+3
* admin/notes/unicode: Point people at "admin/unidata/README" for URLs for Unicode files. * admin/unidata/README: Use stable URLs for the various files. Remove dates, the files self-describe their dates anyway.
2024-01-02; Add 2024 to copyright yearsPo Lu1-1/+1
2023-09-17Support Unicode version 15.1Eli Zaretskii1-6/+7
* admin/unidata/BidiBrackets.txt: * admin/unidata/BidiMirroring.txt: * admin/unidata/Blocks.txt: * admin/unidata/IdnaMappingTable.txt: * admin/unidata/NormalizationTest.txt: * admin/unidata/PropertyValueAliases.txt: * admin/unidata/ScriptExtensions.txt: * admin/unidata/Scripts.txt: * admin/unidata/SpecialCasing.txt: * admin/unidata/UnicodeData.txt: * admin/unidata/confusables.txt: * admin/unidata/copyright.html: * test/manual/BidiCharacterTest.txt: * admin/unidata/emoji-data.txt: * admin/unidata/emoji-sequences.txt: * admin/unidata/emoji-test.txt: * admin/unidata/emoji-variation-sequences.txt: * admin/unidata/emoji-zwj-sequences.txt: Update from Unicode data files. * admin/notes/unicode: Update instructions. * lisp/international/characters.el: Update 'char-width-table' data. * etc/NEWS: Announce support for Unicode 15.1.
2023-09-16Fix Unicode normalization of charactersEli Zaretskii1-0/+5
* lisp/international/ucs-normalize.el (ucs-normalize-composition-exclusions, check-range): Update from Unicode 15.0 data. (Bug#65996) * test/lisp/international/ucs-normalize-tests.el (ucs-normalize-tests--failing-lines-part1) (ucs-normalize-tests--failing-lines-part2): Update to reflect changes in ucs-normalize.el. * admin/notes/unicode: Mention the updates in ucs-normalize.el.
2023-08-11Update lists of non-UTF filesPaul Eggert1-5/+19
* .gitattributes: Don't diff text files with encodings incompatible with UTF-8. Add some new binary file extensions, like '.webp'. etc/e/eterm-direct and java/emacs.keystore are also binary. * admin/notes/unicode: Update similarly.
2023-08-06; * admin/notes/unicode (char-width-table): Update instructions.Eli Zaretskii1-1/+4
2023-05-28Add instructions and test file for VS-15/VS-16Robert Pluim1-1/+16
* admin/notes/unicode: Add instructions for emoji-variation-sequences.txt * admin/unidata/emoji-variation-sequences.txt: New file, imported from Unicode 15.
2023-01-01; Add 2023 to copyright years.Eli Zaretskii1-1/+1
2022-09-20Mention that src/macuvs.h sometimes needs committingRobert Pluim1-0/+4
* admin/notes/unicode: src/macuvs.h is generated, but needs to be committed sometimes.
2022-05-04Fix 'bidi-class' property of unassigned codepointsEli Zaretskii1-0/+6
* admin/unidata/unidata-gen.el (unidata-file-alist): Update the default values of 'bidi-class' according to the latest Unicode Standard. * admin/notes/unicode: Mention possible changes in DerivedBidiClass.txt that need to be reflected in unidata-gen.el. * lisp/international/characters.el (#xfb50, #xfdf0): Fix the Arabic block characters. (Bug#55256)
2022-01-18Add textsec-domain-suspicious-pLars Ingebrigtsen1-7/+9
* .gitignore: Ignore idna-mapping.el. * admin/notes/unicode: Note idna-mapping file. * admin/unidata/IdnaMappingTable.txt: New file. * admin/unidata/Makefile.in (all): Generate idna-mapping.el. * admin/unidata/unidata-gen.el (unidata-gen-idna-mapping): Generate. * lisp/international/textsec.el (textsec-domain-suspicious-p): New function.
2022-01-18Add textsec support for confusable charactersLars Ingebrigtsen1-1/+2
* admin/notes/unicode: Note the confusables.txt file. * admin/unidata/Makefile.in (${unidir}/uni-confusable.el): Generate the confusable file. * admin/unidata/README (https): Add confusables.txt. * admin/unidata/confusables.txt: New file. * admin/unidata/unidata-gen.el (unidata-gen-confusable): Parse the confusables.txt file. * lisp/international/textsec.el (textsec-ascii-confusable-p) (textsec-unconfuse-string): New functions.
2022-01-17; * admin/notes/unicode: Update.Eli Zaretskii1-1/+4
2022-01-01; Add 2022 to copyright years.Eli Zaretskii1-1/+1
2021-11-06Fix Emoji-related documentationEli Zaretskii1-5/+8
* etc/NEWS: Fix wording and spelling. * doc/emacs/mule.texi (Input Methods): Add index entries and fix wording. * admin/notes/unicode: Update instructions for updating Emacs for the latest Unicode Standard.
2021-10-19* admin/notes/unicode: Refer to Unicode's emoji-style.txtRobert Pluim1-2/+6
2021-09-20Support for Unicode emoji sequencesRobert Pluim1-4/+11
This covers both sequences using Zero-Width-Joiner codepoints and those without. Bug#39799, I hope. * .gitignore: Add emoji-zwj.el * admin/notes/unicode: Add emoji-zwj-sequences.txt and emoji-sequences.txt references. Describe how to test after updating to a newer Unicode version. * admin/unidata/Makefile.in (all): add emoji-zwj.el as a dependency. (emoji-zwj.el): Add target plus rules for building. (gen-clean): Add emoji-zwj.el. * admin/unidata/README: Add emoji-zwj-sequences.txt and emoji-sequences.txt references. * admin/unidata/blocks.awk: Force emoji script to be used for certain codepoints that are used by the Unicode sequences. * admin/unidata/emoji-sequences.txt: New file. * admin/unidata/emoji-zwj-sequences.txt: New file. * admin/unidata/emoji-zwj.awk: New file. Derives composition-function-table rules from emoji-zwj-sequences.txt, plus hardcodes some derived manually from emoji-sequences.txt. * etc/NEWS: Announce change. * lisp/international/characters.el: Load the generated emoji-zwj.el * src/Makefile.in (emoji-zwj): New target. (temacs): Add emoji-zwj as a dependency.
2021-09-20Base emoji script membership on Emoji_PresentationRobert Pluim1-1/+1
The Emoji property describes which codepoints can be displayed as emoji, but Emoji_Presentation governs which are displayed as emoji by default. * admin/notes/unicode: Adjust check-emoji-coverage to look in the Emoji_Presentation sections of emoji-data.txt * admin/unidata/blocks.awk: Assign emoji script using the Emoji_Presentation section.
2021-09-17Split Unicode emoji into their own scriptRobert Pluim1-6/+33
* admin/notes/unicode: Describe how to update emoji for new Unicode release. * admin/unidata/Makefile.in: Pass emoji-data.txt to blocks.awk script. * admin/unidata/README: Add pointer to emoji-data.txt file. * admin/unidata/blocks.awk: Parse emoji-data.txt, add emoji codepoints to the 'emoji' script (except for the ASCII ones). * admin/unidata/emoji-data.txt: New file. * etc/NEWS: Describe new 'emoji' script. * etc/TODO: Update item about 'emoji' script. * lisp/international/fontset.el (script-representative-chars): Add 'emoji' script. (setup-default-fontset): Add 'emoji' script. Use "Noto Color Emoji" as default font for it.
2021-09-15Update Unicode support to Unicode version 14.0.0Eli Zaretskii1-2/+7
* admin/unidata/copyright.html: * admin/unidata/UnicodeData.txt: * admin/unidata/Blocks.txt: * admin/unidata/BidiBrackets.txt: * admin/unidata/BidiMirroring.txt: * admin/unidata/IVD_Sequences.txt: * admin/unidata/NormalizationTest.txt: * admin/unidata/SpecialCasing.txt: * test/manual/BidiCharacterTest.txt: Updated files from Unicode 14.0. * lisp/international/fontset.el (script-representative-chars): Add new scripts. (otf-script-alist): Update from latest version. (setup-default-fontset): Add new scripts. * lisp/international/characters.el: Update syntax and category tables for new characters and scripts. (char-width-table): Update for changes in Unicode 14.0. * lisp/international/mule-cmds.el (ucs-names): Update used and unused ranges per Unicode 14.0. * test/lisp/international/ucs-normalize-tests.el (ucs-normalize-tests--failing-lines-part1) (ucs-normalize-tests--failing-lines-part2): Update per the test results. * doc/lispref/nonascii.texi (Character Properties): Update Unicode version number. * etc/NEWS: Announce support for Unicode 14.0. * admin/notes/unicode: Minor copyedits.
2021-01-28Use lexical-binding in all of `lisp/emacs-lisp`Stefan Monnier1-0/+1
* lisp/emacs-lisp/bindat.el: Use lexical-binding. (bindat--unpack-group, bindat--length-group, bindat--pack-group): Declare `last` and `tag` as dyn-scoped. (bindat-unpack, bindat-pack): Bind `bindat-raw` and `bindat-idx` via `let` rather than via the formal arglist. * lisp/emacs-lisp/package-x.el: * lisp/emacs-lisp/generic.el: * lisp/emacs-lisp/eieio-opt.el: * lisp/emacs-lisp/derived.el: * lisp/emacs-lisp/crm.el: Use lexical-binding. * lisp/emacs-lisp/helper.el: Use lexical-binding. (Helper-help-map): Move initialization into declaration. * lisp/emacs-lisp/regi.el: Use lexical-binding. (regi-interpret): Remove unused var `tstart`. Declare `curframe`, `curentry` and `curline` as dyn-scoped. * lisp/emacs-lisp/shadow.el: Use lexical-binding. (load-path-shadows-find): Remove unused var `file`. Tighten a regexp, use `push`. * lisp/emacs-lisp/tcover-ses.el: Use lexical-binding. Require `ses`. Remove correspondingly redundant declarations. (ses--curcell-overlay): Declare. (ses-exercise): Use `dlet` and use a properly-prefixed var name. Fix name of `curcell-overlay` variable. * lisp/emacs-lisp/unsafep.el: Use lexical-binding. (unsafep): Bind `unsafep-vars` via `let` rather than via the formal arglist.
2021-01-27* admin/notes/unicode: titdic-cnv.el is now utf-8.Paul Eggert1-9/+0
2021-01-01Update copyright year to 2021Paul Eggert1-1/+1
Run "TZ=UTC0 admin/update-copyright".
2021-01-01Update copyright year to 2021Paul Eggert1-1/+1
Run "TZ=UTC0 admin/update-copyright $(git ls-files)".
2020-01-05Go back to iso-2022-7bit for titdic-cnv.el againPaul Eggert1-1/+9
* admin/notes/unicode: Mention this. * lisp/international/titdic-cnv.el: Go back to iso-2022-7bit for this file, since utf-8-emacs unified characters that tsanq-quick-converter did not want unified. Problem reported by Eli Zaretskii in: https://lists.gnu.org/r/emacs-devel/2020-01/msg00156.html
2020-01-01Update copyright year to 2020Paul Eggert1-1/+1
Run "TZ=UTC0 admin/update-copyright $(git ls-files)".
2019-08-17Minor update in admin/notes/unicodeEli Zaretskii1-0/+3
* admin/notes/unicode: Mention changes to be done in setup-default-fontset in fontset.el. (Bug#14461)
2019-03-09Import Unicode 12.0 data filesEli Zaretskii1-5/+18
* admin/unidata/copyright.html: * admin/unidata/UnicodeData.txt: * admin/unidata/SpecialCasing.txt: * admin/unidata/NormalizationTest.txt: * admin/unidata/Blocks.txt: * admin/unidata/BidiMirroring.txt: * admin/unidata/BidiBrackets.txt: New versions from Unicode 12.0. * admin/unidata/unidata-gen.el (unidata-gen-file): * admin/unidata/blocks.awk (name2alias): Adapt to changes in new data files. * admin/notes/unicode: Update and improve instructions for importing a new Unicode Standard. * lisp/international/characters.el (char-width-table): Update lists of characters according to Unicode 12.0. * lisp/international/fontset.el (script-representative-chars): Add characters from new scripts to 'script-representative-chars'. (otf-script-alist): Update according to data on the MS site. * lisp/international/mule-cmds.el (ucs-names): Update unused ranges of codepoints according to Unicode 12.0. * test/lisp/international/ucs-normalize-tests.el (ucs-normalize-tests--failing-lines-part1) (ucs-normalize-tests--failing-lines-part2): Update for the new NormalizationTest.txt file. * test/manual/BidiCharacterTest.txt: Update with the new version from Unicode 12.0.
2019-01-08* admin/notes/unicode: Update to match recent changes.Paul Eggert1-23/+7
2019-01-01Update copyright year to 2019Paul Eggert1-1/+1
Run 'TZ=UTC0 admin/update-copyright $(git ls-files)'.
2018-06-09Update Unicode data files to version 11.0.0 of UnicodeEli Zaretskii1-1/+1
* admin/unidata/UnicodeData.txt: * admin/unidata/SpecialCasing.txt: * admin/unidata/NormalizationTest.txt: * admin/unidata/copyright.html: * admin/unidata/BidiMirroring.txt: * admin/unidata/BidiBrackets.txt: Import from Unicode 11.0. * admin/notes/unicode: Update the URL for OTF script tags. * lisp/international/mule-cmds.el (ucs-names): Update unused ranges. * lisp/international/fontset.el (script-representative-chars): Add hanifi-rohingya, old-sogdian, sogdian, dogra, gunjala-gondi, makasar, and medefaidrin. (otf-script-alist): Add old-hungarian. * lisp/international/characters.el (tbl): Add syntax entries for Supplemental Mathematical Operators, Miscellaneous Symbols and Arrows, and Supplemental Punctuation. Update the list of wide characters. * test/lisp/international/ucs-normalize-tests.el (ucs-normalize-tests--failing-lines-part2): Update to match admin/unidata/NormalizationTest.txt. * doc/lispref/nonascii.texi (Character Properties): Update the reference to the Unicode Standard. * doc/misc/efaq.texi (New in Emacs 26): * etc/NEWS: Mention compatibility with Unicode 11.0.
2018-05-19* admin/notes/unicode: HELLO is again UTF-8.Paul Eggert1-4/+0
2018-04-20Revert "* admin/notes/unicode: HELLO is now UTF-8."Michael Albinus1-0/+4
This reverts commit 0585bd643dae2592214e77998b875347e6e59bab.
2018-04-20* admin/notes/unicode: HELLO is now UTF-8.Paul Eggert1-4/+0
2018-02-16; Fix doc typos related to indefinite articlesGlenn Morris1-1/+1
2018-01-01Update copyright year to 2018Paul Eggert1-1/+1
Run admin/update-copyright.
2017-09-13Prefer HTTPS to FTP and HTTP in documentationPaul Eggert1-1/+1
Most of this change is to boilerplate commentary such as license URLs. This change was prompted by ftp://ftp.gnu.org's going-away party, planned for November. Change these FTP URLs to https://ftp.gnu.org instead. Make similar changes for URLs to other organizations moving away from FTP. Also, change HTTP to HTTPS for URLs to gnu.org and fsf.org when this works, as this will further help defend against man-in-the-middle attacks (for this part I omitted the MS-DOS and MS-Windows sources and the test tarballs to keep the workload down). HTTPS is not fully working to lists.gnu.org so I left those URLs alone for now.
2017-07-08Update Unicode data and files to Unicode 10.0Eli Zaretskii1-1/+2
* admin/notes/unicode: * admin/unidata/README: * admin/unidata/BidiBrackets.txt: * admin/unidata/BidiMirroring.txt: * admin/unidata/Blocks.txt: * admin/unidata/IVD_Sequences.txt: * admin/unidata/NormalizationTest.txt: * admin/unidata/SpecialCasing.txt: * admin/unidata/UnicodeData.txt: * lisp/international/characters.el: * lisp/international/fontset.el (script-representative-chars): * lisp/international/mule-cmds.el (ucs-names): Update per Unicode 10.0.
2016-12-31Update copyright year to 2017Paul Eggert1-1/+1
Run admin/update-copyright.
2016-10-15Fix char-width-table values for some EmojiEli Zaretskii1-0/+4
* lisp/international/characters.el (char-width-table): Add missing range U+1F400..U+1F43E. (Bug#24699) * admin/notes/unicode: Mention the need to verify char-width-table setting against data in EastAsianWidth.txt.
2016-09-21; * admin/notes/unicode: Mention BidiCharacterTest.txt.Eli Zaretskii1-4/+9
2016-07-16Add tests for ucs-normalize.elNoam Postavsky1-2/+9
Some tests are marked as expected to fail. * test/lisp/international/ucs-normalize-tests.el: New tests. * admin/unidata/NormalizationTest.txt: Add data for tests. * admin/unidata/README: Add URL for NormalizationTest.txt. * admin/notes/unicode: Add note about running (and updating the data for) the new tests. Remove note about normalization being unsupported.
2016-03-12Update Unicode notes for importing a new Unicode versionEli Zaretskii1-2/+6
* admin/notes/unicode: Mention the need to update otf-script-alist in fontset.el when importing data files from a new Unicode version.
2016-03-11Update admin/notes/unicodeEli Zaretskii1-0/+7
* admin/notes/unicode: Update the list of files from the UCD we are using. Mention the possible need to change 'ucs-names' when importing a new version of the Unicode Standard.
2016-01-01Update copyright year to 2016Paul Eggert1-1/+1
Run admin/update-copyright.