Update emoji character data to Unicode 10.0 / Emoji 5.0 (which also
removes U+1F93B MODERN PENATHLON from the emoji base letters).
Also add unit tests for line breaking for new characters (based on
earlier work by Seigo Nonaka).
Test: All new and existing unit tests pass;
Test: Manually tested line breaking of new emojis in TextView.
Bug: 28364892
Bug: 28678294
Bug: 30874706
Change-Id: I367cdab09187dc08a66a3112a5181a2b7fb338a5
Refactor WordBreaker to make it ready for more complex behavior.
Test: existing unit tests continue to pass
Change-Id: Ife758f3e2cf48922ab56109e6c5d3cffa3673feb
LayoutCache only keeps result of layout and can live after
FontCollection is destructed by GC.
This kind of failure will be captured by minikin_stress_tests in the
subsequent CL (I1bf4ba43e6e97cd04e7d6dd42d388dd17ce64c7b)
Test: ran minikin_tests
Bug: 36223724
Change-Id: I639b73c0f1041549158c43212a901c82df4b02db
Previously, we stayed on the conservative side and disallowed any
grapheme breaks (and thus cursoring) where a virama was followed by a
letter, since we did not know if the virama would be forming a
cluster with the letter or not. This created problems with Indic
languages with infrequent conjuncts, such as Tamil.
Now we use the information in calculated advances to find if a
cluster is formed. If there is no cluster, we break the grapheme and
allow cursoring after the virama.
Test: Unit tests added to GraphemeBreakTests and MeasurementTests.
Test: Also manually tested Tamil sequences.
Bug: 35721792
Change-Id: Ib159edb94b3ad6f693f0d3dad016b332b2cef447
To share the calculated coverage information across the processes, make
SparseBitSet serializable.
Bug: 34042446
Test: minikin_tests passes
Change-Id: I0463138adcf234739bb3ce1cdadf382021921f3e
This CL includes:
- Stop using utils/Mutex and use std::mutex instead.
- Stop using utils/Singleton.
Test: minikin_tests passed
Change-Id: Ib3f75b83397a546472bb5f91e066e44506e78263
This is 2nd attempt at I9e01d237c9adcb05e200932401cb1a4780049f86.
The previous CL was reverted because 8-bit integers were too small to
store the indices of mFamilyVec. This CL changes it to 16-bit integers
since size_t is still unnecessary large.
Theoretically, 32-bit integers are necessary for the indices of
mFamilyVec since the size of mFamilyVec can be 0x10EE01. However, in
practice, 16-bit integers are enough for the indices of mFamilyVec.
The length of mFamilyVec for the system fonts is 2084. Even if the
developers load their own very large fonts, it can only increase the
number of elements in mFamilyVec to at most 0x10FF.
As the result, memory usage of the FontCollections for the system fonts
decreases as follows.
64-bit process: before: 398,264 bytes, after: 282,568 bytes (-115,696 bytes)
32-bit process: before: 199,132 bytes, after: 149,548 bytes (-49,584 bytes)
Bug: 33562608
Test: Verified Emoji and CJK characters are present.
Test: android.text.cts.EmojiTest passed
Test: Minikin unit tests passed
Change-Id: I6796fd55ac30fe30528a212ebf6097b1d672e2f8
With this change, different languages can have a different minimum
length for suffix and prefixes when hyphenating. Previously, the
defaults used for English, 2 and 3, were used for every language.
Bug: 35712376
Test: Minikin unit tests were updated and the pass
Change-Id: Iffaf11c6b208c57d28d45b17246e177572dc1210
Since there are no known users of Minikin outside Android yet, these
files are simply a maintenance burden with no actual benefit.
Removing the samples until there are potential external users.
Test: Not needed
Change-Id: If7f1fb775cae427fbe31b86c202d1380c701bf28
This adds better support for Arabic script languages, Armenian,
Catalan, Hebrew, Kannada, Malayalam, Polish, Tamil, and Telugu by
adding various hyphenation types and edits appropriate for the
locales.
For Arabic script languages, soft hyphens act transparently with
regard to joining: If a line is broken at a soft hyphen where the two
characters around the soft hyphen were joining each other before,
they will continue to appear joining if the line is broken at the
soft hyphen and a hyphen glyph is inserted. This is needed for
Central Asian languages such as Uighur.
For Armenian, U+058A ARMENIAN HYPHEN is used for line breaks caused
by either automatic hyphenation or soft hyphens.
For Catalan, nonstandard line breaks are implemented for "l·l", which
hyphenates as "l-/l".
For Polish, when there is a line break at a hyphen, the hyphen is
repeated at the next line.
For the South Indic languages, when breaks happen due to soft breaks
or automatic hyphenation, no visible hyphen is inserted, although a
penalty is added.
For Hebrew, support for using U+05BE HEBREW PUNCTUATION MAQAF has
been implemented, but it's turned off pending confirmation of
desirability.
Also, hard hyphens, which previously had no penalty added for
breaking the line after them, now have the same penalty as an
automatic or soft break, with the difference that no hyphen is
inserted when they break.
Finally, some bugs have been fixed with hyphenating multiscript and
multi-font words.
Bug: 19950445
Bug: 19955011
Bug: 25623243
Bug: 26154469
Bug: 26154471
Bug: 33387871
Bug: 33560754
Bug: 33752592
Bug: 33754204
Test: Unit tests added, plus thorough manual testing
Change-Id: Iaccf776ce8d1d434ee8b1c534ff3659d80fdc338
Let's use shared_ptr since manual ref counting can be a bug-prone and
using the global mutex inside destructor is not useful for some time.
To remove raw pointer manipulation, needed to change Layout
constructors. Layout is no longer copyable and need to pass
FontCollection to constructor.
Bug: 28119474
Test: minikin_tests passed
Test: hwui_unit_tests passed
Test: No performance regression in minikin_perftest.
Change-Id: I8824593206ecba74cbc9731e298f045e1ae442a3
The Bulgarian hyphenation patterns contain a line consisting of '0ь0'
which has no practical effect on hyphenation. Add an exception in
roundtrip testing to make sure we don't fail while comparing our tables
with the input data.
Test: make -j works and creates .hyb files for bg and cu
Change-Id: Ia46b8a45fe522f5194d8105d31b34b0e27528cc9
Keep the region code and pass it to HarfBuzz during doing layout.
Test: minikin_tests
Bug: 30746293
Change-Id: I7c908701ca677238f663c82c597f8615d190e055
Mongolian fonts need to shape across U+202F NARROW NO-BREAK SPACE
(NNBSP). But if the first font in the fallback chain supports NNBSP,
it would break Mongolian shaping since the text would be broken into
three font runs. By making NNBSP sticky, we make sure Mongolian text
is kept in one font run and is shaped properly.
See http://www.unicode.org/L2/L2017/17036-mongolian-suffix.pdf for
background. The proposed character in the proposal was not accepted
for encoding by the Unicode Techincal Committee, but the document
explains in more detail why this change in needed.
Bug: 34344220
Test: manual
Change-Id: I344a63f383fa5485875603570025eac3c4eb2574
The Flag sequence is well handled by latest ICU.
Just add unit tests for catching regression in future.
Test: ran minikin_tests
Change-Id: I78d5461de8ff4d002ca06fb5bb81fcd7bc45d95e
"adb sync data" pushes test data as well as test executables.
Test: ran minikin_perftests
Test: ran minikin_tests
Change-Id: I08219f8abc4b59bd26d8f9155975b65b56a88b7b
This is 2nd attempt of I08e9b74192f8af1d045f1276498fa4e60d73863e.
The original CL was reverted due to conflicting with another CL submitted
before.
Here is the original commit message of reverted change.
This lays the groundwork for variation settings support.
Since we should regard different variations of a font as different fonts, we
need to create new typefaces. To reuse the same instance of MinikinFont, as
much as possible, FontFamily::createFamilyWithVariation now reuses an
existence instance, while incrementing the reference count.
Test: minikin_tests
Bug: 33062398
Change-Id: Ib25bf1bb5a5191e15a6523954146521464c91906
This is 2nd attempt of 0470cdb3e4
The difference is adding clearElementsEithLock to Font class which
is necessary to delete Fonts object outside of minikin. This method
should be removed once http://b/28119474 is fixed.
Here is original commit message of reverted change.
This lays the groundwork for making SparseBitSet serializable.
FontFamily.addFont is only used when the FontFamily is constructed.
Thus, instead of calling FontFamily.addFont multiple time, passes
Font list to the constructor. By this change, FontFamily can be
immutable now.
By making FontFamily immutable, We can create FontFamily with
pre-calculated SparseBitSet.
Bug: 34042446
Bug: 28119474
Bug: 34378805
Test: minikin_tests has passed
Change-Id: Ice433931196f5ae79a1a7ee0c98020f914aeb5f2
We should override the advance function only when the glyph is came
from color bitmap. This was introduced by
Ia88cb670ca9e0bb352bccef22c5ea3a789bcc1da.
Bug: 21705974
Test: ran minikin_tests
Change-Id: I3489d75ace8bffdd9035a5986a2641313feef04d
This lays the groundwork for variation settings support.
Since we should regard different variations of a font as different fonts, we
need to create new typefaces. To reuse the same instance of MinikinFont, as
much as possible, FontFamily::createFamilyWithVariation now reuses an
existence instance, while incrementing the reference count.
Test: minikin_tests
Bug: 33062398
Change-Id: I08e9b74192f8af1d045f1276498fa4e60d73863e
This lays the groundwork for making SparseBitSet serializable.
FontFamily.addFont is only used when the FontFamily is constructed.
Thus, instead of calling FontFamily.addFont multiple time, passes
Font list to the constructor. By this change, FontFamily can be
immutable now.
By making FontFamily immutable, We can create FontFamily with
pre-calculated SparseBitSet.
Bug: 34042446
Test: minikin_tests has passed
Change-Id: I2576789fba6cb27687e920e2488e8bedbcf7d36f
GraphemeBreak.tailoring/GraphemeBreak.genderBalancedEmoji start failing
after ICU update to 58. The failure is around Rule GB9 in Unicode Standard
Annex #29. GB9 forbids breaks before extending characters and before ZWJ.
However the implementation in minikin only checks for extending characters.
It used to work with Unicode 8.0 since ZWJ had the Grapheme_Cluster_Break
property of Extend in Unicode 8.0 but it no longer has that property in
Uniocde 9.0.
Thus, we need to check for ZWJ explicitly.
At the same time, this removes manually added PREPEND characters case from
tailoredGraphemeClusterBreak which is already supported in ICU 58.
Test: minikin_tests passes
Bug: 34117643
Change-Id: Ib46d48bebe4a866208e050d7defc715c61fcbeb1
To avoid lock contention in Skia, use HarfBuzz implementation
for retrieving boundary box and advance information from font.
Bug: 21705974
Test: Manually done
Change-Id: Ia88cb670ca9e0bb352bccef22c5ea3a789bcc1da
Since switching to 64-bit devices, size_t is now a 64-bit integer.
FontCollection::Range uses two size_t integers but they just point to an index
in mFamilies. To reduce the memory usage, this CL changes the size_t integers to
uint8_t.
The maximum size of each integer in Range is the size of FontCollection::mFamilies.
The largest this can go is the system font list plus a user defined family, which
has 91 families. So an 8-bit integer should be enough.
With this change, about 84 KiB of memory will be saved per font collection. Since
eight font collections are created during bootstrap, about 670 KiB of memory will
be saved with this CL.
Bug: 33562608
Test: Ran FontCollection.collectionAllocationSizeTest on a 64-bit device.
On my Nexus 5X, it changed from 327358 to 241342.
Change-Id: I9e01d237c9adcb05e200932401cb1a4780049f86
Add an "mJustified" for justification, and tune the line breaking to
produce good results. Major differences for fully justified text include:
- Space can be shrunk in justified text.
- Hyphenation should be more aggressive in justified text.
Also adds a penalty for the last line being very short. This is tuned
to be more aggressive for ragged right than for justified text.
This is based on a patch by Raph Levien (raph@google.com).
Bug: 31707212
Test: Manually tested with Icbfab2faa11a6a0b52e6f0a77a9c9b5ef6e191da
Change-Id: If366f82800831ccc247ec07b7bc28ca4c6ae0ed6
This will handle installation for local builds as well as for the test
bundles.
Test: m -j minikin_tests; ls $OUT/data/nativetest*/minikin_tests
Test: m -j continous_native_tests dist; zipinfo -1 out/dist/*continuous_native_tests*.zip
Test: /data/nativetest{,64}/minikin_tests/minikin_tests
Change-Id: Iafd31fa119e7c4d92937ca8ae8346e268a6c1f38
Merged-In: Iafd31fa119e7c4d92937ca8ae8346e268a6c1f38