Commit Graph

600 Commits

Author SHA1 Message Date
Roozbeh Pournader
f2fd20ec54 Update emoji grapheme breaking rules
The rules are updated to the latest UAX #29, with tailorings based on
the font in use: we can now use the clustering information
calculated by Layout, so we will only disallow a grapheme break if an
emoji ligature is actually formed.

Test: Unit tests have been updated and pass.
Bug: 30917298
Bug: 34211654
Change-Id: Idc0ef9f1f4f45dc45a50ed69e45c43ebfaea0306
2017-03-16 13:34:52 -07:00
Roozbeh Pournader
f3399b503e Refactor WordBreaker
Refactor WordBreaker to make it ready for more complex behavior.

Test: existing unit tests continue to pass
Change-Id: Ife758f3e2cf48922ab56109e6c5d3cffa3673feb
2017-03-16 12:23:08 -07:00
TreeHugger Robot
464d63f0d8 Merge "Introduce minikin_stress_tests to find race condition." 2017-03-16 00:50:50 +00:00
TreeHugger Robot
4b613e361b Merge "Serialize and deserialize supported axes." 2017-03-15 23:30:30 +00:00
Seigo Nonaka
e64e9b2176 Introduce minikin_stress_tests to find race condition.
This is designed for catching race condition.
The stress_tests is splited from unit test binary since this takes
30 seconds on angler.

Bug: 36223724
Bug: 36208043
Test: ran minikin_stress_tests
Change-Id: I1bf4ba43e6e97cd04e7d6dd42d388dd17ce64c7b
2017-03-15 16:11:33 -07:00
TreeHugger Robot
0c8d074c0c Merge "Fallback from script-specific hyphens to normal hyphen first" 2017-03-15 22:45:31 +00:00
TreeHugger Robot
c044fd4ed7 Merge "In greedy line breaking, repeat breaks until the line fits" 2017-03-15 22:29:13 +00:00
Seigo Nonaka
ca8ac8a924 Serialize and deserialize supported axes.
To avoid reading font files during FontFamily construction, serialize
and deserialize supported axes and cmap coverage at the same time.

Bug: 36232655
Test: ran minikin_tests
Change-Id: I4086fb887e13f872390b533584bce6f1d5598ea0
2017-03-15 14:18:13 -07:00
Roozbeh Pournader
aae6468815 Fallback from script-specific hyphens to normal hyphen first
The previous code fell back directly from a script-specific hyphen to
the ASCII hyphen-minus if the font didn't support the script-specific
hyphen. Now we try the Unicode hyphen (U+2010) first before trying
the ASCII hyphen-minus.

Bug: 36201363
Test: Not needed
Change-Id: I374234fd73fab7edd990ea86f8937c38761c90bf
2017-03-15 14:15:46 -07:00
Roozbeh Pournader
d2aaf3394a In greedy line breaking, repeat breaks until the line fits
Previously, in greedy line breaking, when a line overflowed, we found
the best line breaking candidate before it and broke the line there.
But we didn't check to see if the remaining part now fits in a line.

With this change, we now repeat checking for overflows, and break
again until we have no breaking opportunity or the remaining text now
fits in a line.

Also found an issue with greedy line breaking and keeping the
hyphenation edit for the next line which is now fixed.

Test: Manual. The issue reported in the bug is now fixed.
Bug: 34185255
Bug: https://code.google.com/p/android/issues/detail?id=231437
Bug: 33560754
Change-Id: I93bdd341e4f8e1257710e453e4938f224cb2a1ff
2017-03-15 13:52:21 -07:00
TreeHugger Robot
fde7453c82 Merge "Break grapheme clusters after viramas if they end a cluster" 2017-03-15 19:53:12 +00:00
Seigo Nonaka
3165aebe3b Do not keep FontCollection reference in Layout.
LayoutCache only keeps result of layout and can live after
FontCollection is destructed by GC.

This kind of failure will be captured by minikin_stress_tests in the
subsequent CL (I1bf4ba43e6e97cd04e7d6dd42d388dd17ce64c7b)

Test: ran minikin_tests
Bug: 36223724
Change-Id: I639b73c0f1041549158c43212a901c82df4b02db
2017-03-15 15:17:43 +00:00
Seigo Nonaka
8d9c9d7f20 Expose supportedAxes to frameworks/base
The list of supportedAxes are necessary for returning value of
setFontVariationSettings.

Bug: 35764323
Test: ran TextViewTest and PaintTest in cts
Change-Id: I52f244146ea0ce335df02c841f89285be2ed746e
2017-03-15 14:32:08 +00:00
Roozbeh Pournader
f4c0bd2e1b Break grapheme clusters after viramas if they end a cluster
Previously, we stayed on the conservative side and disallowed any
grapheme breaks (and thus cursoring) where a virama was followed by a
letter, since we did not know if the virama would be forming a
cluster with the letter or not. This created problems with Indic
languages with infrequent conjuncts, such as Tamil.

Now we use the information in calculated advances to find if a
cluster is formed. If there is no cluster, we break the grapheme and
allow cursoring after the virama.

Test: Unit tests added to GraphemeBreakTests and MeasurementTests.
Test: Also manually tested Tamil sequences.
Bug: 35721792
Change-Id: Ib159edb94b3ad6f693f0d3dad016b332b2cef447
2017-03-14 21:41:49 -07:00
Seigo Nonaka
44914ce013 Revert "Use std::mutex instead of android::Mutex"
This reverts commit 0eac702718.

Bug: 36208043
Test: N/A

Change-Id: I165ab7a0718ea50a8034adb6277809e271fd762c
2017-03-14 10:48:42 -07:00
Seigo Nonaka
7945b2d019 Fix build failure due to unexpected merge.
FontLanguageListCache::kEmptyListId is gone, use kEmptyLanguageListId
instead.

Test: N/A
Change-Id: I96075849c53f23fbce8dbc180a51d8f97e45f316
2017-03-13 16:42:46 -07:00
TreeHugger Robot
aed4ec33ad Merge "Make SparseBitSet serializable." 2017-03-13 23:07:04 +00:00
Seigo Nonaka
ff9a6740ed Make SparseBitSet serializable.
To share the calculated coverage information across the processes, make
SparseBitSet serializable.

Bug: 34042446
Test: minikin_tests passes
Change-Id: I0463138adcf234739bb3ce1cdadf382021921f3e
2017-03-13 14:07:55 -07:00
Seigo Nonaka
0eac702718 Use std::mutex instead of android::Mutex
This CL includes:
- Stop using utils/Mutex and use std::mutex instead.
- Stop using utils/Singleton.

Test: minikin_tests passed
Change-Id: Ib3f75b83397a546472bb5f91e066e44506e78263
2017-03-13 14:03:05 -07:00
Seigo Nonaka
684ac0b636 Reduce memory usage of FontCollection.
This is 2nd attempt at I9e01d237c9adcb05e200932401cb1a4780049f86.

The previous CL was reverted because 8-bit integers were too small to
store the indices of mFamilyVec. This CL changes it to 16-bit integers
since size_t is still unnecessary large.

Theoretically, 32-bit integers are necessary for the indices of
mFamilyVec since the size of mFamilyVec can be 0x10EE01. However, in
practice, 16-bit integers are enough for the indices of mFamilyVec.
The length of mFamilyVec for the system fonts is 2084. Even if the
developers load their own very large fonts, it can only increase the
number of elements in mFamilyVec to at most 0x10FF.

As the result, memory usage of the FontCollections for the system fonts
decreases as follows.
64-bit process: before: 398,264 bytes, after: 282,568 bytes (-115,696 bytes)
32-bit process: before: 199,132 bytes, after: 149,548 bytes (-49,584 bytes)

Bug: 33562608
Test: Verified Emoji and CJK characters are present.
Test: android.text.cts.EmojiTest passed
Test: Minikin unit tests passed
Change-Id: I6796fd55ac30fe30528a212ebf6097b1d672e2f8
2017-03-13 06:35:53 -07:00
Roozbeh Pournader
068c7ba2ea Customizable min suffix/prefix length for hyphenation in Minikin
With this change, different languages can have a different minimum
length for suffix and prefixes when hyphenating. Previously, the
defaults used for English, 2 and 3, were used for every language.

Bug: 35712376
Test: Minikin unit tests were updated and the pass
Change-Id: Iffaf11c6b208c57d28d45b17246e177572dc1210
2017-03-06 16:12:53 -08:00
TreeHugger Robot
0b451587cc Merge "Remove sample directory" 2017-03-06 23:47:55 +00:00
Roozbeh Pournader
22d4e7e3f3 Remove sample directory
Since there are no known users of Minikin outside Android yet, these
files are simply a maintenance burden with no actual benefit.
Removing the samples until there are potential external users.

Test: Not needed
Change-Id: If7f1fb775cae427fbe31b86c202d1380c701bf28
2017-03-06 14:05:23 -08:00
Roozbeh Pournader
319073941e Correct hyphenation for various complex cases
This adds better support for Arabic script languages, Armenian,
Catalan, Hebrew, Kannada, Malayalam, Polish, Tamil, and Telugu by
adding various hyphenation types and edits appropriate for the
locales.

For Arabic script languages, soft hyphens act transparently with
regard to joining: If a line is broken at a soft hyphen where the two
characters around the soft hyphen were joining each other before,
they will continue to appear joining if the line is broken at the
soft hyphen and a hyphen glyph is inserted.  This is needed for
Central Asian languages such as Uighur.

For Armenian, U+058A ARMENIAN HYPHEN is used for line breaks caused
by either automatic hyphenation or soft hyphens.

For Catalan, nonstandard line breaks are implemented for "l·l", which
hyphenates as "l-/l".

For Polish, when there is a line break at a hyphen, the hyphen is
repeated at the next line.

For the South Indic languages, when breaks happen due to soft breaks
or automatic hyphenation, no visible hyphen is inserted, although a
penalty is added.

For Hebrew, support for using U+05BE HEBREW PUNCTUATION MAQAF has
been implemented, but it's turned off pending confirmation of
desirability.

Also, hard hyphens, which previously had no penalty added for
breaking the line after them, now have the same penalty as an
automatic or soft break, with the difference that no hyphen is
inserted when they break.

Finally, some bugs have been fixed with hyphenating multiscript and
multi-font words.

Bug: 19950445
Bug: 19955011
Bug: 25623243
Bug: 26154469
Bug: 26154471
Bug: 33387871
Bug: 33560754
Bug: 33752592
Bug: 33754204
Test: Unit tests added, plus thorough manual testing
Change-Id: Iaccf776ce8d1d434ee8b1c534ff3659d80fdc338
2017-03-02 15:26:13 -08:00
Seigo Nonaka
2f1aebfcc6 Merge "Remove MinikinRefCounted and use shared_ptr instead" 2017-02-27 04:30:31 +00:00
Roozbeh Pournader
931945afb5 Add exception for Bulgarian to mk_hyb_file am: 131392748f am: 980c37e278
am: 76afbcadb6

Change-Id: Ieee903aff9b9759da30996d9c64cf0dcf94d7294
2017-02-24 14:46:53 +00:00
Roozbeh Pournader
76afbcadb6 Add exception for Bulgarian to mk_hyb_file am: 131392748f
am: 980c37e278

Change-Id: I960946e29e3aeb6bac9cc2028a2201e8c8d5ae4f
2017-02-24 14:43:52 +00:00
Roozbeh Pournader
980c37e278 Add exception for Bulgarian to mk_hyb_file
am: 131392748f

Change-Id: I551c3cae80bf13ca43f13548777691b244197df1
2017-02-24 14:41:45 +00:00
Roozbeh Pournader
131392748f Add exception for Bulgarian to mk_hyb_file
The Bulgarian hyphenation patterns contain a line consisting of '0ь0'
which has no practical effect on hyphenation. Add an exception in
roundtrip testing to make sure we don't fail while comparing our tables
with the input data.

Test: make -j works and creates .hyb files for bg and cu
Change-Id: Ia46b8a45fe522f5194d8105d31b34b0e27528cc9
(cherry picked from commit 6308ea4c4b)
2017-02-24 13:04:41 +00:00
Seigo Nonaka
a619860872 Remove MinikinRefCounted and use shared_ptr instead
Let's use shared_ptr since manual ref counting can be a bug-prone and
using the global mutex inside destructor is not useful for some time.

To remove raw pointer manipulation, needed to change Layout
constructors. Layout is no longer copyable and need to pass
FontCollection to constructor.

Bug: 28119474
Test: minikin_tests passed
Test: hwui_unit_tests passed
Test: No performance regression in minikin_perftest.
Change-Id: I8824593206ecba74cbc9731e298f045e1ae442a3
2017-02-24 17:11:32 +09:00
Roozbeh Pournader
6308ea4c4b Add exception for Bulgarian to mk_hyb_file
The Bulgarian hyphenation patterns contain a line consisting of '0ь0'
which has no practical effect on hyphenation. Add an exception in
roundtrip testing to make sure we don't fail while comparing our tables
with the input data.

Test: make -j works and creates .hyb files for bg and cu
Change-Id: Ia46b8a45fe522f5194d8105d31b34b0e27528cc9
2017-02-22 18:48:18 -08:00
Seigo Nonaka
3fcf3634db Call hb_font_set_variation if font variations are provided.
Test: None
Change-Id: I203d9ba7e1a1fcfdb10cd6a711d9a35136cbddd6
2017-02-22 10:56:04 +09:00
Seigo Nonaka
480ad4a314 Pass region code to HarfBuzz.
Keep the region code and pass it to HarfBuzz during doing layout.

Test: minikin_tests
Bug: 30746293
Change-Id: I7c908701ca677238f663c82c597f8615d190e055
2017-02-15 20:29:38 +09:00
Roozbeh Pournader
30b9f327d3 Add U+202F NARROW NO-BREAK SPACE to the sticky white list
Mongolian fonts need to shape across U+202F NARROW NO-BREAK SPACE
(NNBSP). But if the first font in the fallback chain supports NNBSP,
it would break Mongolian shaping since the text would be broken into
three font runs. By making NNBSP sticky, we make sure Mongolian text
is kept in one font run and is shaped properly.

See http://www.unicode.org/L2/L2017/17036-mongolian-suffix.pdf for
background. The proposed character in the proposal was not accepted
for encoding by the Unicode Techincal Committee, but the document
explains in more detail why this change in needed.

Bug: 34344220
Test: manual
Change-Id: I344a63f383fa5485875603570025eac3c4eb2574
2017-02-14 21:29:18 +00:00
TreeHugger Robot
44c14c81d1 Merge "Introduce unittests for flag sequence." 2017-02-08 17:00:39 +00:00
Seigo Nonaka
655d2a760d Introduce unittests for flag sequence.
The Flag sequence is well handled by latest ICU.
Just add unit tests for catching regression in future.

Test: ran minikin_tests
Change-Id: I78d5461de8ff4d002ca06fb5bb81fcd7bc45d95e
2017-02-09 00:10:35 +09:00
Seigo Nonaka
3c71c47a6a Update the instruction to run tests.
"adb sync data" pushes test data as well as test executables.

Test: ran minikin_perftests
Test: ran minikin_tests
Change-Id: I08219f8abc4b59bd26d8f9155975b65b56a88b7b
2017-02-07 15:52:25 +09:00
Seigo Nonaka
77a29ed5ec Introduce createCollectionWithVariation.
This is 2nd attempt of I08e9b74192f8af1d045f1276498fa4e60d73863e.
The original CL was reverted due to conflicting with another CL submitted
before.

Here is the original commit message of reverted change.

This lays the groundwork for variation settings support.
Since we should regard different variations of a font as different fonts, we
need to create new typefaces. To reuse the same instance of MinikinFont, as
much as possible, FontFamily::createFamilyWithVariation now reuses an
existence instance, while incrementing the reference count.

Test: minikin_tests
Bug: 33062398
Change-Id: Ib25bf1bb5a5191e15a6523954146521464c91906
2017-01-31 13:19:56 +09:00
Seigo Nonaka
0afb39eaed Remove FontFamily.addFont and make FontFamily immutable.
This is 2nd attempt of 0470cdb3e4
The difference is adding clearElementsEithLock to Font class which
is necessary to delete Fonts object outside of minikin. This method
should be removed once http://b/28119474 is fixed.

Here is original commit message of reverted change.

This lays the groundwork for making SparseBitSet serializable.
FontFamily.addFont is only used when the FontFamily is constructed.
Thus, instead of calling FontFamily.addFont multiple time, passes
Font list to the constructor. By this change, FontFamily can be
immutable now.

By making FontFamily immutable, We can create FontFamily with
pre-calculated SparseBitSet.

Bug: 34042446
Bug: 28119474
Bug: 34378805
Test: minikin_tests has passed
Change-Id: Ice433931196f5ae79a1a7ee0c98020f914aeb5f2
2017-01-20 17:59:42 +09:00
Siyamed Sinir
92c0eb1f83 Revert "Remove FontFamily.addFont and make FontFamily immutable."
This reverts commit 0470cdb3e4.

Bug: 34378805
Change-Id: I8f1ee00b365c8b17c6140e9e286fbea082e31364
2017-01-20 02:01:21 +00:00
Siyamed Sinir
8a4ee2d1b4 Merge "Revert "Introduce createCollectionWithVariation."" 2017-01-20 01:59:54 +00:00
Siyamed Sinir
df0cbf3bc0 Revert "Introduce createCollectionWithVariation."
This reverts commit ed8318e4e8.

Bug: 34378805
Change-Id: I22b683f774813724f220b1b8584ab188f3cf4fa7
2017-01-20 01:13:24 +00:00
Siyamed Sinir
10bf33801b Merge "Revert "Reduce memory usage of FontCollection."" 2017-01-12 20:19:57 +00:00
Siyamed Sinir
defcd9d9c2 Revert "Reduce memory usage of FontCollection."
This reverts commit 41ef8b376f.

Test: Manually tested
Bug: 34247671
Change-Id: I0510009b2deac784770f26059681b1980800abc8
2017-01-12 20:13:11 +00:00
TreeHugger Robot
ac93b29649 Merge "Introduce createCollectionWithVariation." 2017-01-12 16:59:26 +00:00
Seigo Nonaka
7235e8c11d Fix inverse condition of forColorEmoji.
We should override the advance function only when the glyph is came
from color bitmap. This was introduced by
Ia88cb670ca9e0bb352bccef22c5ea3a789bcc1da.

Bug: 21705974
Test: ran minikin_tests
Change-Id: I3489d75ace8bffdd9035a5986a2641313feef04d
2017-01-12 18:35:27 +09:00
Seigo Nonaka
ed8318e4e8 Introduce createCollectionWithVariation.
This lays the groundwork for variation settings support.
Since we should regard different variations of a font as different fonts, we
need to create new typefaces. To reuse the same instance of MinikinFont, as
much as possible, FontFamily::createFamilyWithVariation now reuses an
existence instance, while incrementing the reference count.

Test: minikin_tests
Bug: 33062398
Change-Id: I08e9b74192f8af1d045f1276498fa4e60d73863e
2017-01-12 18:26:46 +09:00
Seigo Nonaka
3ea8fce383 Merge "Use HarfBuzz metric implementation for emoji font." 2017-01-12 08:18:25 +00:00
Seigo Nonaka
5774d3acb6 Merge "Fix GraphemeBreak test failures." 2017-01-12 08:17:32 +00:00
TreeHugger Robot
d2cca3e411 Merge "Remove FontFamily.addFont and make FontFamily immutable." 2017-01-12 07:43:53 +00:00