PyKakasi ChangeLog¶
All notable changes to this project will be documented in this file.
v2.2.1 (10, July 2021)¶
Fixed¶
Add Zenkaku-Question(uFF1F) and other Zenkaku marks as endmark (#146)
v2.2.0 (22, June 2021)¶
Added¶
dictionary: add noun and adjectives from UniDic(#140)
Changed¶
Refactoring main loop logics for convert()(#144)
Fixed¶
Fix segmentation (wakati) when combination with Katakana and Hiragana(#142)
v2.1.1 (16, May 2021)¶
Added¶
Provide Kakasi.normalize(text) class method
Add unidic data into data (not used yet), and add parse utility.
Fixed¶
Put type hint stub into package
Copyright notifications
Changed¶
Expand all cletter into dictionary (#139)
Change primary kanwadict index from str to int
test: gather all legacy test into test_pykakasi_legacy.py file.
v2.1.0 (6, May 2021)¶
Added¶
Deprecation warning when using old api(#124)
Add type hint file(pyi) (#124)
Benchmark test codes(#122)
Changed¶
Cache internal results and improve performance about 30-40 times.(#128)
Use standard pickle for database file(#128)
Exceptions module is now
pykakasi
, notpykakasi.exceptions
Removed¶
Dependency for klepto(#128)
v2.0.8 (4, May 2021)¶
Added¶
test: Benchmark and profiling (#122)
Changed¶
Performance: avoid ord() when checking long-mark, speed up about 6%
Reformat code by black(#121)
v2.0.7 (26, Feb. 2021)¶
Fixed¶
Infinite loop after running for a while, handle independent HW VOICED SOUND MARK (#115, #118)
v2.0.5 (5, Feb. 2021)¶
Changed¶
CLI: use argparse for option parse(#113)
Fixed¶
Handle 思った、言った、行った properly.(#114)
CI: fix coveralls error
Deprecated¶
CI: drop travis-ci test and badge
PyKakasi ChangeLog before v1.0¶
All notable changes to this project will be documented in this file.
v2.0.0a6 (30, Mar. 2020)¶
Added¶
Understand more kanji variations.
Fixed¶
Fix IVS handling to return correct word length to consume.
v2.0.0a5 (23, Mar. 2020)¶
Changed¶
Recognize UNICODE standard Ideographic Variation Selector(IVS) and transiliterate when used.(#97)
v2.0.0a4 (20, Mar. 2020)¶
Added¶
Add type hinting.
Changed¶
Refactoring dictionary generation classes.
call super() from wakati.__init__()
test: detection whether tox or raw pytest by TOX_ENV environment variable. When raw pytest, generate dictionaries as fixture. Previous versions uses –runenv option for pytest.
Fixed¶
NewAPI: fix return value when empty input string.
v2.0.0a3 (18, Mar. 2020)¶
Changed¶
Update test cases.
Fixed¶
Add guard for unknown symbol code point which lead NoneType error.
v2.0.0a2 (16, Mar. 2020)¶
Added¶
NewAPI: support kunrei and passport roman conversion rule.
Changed¶
CI: test by github actions
Fixed¶
Support an extended kana(#77) (U0001b150-U0001b152, U0001b164-U0001b167)
v2.0.0a1 (14, Mar. 2020)¶
Added¶
Structured interface of Kakasi class.(#21)
Changed¶
Github workflows for packaging and release.(#91)
Fixed¶
fix data kakasidict.utf8: “本蓮沼”
Deprecated¶
Drop python 2.7 support.
v1.2 (26, Sep, 2019)¶
Fixed¶
Fix out-of-index error when kana-dash is placed on first of same character group.(#85)
v1.1a1 (8, Jul, 2019)¶
Changed¶
pytest: now run on project root without tox, by generating dictionary as a test fixture.
tox: run tox test with installed dictionary instead of a generated fixture.
Optimize kana conversion function.
Move kakasidict.py to src and conftest.py to tests
Fixed¶
Version naming follows PEP386.
Sometimes fails to insert space after punctuation(#79).
Special case in kana-roman passport conversion such as ‘etchu’ etc.
v1.0-rc1 (29, June, 2019)¶
Added¶
Threading test.
Test with Chinese kanji.
Test with extended kana which is out of Unicode BSC.
t flag to specify not to change unkouwn characters to ???.
Changed¶
Refactoring itaiji and kanwa class as a thread-safe borg class.
Fixed¶
Fix test case issue68_2 for missing characters
v0.95 (8, June, 2019)¶
Added¶
Add manual document holder.
Test on Azure-Pipelines.
Tox has a check test pipeline
Add classifier to setup.py
Changed¶
Drop support for python 3.4 that is end-of-line in March, 2019.
Add suppot for pypy and tested on Travis-CI.
Version information on __init__.py
Use ‘tox’ and ‘pytest’ for test runner instead of ‘unittest’.
Fixed¶
Fix keyerror for some characters(#68).
Fix coveralls source code reference.
Removed¶
Test on AppVeyor
v0.94 (16, Feb, 2019)¶
Add¶
Implement word split feature by @oxij (#58).
Changed¶
Improve setup.py build script generating pickled files when build bdist.
Use pytest and pytest-cov for unittest.
Use tox for CI/CD in travis-CI and appveyor.
Fixed¶
Kanwadict: remove entry for 市立 as ichiritsu
Issue #59: fix 0x30f7-30fc katakana convertion to be as same as in Hiragana.
Appveyor: twine upload credential environment variable name.
Deprecated¶
Drop python2.6 and python 3.3 from test target.
v0.93 (3, May, 2018)¶
Added¶
Add test for two type of exceptions
Add test for Upper case flags
Add Upper case flag with E2a mode.
Changed¶
Release source distribution from appveyor.
Refactoring how to import six
Fixed¶
Exception when converting Fullwidth collon uFF1A (#51)
Fixed unworking Upper case flag (“U”) which causes exception
Removed¶
Drop canConvert method from itaiji.
v0.91 (29, Apr., 2018)¶
Added¶
Test case convert from Full-width Alphabet/symbols to Half-width (E2a).
Convert logic from Full-width alphabet/symbols to Half-width (E2a).
Add more words with repeat mark from SKK-JISYO.L (#46)
Changed¶
Not distribute binary wheel package, because of dictionary data depends on python version.
Fixed¶
Conversion from ○々 become ‘TypeError: must be str, not NoneType’ (#46)
Appveyor: update deployment script.
v0.83 (29, Mar., 2018)¶
Fixed¶
Appveyor: fix twine not found error in deploy script
setup: clean old dictionary when building
v0.82 (29, Mar., 2018)¶
Added¶
Russian characters defined in JIS X0208(#13)
Changed¶
README: fix typo and add description for Kigou conversion.
README: update sample code to working one.
Appveyor: generate wheel artifacts
Fixed¶
MANIFEST: update to specify kanwadict3.db explicitly.
setup.py: allow reading README.rst written in UTF-8.
v0.80 (28, Mar., 2018)¶
Here is a release candicate for v1.0
Added¶
Readme: add dependency description.
Changed¶
Bump up version number.
Readme: recommend ‘pip install pykakasi’
Replace anydbm with semidbm that is a pure dbm implementation with performance.
Fixed¶
Reduce test warnings.
No platform dependency now.
Fix dependency in wheel package that depend on gdbm in previous release.
Removed¶
Binary release for windows and linux.
v0.26 (26, Mar., 2018)¶
Changed¶
Use six for python 2 and 3 compatility code.
Fixed¶
Build wheel with platform names.
v0.25 (25, Mar., 2018)¶
Added¶
Test on Python 3.5 and Python 3.6
Test on Windows using AppVeyor
Mesure test coverage and monitor on coveralls.io
Changed¶
Move dictionary data to pykakasi/data
Build dictionary when setup.py build
Recoomend installation from github source not pypi. (#17)
Converter configuration become per instance not class wide.
Fixed¶
kakasi.py: Fix exception class name typo of InvalidFlagValueException
kakasi.py, h2a.py, k2a.py: Do not import all exception class.
test_genkanwadict.py: Multi platform support for temp directory(#27).
setup.py: change _pre_build() to pre_build() (#17).
v0.23 (25, May., 2014)¶
Support following options in kakasi command.
same as original kakasi:
-J{aKH} -K{aH} -H{aK} -E{a} -rk -rh -w -s -S -Cadditional options:
-v --version -h --help -O --output: output file -I --input: input file
Change default behavior as almost same as original kakasi
Zenkaku numbers conversion
Passport roman conversion table
Version 0.10 (25, April, 2014)¶
Work on python 2.6, 2.7, 3.3, 3.4 (Thanks @FGtatsuro)
Kunrei and Hepburn roman table