discourse/spec/lib/tiny_japanese_segmenter_spec.rb
Alan Guo Xiang Tan 930f51e175 FEATURE: Split up text segmentation for Chinese and Japanese.
* Chinese segmenetation will continue to rely on cppjieba
* Japanese segmentation will use our port of TinySegmenter
* Korean currently does not rely on segmentation which was dropped in c677877e4f
* SiteSetting.search_tokenize_chinese_japanese_korean has been split
into SiteSetting.search_tokenize_chinese and
SiteSetting.search_tokenize_japanese respectively
2022-02-07 09:21:14 +08:00

14 lines
530 B
Ruby

# frozen_string_literal: true
require 'rails_helper'
describe TinyJapaneseSegmenter do
describe '.segment' do
it 'generates the segments for a given japanese text' do
expect(TinyJapaneseSegmenter.segment("TinySegmenterはJavascriptだけ書かれた極めてコンパクトな日本語分かち書きソフトウェアです。")).to eq(
%w{TinySegmenter は Javascript だけ 書か れ た 極め て コンパクト な 日本 語分 かち 書き ソフトウェア です 。}
)
end
end
end