string-direction
is a ruby library for automatic detection of the direction (left-to-right, right-to-left or bi-directional) in which a text should be displayed.
require 'string-direction'
detector = StringDirection::Detector.new
detector.direction('english') #=> 'ltr'
detector.direction('العربية') #=> 'rtl'
detector.direction("english العربية") #=> 'bidi'
detector.ltr?('english') #=> true
detector.rtl?('العربية') #=> true
detector.bidi?('english') #=> false
But, if you prefer, you can monkey patch String
:
String.send(:include, StringDirection::StringMethods)
'english'.direction #=> 'ltr'
'العربية'.rtl? #=> true
string-direction
uses different strategies in order to try to detect the direction of a string. The detector uses them once at a time and returns the result once one of them succeeds, aborting any further analysis.
Strategies are passed to the detector during its initialization:
detector = StringDirection::Detector.new(:foo, :bar)
In the above example, classes StringDirection::FooStrategy
and StringDirection::BarStrategy
have to be in the load path.
Three strategies are natively integrated: marks
, characters
& dominant
. marks
&& characters
are used as default strategies if no arguments are given to the detector.
Looks for the presence of Unicode direction marks: left-to-right (\u200e) or right-to-left (\u200f).
detector = StringDirection::Detector.new(:marks)
detector.direction("\u200eالعربية") #=> "ltr"
detector.direction("\u200fEnglish") #=> "rtl"
marks
strategy can not only analyze a string but everything responding to to_s
.
Looks for the presence of right-to-left characters in the scripts used in the string.
By default, string-direction
consider following scripts to have a right-to-left writing:
- Arabic
- Hebrew
- Nko
- Kharoshthi
- Phoenician
- Syriac
- Thaana
- Tifinagh
detector = StringDirection::Detector.new(:characters)
detector.direction('english') #=> 'ltr'
detector.direction('العربية') #=> 'rtl'
You can change these defaults:
detector.direction('ᚪᚫᚬᚭᚮᚯ') #=> 'ltr'
StringDirection.configure do |config|
config.rtl_scripts << 'Runic'
end
detector.direction('ᚪᚫᚬᚭᚮᚯ') #=> 'rtl'
This can be useful, mainly, for scripts that have both left-to-right and right-to-left representations:
- Bopomofo
- Carian
- Cypriot
- Lydian
- Old_Italic
- Runic
- Ugaritic
Keep in mind than only scripts recognized by Ruby regular expressions are allowed.
characters
strategy can not only analyze a string but everything responding to to_s
.
With dominant
strategy, a string can be left-to-right or right-to-left, but never bidi. It returns one or the other in function of which one has more characters.
detector = StringDirection::Detector.new(:dominant)
detector.direction('e العربية') #=> 'rtl'
detector.direction('english ة') #=> 'ltr'
As with characters
strategy, you can change which scripts are considered right-to-left.
dominant
strategy can not only analyze a string but everything responding to to_s
.
You can define your custom strategies. To do so, you just have to define a class inside StringDirection
module with a name ending with Strategy
. This class has to respond to an instance method run
which takes the string as argument. You can inherit from StringDirection::Strategy
to have convenient methods ltr
, rtl
and bidi
which return expected result. If the strategy doesn't know the direction, it must return nil
.
class StringDirection::AlwaysLtrStrategy < StringDirection::Strategy
def run(string)
ltr
end
end
detector = StringDirection::Detector.new(:always_ltr)
detector.direction('العربية') #=> 'ltr'
marks
and characters
are default strategies, but you can change them:
StringDirection.configure do |config|
config.default_strategies = [:custom, :marks, :always_ltr]
end
If you desire, you can monkey patch String
:
String.send(:include, StringDirection::StringMethods)
'english'.direction #=> 'ltr'
In that case, strategies configured in string_method_strategies
are used:
StringDirection.configure do |config|
config.string_methods_strategies = [:marks, :characters]
end
string-direction
follows the principles of semantic versioning.
I'm not an expert neither in World scripts nor in Unicode. So, please, if you know some case where this library is not working as it should be, open an issue and help improve it.
Omniglot.com, where I learnt which Ruby recognized scripts have a right-to-left writing system.
Copyright 2013 Marc Busqué - [email protected]
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/.