Rules

From Ocelot
Jump to: navigation, search

Ocelot uses user-configurable rules to filter and match segments in an XLIFF document based on their metadata. A rule can contain:

  • An ITS data category to match
  • A value or range of values for that data category
  • Color information which is used to flag matching segments in the Ocelot UI

Contents

Configuring Rules

ITS Rules are defined in the rules.properties file located in the .ocelot directory in the user's home directory. If this file does not exist, you will need to create it. The ITS Rules are stored in Java .properties format, and can be edited with any text editor. Ocelot expects this file to be UTF-8.

Defining a Rule

Rule properties have a hierarchical structure of the form

rule_name.category[.subcategory] = value

Each of these parts will be explained in turn.

Rule names

Each rule should have a unique name. This name should be used as a prefix for each element of the rule definition. For example, the following rule is named General_Critical:

Generic_Critical.flag.text = *
Generic_Critical.flag.fill = #00ff00
Generic_Critical.flag.border = #00ff00
Generic_Critical.locQualityIssueSeverity = 50-100

Categories, Subcategories, and Values

Each rule property must contain a category. The acceptable values for each property depend on the category of that property.

ITS 2.0 Localization Quality categories

These categories are for use in rules that match segments with ITS 2.0 LQI metadata. By combining several properties with different categories, you can select segments that match criteria like "omission errors with a severity greater than 75". For reference, refer to the Localization Quality Issue section of ITS 2.0.

Category Values
locQualityIssueType to divide them, for example terminology|omission.
locQualityIssueSeverity A range of numerical values from 0-100, in the form min-max.
locQualityIssueComment Any string (interpreted as a regular expression).

ITS 2.0 Provenance categories

These categories match segments that have the specified provenance category and a matching value. For reference, refer to the Provenance section of ITS 2.0.

Category Values
org Any string (interpreted as a regular expression).
person Any string (interpreted as a regular expression).
tool Any string (interpreted as a regular expression).
revOrg Any string (interpreted as a regular expression).
revPerson Any string (interpreted as a regular expression).
revTool Any string (interpreted as a regular expression).
provRef Any string (interpreted as a regular expression).

ITS 2.0 MT Confidence category

This category selects for segments with an MT Confidence score in the matching range. For reference, see the MT Confidence section of ITS 2.0.

Category Values
mtConfidence A range of numerical values from 0-1, in the form min-max.

Flags and Visual Appearance

The flag category is special: it controls the visual appearance of segments that match the rule. A rule can have multiple flag properties, each with a different subcategory. The available subcategories are:

Subcategory Values
text The glyph/character used to render a flag next to matching segments. You can enter multiple characters, but only the first will be used. Unicode characters (such as ☃) are allowed.
fill The background color of the flag, in the form #RRGGBB.
border The color of the border of the flag, in the form #RRGGBB.

Quick-add rules

It's possible to assign hotkeys to commonly used language quality issue data, so that you can easily add common issues to a document. This is done by including rules with the special quickAdd category and its subcategories:

Subcategory Values
locQualityIssueType The type of LQI to add. This should valid ITS 2.0 Language Quality Issue Type, such as omission or mistranslation.
locQualityIssueSeverity The severity of the LQI to add. This should be a valid ITS 2.0 LQI severity between 0 and 100, inclusive.
locQualityIssueComment The comment to attach to the LQI.
hotkey The hotkey to which this quick-add rule should be attached. This should be a number from 0-9, corresponding to the hotkeys Ctrl-0 to Ctrl-9 on Windows and Linux, or ⌘-0 to ⌘-9 on the Mac.

State-qualifier Rules

Ocelot allows for several special rules to indicate the visual appearance of segments with specific values for the XLIFF 1.2 state-qualifier attribute. State-qualifier rules are the only rules that don't have normal names. They are always of the form:

state-qualifier-value = #RRGGBB

Where state-qualifier-value is one of id-match, exact-match, fuzzy-match, or mt-suggestion.

Examples

Escalating severity levels

These four rules all match LQI non-conformance issues, but they each match a different severity band and assign it a different color flag.

Critical_non-conformance.locQualityIssueType = non-conformance
Critical_non-conformance.locQualityIssueSeverity = 90-100
Critical_non-conformance.flag.text = *
Critical_non-conformance.flag.fill = #ff0000
Critical_non-conformance.flag.border = #ff0000
Significant_non-conformance.locQualityIssueType = non-conformance
Significant_non-conformance.locQualityIssueSeverity = 80-90
Significant_non-conformance.flag.text = *
Significant_non-conformance.flag.fill = #ff9500
Significant_non-conformance.flag.border = #ff9500
Important_non-conformance.locQualityIssueType = non-conformance
Important_non-conformance.locQualityIssueSeverity = 70-80
Important_non-conformance.flag.text = *
Important_non-conformance.flag.fill = #ffff00
Important_non-conformance.flag.border = #ffff00
Possible_non-conformance.locQualityIssueType = non-conformance
Possible_non-conformance.locQualityIssueSeverity = 60-70
Possible_non-conformance.flag.text = *
Possible_non-conformance.flag.fill = #ff56e1
Possible_non-conformance.flag.border = #ff56e1

The result is a set of similar flags that differ only in their highlight color:

non-conformance-rules-example.png

Flag all segments with low MT Confidence

Unconfident.mtConfidence = 0.0-0.5
Unconfident.flag.fill = #808020
Unconfident.flag.border = #808020
Unconfident.flag.text = ~

Flag segments with any LQI of severity 3

Severity_3.locQualityIssueSeverity = 3-3
Severity_3.locQualityIssueType = mistranslation|omission|untranslated|addition|duplication|inconsistency| \
                     grammar|legal|register|locale-specific-content|locale-violation|style|characters|misspelling| \
                     typographical|formatting|inconsistent-entities|numbers|pattern-problem|whitespace| \
                     internationalization|length|uncategorized|other
Severity_3.flag.fill = #ff9500
Severity_3.flag.border = #ff9500
Severity_3.flag.text = 3

Highlight exact-match segments with a brilliant green

exact-match=#00FF00

Define a hotkey to add a common language quality issue

This defines the hotkey Ctrl-1 (⌘-1 on Mac) to add a "mistranslation, severity 75" issue to the current segment. It also defines flag information so that these issues appear as the letter 'M' on a red background.

QuickAddMistranslation.locQualityIssueSeverity = 75-100
QuickAddMistranslation.locQualityIssueType = mistranslation
QuickAddMistranslation.flag.fill = #ff0000
QuickAddMistranslation.flag.text = M
QuickAddMistranslation.flag.border = #ff0000
QuickAddMistranslation.quickAdd.locQualityIssueType = mistranslation
QuickAddMistranslation.quickAdd.locQualityIssueSeverity = 75
QuickAddMistranslation.quickAdd.hotkey = 1

Filtering Segments with the Rules UI

You can access the rules UI by selecting Rules from the Filter menu.

cFtHvo8.png

The top section of the dialog shows you the available ITS rules. For each rule, the name is displayed along with the flag as it will be rendered in Ocelot. There are three options for filtering segments by rules:

  • Show all segments (this option is selected by default)
  • Show only segments that have one or more pieces of ITS metadata attached to them
  • Show only segments that match a rule that is enabled from the available list. When this option is selected, rules may be selected individually by checking them.

The bottom section of the dialog allows for additional filtering by state-qualifier values. If any state-qualifier rules were defined, their colors are shown here. (In the screenshot, exact-match and mt-suggestion have colors defined.) There are two options for filtering segments by state-qualifier value:

  • Show all segments (this option is selected by default)
  • Show only segments that have a state-qualifier value matching one of the selected values.

Ocelot will display the intersection of the segments selected by these two filter criteria. This means that you can specify things like:

  • Show all segments that have ITS metadata AND a state-qualifier other than "exact-match"
  • Show all segments that have an MT Confidence score assigned AND a state-qualifier of "mt-suggestion"
  • etc

Validation and Warnings

If Ocelot is unable to parse a rule, or if a rule contains incomplete data, a warning will be printed to the log.