TBX Checker is a general-purpose utility, written in Java, that checks files for compliance with a TBX TML. Any XML validator can verify that a file conforms to the TBX core structure DTD, but the TBX Checker additionally verifies that the file conforms to the constraints of its specific TML, as expressed in an XCS file.
To run this software, you are required to have the Java Runtime Environment (version X) installed. To find out whether you have the necessary Java runtime environment, enter java -version
in a command line. If not, it is available for major platforms.
To run the TBX Checker, double-click the .jar
file contained in the .zip
download above, or enter java -jar tbxcheck-_._._
(where the blanks represent version numbers, as before) at a command line. The program is fully graphical; the "Open" button will bring up a dialog to select a TBX file to validate, and the dropdown menu controls the level of detail in the report (which shows up in a new window).
There are several common causes of non-compliance or of incomplete compliance, which you may encounter:
- Checker can't find necessary files. The TBX file refers to two other files by name, path, or URI: Its core structure DTD, and its XCS file. A third file is also necessary, namely the DTD for XCS (i.e., the file that explains how an XCS file is structured). The core structure is named in the DOCTYPE declaration:
<!DOCTYPE martif SYSTEM "TBXcoreStructV02.dtd">
. The XCS is named within the header of the TBX file, in an element that looks like this:<p type="DCSName">TBXBasicXCSV02.xcs</p>
. The DTD for XCS is named in the DOCTYPE element of the XCS file. If these files do not exist exactly as specified (including upper and lower case), the Checker will not be able to find them. (Unfortunately, themrc2tbx
package as presently constituted gets this wrong: The package provides an.XCS
extension in upper case, whereas the TBX file designates it in lower case. This will be fixed in the next release ofmrc2tbx
; in the meanwhile you can rename the file yourself.) Very often all three files are cited by name only, and therefore must be placed in the same directory as the TBX file. - Improper languages. The XCS file specifies not only the data categories that may be used in a given TML, but also the languages. If the TBX file includes terms in languages not listed in the XCS, TBX Checker will flag the error. As of version 1.2.8, the TBX Checker can be directed not to perform this validation by clicking a check box. In previous versions, you can edit the XCS file's
<languages>
element (just below the header) by hand to prevent this problem. In the near term we will release a simple utility to make this easier; it will identify all the language codes in an arbitrary TBX file and return an appropriate<languages>
element (to be customized with the languages' names and pasted in). - Broken links. If entries contain links to nonexistent targets (either other terminological entries, or parties responsible for changes), the TBX Checker will flag an error. At present it will only flag one such error, no matter how many there are. Therefore, after fixing the problem it is necessary to re-check the file. You don't need to restart the Checker; just click the "Open" button again. The origin of this bug is as yet unknown.
You can test the checker with some sample files.
Another approach to checking TBX files is to use an integrated schema that combines the constraints of the core structure of TBX and one TBX TML. Integrated schemas for TBX-Basic have been developed using the RNG schema definition language and the XSD schema definition language. More information on the integrated schema approach to TBX checking will be made available later.